version 1.6.1
* Optional GPU acceleration via wgpu-native (WebGPU) on Linux (Vulkan), macOS (Metal), and Windows (Direct3D 12/Vulkan)
* All five compute shaders use workgroup shared memory tiling, eliminating 8-16x memory bandwidth waste versus naive global-memory access
* Distance matrix shader tiles X rows cooperatively across 256-thread workgroups, computing all n*(n-1)/2 pairs in parallel
* Kernel matrix shader (Gaussian, Laplacian, inverse multiquadric, exponential, polynomial) uses the same 16-feature tiling strategy
* Batch objective shader evaluates r candidate designs simultaneously; each workgroup tiles the n design-vector entries into shared memory before the inner product loop
* Multiple-kernel objective shader applies the same tiling per kernel and accumulates the weighted (optionally log-scaled) sum in a single GPU pass
* Greedy search shader (full_greedy_search_gpu) runs the complete pair-switching loop on the GPU: incremental O(p) average update per iteration eliminates the O(n*p) full recompute, and a 256-thread workgroup parallel argmin reduction reads back only 4 bytes per iteration instead of numIT*numIC*4 bytes
* New R functions: ged_gpu_available(), ged_gpu_devices(), compute_distance_matrix_gpu(), compute_kernel_matrix_gpu(), compute_objective_vals_gpu(), compute_multiple_kernel_objective_vals_gpu(), compute_randomization_metrics_gpu(), full_greedy_search_gpu()
* GPU use is automatic when wgpu-native is detected at install time; disabled transparently otherwise with no change to existing CPU code paths
* Added benchmark script benchmarks/benchmark_gpu_backend.R for CPU vs GPU timing comparison

version 1.6
* Massive speedups using Rcpp all over the codebase
* Gurobi search for r many vectors now works using pooling and a whole array of other arguments
* ggplot2 in now the plotting engine
* Added a test suite for public exported functions
* Added a benchmark suite
* Standardized method names across all types of searches
* Added examples for all functions in documentation
* Fixed documentation errors and inconsistencies
* Updated package description with DOI's for the relevant publications

version 1.5.6.3
* Fixed the Java-side cache so newBlankDesign no longer reuses a design with the wrong nT when the same n is requested again, which was causing findIndicies to overrun its array (issue #3 on github)

version 1.5.6.2
* added a C++ function that creates block design allocations rapidly
* exports all toolbox-style blazing-fast C++ functions (a) generate_block_design_cpp_wrap which generates homogeneous block allocation vectors
(b) compute_distance_matrix_cpp_wrap which generates Euclidean squared distance matrices and (c) shuffle_cpp_wrap a shuffling routine that 
is faster than base R's sample() function
* made standard one_zero over zero_one formats for allocation vectors
* added a function "gen_var_cov_matrix_block_designs" that generated varcov matrices for block designs

version 1.5.6.1
* Faster implementation of complete_randomization_with_forced_balanced, imbalanced_complete_randomization, and imbalanced_block_designs via std::shuffle
* Seed for complete_randomization_with_forced_balanced, imbalanced_complete_randomization, and imbalanced_block_designs
* We now comply with CRAN policy to not change the user's graphical parameters

version 1.5.6
* Method that calculates asymmetric cost allocation
* Greedy Pair Switching algorithm that supports unequal number of treatments and controls
* Imbalanced (unequal allocation) Completely Random Designs
* Imbalanced (unequal allocation) Block Designs

version 1.5.5
* Speedier implementation of binary pairwise matching designs using Java and Hash functions to check uniqueness

version 1.5
* Greedy Design on Multiple Kernels simultaneously
* You can pass distance matrices to binary match design initialization
* You can create binary match designs based on Mahalanobis distances between units using a flag in the design initialization

version 1.4
* Binary match followed by greedy within-the-binary-pairs search / rerandomization

version 1.35
* Curation of designs based on minimizing orthogonality as measured by average absolute correlation between each pair of vectors
* Hadamard matrix experimental designs

version 1.31
* Various speedups and bug fixes.

version 1.3
* Gurobi now works with Kernel distances.

version 1.21
* Fixed small bug, deleted vignette which was causing compilation issues.

version 1.2
* Gurobi setup for allocation searches via numerical optimization.

version 1.1
* Updated package to conform to CRAN's new policies.

version 1.0
* Initial Release
