b8954
📦 llama-cppView on GitHub →
Summary
The server component was updated to use 'pos_next' instead of 'n_tokens' for m-rope calculations. This release also provides numerous pre-compiled binaries for various operating systems and hardware configurations.