b8485
📦 llama-cppView on GitHub →
Summary
The server component now utilizes dynamic threads based on httplib, adjusting the thread count by adding 1024 to the existing n_threads_http setting. This release also includes extensive pre-compiled binaries for macOS, Linux, Windows, and openEuler.