b9276
📦 llama-cppView on GitHub →
✨ 1 features🔧 1 symbols
Summary
The server now exposes detailed prompt token counts via the /slots endpoint, enhancing monitoring capabilities. This release also includes a wide array of pre-compiled binaries for different platforms.
✨ New Features
- Exposed prompt token counts (n_prompt_tokens, n_prompt_tokens_processed, n_prompt_tokens_cache) in the /slots JSON response for better monitoring of prompt evaluation progress.