b9479
📦 llama-cppView on GitHub →
🐛 2 fixes🔧 3 symbols
Summary
This release fixes a critical bug in session state saving and restoring within common_prompt_batch_decode, ensuring correct token handling during completion and state loading.
🐛 Bug Fixes
- Fixed an issue in common_prompt_batch_decode where saving session state resulted in n-1 tokens being saved in session_tokens and the KV cache, leading to the last saved token being replayed incorrectly upon session restore.
- The fix ensures all n tokens are stored in session_tokens, while the memory state correctly reflects n-1 processed tokens as saving occurs before the final token decoding.