b9585
📦 llama-cppView on GitHub →
🐛 1 fixes🔧 1 symbols
Summary
This release fixes an issue in granite speech model inference by correctly applying the embedding scale when deepstack is disabled. It also includes minor cleanup in tests.
🐛 Bug Fixes
- Fixed granite speech model inference by applying embedding scale when deepstack is not used.