b8104
📦 llama-cppView on GitHub →
✨ 2 features🐛 1 fixes🔧 4 symbols
Summary
This release fixes an issue where an extra newline was inserted between text and media markers in MTMD chat output by introducing a specific `media_marker` type. This resolves token count discrepancies when comparing llama-server output with HF implementations for vision models.
Migration Steps
- Refactor to use explicit per type ifs (implied code change related to handling media markers).
- Update common/chat.cpp (implied code change).
- Update common_chat_templates_apply_legacy (implied code change).
✨ New Features
- Introduced a new type `media_marker` in MTMD chat processing to correctly handle media markers.
- Added logic to prevent insertion of newlines before and after media markers during JSON serialization for chat compatibility.
🐛 Bug Fixes
- Fixed an issue causing an extra newline character (\n) between text and media markers in MTMD chat output, ensuring token count consistency with HF implementations for vision models.