b9116
📦 llama-cppView on GitHub →
✨ 3 features🐛 2 fixes🔧 1 symbols
Summary
This release introduces vision support for MiMo v2.5, including optimizations and bug fixes for f16 overflow. It also provides extensive pre-compiled binaries across multiple platforms.
Migration Steps
- If using MiMo v2.5, remember to use filter_tensors as Flash does not have mmproj.
✨ New Features
- Added vision support for MiMo v2.5.
- Implemented fused qkv usage for MiMo v2.5 vision.
- Added various pre-compiled binaries for macOS, Linux (including Vulkan, ROCm 7.2, OpenVINO, SYCL), Android, Windows (including CUDA 12.4, 13.1, Vulkan, SYCL, HIP), and openEuler platforms.
🐛 Bug Fixes
- Fixed f16 vision overflow issue related to MiMo v2.5.
- Cleaned up comments and fixed trailing whitespace in MiMo v2.5 implementation.