Change8

b9116

📦 llama-cppView on GitHub →
3 features🐛 2 fixes🔧 1 symbols

Summary

This release introduces vision support for MiMo v2.5, including optimizations and bug fixes for f16 overflow. It also provides extensive pre-compiled binaries across multiple platforms.

Migration Steps

  1. If using MiMo v2.5, remember to use filter_tensors as Flash does not have mmproj.

✨ New Features

  • Added vision support for MiMo v2.5.
  • Implemented fused qkv usage for MiMo v2.5 vision.
  • Added various pre-compiled binaries for macOS, Linux (including Vulkan, ROCm 7.2, OpenVINO, SYCL), Android, Windows (including CUDA 12.4, 13.1, Vulkan, SYCL, HIP), and openEuler platforms.

🐛 Bug Fixes

  • Fixed f16 vision overflow issue related to MiMo v2.5.
  • Cleaned up comments and fixed trailing whitespace in MiMo v2.5 implementation.

Affected Symbols