Change8

v2.25.0

📦 localaiView on GitHub →
6 features🐛 2 fixes🔧 4 symbols

Summary

This release introduces several new features, including streaming token usage, GGUF template reading, and enhanced llama.cpp cache configuration. It also includes numerous updates to the model gallery and a fix for the llava clip patch.

✨ New Features

  • Exposed cache_type_k and cache_type_v for llama.cpp quantization of kv cache.
  • Jinja templates can now be read from gguf files.
  • Added support for streaming token usage.
  • Dockerfile now allows skipping driver installation.
  • Added path prefix support via HTTP header for the UI.
  • Partial downloads can now be resumed by the downloader.

🐛 Bug Fixes

  • Updated clip.patch for llava.
  • Corrected gallery/index.yaml file.

🔧 Affected Symbols

llavallama.cppggufDockerfile