v2.25.0

📅 Jan 10, 2025📦 localaiView on GitHub →

✨ 6 features🐛 2 fixes🔧 4 symbols

Summary

This release introduces several new features, including streaming token usage, GGUF template reading, and enhanced llama.cpp cache configuration. It also includes numerous updates to the model gallery and a fix for the llava clip patch.

✨ New Features

Exposed cache_type_k and cache_type_v for llama.cpp quantization of kv cache.
Jinja templates can now be read from gguf files.
Added support for streaming token usage.
Dockerfile now allows skipping driver installation.
Added path prefix support via HTTP header for the UI.
Partial downloads can now be resumed by the downloader.

🐛 Bug Fixes

Updated clip.patch for llava.
Corrected gallery/index.yaml file.

🔧 Affected Symbols

llavallama.cppggufDockerfile