b9455

📅 Jun 1, 2026📦 llama-cppView on GitHub →

✨ 1 features🐛 2 fixes

Summary

This release introduces support for quantized KV cache in TP operations and includes minor fixes for partial views and assertions.

✨ New Features

Added support for quantized KV cache in Tensor Parallelism (TP) operations.

🐛 Bug Fixes

Fixed an issue related to partial views.
Removed an overly strict assertion.