b8607

📅 Apr 1, 2026📦 llama-cppView on GitHub →

✨ 2 features🐛 3 fixes🔧 4 symbols

Summary

This release focuses on improving ggml webgpu support by updating quantized buffers and removing synchronous operations, alongside general cleanup and deadlock fixes.

Migration Steps

Move to unpackf16 for wider compatibility in ggml operations.

✨ New Features

ggml webgpu: quantized buffers updated to u32, improving browser/device support.
Work towards removing bitcast in ggml.

🐛 Bug Fixes

Added timeout back to wait function.
Removed synchronous set_tensor/memset_tensor calls.
Removed deadlock condition in free_bufs.

Affected Symbols

set_tensor memset_tensor free_bufs unpackf16