b8660

📅 Apr 3, 2026📦 llama-cppView on GitHub →

✨ 3 features🐛 2 fixes🔧 1 symbols

Summary

This release focuses heavily on internal refactoring and optimization for ggml-webgpu, including moving away from parameter buffer pools and improving compatibility and stability.

✨ New Features

Started work on removing parameter buffer pools in ggml-webgpu.
Added timeout back to wait operations and removed synchronous set_tensor/memset_tensor in ggml-webgpu.
Moved to unpackf16 for wider compatibility in ggml-webgpu.

🐛 Bug Fixes

Fixed stride calculation in ggml-webgpu.
Removed a deadlock condition in free_bufs in ggml-webgpu.

Affected Symbols

ggml-webgpu