b8660
📦 llama-cppView on GitHub →
✨ 3 features🐛 2 fixes🔧 1 symbols
Summary
This release focuses heavily on internal refactoring and optimization for ggml-webgpu, including moving away from parameter buffer pools and improving compatibility and stability.
✨ New Features
- Started work on removing parameter buffer pools in ggml-webgpu.
- Added timeout back to wait operations and removed synchronous set_tensor/memset_tensor in ggml-webgpu.
- Moved to unpackf16 for wider compatibility in ggml-webgpu.
🐛 Bug Fixes
- Fixed stride calculation in ggml-webgpu.
- Removed a deadlock condition in free_bufs in ggml-webgpu.