b8863

📅 Apr 20, 2026📦 llama-cppView on GitHub →

🐛 2 fixes🔧 1 symbols

Summary

This release includes stability improvements for ggml-cuda by implementing a retry mechanism upon OOM errors and addresses several review comments related to synchronization and cleanup.

🐛 Bug Fixes

ggml-cuda: Flush legacy pool on OOM and retry to improve stability during memory pressure.
Address review comments: added explicit sync, updated destructor, and cleaned up MUSA macros in ggml-cuda implementation.

Affected Symbols

ggml-cuda