Change8

b8863

📦 llama-cppView on GitHub →
🐛 2 fixes🔧 1 symbols

Summary

This release includes stability improvements for ggml-cuda by implementing a retry mechanism upon OOM errors and addresses several review comments related to synchronization and cleanup.

🐛 Bug Fixes

  • ggml-cuda: Flush legacy pool on OOM and retry to improve stability during memory pressure.
  • Address review comments: added explicit sync, updated destructor, and cleaned up MUSA macros in ggml-cuda implementation.

Affected Symbols