b9129

📅 May 13, 2026📦 llama-cppView on GitHub →

✨ 2 features🐛 1 fixes🔧 1 symbols

Summary

This release introduces adaptive CPU fallback for ggml-zendnn based on batch size, controllable via an environment variable, and restores previous fallback behavior when the feature is disabled. Numerous pre-compiled binaries are provided.

Migration Steps

If you wish to disable the new adaptive fallback behavior in ggml-zendnn, set the environment variable GGML_ZENDNN_ADAPTIVE_FALLBACK to 0.

✨ New Features

ggml-zendnn now features adaptive fallback to the CPU backend for small batch sizes.
Added runtime environment variable GGML_ZENDNN_ADAPTIVE_FALLBACK to control adaptive fallback (default is enabled).

🐛 Bug Fixes

Restored original ggml-zendnn fallback logic when adaptive fallback is disabled.

Affected Symbols

ggml-zendnn