b8369

📅 Mar 16, 2026📦 llama-cppView on GitHub →

✨ 1 features

Summary

This release introduces an optimization in the CUDA backend by hiding memory latency for GDN operations. It also provides a comprehensive set of pre-compiled binaries for numerous operating systems and hardware configurations.

✨ New Features

CUDA backend now hides memory latency for GDN operations.