Change8

b7649

📦 llama-cppView on GitHub →
1 features🔧 1 symbols

Summary

This release focuses on performance improvements within ggml, specifically optimizing the CUDA ssm_scan operation using warp-level reduction.

✨ New Features

  • Optimized CUDA ssm_scan implementation using warp-level reduction in ggml.

🔧 Affected Symbols

ggml