Change8

b8480

📦 llama-cppView on GitHub →
1 features🔧 1 symbols

Summary

This release introduces an optimization for CANN ACL graph capture by preloading the RoPE cache, ensuring proper execution flow and memory handling during graph recording. It also provides updated pre-compiled binaries for multiple platforms.

✨ New Features

  • Added RoPE cache preload before ACL graph capture on CANN devices to ensure host-to-device copies and allocations run on a non-captured stream, warming up the memory pool, and skipping host-side/allocation branches during capture.

Affected Symbols