b7630
📦 llama-cppView on GitHub →
✨ 3 features🔧 2 symbols
Summary
This release introduces performance improvements for the CANN backend by adding operator fusion support for ADD + RMS_NORM operations. Various pre-built binaries for different operating systems and hardware configurations are also provided.
Migration Steps
- To enable the new ADD + RMS_NORM fusion in the CANN backend, set the GGML_CANN_OPERATOR_FUSION environment variable to true.
✨ New Features
- Added operator fusion support for ADD + RMS_NORM operations in the CANN backend to reduce memory access overhead.
- Implemented ggml_cann_op_add_rms_norm_fused() using ACLNN AddRmsNorm.
- Added ggml_cann_can_fuse() to check fusion eligibility.
🔧 Affected Symbols
ggml_cann_op_add_rms_norm_fusedggml_cann_can_fuse