Change8

b8846

📦 llama-cppView on GitHub →
2 features🔧 2 symbols

Summary

This release significantly reduces CPU overhead in the ggml meta backend by caching subgraph computations when the graph structure is static. Internal refactoring included renaming and removing several tracking fields.

Migration Steps

  1. Rename field `last_uid` to reflect its new purpose.
  2. Rename field `last_n_subgraphs` to reflect its new purpose.
  3. Remove the `last_max_tmp_size` field.

✨ New Features

  • Reduced CPU overhead in the meta backend by caching subgraph splits when the computation graph (cgraph) remains unchanged.
  • Added UID assignment to sub-graphs to enable faster UID checking paths, including for CUDA.

Affected Symbols