Change8

Migrating to llama.cpp b9109

Version b9109 introduces 2 breaking changes. This guide details how to update your code.

Released: 5/11/2026

2
Breaking Changes
3
Migration Steps
6
Affected Symbols

⚠️ Check Your Code

If you use any of these symbols, you need to read this guide:

common_speculative_init()common_speculative_process()common_params_speculativecommon_speculative_typecommon_get_enabled_speculative_implscommon_speculative_type_from_names

Breaking Changes

Issue #1

Support for incompatible vocabs has been dropped. Users must ensure their vocabulary files are compatible with the current spec.

Issue #2

The old `type` field in the `common_params_speculative` struct has been replaced by a vector to allow specifying multiple speculative types.

Migration Steps

  1. 1
    When configuring speculative decoding, replace the single `type` field in `common_params_speculative` with a vector of speculative types.
  2. 2
    Use `common_get_enabled_speculative_impls(const std::vector<enum common_speculative_type>)` to determine enabled implementations based on user-provided spec types.
  3. 3
    Use `common_speculative_type_from_names(const std::vector<std::string> & names)` to parse user-provided spec types specified as names.

Release Summary

This release introduces major support for parallel drafting, enabling more complex speculative decoding strategies, and refactors context management across server and spec components. Several internal structure changes necessitate migration steps for speculative decoding configuration.

Need More Details?

View the full release notes and all changes for llama.cpp b9109.

View Full Changelog