Change8

b9141

📦 llama-cppView on GitHub →
2 features🐛 1 fixes🔧 2 symbols

Summary

This release introduces support for the `continue_final_message` flag in the server and WebUI to align with the vLLM API, ensuring correct behavior when continuing final messages during generation.

✨ New Features

  • Server and WebUI now accept the `continue_final_message` body flag for compatibility with the vLLM API.
  • When `continue_final_message` is true and `add_generation_prompt` is false, the existing `prefill_assistant` code path is triggered, overriding the server side `opt.prefill_assistant` setting.

🐛 Bug Fixes

  • Enforced mutual exclusion between `continue_final_message=true` and `add_generation_prompt=true`, returning HTTP 400 if both are set, matching vLLM behavior.

Affected Symbols