Change8

b9864

Breaking Changes
📦 llama-cppView on GitHub →
2 breaking2 features🐛 1 fixes🔧 2 symbols

Summary

This release improves SSE stream stability by adjusting ping intervals to prevent premature connection drops during slow prefill, and refactors `sse_ping_interval` to be a per-request parameter validated by the schema.

⚠️ Breaking Changes

  • The `sse_ping_interval` configuration has moved from a global setting to a per-request body field in the server API schema. API clients relying on the old global default might see different behavior if they were expecting the previous default, although the global default returns to 30.
  • The raw JSON value reading for `sse_ping_interval` is removed; it is now a typed field_num bound to task_params, requiring schema evaluation for validation.

Migration Steps

  1. API clients that previously relied on a global setting for SSE stream ping behavior should now explicitly send `sse_ping_interval` in the request body if they require a specific cadence.
  2. If using the WebUI, be aware it now explicitly sets `sse_ping_interval: 1`.

✨ New Features

  • Server now pings silent SSE streams every 1 second and only kicks connections after 3 seconds of silence to prevent dropping connections during slow prefill.
  • The WebUI now sends `sse_ping_interval: 1` in the request body to control the cadence for the 3-second visibility kick.

🐛 Bug Fixes

  • Slow prefill operations are less likely to cause healthy SSE connections to drop due to improved ping/kick timing.

Affected Symbols