ray-2.51.0

Breaking Changes

📅 Oct 29, 2025📦 rayView on GitHub →

⚠ 2 breaking✨ 16 features🐛 17 fixes⚡ 2 deprecations🔧 11 symbols

Summary

This release introduces Ray Train v2 as the default, adds application‑level autoscaling to Ray Serve, and brings numerous new features and fixes to Ray Data, Train, Tune, and Serve.

⚠️ Breaking Changes

Ray Train v2 is enabled by default; code that relies on Ray Train v1 behavior may break. Disable with environment variable RAY_TRAIN_V2_ENABLED=0 or migrate to the v2 APIs.
Checkpoint Manager Pydantic API was reverted from v2 to v1; any user code importing or extending the v2 models must be updated to the v1 equivalents.

Migration Steps

Review the Ray Train v2 migration guide (https://github.com/ray-project/ray/issues/49454) and update code to use the v2 APIs or set `RAY_TRAIN_V2_ENABLED=0` to retain v1 behavior.
Replace any imports or usage of the Checkpoint Manager Pydantic v2 models with the v1 equivalents as indicated in the release notes.
Update custom autoscaling policies in Ray Serve to use the new `AutoscalingContext` fields (`total_running_requests`, `total_queued_requests`, `total_num_requests`) and aggregation functions if needed.
Adjust any code that relied on per‑deployment autoscaling to the new application‑level autoscaling model.
Refresh documentation links and examples to reflect the new top‑level `ray.train` aliases.

✨ New Features

Ray Train v2 enabled by default, providing usability and stability improvements.
Top‑level `ray.train` aliases expose public APIs directly under the `ray.train` namespace.
Application‑level autoscaling in Ray Serve, allowing custom policies across all deployments in an application.
Support for custom autoscaling aggregation functions (min, max, time‑weighted average) in Ray Serve.
Enhanced `AutoscalingContext` now includes replica‑level metrics: total_running_requests, total_queued_requests, total_num_requests.
Multiple task consumer deployments can run concurrently within a single Ray Serve application.
Unity Catalog integration for Ray Data.
New expression evaluator infrastructure in Ray Data for better query optimization.
Write operations in Ray Data now support `SaveMode`.
Approximate quantile aggregator added to Ray Data.
MCAP datasource support for robotics data in Ray Data.
Callback‑based statistic computation for preprocessors and `ValueCounter` in Ray Data.
Support for multiple download URIs with improved error handling in Ray Data.
Async inference telemetry added to Ray Serve.
`AutoscalingContext` promoted to a public API with documentation.
Reconfigure method now receives both `user_config` and `rank` parameters on replica rank changes.

🐛 Bug Fixes

Fixed handling of renamed columns being dropped from output in Ray Data.
Fixed projection pushdown handling of column renames in Ray Data.
Fixed `vLLMEngineStage` field name inconsistency for images.
Fixed driver hang during streaming generator block metadata retrieval.
Fixed retry policy for hash‑shuffle tasks.
Fixed prefetch loop to avoid blocking on fetches.
Fixed empty projection handling.
Fixed errors when concatenating mixed pyarrow native and extension types.
Fixed `ControllerError` triggered by `after_worker_group_poll_status` errors.
Fixed `iter_torch_batches` usage of `ray.train.torch.get_device` outside of Train.
Fixed exception‑queue race condition in `ThreadRunner`.
Fixed max constructor retry count test for Windows environments.
Stabilized streaming tests by adding synchronization to prevent chunk coalescing and rechunking.
Deflaked autoscaling tests by fixing race conditions and removing flaky min‑aggregation scenario.
Corrected a broken State API usage unit test.
Reduced rank‑related INFO logs to DEBUG level.
Optimized controller logging by removing expensive debug logs.

🔧 Affected Symbols

ray.trainray.train.torch.get_deviceray.train.torch.iter_torch_batchesray.train.ControllerErrorray.train.ThreadRunnerray.train.JaxBackend.shutdownray.train.TrainingFailedErrorray.serve.AutoscalingContextray.serve.reconfigurevLLMEngineStageCheckpointManager

⚡ Deprecations

Legacy XGBoost and LightGBM trainers now emit deprecation warnings.
Calling `ray.train` methods from `ray.tune` triggers deprecation handling; update to use the dedicated Train APIs.