ray-2.51.0
Breaking ChangesđŚ rayView on GitHub â
â 2 breaking⨠16 featuresđ 17 fixes⥠2 deprecationsđ§ 11 symbols
Summary
This release introduces Ray Train v2 as the default, adds applicationâlevel autoscaling to Ray Serve, and brings numerous new features and fixes to Ray Data, Train, Tune, and Serve.
â ď¸ Breaking Changes
- Ray Train v2 is enabled by default; code that relies on Ray Train v1 behavior may break. Disable with environment variable RAY_TRAIN_V2_ENABLED=0 or migrate to the v2 APIs.
- Checkpoint Manager Pydantic API was reverted from v2 to v1; any user code importing or extending the v2 models must be updated to the v1 equivalents.
Migration Steps
- Review the Ray Train v2 migration guide (https://github.com/ray-project/ray/issues/49454) and update code to use the v2 APIs or set `RAY_TRAIN_V2_ENABLED=0` to retain v1 behavior.
- Replace any imports or usage of the Checkpoint Manager Pydantic v2 models with the v1 equivalents as indicated in the release notes.
- Update custom autoscaling policies in Ray Serve to use the new `AutoscalingContext` fields (`total_running_requests`, `total_queued_requests`, `total_num_requests`) and aggregation functions if needed.
- Adjust any code that relied on perâdeployment autoscaling to the new applicationâlevel autoscaling model.
- Refresh documentation links and examples to reflect the new topâlevel `ray.train` aliases.
⨠New Features
- Ray Train v2 enabled by default, providing usability and stability improvements.
- Topâlevel `ray.train` aliases expose public APIs directly under the `ray.train` namespace.
- Applicationâlevel autoscaling in Ray Serve, allowing custom policies across all deployments in an application.
- Support for custom autoscaling aggregation functions (min, max, timeâweighted average) in Ray Serve.
- Enhanced `AutoscalingContext` now includes replicaâlevel metrics: total_running_requests, total_queued_requests, total_num_requests.
- Multiple task consumer deployments can run concurrently within a single Ray Serve application.
- Unity Catalog integration for Ray Data.
- New expression evaluator infrastructure in Ray Data for better query optimization.
- Write operations in Ray Data now support `SaveMode`.
- Approximate quantile aggregator added to Ray Data.
- MCAP datasource support for robotics data in Ray Data.
- Callbackâbased statistic computation for preprocessors and `ValueCounter` in Ray Data.
- Support for multiple download URIs with improved error handling in Ray Data.
- Async inference telemetry added to Ray Serve.
- `AutoscalingContext` promoted to a public API with documentation.
- Reconfigure method now receives both `user_config` and `rank` parameters on replica rank changes.
đ Bug Fixes
- Fixed handling of renamed columns being dropped from output in Ray Data.
- Fixed projection pushdown handling of column renames in Ray Data.
- Fixed `vLLMEngineStage` field name inconsistency for images.
- Fixed driver hang during streaming generator block metadata retrieval.
- Fixed retry policy for hashâshuffle tasks.
- Fixed prefetch loop to avoid blocking on fetches.
- Fixed empty projection handling.
- Fixed errors when concatenating mixed pyarrow native and extension types.
- Fixed `ControllerError` triggered by `after_worker_group_poll_status` errors.
- Fixed `iter_torch_batches` usage of `ray.train.torch.get_device` outside of Train.
- Fixed exceptionâqueue race condition in `ThreadRunner`.
- Fixed max constructor retry count test for Windows environments.
- Stabilized streaming tests by adding synchronization to prevent chunk coalescing and rechunking.
- Deflaked autoscaling tests by fixing race conditions and removing flaky minâaggregation scenario.
- Corrected a broken State API usage unit test.
- Reduced rankârelated INFO logs to DEBUG level.
- Optimized controller logging by removing expensive debug logs.
đ§ Affected Symbols
ray.trainray.train.torch.get_deviceray.train.torch.iter_torch_batchesray.train.ControllerErrorray.train.ThreadRunnerray.train.JaxBackend.shutdownray.train.TrainingFailedErrorray.serve.AutoscalingContextray.serve.reconfigurevLLMEngineStageCheckpointManager⥠Deprecations
- Legacy XGBoost and LightGBM trainers now emit deprecation warnings.
- Calling `ray.train` methods from `ray.tune` triggers deprecation handling; update to use the dedicated Train APIs.