Change8

ray-2.45.0

📦 rayView on GitHub →
13 features🐛 21 fixes🔧 29 symbols

Summary

Ray 2.45 adds configurable object store fallback, new cgraph transport options, a ClickHouse sink and several dataset API enhancements, along with numerous stability fixes and an upgraded LightGBM version.

✨ New Features

  • Object Store Fallback Directory is now configurable via new setting.
  • cgraph now supports `with_tensor_transport(transport='shm')`.
  • cgraph adds reduce scatter and all gather collective operations for GPU communicator in compiled graphs.
  • Ray Data adds ClickHouse sink via `Dataset.write_clickhouse()`.
  • Ray Data `Dataset.groupby().map_groups()` now accepts `ray_remote_args_fn` for per-group runtime env and resource hints.
  • `Dataset.name` and `set_name` are exposed as public API for lineage tracking.
  • `Dataset.flat_map()` now supports async callable classes.
  • Introduced Ruleset abstraction for rule‑based query optimisation in Ray Data.
  • Added seamless conversion from Daft DataFrame to Ray Dataset.
  • Improved line‑delimited JSONL reading in `read_json()`.
  • `Dataset.export_metadata()` provides schema and statistics snapshots.
  • Ray Train folds `v2.LightGBMTrainer` API into the public `Trainer` class as an alternate constructor.
  • LightGBM library upgraded to version 4.6.0 in Ray Train.

🐛 Bug Fixes

  • `KillActor` RPC with `force_kill=True` now correctly kills threaded actors.
  • Autoscaler no longer removes idle nodes that are reserved for upcoming placement groups.
  • Threaded actors no longer get stuck when receiving two exit signals.
  • cgraph illegal memory access bug fixed when used in pipeline parallelism.
  • Resubmitted actor tasks no longer hang indefinitely.
  • Placement group creation process no longer interleaves due to node failure.
  • `CoreWorker::Shutdown` now flushes task events instead of `CoreWorker::Disconnect`.
  • `MapTransformFn.__eq__` equality check corrected.
  • Unresolved wildcard paths are persisted in `FileBasedDataSource`.
  • Hugging Face dynamic‑module loading repaired on workers.
  • HTTP URLs are no longer expanded by `_expand_paths`.
  • Databricks host‑URL parsing fixed in Delta datasource.
  • `Dataset.random_sample()` reproducibility restored.
  • `RandomAccessDataset.multiget()` return values corrected.
  • Executor shutdown after schema fetch to avoid leaked actors.
  • Streaming shutdown regression repaired.
  • ResourceManager now honours minimum resource reservation.
  • `OutputSplitter._locality_hints` separated from `actor_locality_enabled` and `locality_with_output`.
  • Print redirection now handles new lines correctly.
  • `RunAttempt` workers marked as dead after completion to avoid stale states.
  • `setup_wandb` `rank_zero_only` logic fixed.

🔧 Affected Symbols

ObjectStoreFallbackDirectorycgraph.with_tensor_transportcgraph.reduce_scattercgraph.all_gatherKillActorCoreWorker::ShutdownCoreWorker::DisconnectAutoscaler idle node removal logicThreadedActor exit handlingMapTransformFn.__eq__FileBasedDataSourceHuggingFace dynamic module loader_expand_pathsDelta datasource host URL parserDataset.random_sampleRandomAccessDataset.multigetResourceManagerDataset.write_clickhouseDataset.groupby().map_groupsDataset.nameDataset.set_nameDataset.flat_mapRulesetDaft DataFrame conversionDataset.export_metadataLightGBMTrainerOutputSplitter._locality_hintsRunAttemptsetup_wandb