v2.10.0
Breaking Changes📦 pytorchView on GitHub →
⚠ 6 breaking✨ 14 features⚡ 4 deprecations🔧 14 symbols
Summary
PyTorch 2.10 introduces Python 3.14 support for torch.compile, new features like combo-kernels fusion and LocalTensor for distributed debugging, and removes several deprecated or legacy functionalities across ONNX, Dataloader, and nn modules.
⚠️ Breaking Changes
- Removed unused `data_source` argument from Sampler. If you have a custom sampler using this argument, please update it.
- Removed deprecated imports for `torch.utils.data.datapipes.iter.grouping`. Import `SHARDING_PRIORITIES`, `ShardingFilterIterDataPipe` from `torch.utils.data.datapipes.iter.sharding` instead.
- Removed Nested Jagged Tensor support from `nn.attention.flex_attention`.
- `fallback=False` is now the default in `torch.onnx.export`. To preserve 2.9 behavior, manually set `fallback=True` in the `torch.onnx.export` call.
- The ONNX exporter now uses the `dynamo=True` option without fallback by default. This is the recommended usage.
- Renamed `pytorch-triton` package to `triton`.
Migration Steps
- If using a custom sampler that utilizes the `data_source` argument, remove its usage.
- Update imports for grouping datapipes: change `from torch.utils.data.datapipes.iter.grouping import ...` to import from `torch.utils.data.datapipes.iter.sharding`.
- If using `torch.onnx.export`, replace usage of `dynamic_axes` with the `dynamic_shapes` argument.
- If using `torch.profiler.export_memory_timeline`, migrate to using `torch.cuda.memory._record_memory_history` and `torch.cuda.memory._export_memory_snapshot`.
- If relying on implicit device mesh slicing behavior that generated a warning, update code to explicitly manage flattened mesh bookkeeping.
- Replace usage of `torch.jit` APIs with `torch.compile` or `torch.export`.
✨ New Features
- Python 3.14 support for `torch.compile()`, including experimental support for Python 3.14t (freethreaded build).
- Reduced kernel launch overhead with combo-kernels horizontal fusion in torchinductor.
- New `varlen_attn()` op providing support for ragged and packed sequences.
- Efficient eigenvalue decompositions with `DnXgeev`.
- `torch.compile()` now respects `use_deterministic_mode`.
- Introduced `DebugMode` for tracking dispatched calls and debugging numerical divergence.
- Allow setting `grad_dtype` on leaf tensors in Autograd.
- Added Default Autograd Fallback for PrivateUse1 in PyTorch.
- Added API to annotate disjoint backward for use with `torch.utils.checkpoint.checkpoint`.
- Added `ComplexTensor` subclass to Complex Frontend.
- Support autograd in `torch.cond`.
- BFloat16 support added to cuDNN RNN.
- Upgraded cuDNN to frontend version 1.16.1.
- Introduction of `LocalTensor` for debugging and simulation of distributed tensor computations on a single process, enabled via `LocalTensorMode` context manager.
Affected Symbols
torch.compiletorch.utils.data.datapipes.iter.groupingtorch.utils.data.datapipes.iter.shardingnn.attention.flex_attentiontorch.onnx.exporttorch.distributed.device_meshtorch.jittorch.profiler.export_memory_timelinetorch.cuda.memory._record_memory_historytorch.cuda.memory._export_memory_snapshottorch.condtorch.distributed._local_tensorLocalTensorMode@maybe_run_for_local_tensor
⚡ Deprecations
- Added a warning for slicing flattened dim from root mesh and types for `_get_slice_mesh_layout` in `DeviceMesh`. Users should explicitly manage bookkeeping of flattened meshes.
- `torch.jit` is not guaranteed to work in Python 3.14. Deprecation warnings added to user-facing APIs. Replace usage with `torch.compile` or `torch.export`.
- The `dynamic_axes` option in `torch.onnx.export` is deprecated. Users should supply the `dynamic_shapes` argument instead.
- The `export_memory_timeline` method in `torch.profiler` is deprecated. Use the newer memory snapshot API (`torch.cuda.memory._record_memory_history` and `torch.cuda.memory._export_memory_snapshot`) instead.