Change8

PyTorch

Data & ML

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Latest: v2.11.08 releases8 breaking changes19 common errorsView on GitHub

Release History

v2.11.0Breaking5 features
Mar 23, 2026

PyTorch 2.11 introduces major highlights like Differentiable Collectives and FlexAttention updates, but enforces breaking changes by moving PyPI wheels to CUDA 13.0 and modifying APIs for variable length attention and hub loading.

v2.10.0Breaking14 features
Jan 21, 2026

PyTorch 2.10 introduces Python 3.14 support for torch.compile, new features like combo-kernels fusion and LocalTensor for distributed debugging, and removes several deprecated or legacy functionalities across ONNX, Dataloader, and nn modules.

v2.9.1Breaking12 fixes3 features
Nov 12, 2025

This maintenance release addresses critical regressions in PyTorch 2.9.0, specifically fixing memory issues in 3D convolutions, Inductor compilation bugs for Gemma/vLLM, and various distributed and numeric stability fixes.

v2.9.0Breaking1 fix7 features
Oct 15, 2025

PyTorch 2.9.0 introduces Python 3.10 as the minimum requirement, defaults the ONNX exporter to the Dynamo-based pipeline, and adds support for symmetric memory and FlexAttention on new hardware.

v2.8.0Breaking3 fixes10 features
Aug 6, 2025

PyTorch 2.8.0 introduces high-performance quantized LLM inference on Intel CPUs, SYCL support for CPP extensions, and stricter validation for autograd and torch.compile. It includes significant breaking changes regarding CUDA architecture support and internal configuration renames.

v2.7.1Breaking16 fixes3 features
Jun 4, 2025

This maintenance release focuses on fixing regressions and silent correctness issues across torch.compile, Distributed, and Flex Attention, while also improving wheel sizes and platform-specific compatibility for MacOS, Windows, and XPU.

v2.7.0Breaking1 fix9 features
Apr 23, 2025

PyTorch 2.7.0 introduces Blackwell support and FlexAttention optimizations while enforcing stricter C++ API visibility and Python limited API compliance. It marks a significant shift in ONNX and Export workflows by deprecating legacy capture methods in favor of the unified torch.export API.

v2.6.0Breaking10 features
Jan 29, 2025

PyTorch 2.6 introduces Python 3.13 support for torch.compile, FP16 support for X86 CPUs, and new AOTInductor packaging APIs. It includes a significant security change making torch.load use weights_only=True by default and deprecates the official Anaconda channel.

Common Errors

TorchRuntimeError12 reports

TorchRuntimeError in PyTorch often arises from incorrect tensor datatypes within operations or when moving tensors between CPU and GPU without proper type casting (e.g., .float(), .long(), .cuda()). Resolve this by ensuring all tensors involved in an operation have compatible datatypes and are on the same device. Explicitly cast tensors and use .to(device) before the operation to guarantee correct datatype and device placement.

OpCheckError4 reports

OpCheckError in PyTorch custom operator testing usually means the custom operator's output does not match the expected output calculated by NumPy, violating the OpInfo contract. Debug by carefully inspecting the custom operator's forward and backward implementations, paying close attention to data types, tensor shapes, and numerical precision to ensure they align with NumPy's behavior. If discrepancies are found, rectify the custom operator to produce outputs consistent with NumPy when given the same inputs.

OutOfMemoryError3 reports

OutOfMemoryError in PyTorch usually stems from allocating more GPU memory than available. Fix this by reducing batch size, model size, or sequence length, and explicitly release unused tensors with `del` and `torch.cuda.empty_cache()` to free up memory. Consider using gradient accumulation or mixed-precision training (e.g., with `torch.cuda.amp`) to further lower memory footprint.

ProcessRaisedException3 reports

ProcessRaisedException in PyTorch often arises from issues within multiprocessing contexts, specifically related to CUDA device handling or argument mismatches during distributed operations or within TorchInductor. Ensure CUDA devices are correctly initialized and visible to all processes, and verify that all function/class calls within multiprocessing conform to the expected argument count and types as defined by PyTorch or TorchInductor APIs, paying special attention to distributed configurations.

RefResolutionError2 reports

RefResolutionError in PyTorch FX usually arises when tracing a module and the tracer encounters a symbol (e.g., a function or module) it cannot resolve within the FX graph's scope. To fix this, either make sure all necessary modules/functions are explicitly passed as arguments or attributes of the module being traced, or use `torch.fx.wrap` to expose external functions/modules to the tracer. Consider tracing at a higher level or restructuring your code to improve traceability if the problem persists.

NoValidChoicesError2 reports

The "NoValidChoicesError" in PyTorch Inductor usually indicates that no viable backend implementations (e.g., GEMM, convolution) are found that satisfy all constraints for a given operation, often due to unsupported data types, shapes, or hardware features. To fix this, either rewrite the operation using supported data types/shapes/layouts, or investigate and potentially enable/implement a missing backend implementation in Inductor that fulfills the requirements (often requires understanding Inductor's code generation).

Related Data & ML Packages

Subscribe to Updates

Get notified when new versions are released

RSS Feed