0.46.0

Breaking Changes

📅 May 27, 2025📦 bitsandbytesView on GitHub →

⚠ 2 breaking✨ 6 features🐛 8 fixes⚡ 7 deprecations🔧 9 symbols

Summary

This release introduces significant improvements for `torch.compile` compatibility with both LLM.int8() and 4bit quantization, alongside a major refactoring to integrate with PyTorch Custom Operators. Support for Python 3.8 and older PyTorch versions has been dropped.

⚠️ Breaking Changes

Support for Python 3.8 has been dropped. Users must upgrade to Python 3.9 or newer.
Support for PyTorch versions older than 2.2.0 has been dropped. Users must upgrade to PyTorch >= 2.2.0.

Migration Steps

Ensure your Python version is 3.9 or higher.
Ensure your PyTorch version is 2.2.0 or higher. For best `torch.compile` support, PyTorch 2.6+ is recommended.
Review usage of deprecated functions in `bnb.autograd` and `bnb.functional` namespaces and update code accordingly.

✨ New Features

Added support for `torch.compile` without graph breaks for LLM.int8() (requires PyTorch 2.4+, PyTorch 2.6+ recommended). Experimental CPU support is included.
Added support for `torch.compile` without graph breaks for 4bit quantization (requires PyTorch 2.4+ for `fullgraph=False`, PyTorch 2.8 nightly for `fullgraph=True`).
Wheels are now published for CUDA Linux aarch64 (sbsa), targeting Turing generation (sm75) and newer.
Refactored library code to integrate better with PyTorch via the `torch.library` and custom ops APIs, enabling better `torch.compile` support.
Added simple op implementations for CPU.
Added autoloading for backend packages.

🐛 Bug Fixes

Fixed `torch.compile` issue for LLM.int8() when threshold=0.
Fixed missing CPU library issue.
Fixed torch compatibility issue for versions <= 2.4.
Fixed typo in __getitem__.
Improved CUDA version detection and error handling.
Fixed Intel CPU/XPU installation issues.
Fixed optimizer backwards compatibility.
Fixed issue related to `int8_mm_dequant` being moved from CPU to default backend.

🔧 Affected Symbols

bnb.autograd.get_inverse_transform_indicesbnb.autograd.undo_layoutbnb.functional.create_quantile_mapbnb.functional.estimate_quantilesbnb.functional.get_colrow_absmaxbnb.functional.get_row_absmaxbnb.functional.histogram_scatter_add_2dtorch.compileget_cuda_version_tuple

⚡ Deprecations

bnb.autograd.get_inverse_transform_indices() is deprecated.
bnb.autograd.undo_layout() is deprecated.
bnb.functional.create_quantile_map() is deprecated.
bnb.functional.estimate_quantiles() is deprecated.
bnb.functional.get_colrow_absmax() is deprecated.
bnb.functional.get_row_absmax() is deprecated.
bnb.functional.histogram_scatter_add_2d() is deprecated.