0.46.0
Breaking Changes📦 bitsandbytesView on GitHub →
⚠ 2 breaking✨ 6 features🐛 8 fixes⚡ 7 deprecations🔧 9 symbols
Summary
This release introduces significant improvements for `torch.compile` compatibility with both LLM.int8() and 4bit quantization, alongside a major refactoring to integrate with PyTorch Custom Operators. Support for Python 3.8 and older PyTorch versions has been dropped.
⚠️ Breaking Changes
- Support for Python 3.8 has been dropped. Users must upgrade to Python 3.9 or newer.
- Support for PyTorch versions older than 2.2.0 has been dropped. Users must upgrade to PyTorch >= 2.2.0.
Migration Steps
- Ensure your Python version is 3.9 or higher.
- Ensure your PyTorch version is 2.2.0 or higher. For best `torch.compile` support, PyTorch 2.6+ is recommended.
- Review usage of deprecated functions in `bnb.autograd` and `bnb.functional` namespaces and update code accordingly.
✨ New Features
- Added support for `torch.compile` without graph breaks for LLM.int8() (requires PyTorch 2.4+, PyTorch 2.6+ recommended). Experimental CPU support is included.
- Added support for `torch.compile` without graph breaks for 4bit quantization (requires PyTorch 2.4+ for `fullgraph=False`, PyTorch 2.8 nightly for `fullgraph=True`).
- Wheels are now published for CUDA Linux aarch64 (sbsa), targeting Turing generation (sm75) and newer.
- Refactored library code to integrate better with PyTorch via the `torch.library` and custom ops APIs, enabling better `torch.compile` support.
- Added simple op implementations for CPU.
- Added autoloading for backend packages.
🐛 Bug Fixes
- Fixed `torch.compile` issue for LLM.int8() when threshold=0.
- Fixed missing CPU library issue.
- Fixed torch compatibility issue for versions <= 2.4.
- Fixed typo in __getitem__.
- Improved CUDA version detection and error handling.
- Fixed Intel CPU/XPU installation issues.
- Fixed optimizer backwards compatibility.
- Fixed issue related to `int8_mm_dequant` being moved from CPU to default backend.
🔧 Affected Symbols
bnb.autograd.get_inverse_transform_indicesbnb.autograd.undo_layoutbnb.functional.create_quantile_mapbnb.functional.estimate_quantilesbnb.functional.get_colrow_absmaxbnb.functional.get_row_absmaxbnb.functional.histogram_scatter_add_2dtorch.compileget_cuda_version_tuple⚡ Deprecations
- bnb.autograd.get_inverse_transform_indices() is deprecated.
- bnb.autograd.undo_layout() is deprecated.
- bnb.functional.create_quantile_map() is deprecated.
- bnb.functional.estimate_quantiles() is deprecated.
- bnb.functional.get_colrow_absmax() is deprecated.
- bnb.functional.get_row_absmax() is deprecated.
- bnb.functional.histogram_scatter_add_2d() is deprecated.