v0.22.0-rc1
📦 ollamaView on GitHub →
✨ 1 features🐛 1 fixes🔧 2 symbols
Summary
This release introduces support for NVIDIA TensorRT Model Optimizer import within mlx and fixes an issue related to multi-regex BPE offset handling in the tokenizer. It also includes performance improvements by batching the sampler across multiple sequences in mlxrunner.
✨ New Features
- mlx: Support NVIDIA TensorRT Model Optimizer import.
🐛 Bug Fixes
- tokenizer: fix multi-regex BPE offset handling.