v0.22.1
📦 ollamaView on GitHub →
✨ 2 features🐛 2 fixes🔧 4 symbols
Summary
This release introduces model batching support and TensorRT Model Optimizer import for the mlx backend. It also includes several bug fixes related to tokenization and desktop application startup behavior.
✨ New Features
- Model support for batching has been added.
- Support for NVIDIA TensorRT Model Optimizer import added to mlx backend.
🐛 Bug Fixes
- Fixed multi-regex BPE offset handling in tokenizer.
- Fixed desktop app startup killing active `ollama launch` sessions.