v0.22.1-rc1

📅 Apr 28, 2026📦 ollamaView on GitHub →

✨ 2 features🐛 2 fixes🔧 4 symbols

Summary

This release introduces model batching support and adds NVIDIA TensorRT Model Optimizer import capability. Several minor bugs related to tokenization and desktop app session handling were also resolved.

✨ New Features

Model support for batching has been added.
Support for NVIDIA TensorRT Model Optimizer import added to mlx backend.

🐛 Bug Fixes

Fixed multi-regex BPE offset handling in tokenizer.
Fixed desktop app startup killing active `ollama launch` sessions.

Affected Symbols

mlxrunner tokenizer mlx app/server