v0.18.2-rc1

📅 Mar 18, 2026📦 ollamaView on GitHub →

✨ 3 features🐛 1 fixes🔧 4 symbols

Summary

This release introduces significant performance and feature enhancements for MLX backend, including model eviction, quantized embeddings, and fast SwiGLU. It also includes a fix for the web_search legacy path in the cloud proxy.

✨ New Features

Model eviction implemented for MLX.
Added prequantized tensor packing and related changes for qwen35 support in MLX.
Implemented quantized embeddings and fast SwiGLU, along with runtime fixes for MLX.

🐛 Bug Fixes

Cloud proxy now flushes on newlines for the web_search legacy path.

Affected Symbols

sched mlx cloud_proxy web_search