Change8

v0.18.2-rc1

📦 ollamaView on GitHub →
3 features🐛 1 fixes🔧 4 symbols

Summary

This release introduces significant performance and feature enhancements for MLX backend, including model eviction, quantized embeddings, and fast SwiGLU. It also includes a fix for the web_search legacy path in the cloud proxy.

✨ New Features

  • Model eviction implemented for MLX.
  • Added prequantized tensor packing and related changes for qwen35 support in MLX.
  • Implemented quantized embeddings and fast SwiGLU, along with runtime fixes for MLX.

🐛 Bug Fixes

  • Cloud proxy now flushes on newlines for the web_search legacy path.

Affected Symbols