Change8

v5.2.0

Breaking Changes
📦 sentence-transformersView on GitHub →
1 breaking6 features🐛 2 fixes2 deprecations🔧 8 symbols

Summary

Version 5.2.0 adds multiprocessing to CrossEncoder, multilingual NanoBEIR support, similarity scores in hard‑negative mining, and updates for Transformers 5 while deprecating Python 3.9 and the old `n-tuple-scores` format.

⚠️ Breaking Changes

  • The old `n-tuple-scores` output format for `mine_hard_negatives` has been removed; use `output_format=\"n-tuple\"` with `output_scores=True` instead.

Migration Steps

  1. If you used the old `n-tuple-scores` format, replace it with `output_format=\"n-tuple\"` and set `output_scores=True`.
  2. Upgrade the `transformers` package to >=5.0 to ensure compatibility.
  3. When using `CrossEncoder`, initialize a pool with `pool = model.start_multi_process_pool()` and pass `pool=pool` to `predict`/`rank`, then call `model.stop_multi_process_pool(pool)` after inference. Alternatively, pass a list of devices (e.g., `device=[\"cpu\"]*4`).

✨ New Features

  • CrossEncoder now supports multiprocessing via `start_multi_process_pool`, `stop_multi_process_pool`, and the `pool` argument to `predict` and `rank`.
  • Providing a list of devices to `CrossEncoder` automatically creates a multiprocessing pool for faster CPU or multi‑GPU inference.
  • NanoBEIR evaluators accept a `dataset_id` parameter, enabling evaluation on multilingual NanoBEIR collections.
  • `mine_hard_negatives` adds an `output_scores` parameter to export similarity scores alongside mined negatives.
  • Support for Transformers library version 5.x.
  • Improved handling of datasets with multiple positive passages in hard‑negative mining.

🐛 Bug Fixes

  • Fixed several issues when datasets contain multiple positive passages during hard‑negative mining.
  • Resolved bugs related to multi‑GPU usage in `mine_hard_negatives`.

🔧 Affected Symbols

CrossEncoderCrossEncoder.start_multi_process_poolCrossEncoder.stop_multi_process_poolCrossEncoder.predictCrossEncoder.rankmine_hard_negativesNanoBEIREvaluatorSentenceTransformer

⚡ Deprecations

  • Python 3.9 support is deprecated; upgrade to Python 3.10 or newer.
  • The `n-tuple-scores` format in `mine_hard_negatives` is deprecated and replaced by the new `output_scores` handling.