v5.5.0
📦 sentence-transformersView on GitHub →
✨ 5 features🐛 1 fixes🔧 11 symbols
Summary
This release introduces the `train-sentence-transformers` Agent Skill and two new training losses, `EmbedDistillLoss` and `ADRMSELoss`, alongside numerous robustness and correctness improvements.
Migration Steps
- When using `EmbedDistillLoss` and the student/teacher dimensions differ, pass `projection_dim=<teacher_dim>` during loss initialization.
- If reusing a projection layer from `EmbedDistillLoss` across stages (e.g., after initial training), use `loss.save_projection(...)` and `loss.load_projection(...)`.
✨ New Features
- Introduced the `train-sentence-transformers` Agent Skill for driving end-to-end training and fine-tuning via AI coding agents.
- Added `EmbedDistillLoss`, an embedding-level knowledge distillation loss for `SentenceTransformer` that aligns student embeddings with pre-computed teacher embeddings.
- Added `ADRMSELoss`, a listwise learning-to-rank loss for `CrossEncoder` based on the Rank-DistiLLM paper.
- `SentenceTransformer.encode()` and related methods, as well as `CrossEncoder.predict()` and `model.preprocess()`, now accept a per-call `processing_kwargs` argument to override constructor configuration.
- `MSELoss` is now a subclass of `EmbedDistillLoss` and gains the optional `projection_dim` argument.
🐛 Bug Fixes
- The release includes a long list of robustness and correctness fixes.