Change8

v5.3.0

📦 sentence-transformersView on GitHub →
5 features🐛 1 fixes🔧 8 symbols

Summary

This minor release enhances contrastive learning with new loss formulations and hardness weighting for MultipleNegativesRankingLoss, and introduces memory-efficient CachedSpladeLoss and GlobalOrthogonalRegularizationLoss. Bug fixes include addressing issues with GroupByLabelBatchSampler.

Migration Steps

  1. When using `MultipleNegativesRankingLoss`, existing usage remains unchanged (defaulting to standard InfoNCE). To use new variants, configure the `directions` and `partition_mode` parameters.
  2. To use hardness weighting in `MultipleNegativesRankingLoss` or `CachedMultipleNegativesRankingLoss`, set `hardness_mode` (e.g., `"in_batch_negatives"`) and `hardness_strength`.

✨ New Features

  • MultipleNegativesRankingLoss now supports alternative InfoNCE formulations via new `directions` and `partition_mode` parameters, allowing configurations like Symmetric InfoNCE and GTE improved contrastive loss.
  • Added optional hardness weighting to MultipleNegativesRankingLoss and CachedMultipleNegativesRankingLoss to up-weight harder negatives using the `hardness_strength` and `hardness_mode` parameters.
  • Introduced `GlobalOrthogonalRegularizationLoss` for embedding space regularization, designed to be combined with primary contrastive losses.
  • Introduced `CachedSpladeLoss` for memory-efficient SPLADE training by applying the GradCache technique.
  • Added a faster hashed batch sampler option (NO_DUPLICATES_HASHED).

🐛 Bug Fixes

  • Fixed `GroupByLabelBatchSampler` behavior when used with triplet losses.

Affected Symbols