v3.4.0

Breaking Changes

📅 Jan 23, 2025📦 sentence-transformersView on GitHub →

⚠ 1 breaking✨ 5 features🐛 10 fixes🔧 14 symbols

Summary

Version 3.4.0 fixes a major memory‑leak issue, adds compatibility between cached losses and MatryoshkaLoss, introduces several new features, and resolves numerous bugs.

⚠️ Breaking Changes

SentenceTransformerModelCardData no longer stores a reference to SentenceTransformerTrainer, breaking the previous circular reference; code that accessed model_card_data.trainer must be updated to obtain the trainer elsewhere.

Migration Steps

If your code accessed SentenceTransformerModelCardData.trainer, refactor to keep a separate reference to the trainer.
Reinstall or upgrade to sentence-transformers==3.4.0 using pip.
Review any custom module saving logic to ensure kwargs are correctly persisted.

✨ New Features

Added compatibility to combine Cached losses (CachedMultipleNegativesRankingLoss, CachedGISTEmbedLoss, CachedMultipleNegativesSymmetricRankingLoss) with MatryoshkaLoss.
Added Matthews Correlation Coefficient metric to BinaryClassificationEvaluator.
Added a margin parameter to TripletEvaluator.
Model cards now include dataset information in expandable sections when many datasets are present.
Enabled multi‑GPU and CPU multi‑process support for util.mine_hard_negatives.

🐛 Bug Fixes

Fixed NoDuplicatesBatchSampler producing identical subsequent batches.
Fixed crash when using old‑style model.fit() with write_csv on an evaluator.
Converted evaluator output types from np.float to native float.
Allowed specifying revision and cache_dir when loading PEFT Adapter models.
Fixed CrossEncoder lazy placement on incorrect device and made it respect model.to().
Correctly saved custom module kwargs in modules.json (e.g., for jina‑embeddings‑v3).
Fixed HfArgumentParser(SentenceTransformerTrainingArguments) crash caused by prompts typing.
Fixed breaking change in PyLate when loading modules.
Raised error for empty dataset list in NanoBEIREvaluator.
Updated Sphinx version and switched from recommonmark to myst-parser in docs.

🔧 Affected Symbols

SentenceTransformerModelCardDataSentenceTransformerTrainerlosses.CachedMultipleNegativesRankingLosslosses.CachedGISTEmbedLosslosses.CachedMultipleNegativesSymmetricRankingLosslosses.MatryoshkaLossBinaryClassificationEvaluatorTripletEvaluatorutil.mine_hard_negativesCrossEncoderHfArgumentParserSentenceTransformerTrainingArgumentsNanoBEIREvaluatorPyLate