Change8

1.0.15

📦 chromaView on GitHub →
14 features🐛 19 fixes🔧 9 symbols

Summary

Version 1.0.15 focuses heavily on stability, garbage collection improvements (especially for WAL3), and enhanced S3/storage handling. It includes numerous bug fixes related to data deletion, manifest consistency, and contention management.

✨ New Features

  • Added a scrubbing tool that supports limits.
  • Added ability to bail from snapshot, manifest, and gc installs.
  • Return database id in get collections call from sysdb.
  • CLI: Set Chroma environment variables.
  • Support writing data to separate prefixes in s3.
  • Enforce maximum get_collections limit as 100.
  • Purge dirty log in background at the end of scheduled compaction.
  • Move Log GC to operator.
  • Make roll dirty log always converge to coalesce everything.
  • Read args from env variables for Python CloudClient.
  • Implement three-phase garbage collection for WAL3.
  • Add ability to set different block sizes for different blockfiles.
  • Allow slicing of the log when pulling to narrow files for debugging.
  • Skip log GC in dry run mode.

🐛 Bug Fixes

  • Tracked the threshold of garbage collected fragments.
  • Correctly removed embeddings, embeddings metadata, and segment metadata on delete_collection.
  • Fixed contention in S3 assumed to be retryable, which caused the manifest to fail.
  • Fixed duplicate DeleteUnusedFiles task in GC for soft-deleted collections.
  • Fixed panic on sysdb when calling CheckCollection.
  • Fixed scrub error caused by a transient error in scrubbing.
  • Manifest-initial-offset was not set under gc conditions.
  • Fixed the condition for setting the manifest initial_seq_no.
  • Fixed a flakey prop test and committed the regression.
  • Do not leak tokio tasks in the log service.
  • Log GC offset should be one above minimum compaction offset.
  • Coalesced when multiple collections return the same info to compact.
  • Enriched from the manifest if a cursor doesn't exist.
  • Fixed flakey prop test.
  • Fixed forking for js client.
  • GC gets wedged (WAL3 related fix).
  • Fixed dedup in get_collections_with_new_data.
  • Read from legacy metadata config when no collection config set.
  • Batch inserts on push_logs in sqlite.

🔧 Affected Symbols

purge_dirty_for_collectiondelete_collectionCheckCollectionget_collectionsCheckCollectionsappend_batchListCollectionsToGcget_collections_with_new_datapush_logs