Change8

v4.3.0

Breaking Changes
📦 deeplakeView on GitHub →
1 breaking9 features🐛 3 fixes🔧 3 symbols

Summary

Deeplake 4.3.0 is a major update introducing comprehensive support for video data, enhanced indexing for numeric types, and significant improvements to CSV import/export functionality.

⚠️ Breaking Changes

  • Datasets created or modified with v4.3.0 cannot be opened with v4.2.x versions due to internal format enhancements.

Migration Steps

  1. If working with shared datasets modified in v4.3.0, upgrade all environments accessing those datasets to v4.3.0 or newer, as older versions (v4.2.x) will not be able to open them.

✨ New Features

  • Complete revisit of Sequence types to support visual and structured data.
  • Video type support is now available in Deeplake, supporting MP4 and MKV videos with H264 codec and providing fast random access to video frames.
  • Indexing for numeric types, enabling fast queries for numeric comparisons in TQL, including IN and BETWEEN operations.
  • Significant improvements to textual index types, providing faster search without requiring index regeneration.
  • Fully rewritten from_csv function with support for large CSV files.
  • New to_csv API to export Deeplake datasets/views to CSV format.
  • Support for specifying Python builtin types when defining dataset schemas.
  • Support for using Pydantic Models as dataset schemas.
  • Enriched async operations typing, to support better integration with linters and IDEs.

🐛 Bug Fixes

  • Improved TQL data fetching and linear scan performance for non-indexed columns.
  • Better memory usage tracking to prevent out-of-memory errors.
  • Various stability improvements and bug fixes.

🔧 Affected Symbols

Sequencefrom_csvto_csv