v4.3.0
Breaking Changes📦 deeplakeView on GitHub →
⚠ 1 breaking✨ 9 features🐛 3 fixes🔧 3 symbols
Summary
Deeplake 4.3.0 is a major update introducing comprehensive support for video data, enhanced indexing for numeric types, and significant improvements to CSV import/export functionality.
⚠️ Breaking Changes
- Datasets created or modified with v4.3.0 cannot be opened with v4.2.x versions due to internal format enhancements.
Migration Steps
- If working with shared datasets modified in v4.3.0, upgrade all environments accessing those datasets to v4.3.0 or newer, as older versions (v4.2.x) will not be able to open them.
✨ New Features
- Complete revisit of Sequence types to support visual and structured data.
- Video type support is now available in Deeplake, supporting MP4 and MKV videos with H264 codec and providing fast random access to video frames.
- Indexing for numeric types, enabling fast queries for numeric comparisons in TQL, including IN and BETWEEN operations.
- Significant improvements to textual index types, providing faster search without requiring index regeneration.
- Fully rewritten from_csv function with support for large CSV files.
- New to_csv API to export Deeplake datasets/views to CSV format.
- Support for specifying Python builtin types when defining dataset schemas.
- Support for using Pydantic Models as dataset schemas.
- Enriched async operations typing, to support better integration with linters and IDEs.
🐛 Bug Fixes
- Improved TQL data fetching and linear scan performance for non-indexed columns.
- Better memory usage tracking to prevent out-of-memory errors.
- Various stability improvements and bug fixes.
🔧 Affected Symbols
Sequencefrom_csvto_csv