DeepLake
AI & LLMsDatabase for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
Release History
v4.5.21 featureThis release introduces support for binary data within dataset and column metadata and removes the dependency on libatomic.
v4.5.17 fixes4 featuresVersion 4.5.1 focuses heavily on performance improvements through memory allocation changes (mimalloc, simdjson) and introduces significant new features like full null value support and dataset statistics gathering.
v4.5.0v4.4.51 fix1 featureThis release includes improved support for NULL values in Deeplake and resolves a concurrency issue during row and column creation.
v4.4.43 featuresThis release focuses on infrastructure improvements, including better build integration via cmake/pkg-config for the Deeplake API and significant enhancements to storage access performance and PostgreSQL batch ingestion.
v4.4.3No release notes provided.
v4.4.18 fixes7 featuresThis release introduces significant storage and API enhancements, including a new list_dirs API and mesh type support, alongside substantial performance improvements in cache operations and bug fixes across core functionality and indexing.
v4.4.07 fixes6 featuresThis release introduces significant indexing improvements, including JSON data indexing support and better progress indication. It also resolves several critical bugs related to compilation, Windows testing, and browser stability.
v4.3.52 fixes2 featuresThis release introduces support for the link to bytes data type and improves text column flexibility. Several bugs related to error handling and PNG linking have also been resolved.
v4.3.42 fixes4 featuresThis release introduces PostgreSQL 18 compatibility and significant performance optimizations, including adaptive backoff and cgroup-aware concurrency, alongside various bug fixes.
v4.3.32 fixes5 featuresThis release introduces significant enhancements to pg_deeplake, including new data type support and performance improvements, alongside architectural refactoring and bug fixes.
v3.9.52This release primarily updates dependency compatibility by allowing usage with numpy version 2.
v4.3.0Breaking3 fixes9 featuresDeeplake 4.3.0 is a major update introducing comprehensive support for video data, enhanced indexing for numeric types, and significant improvements to CSV import/export functionality.
v3.9.511 fixThis minor release (3.9.51) primarily addresses a bug in the dependency resolver.
v3.9.502 featuresThis release includes an update to version 3.9.50 and refactors the frame extraction logic along with improvements to frame rate retrieval.
v3.9.461 featureThis patch release introduces a base mmsegmentation dataset class to facilitate integration with mmseg 1.x, requiring users to inherit from it.
v4.2.141 fix1 featureThis release introduces the new Audio type and fixes a conversion bug related to the Polygon type from V3.
v4.2.127 featuresThis release introduces several new features including autocommit, TQL enhancements (AVG function, improved SAMPLE BY), and expanded support for data types and integrations like zlib compression for SegmentMask and float16/bfloat16 indexing.
v4.2.83 featuresThis release introduces new data loading capabilities with `from_csv` and byte support for `from_parquet`, alongside new compression options for `SegmentMask`.
v3.9.452 featuresThis release introduces a new sampler setter function and implements a hierarchical namespace structure.
v4.2.72 fixes6 featuresThis release introduces significant enhancements to indexing capabilities, batch query support, and rich type handling within Structs, alongside important bug fixes for data consistency and chunking.
v4.2.35 featuresThis release introduces data file compaction, improved image handling, a new text comparison index type, and enhanced asynchronous operation support, including async iteration over batches.
v4.2.14 featuresDeeplake 4.2 introduces automatic commit compaction for faster opens and stabilizes the asynchronous API, alongside metadata copying in `deeplake.like`.
v4.1.171 fixThis minor release addresses a specific concurrency issue related to dataset size tracking during row deletion.
v4.1.164 featuresThis release introduces improvements in version control, adds support for float16 and bfloat16 data types, and integrates OpenTelemetry for enhanced observability.
v3.9.441 featureThis release updates internal versions and introduces support for Labelbox IAM integration.
v4.1.122 fixes4 featuresThis major release introduces dataset branching, enhanced query capabilities including direct credential passing and full JOIN support, and improved Labelbox integration.
v4.1.114 fixesThis release focuses on stability and correctness, primarily fixing bugs related to S3 redirects, virtual column dtypes, iteration handling, and network error reporting.
v3.9.431 fixThis release primarily focuses on fixing a visualization issue for local datasets.
v3.9.421 fix1 featureThis release introduces an exception for empty class names and resolves a palette issue. Version updates were also applied for the 3.9.42 release.
v4.1.101 fix1 featureThis release fixes a high memory usage issue during json column queries and introduces the ability to append arrays of nulls to image type columns.
v4.1.73 fixes4 featuresThis release focuses on performance improvements for dataset loading and querying, introduces native support for MMR search, and adds the EmbeddingMatrix index type.
v3.9.41BreakingThis release primarily updates internal dependencies to version 3.9.41 and removes the unnecessary `labelbox_client` argument from labelbox integration functions.
v3.9.401 featureThis release integrates the mmseg 1.x version and updates internal version dependencies.
v3.9.392 fixes1 featureThis release includes refactoring for the Labelbox integration, updates to internal dependencies, and improved resilience for Azure storage authentication.
v3.9.381 fixThis release primarily updates internal versions for the 3.9.38 release and includes a fix for an issue concerning max_view.
v4.1.54 featuresThis release introduces support for new data types including dicom, nifti, and point, alongside performance improvements in index generation and new search capabilities.
v3.9.372 fixesThis release includes minor updates, specifically updating versions for the 3.9.37 release and updating the client after a credentials exchange.
v3.9.362 fixesThis release includes minor updates, specifically updating versions for the 3.9.36 release and updating the client after a credentials exchange.
v4.1.45 fixes5 featuresThis release introduces support for the Point type, adds automation for COCO-like dataset ingestion via from_coco, and includes several bug fixes related to query performance and data handling.
Common Errors
LogNotexistsError2 reportsThe "LogNotexistsError" in Deep Lake usually indicates that the dataset's metadata log file is missing or inaccessible due to permission issues, network problems, or S3 configuration errors. To resolve this, verify your cloud credentials, ensure proper network connectivity to your data lake, and double-check that the specified dataset path in Deep Lake is correct and accessible with the necessary permissions. If using S3, confirm your bucket policy is configured correctly to allow Deep Lake access.
InvalidColumnValueError1 reportInvalidColumnValueError in Deep Lake arises primarily when the data type you're trying to append to a column does not match the column's existing data type (e.g., appending a list of strings to a column defined as text). To fix this, ensure the data type of the data you're appending exactly matches the column's schema by inspecting the column's `htype` attribute and casting or converting your data accordingly before appending. Alternatively, recreate the dataset or column allowing type inference to accommodate your data, or use `extend` for appending multiple compatible items.
Related AI & LLMs Packages
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.
🦜🔗 The platform for reliable agents.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
LLM inference in C/C++
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
Subscribe to Updates
Get notified when new versions are released