Polars
Data & MLExtremely fast Query Engine for DataFrames, written in Rust
Release History
rs-0.53.01 fix60 featuresThis release focuses heavily on performance improvements across various operations, especially in streaming contexts, and introduces significant enhancements to SQL support and data type handling, including new Extension types.
py-1.38.113 fixes1 featureThis release introduces the ability to retrieve a byte from binary data using get() and includes numerous bug fixes across query optimization, schema inference, and data handling. Several internal cleanups and documentation updates were also performed.
py-1.38.040 fixes27 featuresThis release focuses heavily on performance improvements across streaming, I/O, and core computations, alongside numerous bug fixes for stability and correctness. A key change is the deprecation of the `retries` argument in favor of using `storage_options`.
py-1.37.14 fixesThis release focuses on performance improvements, particularly for SQL UNION clauses, and includes several bug fixes related to IPC slicing and error handling. Dependency updates were also performed.
py-1.37.0py-1.36.16 fixes1 featureThis release focuses on performance improvements, new features like Object literal creation, and numerous bug fixes across various components including binary operations and DataFrame methods.
py-1.36.054 fixes53 featuresThis release introduces significant enhancements across SQL support (including window functions and QUALIFY), new DataFrame/Expr methods like bin slicing and improved rolling calculations, and numerous performance optimizations, especially for streaming and Parquet I/O. Numerous bugs related to panics, dtype handling, and SQL expression resolution have also been fixed.
py-1.36.0-beta.254 fixes49 featuresThis release introduces significant enhancements across SQL support, expression capabilities (like rolling windows and aggregation contexts), and adds support for Extension types. Numerous performance improvements and bug fixes address stability and correctness across various operations, especially in streaming and Parquet handling.
py-1.35.24 fixesThis release focuses on stability and performance, primarily addressing several bugs related to group_by operations and fixing a wide-table join performance regression.
rs-0.52.042 fixes38 featuresThis release focuses heavily on performance improvements across lazy evaluation, group-by operations, and I/O, alongside numerous bug fixes, especially in SQL handling and streaming aggregations. New features include batch collection methods for LazyFrames and enhanced streaming support for various functions.
py-1.35.111 fixes3 featuresThis release focuses on performance improvements, including optimizing rolling moment window computation and IPC stream reads. It also introduces support for BYTE_ARRAY backed Decimals in Parquet and fixes numerous bugs across SQL handling, group-by operations, and predicate pushdown.
py-1.35.046 fixes22 featuresThis release focuses heavily on performance improvements across group-by operations, native aggregations, and data parsing, while stabilizing Decimal support and introducing several new features like ewm_mean in streaming and enhanced list/array aggregations. Two functions, Expr.agg_groups() and pl.groups(), have been deprecated.
py-1.35.0-beta.142 fixes18 featuresThis release focuses heavily on performance improvements across group-by operations, data serialization, and string parsing. Numerous enhancements were added, including new aggregation methods and improved ADBC engine integration, alongside fixes for various edge cases and regressions.
py-1.34.043 fixes17 featuresThis release introduces new batch collection methods for LazyFrames and significant performance optimizations across various operations, including native streaming support for gather_every and mode(). Numerous bug fixes address issues related to CSV parsing, streaming engine consistency, and cloud storage scanning.
py-1.34.0-beta.542 fixes17 featuresThis release introduces new lazy sink/collect batch methods, enhances performance across various scan and expression operations, and fixes numerous bugs related to data types, streaming, and cloud storage interactions.
py-1.34.0-beta.433 fixes14 featuresThis release introduces new batch collection methods (`LazyFrame.{sink,collect}_batches`), significant performance optimizations across scanning and expressions, and numerous bug fixes, especially around Iceberg and streaming operations.
py-1.34.0-beta.333 fixes14 featuresThis release introduces new batch collection methods for LazyFrames and enhances performance across various scan types, alongside numerous bug fixes for stability and correctness in streaming and aggregation operations.
py-1.34.0-beta.119 fixes12 featuresThis release introduces new batch collection methods for LazyFrames and significant performance improvements across various scan types. It also includes numerous bug fixes, especially around cloud storage interactions and data type handling.
rs-0.51.0Breaking75 fixes19 featuresThis release introduces significant performance improvements across the streaming engine, new features like array support for unique counts and AWS role assumption guides, and addresses numerous bugs, notably around serialization and type handling. The most critical change is the transition of some eager Expressions to be lazy compatible.
py-1.33.113 fixes7 featuresThis release focuses heavily on performance improvements, particularly around Parquet decoding and memory allocation. It also introduces several enhancements, including S3 URI support and a new security policy, alongside numerous bug fixes.
py-1.33.0Breaking50 fixes17 featuresThis release focuses heavily on performance improvements by lowering many operations natively to the streaming engine and optimizing the query planner. A significant breaking change involves making previously eager Exprs lazy compatible.
py-1.33.0-beta.1Breaking39 fixes10 featuresThis release focuses heavily on performance improvements by lowering various operations to the streaming engine and enhancing lazy evaluation capabilities, alongside numerous bug fixes and the removal of eager expression behavior.
py-1.32.323 fixes5 featuresThis release focuses heavily on performance improvements by lowering operations like `.sort(maintain_order=True).head()` and `rle` to the streaming engine. Numerous bug fixes address issues across data types, serialization, and expression evaluation.
py-1.32.21 fixThis release primarily focuses on fixing an issue related to returning the correct Python package version and includes a documentation update.
py-1.32.128 fixes5 featuresThis release focuses heavily on performance improvements by lowering more operations to the streaming engine and optimizing internal parsing. Numerous bug fixes address issues across Iceberg/Delta scans, data type handling, and aggregation queries.
rs-0.50.079 fixes24 featuresThis release focuses heavily on performance improvements by lowering more operations to the streaming engine and optimizing various internal processes. It also introduces significant enhancements around Categorical/Enum types and fixes numerous bugs across expressions, I/O, and joins.
py-1.32.078 fixes31 featuresThis release focuses heavily on performance improvements across the streaming engine, expression lowering, and various I/O operations. Key enhancements include the formalization of `Selector` in the DSL and a rework of Categorical/Enum handling using (Frozen)Categories.
py-1.32.0-beta.178 fixes31 featuresThis release focuses heavily on performance improvements across the streaming engine, including lowering various operations and optimizing predicate pushdown. Key enhancements include the formalization of `Selector` in the DSL and reworking Categorical/Enum handling using (Frozen)Categories.
rs-0.49.11 fix1 featureThis release focuses on performance optimizations, enhanced functionality for padding methods, and fixes related to time zone handling in date/time expressions.
rs-0.49.0Breaking46 fixes14 featuresThis release removes the old streaming engine, introduces native Iceberg positional delete support, and includes numerous performance optimizations and bug fixes across various data types and operations.
py-1.31.0Breaking49 fixes12 featuresThis release removes the old streaming engine, introduces DataType expressions and Iceberg positional delete support, and includes numerous performance optimizations and bug fixes across various operations.
py-1.31.0-beta.1Breaking44 fixes13 featuresThis release removes the old streaming engine, introduces DataType expressions and Iceberg positional delete support, and includes numerous performance optimizations and bug fixes across various components.
py-1.30.044 fixes20 featuresThis release focuses heavily on performance improvements across various operations, including optimizer casts, parallelism, and elementwise execution. It introduces several new features like list.filter, LazyFrame.match_to_schema, and enhanced type inference, alongside numerous bug fixes addressing panics and incorrect outputs.
rs-0.48.11 fixThis release focuses on performance improvements by switching eligible casts to non-strict in the optimizer and includes fixes for serialization errors and build system issues.
rs-0.48.0Breaking29 fixes11 featuresThis release introduces several performance improvements, new features like list.filter and schema matching for LazyFrames, and addresses numerous bugs, including panics and incorrect outputs in various operations. A breaking change involves updating how time zone information is stored internally.
py-1.30.0-beta.137 fixes20 featuresThis release focuses heavily on performance improvements, including optimized initialization and parallelism, alongside numerous bug fixes across SQL, I/O, and expression evaluation. New features include enhanced schema matching, improved time-string parsing, and better support for various data types and operations.
rs-0.47.0Breaking3 fixes51 featuresThis release introduces significant performance improvements across the streaming engine, new SQL functions, and enhanced support for array types and various sink configurations. A breaking change was made to how the bottom interval is handled in `hist`.
py-1.29.014 fixes6 featuresThis release focuses on performance improvements, notably avoiding alloc_zeroed in decompression, and introduces several new features including SQL support for array checks and DataFrame initialization from torch Tensors. Several bugs related to joins, parquet reading, and date/datetime conversions have also been fixed.
py-1.28.12 fixesThis release focuses on bug fixes related to Parquet reading and predicate filtering, alongside minor build system updates and documentation improvements.
py-1.28.024 fixes9 featuresThis release focuses heavily on performance improvements, particularly within the streaming engine, and introduces several new features like enhanced rolling statistics and GPU support for sink APIs. Numerous bug fixes address issues across data types, I/O operations, and streaming execution.
py-1.27.15 fixes1 featureThis release focuses on stability and correctness, fixing several bugs related to joins, caching, JSON writing, and deadlocks, alongside an enhancement to expression autocomplete in interactive environments.
py-1.27.0Breaking50 fixes15 featuresThis release introduces several performance improvements, new SQL functions, and enhances compatibility with Python I/O classes. It also contains several bug fixes, notably around streaming engine stability and type handling, alongside breaking changes to the `hist` function and Partition API.
py-1.26.011 fixes4 featuresThis release focuses heavily on performance improvements, including optimizations for binary hash tables and join operations, alongside numerous bug fixes across various functionalities like CSV parsing and aggregation.
py-1.25.238 fixes20 featuresThis release introduces significant performance enhancements, including common subplan elimination and linear-time rolling operations, alongside new features like lazy sinks and expanded support for streaming engines.
py-1.24.014 fixes10 featuresThis release focuses heavily on performance improvements, particularly in the new streaming engine, and introduces several enhancements like lossy decoding for CSVs and DataFrame.write_iceberg. Numerous bug fixes address stability issues across various operations including rolling statistics and sink phases.
py-1.23.036 fixes17 featuresThis release focuses heavily on performance improvements, especially around rolling operations and group-by scenarios, alongside numerous bug fixes across data types, I/O, and SQL functionality. New features include SQL DELETE support and enhanced streaming capabilities.
py-1.22.032 fixes29 featuresThis release focuses heavily on performance improvements across various operations, especially within the new streaming engine, and introduces significant enhancements to I/O capabilities, including better support for Unity Catalog and IO plugins. Several bugs related to type handling, aggregations, and specific functions like `Expr.over` and `top_k` have also been resolved.
rs-0.46.0Breaking48 fixes46 featuresThis release introduces the new Int128Type and brings significant performance enhancements across various operations, especially within the new streaming engine. Numerous bug fixes address issues related to Decimal types, serialization, and cloud storage interactions.
py-1.21.022 fixes24 featuresThis release focuses heavily on performance improvements through increased use of BitmapBuilder and enhancements across the streaming engine, including new CSV and NDJson sinks. Numerous bug fixes address issues related to Decimal types, slicing, joins, and error handling.
py-1.20.035 fixes15 featuresThis release focuses heavily on performance improvements across various areas, including streaming engine aggregations, serialization, and Rust/Python data conversion. It also introduces several new features like SQL support for NORMALIZE and enhancements to Parquet handling and cloud storage integration.
Common Errors
ArrowNotImplementedError2 reportsArrowNotImplementedError in Polars usually arises from compatibility issues between the Arrow library used by Polars and specific data types or features in your Parquet files or schemas. To fix this error, ensure your pyarrow version is compatible with the polars version and that pyarrow supports all datatypes present in your Parquet file; consider updating pyarrow or casting questionable columns (e.g., using `pl.col("my_col").cast(pl.Utf8)`) to more standard types before writing or reading your data. You may also need to explicitly specify the Arrow schema using `use_pyarrow=True` to control data type handling.
NoDataError2 reportsNoDataError in Polars often arises when operations, like CSV parsing or data filtering, result in an empty DataFrame, and `raise_if_empty=True` (the default). Fix this by setting `raise_if_empty=False` during operations like `scan_csv` or `read_csv` to return an empty DataFrame gracefully, or ensure the input data isn't empty if an empty frame is not desired, and make sure to properly infer the schema or provide one.
FileNotFoundError2 reportsFileNotFoundError in Polars usually arises when attempting to read or write files/directories that don't exist at the specified path. When reading, ensure the file path is correct and the file exists. When writing, create the necessary directories beforehand using `os.makedirs(path, exist_ok=True)` or enable automatic directory creation if the writing function supports it (e.g., `mkdir=True` in `sink_parquet`).
NotImplementedError2 reportsNotImplementedError in Polars usually indicates a specific function or operation hasn't been implemented for the given data type or execution engine (e.g., GPU, streaming). To fix it, either implement the missing functionality for the relevant dtype/engine combination or conditionally disable/route the operation to a supported alternative when the operation is not available, providing a clear error message explaining the limitation if that is the decision.
ValueError2 reportsValueError exceptions in Polars often arise from using empty or invalid format strings with string or datetime operations, especially when the GPU engine is enabled. To fix this, ensure that all format strings passed to functions like `.dt.strftime()` or `.str.strptime()` are valid and non-empty. Review the format string documentation for the specific function to understand the required structure.
ColumnNotFoundError2 reportsThe "ColumnNotFoundError" in polars arises when your code references a column name that doesn't exist in the DataFrame, often due to typos, incorrect casing, or failing to create the column before referencing it. Double-check all column names for accuracy and case-sensitivity; ensure the column exists by inspecting the DataFrame's schema or creating it with `pl.lit()` before use if necessary. If you use lazy evaluation, call `collect()` to materialize new columns before accessing them.
Related Data & ML Packages
An Open Source Machine Learning Framework for Everyone
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
scikit-learn: machine learning in Python
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Streamlit — A faster way to build and share data apps.
Subscribe to Updates
Get notified when new versions are released