py-1.31.0

Breaking Changes

📅 Jun 18, 2025📦 polarsView on GitHub →

⚠ 1 breaking✨ 12 features🐛 49 fixes⚡ 1 deprecations🔧 22 symbols

Summary

This release removes the old streaming engine, introduces DataType expressions and Iceberg positional delete support, and includes numerous performance optimizations and bug fixes across various operations.

⚠️ Breaking Changes

The old streaming engine has been removed. Users relying on the previous streaming implementation must update their code to use the new engine.

✨ New Features

Introduction of DataType expressions in Python.
Native implementation for Iceberg positional deletes.
Basic implementation of `DataTypeExpr` in the Rust DSL.
Added `required: bool` to `ParquetFieldOverwrites`.
Support serializing `name.map_fields`.
Support serializing `Expr::RenameAlias`.
Added `keys` column in `finish_callback`.
Added `extra_columns` parameter to `scan_parquet`.
Added CORR function to polars SQL.
Added per partition sort and finish callback to sinks.
Support descendingly-sorted values in `search_sorted()`.
Derive DSL schema.

🐛 Bug Fixes

Removed axis from `show_graph`.
Removed axis ticks in `show_graph`.
Restricted custom `aggregate_function` in `pivot` to `pl.element()`.
Fixed `SourceToken` leak in in-memory sink linearize.
Fixed panic when reading empty parquet with multiple boolean columns.
Raise ComputeError instead of panicking in `truncate` when mixing month/week/day/sub-daily units.
Materialized `list.eval` with unknown type.
Only set sorting flag for 1st column with PQ SortingColumns.
Fixed typo in AExprBuilder.
Fixed null return from var/std on scalar column.
Supported Datetime broadcast in `list.concat`.
Ensured projection pushdown maintains right table schema.
Added Null dtype support to arg_sort_by.
Raise error by default on invalid CSV quotes.
Fixed group_by mean and median returning all nulls for Decimal dtype.
Fixed hive partition pruning not filtering out `__HIVE_DEFAULT_PARTITION__`.
Fixed `AssertionError` when using `scan_delta()` on AWS with `storage_options`.
Fixed deadlock on `collect(background=True)` / `collect_concurrently()`.
Fixed incorrect null count in rolling_min/max.
Preserved `file://` in LazyFrame node traverser.
Respected column order in `register_io_source` schema.
Stopped calling unnest for objects implementing `__arrow_c_array__`.
Fixed incorrect output when using `sort` with `group_by` and `cum_sum`.
Implemented owned arithmetic for Int128.
Stopped schema-matching structs with different field counts.
Fixed confusing error message on duplicate row_index.
Added `include_nulls` to `Agg::Count` CSE check.
Fixed view buffer exceeding 2^32 - 1 bytes in concatenate_view.
Fixed incorrect result selecting `pl.len()` from `scan_csv` with `skip_lines`.
Allowed for IO plugins with reordered columns in streaming.
Fixed inconsistency in `str.zfill` method when string contained leading '+'.
Fixed integer underflow in `propagate_nulls`.
Fixed setting `compat_level=0` for `sink_ipc`.
Narrowed return type for `DataType.is_`, improving Pyright's type completeness.
Supported arrow Decimal32 and Decimal64 types.
Guarded against dictionaries being passed to projection keywords.
Updated arrow format.
Fixed filter pushdown to IO plugins.
Improved numeric stability for rolling_mean<f32>.
Guarded against invalid nested objects in 'map_elements'.
Allowed subclasses in type equality checking.
Returned early in `pl.Expr.__array_ufunc__` when only single input.
Added inline implodes in type coercion.
Added {top, bottom}_k_by to Series.
Corrected `int_ranges` to raise error on invalid inputs.
Stopped silently overflowing for temporal casts.
Fixed error using `write_csv` with `storage_options`.
Fixed schema resolution `.over(mapping_strategy="join")` with non-aggregations.
Ensured rename behaves the same as select.

🔧 Affected Symbols

scan_parquetpivottruncatelist.evalarg_sort_byrolling_minrolling_maxrolling_mean<f32>cum_sumstr.zfillpl.Expr.__array_ufunc__top_k_bybottom_k_byint_rangeswrite_csvregister_io_sourcescan_deltacollect(background=True)collect_concurrently()scan_csvrenameselect

⚡ Deprecations

The `allow_missing_columns` parameter in `scan_parquet` is deprecated in favor of the new `missing_columns` parameter.