py-1.34.0-beta.4

📅 Sep 28, 2025📦 polars

✨ 14 features🐛 33 fixes🔧 13 symbols

Summary

This release introduces new batch collection methods (`LazyFrame.{sink,collect}_batches`), significant performance optimizations across scanning and expressions, and numerous bug fixes, especially around Iceberg and streaming operations.

✨ New Features

Added LazyFrame.{sink,collect}_batches methods.
Ensured deterministic import order for Python Polars package variants.
Support scanning from file:/path URIs.
Log which file the schema was sourced from, and which file caused an extra column error.
Added support to display lazy query plan in marimo notebooks without needing to install matplotlib or mermaid.
Added unstable `hidden_file_prefix` parameter to `scan_parquet`.
Use fixed-scale Decimals.
Added support for unsigned 128-bit integers.
Added unstable `pl.Config.set_default_credential_provider`.
Roundtrip `BinaryOffset` type through Parquet.
Added opt-in unstable functionality to load interval types as `Struct`.
Support reading parquet metadata from cloud storage.
Added user guide section on AWS role assumption.
Support `unique` / `n_unique` / `arg_unique` for `array` columns.

🐛 Bug Fixes

Widen `from_dicts` to `Iterable[Mapping[str, Any]]`.
Fix `unsupported arrow type Dictionary` error in `scan_iceberg()`.
Raise Exception instead of panic when unnest on non-struct column.
Include missing feature dependency from `polars-stream/diff` to `polars-plan/abs`.
Fix newline escaping in streaming show_graph.
Do not allow inferring (-1) the dimension on any `Expr.reshape` dimension except the first.
Sink batches early stop on in-memory engine.
More precisely model expression ordering requirements.
Fix panic in zero-weight rolling mean/var.
Fix Decimal <-> literal arithmetic supertype rules.
Match various aggregation return types in the streaming engine with the in-memory engine.
Validate list type for list expressions in planner.
Fix `scan_iceberg()` storage options not taking effect.
Have `log()` prioritize the leftmost dtype for its output dtype.
CSV pl.len() was incorrect.
Add support for float inputs for duration types.
Roundtrip empty string through hive partitioning.
Fix potential OOB writes in unaligned IPC read.
Fix regression error when scanning AWS presigned URL.
Make `PlPath::join` for cloud paths replace on absolute paths.
Correct dtype for cum_agg in streaming engine.
Restore support for np.datetime64() in pl.lit().
Ignore Iceberg list element ID if missing.
Fix panic on streaming full join with coalesce.
Fix `AggState` on `all_literal` in `BinaryExpr`.
Show IR sort options in `explain`.
Fix schema on `ApplyExpr` with single row `literal` in agg context.
Fix planner schema for dividing `pl.Float32` by int.
Fix panic scanning from AWS legacy global endpoint URL.
Fix `iterable_to_pydf(..., infer_schema_length=None)` to scan all data.
Do not propagate struct of nulls with null.
Be stricter with invalid NDJSON input when `ignore_errors=False`.
Implement `approx_n_unique` for temporal dtypes and Null.

🔧 Affected Symbols

LazyFrame.{sink,collect}_batchesscan_icebergExpr.reshapepl.len()pl.lit()AggStateBinaryExprexplainApplyExprpl.Float32iterable_to_pydfpl.Config.set_default_credential_providerscan_parquet