py-1.33.0
Breaking Changes📦 polarsView on GitHub →
⚠ 1 breaking✨ 17 features🐛 50 fixes⚡ 1 deprecations🔧 39 symbols
Summary
This release focuses heavily on performance improvements by lowering many operations natively to the streaming engine and optimizing the query planner. A significant breaking change involves making previously eager Exprs lazy compatible.
⚠️ Breaking Changes
- Removed, deprecated, or changed eager Exprs to be lazy compatible. Users relying on eager evaluation of expressions that are now lazy might see changes in execution behavior or errors if they expected immediate results.
✨ New Features
- Native streaming for int_range with len or count.
- Lowered arg_unique natively to the streaming engine.
- Lowered arg_where natively to the streaming engine.
- Lowered Expr.shift to the streaming engine.
- Lowered order-preserving groupby to the streaming engine.
- Added CSE for custom io sources using pointer for hashing.
- Allow pl.Expr.log to take in an expression.
- Added caching to user credential providers.
- Exposed mkdir parameter on write_parquet.
- Implemented diff() in the streaming engine.
- Enabled Expr.diff(n) for negative n.
- Allow upcasting null-typed columns to nested column types in scans.
- Dropped PyArrow requirement for write_database with the ADBC engine.
- Added LazyFrame.pipe_with_schema.
- Added cum_* as native streaming nodes.
- Added peak\_{min,max} support for booleans.
- Added DataFrame.map_columns for eager evaluation.
🐛 Bug Fixes
- Fixed invalid conversion from non-bit numpy bools.
- Made dt.epoch('s') serializable.
- Made Expr.rechunk serializable.
- Fixed schema mismatch for 'log' operation.
- Fixed incorrect first/last aggregate in streaming engine.
- Fixed group offsets in sliced groups.
- Fixed panic in inexact date(time) conversion.
- Kept DSL cache after serialization and deserialization.
- Sanitized and warned about eval usage.
- Corrected incorrect default in from_pandas overload for include_index.
- Fixed unique with keep="none" in new optimization pass.
- Corrected size limits for Decimal cast.
- Fixed unordered unions in check order observing pass.
- Fixed dtype for slice on Literal in agg context.
- Fixed incorrect filter(lit(True)) when scanning hive.
- Fixed in-memory group_by on 128-bit integers.
- Fixed panic in gather inside groupby with invalid indices.
- Released the GIL in map_groups.
- Removed extra explode in LazyGroupBy.{head,tail}.
- Fixed panic in polars cloud CSV scan.
- Fixed panic when loading categorical columns from IO plugin.
- Fixed credential provider did not auto-init on partition sinks.
- Fixed engine type for concat_list on AggScalar implode.
- Fixed rolling_mean handling centered weights with len(values) < window_size.
- Fixed reading is_in predicate for Parquet plain strings.
- Added support for native DuckDB connection in read_database.
- Made PyCategories pickleable.
- Removed unused unsound function to_mutable_slice.
- Fixed PyO3 extension types giving compat_level errors.
- Allowed non-elementwise by in top_k.
- Fixed sort_by for group_by_dynamic context.
- Fixed input-independent length aggregations in streaming.
- Released GIL when iterating df in to_arrow.
- Respected non-elementwise join_where conditions.
- Fixed mismatched pytest test collection error.
- Resolved schema mismatch for div on Boolean.
- Fixed from_repr parsing of negative durations.
- Made group_by/partition_by iterator keys tuple[Any, ...] to enable tuple-unpacking.
- Kept name when doing empty group-aware aggregation.
- Used Implode instead of reshape_list.
- Fixed rolling mean with weights incorrect when min_samples < window_size.
- Allowed merge_sorted for all types.
- Included datatypes in row_encode expression.
- Included UDF materialized type in serialization.
- Corrected .rolling() output type for non-aggregations.
- Corrected planner output schema for join_asof.
- Corrected output for fold and reduce.
- Fixed Expr.meta.output_name for struct fields.
- Ensured upcast operations on pl.Date default to microsecond precision.
- Fixed planner output type for mean with strange input type.
🔧 Affected Symbols
Exprint_rangearg_uniquearg_whereExpr.shiftgroupbypl.Series.shiftpl.Expr.logwrite_parquetdiff()Expr.diffpl.DataFrame.map_columnsdt.epochExpr.rechunkfrom_pandaspl.LazyFrame.pipe_with_schemacum_*peak\_{min,max}map_groupsLazyGroupBy.headLazyGroupBy.tailgatherto_arrowgroup_by_dynamicgroup_bypartition_byrolling_meanread_databasePyCategoriesto_mutable_slicetop_ksort_bymerge_sortedrow_encodefoldreduceExpr.meta.output_namepl.Datemean⚡ Deprecations
- Deprecation warning added for pl.Series.shift(Null).