py-1.35.0
📦 polarsView on GitHub →
✨ 22 features🐛 46 fixes⚡ 2 deprecations🔧 83 symbols
Summary
This release focuses heavily on performance improvements across group-by operations, native aggregations, and data parsing, while stabilizing Decimal support and introducing several new features like ewm_mean in streaming and enhanced list/array aggregations. Two functions, Expr.agg_groups() and pl.groups(), have been deprecated.
Migration Steps
- If you rely on Expr.agg_groups() or pl.groups(), update your code to use the new recommended alternatives (not explicitly listed, but implied by deprecation).
- If you were using string values declared with temporal dtype during Series initialization, note that its behavior is now consistent with DataFrame initialization.
- If you encounter issues with Parquet empty struct roundtripping, check if setting the relevant environment variable resolves it.
✨ New Features
- Decimal support is stabilized.
- Support for ewm_mean() in the streaming engine.
- Improved row-count estimates.
- Introduction of a remote Polars MCP server.
- Allow local scans on polars cloud (configurable).
- Added Expr.item to strictly extract a single value from an expression.
- Added environment variable to roundtrip empty struct in Parquet.
- Fast-count for scan_iceberg().select(len()).
- Added glob parameter to scan_ipc.
- Added list.agg and arr.agg.
- Implemented {Expr,Series}.rolling_rank().
- Make Series init consistent with DataFrame init for string values declared with temporal dtype.
- Support MergeSorted in CSPE.
- Duration/interval string parsing is 2-5x faster.
- Recursively apply CSPE.
- Added streaming engine per-node metrics.
- Added arr.eval.
- Add union() function for unordered concatenation.
- Add name.replace to the set of column rename options.
- Support np.ndarray -> AnyValue conversion.
- Allow duration strings with positive leading "+".
- Add support for UInt128 to pyo3-polars.
🐛 Bug Fixes
- Re-enabled CPU feature check before import.
- Implemented read_excel workaround for fastexcel/calamine issue loading a column subset from a named table.
- Corrected correctness of any(ignore_nulls) and OOB in all.
- Fixed streaming any/all with ignore_nulls=False.
- Fixed incorrect join_asof on a casted expression.
- Optimized memory on rolling groups in ApplyExpr.
- Fallback Pyarrow scan to in-memory engine.
- Make Operator::swap_operands return correct operators for Plus, Minus, Multiply and Divide.
- Capitalized letters after numbers in to_titlecase.
- Preserved null values in pct_change.
- Raised length mismatch on over with sliced groups.
- Checked duplicate name in transpose.
- Followed Kleene logic in any / all for group-by.
- Do not optimize cross join to iejoin if order maintaining.
- Fixed typing of scan_parquet partially unknown.
- Properly released the GIL for read_parquet_metadata.
- Broadcasted partition_by columns in over expression.
- Cleared index cache on stacked df.filter expressions.
- Fixed 'explode' mapping strategy on scalar value.
- Fixed repeated with_row_index() after scan() silently ignored.
- Correctly returned min and max for enums in groupby aggregation.
- Refactored BinaryExpr in group_by dispatch logic.
- Fixed aggstate for gather.
- Kept scalars for length preserving functions in group_by.
- Fixed duplicate select panic.
- Fixed inconsistency of list.sum() result type with None values.
- Fixed division by zero in Expr.dt.truncate.
- Fixed potential deadlock in __arrow_c_stream__.
- Allowed double aggregations in group-by contexts.
- Fixed Series.shrink_dtype for i128/u128.
- Fixed dtype in EvalExpr.
- Allowed aggregations on AggState::LiteralScalar.
- Dispatched to group_aware for fallible expressions with masked out elements.
- Fixed error for arr.sum() on small integer Array dtypes containing nulls.
- Fixed regression on write_database() to Snowflake due to unsupported string view type.
- Fixed XOR did not follow kleene when one side is unit-length.
- Fixed incorrect precision in Series.str.to_decimal.
- Used overlapping instead of rolling.
- Fixed iterable on dynamic_group_by and rolling object.
- Used Kahan summation for in-memory groupby sum/mean.
- Released GIL in PythonScan predicate evaluation.
- Fixed type error in bitmask::nth_set_bit_u64.
- Added Expr.sign for Decimal datatype.
- Corrected str.replace with missing pattern.
- Ensured schema_overrides is respected when loading iterable row data.
- Supported decimal_comma on Decimal type in write_csv.
🔧 Affected Symbols
foldhashhashbrownuniquen_uniquetake{_slice,}_uncheckedskewkurtosisbitwise_*group_by_dynamicPyIcebergfilter/drop_nulls/drop_nanscumulative_evalDslPlannull_countanyallreversearrow/parquet/IPC/pickle exportapprox_n_uniqueDuration/interval string parsingfirst/last aggregation on Decimals, Categoricals and EnumsBitMapIter::nthewm_mean()scan_icebergExpr.itemscan_ipclist.aggarr.aggExpr.rolling_rank()Series.rolling_rank()read_database_uriSeries initialization (string values with temporal dtype)CSPEarr.evalread_databaseiter_batchesrolling_(sum|mean)Data.unnestLazyFrame.unnestname.replacenp.ndarray -> AnyValue conversionDataFrame load from list of dicts (schema_overrides cast)UInt128read_excelfastexcel/calaminejoin_asofApplyExprscan_parquetread_parquet_metadatapartition_byover expressiondf.filterexplodewith_row_index()scan()BinaryExprgatherrange featuredtype-array featurelist.sum()Expr.dt.truncate__arrow_c_stream__Series.shrink_dtypeEvalExprAggState::LiteralScalararr.sum()write_database() to SnowflakeXOR operationSeries.str.to_decimalExpr.signstr.replacewrite_csvExpr.agg_groups()pl.groups()LazyFrame.set_sortedFunctionIR::HintGroupByPartitionedelement()AExpr::ElementExpr::ElementScanOptionsnew_from_ipcdelta te⚡ Deprecations
- Expr.agg_groups() is deprecated.
- pl.groups() is deprecated.