Change8

rs-0.49.0

Breaking Changes
📦 polarsView on GitHub →
1 breaking14 features🐛 46 fixes🔧 45 symbols

Summary

This release removes the old streaming engine, introduces native Iceberg positional delete support, and includes numerous performance optimizations and bug fixes across various data types and operations.

⚠️ Breaking Changes

  • The old streaming engine has been removed. Users relying on the previous implementation must update their code to use the new streaming engine.

✨ New Features

  • Native implementation for Iceberg positional deletes.
  • Make match_chunks public.
  • Implement StructFunction expressions in into_py.
  • Basic implementation of DataTypeExpr in Rust DSL.
  • Added required: bool to ParquetFieldOverwrites.
  • Support serializing name.map_fields.
  • Support serializing Expr::RenameAlias.
  • Add keys column in finish_callback.
  • Add extra_columns parameter to scan_parquet.
  • Add CORR function to polars SQL.
  • Add per partition sort and finish callback to sinks.
  • Add and test DataFrame equality functionality.
  • Support descendingly-sorted values in search_sorted().
  • Derive DSL schema.

🐛 Bug Fixes

  • Restrict custom aggregate_function in pivot to pl.element().
  • Don't leak SourceToken in in-memory sink linearize.
  • Fix panic reading empty parquet with multiple boolean columns.
  • Raise ComputeError instead of panicking in truncate when mixing month/week/day/sub-daily units.
  • Materialize list.eval with unknown type.
  • Only set sorting flag for 1st column with PQ SortingColumns.
  • Typo in AExprBuilder.
  • Null return from var/std on scalar column.
  • Support Datetime broadcast in list.concat.
  • Ensure projection pushdown maintains right table schema.
  • Don't create i128 scalars if dtype-128 is not set.
  • Add Null dtype support to arg_sort_by.
  • Raise error by default on invalid CSV quotes.
  • Fix group_by mean and median returning all nulls for Decimal dtype.
  • Fix hive partition pruning not filtering out __HIVE_DEFAULT_PARTITION__.
  • Fix AssertionError when using scan_delta() on AWS with storage_options.
  • Fix deadlock on collect(background=True) / collect_concurrently().
  • Incorrect null count in rolling_min/max.
  • Preserve file:// in LazyFrame node traverser.
  • Respect column order in register_io_source schema.
  • Incorrect output when using sort with group_by and cum_sum.
  • Implement owned arithmetic for Int128.
  • Do not schema-match structs with different field counts.
  • Fix confusing error message on duplicate row_index.
  • Add include_nulls to Agg::Count CSE check.
  • View buffer exceeding 2^32 - 1 bytes in concatenate_view.
  • Fix incorrect size_hint() for FlatIter.
  • Fix incorrect result selecting pl.len() from scan_csv with skip_lines.
  • Allow for IO plugins with reordered columns in streaming.
  • Method str.zfill was inconsistent with Python and pandas when string contained leading '+'.
  • Integer underflow in propagate_nulls.
  • Fix cum_min and cum_max does not preserve inf or -inf values at series start.
  • Setting compat_level=0 for sink_ipc.
  • Support arrow Decimal32 and Decimal64 types.
  • Update arrow format.
  • Fix filter pushdown to IO plugins.
  • Improve numeric stability rolling_mean<f32>.
  • Allow subclasses in type equality checking.
  • Return early in pl.Expr.__array_ufunc__ when only single input.
  • Add inline implodes in type coercion.
  • Correct int_ranges to raise error on invalid inputs.
  • Set the sorted flag on Array after it is sorted.
  • Don't silently overflow for temporal casts.
  • Fix error using write_csv with storage_options.
  • Schema resolution .over(mapping_strategy="join") with non-aggregations.
  • Ensure rename behaves the same as select.

🔧 Affected Symbols

ParquetFieldOverwritesFetchedCredentialsCachescan_parquetsearch_sortedpivotSourceTokentruncatelist.evalPQ SortingColumnsAExprBuildervar/stdlist.concatarg_sort_byscan_delta()collect(background=True)collect_concurrently()rolling_min/maxLazyFrameregister_io_sourcesortgroup_bycum_sumInt128row_indexAgg::Countconcatenate_viewFlatIterpl.len()scan_csvstr.zfillpropagate_nullscum_mincum_maxsink_ipcrolling_mean<f32>pl.Expr.__array_ufunc__int_rangesArraywrite_csvpl.cumulative_evalexpr.metaRenameAliasFnLogicalCategoriesCategoricalMapping