Change8

rs-0.47.0

Breaking Changes
📦 polarsView on GitHub →
1 breaking51 features🐛 3 fixes🔧 12 symbols

Summary

This release introduces significant performance improvements across the streaming engine, new SQL functions, and enhanced support for array types and various sink configurations. A breaking change was made to how the bottom interval is handled in `hist`.

⚠️ Breaking Changes

  • Make bottom interval closed in `hist` (#22090)

✨ New Features

  • Enable common subplan elimination across plans in `collect_all` (#21747)
  • Add lazy sinks (#21733)
  • Add `PartitionByKey` for new streaming sinks (#21689)
  • Enable new streaming memory sinks by default (#21589)
  • Support grouping by `pl.Array` (#22575)
  • Support BinaryOffset serde (#22528)
  • Show physical stage graph (#22491)
  • Add structure for dispatching iceberg to native scans (#22405)
  • Add SQL support for checking array values with `IN` and `NOT IN` expressions (#22487)
  • Add `RoundMode` for Decimal and Float (#22248)
  • Add `rolling_kurtosis` (#22335)
  • Add `.sort(nulls_last=True)` to booleans, categoricals and enums (#22300)
  • Add rolling min/max for temporals (#22271)
  • Support literal:list agg (#22249)
  • Support `implode + agg` (#22230)
  • Dispatch scans to new-streaming by default (#22153)
  • Expose `FunctionIR::FastCount` in the python visitor (#22195)
  • Add `SPLIT_PART` string function to the SQL interface (#22158)
  • Allow scalar expr in `Expr.diff` (#22142)
  • Support additional unsigned int aliases in the SQL interface (#22127)
  • Add `STRING_TO_ARRAY` function to the SQL interface (#22129)
  • Add dt.is_business_day (#21776)
  • Add support for `Int128` parsing/recognition to the SQL interface (#22104)
  • Allow sinking to abstract python `io` and `fs` classes (#21987)
  • Add `add_alp_optimize_exprs` to `IRBuilder` (#22061)
  • Add `cat.slice` (#21971)
  • Support growing schema if line lenght increases during csv schema inference (#21979)
  • Add support for io-plugins in new-streaming (#21870)
  • Add `PartitionParted` (#21788)
  • Add DoubleEndedIterator for CatIter (#21816)
  • Add `polars_testing` folder with relevant files and `add_series_equal!()` functionality (#21722)
  • Allow to use `repeat_by` with (nested) lists and structs (#21206)
  • Add support for rolling_(sum/min/max) for booleans through casting (#21748)
  • Support multi-column sort for all nested types and nested search-sorted (#21743)
  • Add `mkdir` flag to sinks (#21717)
  • Enable joins on list/array dtypes (#21687)
  • Add a config option to specify the default engine to attempt to use during lazyframe calls (#20717)
  • Support all elementwise functions in IO plugin predicates (#21705)
  • Stabilize Enum datatype (#21686)
  • Support Polars int128 in from arrow (#21688)
  • Cloud support for new-streaming scans and sinks (#21621)
  • Add len method to arr (#21618)
  • Closeable files on unix (#21588)
  • Add new `PartitionMaxSize` sink (#21573)
  • Implement `unpack_dtypes()` functionality with unit tests (#21574)
  • Support engine callback for `LazyFrame.profile` (#21534)
  • Dispatch new-streaming CSV negative slice to separate node (#21579)
  • Add NDJSON source to new streaming engine (#21562)
  • Add lossy decoding to `read_csv` for non-utf8 encodings (#21433)
  • Add 'nulls_equal' parameter to `is_in` (#21426)
  • Support writing `Time` type in json (#21454)

🐛 Bug Fixes

  • Fix quadratic behavior when casting Enums (#22008)
  • Fix replace flags (#21731)
  • Fix pathologic `rolling + group-by` performance and memory explosion (#21403)

🔧 Affected Symbols

histcollect_allpl.ArrayDataFrameSeriestorch.TensorExpr.diffdt.is_business_daycat.slicerepeat_byis_inLazyFrame.profile