rs-0.47.0
Breaking Changes📦 polarsView on GitHub →
⚠ 1 breaking✨ 51 features🐛 3 fixes🔧 12 symbols
Summary
This release introduces significant performance improvements across the streaming engine, new SQL functions, and enhanced support for array types and various sink configurations. A breaking change was made to how the bottom interval is handled in `hist`.
⚠️ Breaking Changes
- Make bottom interval closed in `hist` (#22090)
✨ New Features
- Enable common subplan elimination across plans in `collect_all` (#21747)
- Add lazy sinks (#21733)
- Add `PartitionByKey` for new streaming sinks (#21689)
- Enable new streaming memory sinks by default (#21589)
- Support grouping by `pl.Array` (#22575)
- Support BinaryOffset serde (#22528)
- Show physical stage graph (#22491)
- Add structure for dispatching iceberg to native scans (#22405)
- Add SQL support for checking array values with `IN` and `NOT IN` expressions (#22487)
- Add `RoundMode` for Decimal and Float (#22248)
- Add `rolling_kurtosis` (#22335)
- Add `.sort(nulls_last=True)` to booleans, categoricals and enums (#22300)
- Add rolling min/max for temporals (#22271)
- Support literal:list agg (#22249)
- Support `implode + agg` (#22230)
- Dispatch scans to new-streaming by default (#22153)
- Expose `FunctionIR::FastCount` in the python visitor (#22195)
- Add `SPLIT_PART` string function to the SQL interface (#22158)
- Allow scalar expr in `Expr.diff` (#22142)
- Support additional unsigned int aliases in the SQL interface (#22127)
- Add `STRING_TO_ARRAY` function to the SQL interface (#22129)
- Add dt.is_business_day (#21776)
- Add support for `Int128` parsing/recognition to the SQL interface (#22104)
- Allow sinking to abstract python `io` and `fs` classes (#21987)
- Add `add_alp_optimize_exprs` to `IRBuilder` (#22061)
- Add `cat.slice` (#21971)
- Support growing schema if line lenght increases during csv schema inference (#21979)
- Add support for io-plugins in new-streaming (#21870)
- Add `PartitionParted` (#21788)
- Add DoubleEndedIterator for CatIter (#21816)
- Add `polars_testing` folder with relevant files and `add_series_equal!()` functionality (#21722)
- Allow to use `repeat_by` with (nested) lists and structs (#21206)
- Add support for rolling_(sum/min/max) for booleans through casting (#21748)
- Support multi-column sort for all nested types and nested search-sorted (#21743)
- Add `mkdir` flag to sinks (#21717)
- Enable joins on list/array dtypes (#21687)
- Add a config option to specify the default engine to attempt to use during lazyframe calls (#20717)
- Support all elementwise functions in IO plugin predicates (#21705)
- Stabilize Enum datatype (#21686)
- Support Polars int128 in from arrow (#21688)
- Cloud support for new-streaming scans and sinks (#21621)
- Add len method to arr (#21618)
- Closeable files on unix (#21588)
- Add new `PartitionMaxSize` sink (#21573)
- Implement `unpack_dtypes()` functionality with unit tests (#21574)
- Support engine callback for `LazyFrame.profile` (#21534)
- Dispatch new-streaming CSV negative slice to separate node (#21579)
- Add NDJSON source to new streaming engine (#21562)
- Add lossy decoding to `read_csv` for non-utf8 encodings (#21433)
- Add 'nulls_equal' parameter to `is_in` (#21426)
- Support writing `Time` type in json (#21454)
🐛 Bug Fixes
- Fix quadratic behavior when casting Enums (#22008)
- Fix replace flags (#21731)
- Fix pathologic `rolling + group-by` performance and memory explosion (#21403)
🔧 Affected Symbols
histcollect_allpl.ArrayDataFrameSeriestorch.TensorExpr.diffdt.is_business_daycat.slicerepeat_byis_inLazyFrame.profile