Change8

py-1.27.0

Breaking Changes
📦 polarsView on GitHub →
2 breaking15 features🐛 50 fixes1 deprecations🔧 31 symbols

Summary

This release introduces several performance improvements, new SQL functions, and enhances compatibility with Python I/O classes. It also contains several bug fixes, notably around streaming engine stability and type handling, alongside breaking changes to the `hist` function and Partition API.

⚠️ Breaking Changes

  • The bottom interval is now closed in `hist`.
  • The Partition API arguments have changed from unspecified names to `base_path` and `file_path`.

Migration Steps

  1. If you were using the old Partition API, update calls to use `base_path` and `file_path` arguments.
  2. Review usage of `backward_fill` and `forward_fill` if you were relying on the deprecated interface.

✨ New Features

  • Added `SPLIT_PART` string function to the SQL interface.
  • Allowed scalar expression in `Expr.diff`.
  • Supported additional unsigned int aliases in the SQL interface.
  • Added `STRING_TO_ARRAY` function to the SQL interface.
  • Added `dt.is_business_day`.
  • Added an `eager` parameter to `pl.cov`.
  • Added support for `Int128` parsing/recognition to the SQL interface.
  • Added an `eager` parameter to `pl.coalesce`.
  • Added an `eager` parameter to `pl.corr`.
  • Allowed sinking to abstract python `io` and `fs` classes.
  • Added `add_alp_optimize_exprs` to `IRBuilder`.
  • Added `cat.slice`.
  • Supported growing schema if line length increases during csv schema inference.
  • Replaced thread unsafe `GilOnceCell` with `Mutex`.
  • Supported modified dsl in file cache.

🐛 Bug Fixes

  • Fixed implosion in aggregation.
  • Reduced GIL hold time for IO plugins in new-streaming.
  • Enhanced predicate validation and cast safety in `join_where`.
  • Handled Parquet files with compressed empty DataPage v2.
  • Fixed schema error during lowering.
  • Rewrote unroll of overlapping groups to mitigate out of range index panic.
  • Fixed incorrect rounding for very large/small numbers.
  • Allowed set input to `list.set_*` operations.
  • Fixed deadlock in join due to rayon nested task-stealing.
  • Marked `Expr.repeat_by` as elementwise.
  • Fixed csv serializer panic by supporting ScalarColumn in as_single_chunk.
  • Raised an error if a number doesn't have associated unit in duration strings.
  • Added `i128` as supertype to boolean.
  • Fixed panic when constructing DataFrame from pyarrow due to duplicate field names.
  • Added broadcasts and error messages for many elementwise operations.
  • Threw error for `n=0` on `list.gather_every`.
  • Threw error for unsupported rolling operations.
  • Error on unequal length `str.to_integer` arguments.
  • Made bottom interval closed in `hist` (also listed as breaking change).
  • Fixed relative path resolution for plugin libraries.
  • Avoided panic with striptime for out-of-bounds dates.
  • Fixed join revmaps for categoricals in `merge_sorted`.
  • Fixed glob expansion matching extra files.
  • Ensured SQL dot-notation for nested column fields resolves correctly.
  • Fixed Parquet filter performance regression from multiscan dispatch.
  • Fixed panic for unequal length `ewm_mean_by` arguments.
  • Added scalarity checks to `pl.repeat`.
  • Fixed type check of `n` parameter of `pl.repeat`.
  • Marked `bitwise_{count,leading,trailing}_{ones,zeros}` as elementwise.
  • Marked `pl.*_ranges` functions correctly as element-wise.
  • Correctly type checked `pl.arctan2`.
  • Marked `pl.business_day_count` as elementwise.
  • Checked input python type for `str.extract_groups`.
  • Checked types for `fill_char` in `str.pad_{start,end}`.
  • Marked `str.to_decimal` properly as non-elementwise.
  • Documented return type for `bin.encode` and `bin.decode`.
  • Reverted #22017 and improved block(\_in\_place)\_on doc comment.
  • Removed outdated depth warning.
  • Expression pl.concat was incorrectly marked as elementwise.
  • Used block\_in\_place\_on to start streaming.
  • Fixed panic on empty aggregation in streaming.
  • Error instead of panic for invalid durations in `dt.offset_by()` and `dt.round()`.
  • Raised error instead of silently appending NULL in NDJSON parsing.
  • Ensured AV is static before pushing to row buffer.
  • Fixed deadlock in new-streaming multiplexer.
  • Released GIL in `collect_with_callback`.
  • Fixed panic in new RegexCache.
  • Fixed type hint of `cs.exclude()` to be `SelectorType` instead of `Expr`.
  • Added correct deprecation warning for .str.concat.
  • Used absolute paths by defaults for plugins.

🔧 Affected Symbols

histPartition APIExpr.diffpl.covpl.coalescepl.corrIRBuildercat.slicebackward_fillforward_fillimplodejoin_wherelist.set_*Expr.repeat_bystr.to_integerlist.gather_everypl.repeatbitwise_{count,leading,trailing}_{ones,zeros}pl.*_rangespl.arctan2pl.business_day_countstr.extract_groupsstr.pad_{start,end}str.to_decimalbin.encodebin.decodeblock(\_in\_place)\_ondt.offset_by()dt.round().str.concatcs.exclude()

⚡ Deprecations

  • The duplicate interface for `backward_fill` and `forward_fill` has been deprecated.