Change8

py-1.41.0

📦 polarsView on GitHub →
10 features🐛 49 fixes1 deprecations🔧 23 symbols

Summary

This release introduces new features like LazyFrame.gather and stabilizes the streaming engine, alongside numerous performance improvements and bug fixes across data reading and expression evaluation.

✨ New Features

  • Add LazyFrame.gather (#27501)
  • Use true division for the / operator in Polars SQL (#27391)
  • Add Rust backend for Expr.has_nulls (#27590)
  • Stabilize float16 (#27607)
  • Add Expr.is_empty (#27583)
  • Add support for the SQL FILTER clause for aggregate functions, and STRING_AGG (#27564)
  • Add support for the SQL FILTER clause for aggregate functions, and STRING_AGG (#27564)
  • Add LazyFrame.gather (#27501)
  • Add null_on_oob in {Expr/Series}.gather (#27327)
  • Stabilize streaming engine (#27497)

🐛 Bug Fixes

  • Panic in scan of empty IPC with slice (#27708)
  • Persist object_store rebuild state in cache (#27707)
  • Sort flag on GroupsType only applies to first element (#27684)
  • Invalid unwrap_unchecked when length isn't exact (#27685)
  • Don't unwrap channel send in streaming join_asof (#27688)
  • Fix merge_sorted panic when List in frame (#27568)
  • Put AsOf join buffered Morsels back the front of the deque if we cannot process them rn (#27658)
  • Fix skip_batches logic for NaN (#27673)
  • Raise TypeError when calling next() directly on GroupBy objects (#27562)
  • Data type comparison for extension types (#27632)
  • Share last-morsel split budget across files in streaming multi-scan (#27630)
  • Bytes scalars were not being broadcast in dataframe constructor (#27621)
  • Reset the sort-options in Series::is_sorted() after row-encoding columns (#27614)
  • Rayon deadlock with re-entrant io sources (#27600)
  • Don't push negative-offset slices through HConcat (#27570)
  • Logic error in streaming is_empty (#27602)
  • Fix incorrect CSE with large is_in literal (#27575)
  • AnonymousFunction can qualify as SQL aggregator (#26986)
  • Fix CSPE panic in cloud (#27594)
  • Set merge-join streaming node to Finished if its sending port is Done (#27572)
  • Widen decimal precision on sum aggregation at runtime (#27579)
  • Fix str.to_time was raising unnecessarily when input was all nulls (#27574)
  • Prevent panic when switching from one extension dtype to another (#27566)
  • Fix DataFrame.write_database(..., if_table_exists="append", engine="adbc") not handling missing tables correctly (#26913)
  • Ensure json_decode doesn't fail for Date and Time string deserialization (#27554)
  • Incorrect RUSTFLAGS passing in Makefile (#27555)
  • Fix panic on reading IPC with 0-row compressed bitmap (#27551)
  • Set HEAD_RESPONSE_SIZE_ESTIMATE to 0 (#27548)
  • Fix lazy concat horizontal didn't raise on mismatching heights after projection pushdown (#27506)
  • Prevent join panic when suffix="" and coalesce=True (#27376)
  • Do not make a FastCount for csv if pre_slice is set (#27536)
  • Support duplicate names in over (#27544)
  • Reassign sequence numbers when distributing input morsels in streaming AsOf join node (#27538)
  • Do not reverse dataframes when sorting with all-null key columns (#27517)
  • Incorrect length check on streaming zip (#27505)
  • Remove invalid type annotation Sequence[int] from DataFrame.__setitem__ key (#27355)
  • Respect nulls_last for descending over(order_by) in group_by().agg() (#27486)
  • Fix perf regression in scan_csv select(len()) when collected on streaming engine (#27504)
  • Harden extend strictness (#27476)
  • Prevent deadlock when using to_arrow() in a multithreaded context (#27472)
  • Do not flatten sliced union (#27466)
  • Prevent deadlock when using to_pandas() in multithreaded context (#27451)
  • Struct rechunk bug and add Series::with_validity (#27446)
  • Handle column indexing in read_parquet/read_csv with pyarrow reader (#27397)
  • Export enum as ordered dictionary to arrow (#27432)
  • Ensure sample() respects shuffle=False (#27248)
  • Return empty DataFrame from concat_list with lit and empty column (#27305)
  • Read parquet MAP columns without LogicalType annotation (#27404)
  • Raise DuplicateError on parquet files with duplicate column names (#27399)

Affected Symbols

⚡ Deprecations

  • Deprecate the StringCache (#27580)