py-1.41.0
📦 polarsView on GitHub →
✨ 10 features🐛 49 fixes⚡ 1 deprecations🔧 23 symbols
Summary
This release introduces new features like LazyFrame.gather and stabilizes the streaming engine, alongside numerous performance improvements and bug fixes across data reading and expression evaluation.
✨ New Features
- Add LazyFrame.gather (#27501)
- Use true division for the / operator in Polars SQL (#27391)
- Add Rust backend for Expr.has_nulls (#27590)
- Stabilize float16 (#27607)
- Add Expr.is_empty (#27583)
- Add support for the SQL FILTER clause for aggregate functions, and STRING_AGG (#27564)
- Add support for the SQL FILTER clause for aggregate functions, and STRING_AGG (#27564)
- Add LazyFrame.gather (#27501)
- Add null_on_oob in {Expr/Series}.gather (#27327)
- Stabilize streaming engine (#27497)
🐛 Bug Fixes
- Panic in scan of empty IPC with slice (#27708)
- Persist object_store rebuild state in cache (#27707)
- Sort flag on GroupsType only applies to first element (#27684)
- Invalid unwrap_unchecked when length isn't exact (#27685)
- Don't unwrap channel send in streaming join_asof (#27688)
- Fix merge_sorted panic when List in frame (#27568)
- Put AsOf join buffered Morsels back the front of the deque if we cannot process them rn (#27658)
- Fix skip_batches logic for NaN (#27673)
- Raise TypeError when calling next() directly on GroupBy objects (#27562)
- Data type comparison for extension types (#27632)
- Share last-morsel split budget across files in streaming multi-scan (#27630)
- Bytes scalars were not being broadcast in dataframe constructor (#27621)
- Reset the sort-options in Series::is_sorted() after row-encoding columns (#27614)
- Rayon deadlock with re-entrant io sources (#27600)
- Don't push negative-offset slices through HConcat (#27570)
- Logic error in streaming is_empty (#27602)
- Fix incorrect CSE with large is_in literal (#27575)
- AnonymousFunction can qualify as SQL aggregator (#26986)
- Fix CSPE panic in cloud (#27594)
- Set merge-join streaming node to Finished if its sending port is Done (#27572)
- Widen decimal precision on sum aggregation at runtime (#27579)
- Fix str.to_time was raising unnecessarily when input was all nulls (#27574)
- Prevent panic when switching from one extension dtype to another (#27566)
- Fix DataFrame.write_database(..., if_table_exists="append", engine="adbc") not handling missing tables correctly (#26913)
- Ensure json_decode doesn't fail for Date and Time string deserialization (#27554)
- Incorrect RUSTFLAGS passing in Makefile (#27555)
- Fix panic on reading IPC with 0-row compressed bitmap (#27551)
- Set HEAD_RESPONSE_SIZE_ESTIMATE to 0 (#27548)
- Fix lazy concat horizontal didn't raise on mismatching heights after projection pushdown (#27506)
- Prevent join panic when suffix="" and coalesce=True (#27376)
- Do not make a FastCount for csv if pre_slice is set (#27536)
- Support duplicate names in over (#27544)
- Reassign sequence numbers when distributing input morsels in streaming AsOf join node (#27538)
- Do not reverse dataframes when sorting with all-null key columns (#27517)
- Incorrect length check on streaming zip (#27505)
- Remove invalid type annotation Sequence[int] from DataFrame.__setitem__ key (#27355)
- Respect nulls_last for descending over(order_by) in group_by().agg() (#27486)
- Fix perf regression in scan_csv select(len()) when collected on streaming engine (#27504)
- Harden extend strictness (#27476)
- Prevent deadlock when using to_arrow() in a multithreaded context (#27472)
- Do not flatten sliced union (#27466)
- Prevent deadlock when using to_pandas() in multithreaded context (#27451)
- Struct rechunk bug and add Series::with_validity (#27446)
- Handle column indexing in read_parquet/read_csv with pyarrow reader (#27397)
- Export enum as ordered dictionary to arrow (#27432)
- Ensure sample() respects shuffle=False (#27248)
- Return empty DataFrame from concat_list with lit and empty column (#27305)
- Read parquet MAP columns without LogicalType annotation (#27404)
- Raise DuplicateError on parquet files with duplicate column names (#27399)
Affected Symbols
LazyFrame.gatherStringCacheExpr.has_nullsExpr.is_emptylist.shiftjson_decodeto_numpyselect(len())is_inlist.sliceDataFrame.write_databasestr.to_timeDataFrame.__setitem__Series::is_sorted()to_arrow()to_pandas()Series::with_validityread_parquetread_csvsample()concat_listDataFrame.__array__Series.__array__
⚡ Deprecations
- Deprecate the StringCache (#27580)