py-1.21.0
📦 polarsView on GitHub →
✨ 24 features🐛 22 fixes🔧 22 symbols
Summary
This release focuses heavily on performance improvements through increased use of BitmapBuilder and enhancements across the streaming engine, including new CSV and NDJson sinks. Numerous bug fixes address issues related to Decimal types, slicing, joins, and error handling.
Migration Steps
- If you were relying on `nest-asyncio`, remove the dependency as it is replaced by custom logic.
✨ New Features
- Stabilize methods/functions
- Add `linear_space`
- Improve string → temporal parsing in `read_excel` and `read_ods`
- Implement df.unique() on new-streaming engine
- Experimental credential provider support for Delta read/scan/write
- Allow column expressions in DataFrame `unnest`
- Auto-initialize Python credential providers in more cases
- Add unique operations for Decimal dtype
- Add NDJson sink for the new streaming engine
- Support nested keys in window functions
- Add CSV sink for the new streaming engine
- Periodically check python signals ('CTRL-C' handling)
- Experimental unity catalog client
- Support cumulative aggregations for `Decimal` dtype
- Account for SurrealDB Python API updates (handle both `SurrealDB` and `AsyncSurrealDB` classes) in `read_database`
- Drop `nest-asyncio` in favor of custom logic
- Improve window function caching strategy
- Support `lakefs://` URI for delta scanner
- Additional support for loading `numpy.float16` values (as Float32)
- Implement negative slice for new streaming IPC
- Debloat Series bitops
- Reduce python map bloat
- Dispatch to the in-mem engine for `AExpr::Gather`
- Dispatch to the in-memory engine for multifile sources
🐛 Bug Fixes
- Warn if asof keys not sorted
- Ensure explicit values given to `column_widths` override autofit in `write_excel`
- Avoid name collisions and panicking in object conversion
- Incorrect scale used in `log` and `exp` for Decimal type
- Don't deep clone manuallydrop in GroupsPosition
- Fix DuplicateError when selecting columns after `join_where` or cross join + filter
- Incorrect `Decimal` value for `fill_null(strategy="one")`
- Fix one edge case (out of many) of int128 literals not working
- Add height check to frame-level row indexing when key is int
- Remove `assert` that panics on `group_by` followed by `head(n)`, where `n` is larger then the frame height
- Selectors should raise on `+` between themselves
- Fix panic `InvalidHeaderValue` scanning from S3 on Windows
- Fix `clip` for `Decimal` returning wrong values
- Incorrect height from slicing after projecting only the file path column
- Shift mask when skipping Bitpacked values in Parquet
- Error instead of truncate if length mismatch for several `str` functions
- Support cumulative aggregations for `Decimal` dtype (Note: This appears in both features and bug fixes, keeping as is)
- Allow `is_in` values to be given as custom `Collection`
- Propagate null instead of panicking in `pl.repeat_by()`
- Do not print sensitive information to output on `POLARS_VERBOSE`
- Ignore file cache allocation error if `fallocate()` is not permitted
- Incorrect logic in `assert_series_equal` for infinities
🔧 Affected Symbols
read_excelread_odsdf.unique()unnestread_databaseSurrealDBAsyncSurrealDBnest-asynciodelta scannernumpy.float16write_excelcolumn_widthslogexpDecimalfill_nullgroup_byhead(n)clippl.repeat_by()assert_series_equalAExpr::Gather