rs-0.53.0
📦 polarsView on GitHub →
✨ 60 features🐛 1 fixes🔧 21 symbols
Summary
This release focuses heavily on performance improvements across various operations, especially in streaming contexts, and introduces significant enhancements to SQL support and data type handling, including new Extension types.
Migration Steps
- If you rely on specific behavior for qualified wildcard columns in SQL projections, review the improved disambiguation logic.
✨ New Features
- Add Extension types.
- Support for the SQL `FETCH` clause.
- Add get() to retrieve a byte from binary data.
- Support annoymous agg in-mem.
- Add unstable `arrow_schema` parameter to `sink_parquet`.
- Allow quantile to compute multiple quantiles at once.
- Allow empty LazyFrame in `LazyFrame.group_by(...).map_groups`.
- Add streaming UnorderedUnion.
- Implement compression support for sink_ndjson.
- Add unstable record batch statistics flags to `{sink/scan}ipc`.
- Add compression support to write_csv and sink_csv.
- Add `scan_lines`.
- Support regex in `str.split`.
- Add unstable IPC Statistics read/write to `scan_ipc`/`sink_ipc`.
- Add nulls support for all rolling_by operations.
- ArrowStreamExportable and sink_delta.
- Implement streaming decompression for CSV `COUNT(*)` fast path.
- Add nulls support for rolling_mean_by.
- Add lazy `collect_all`.
- Add streaming decompression for NDJSON schema inference.
- Expose record batch size in `{sink,write}ipc`.
- Add `null_on_oob` parameter to `expr.get`.
- Suggest correct timezone if timezone validation fails.
- Support streaming IPC scan from S3 object store.
- Implement streaming CSV schema inference.
- Support hashing of meta expressions.
- Add pl.Expr.(min|max)_by.
- Implement or fix json encode/decode for (U)Int128, Categorical, Enum, Decimal.
- Expand scatter to more dtypes.
- Implement streaming CSV decompression.
- Add Series `sql` method for API consistency.
- Support Binary and Decimal in arg_(min|max).
- Allow Decimal parsing in str.json_decode.
- Add `shift` support for Object data type.
- Add node status to NodeMetrics.
- Allow scientific notation when parsing Decimals.
- Allow creation of `Object` literal.
- Add `bin.slice()`, `bin.head()`, and `bin.tail()` methods.
- Add SQL support for the `QUALIFY` clause.
- Add SQL syntax support for `CROSS JOIN UNNEST(col)`.
- Add separate env var to log tracked metrics.
- Expose fields for generating physical plan visualization data.
- Allow pl.Object in pivot value.
- Extend SQL `UNNEST` support to handle multiple array expressions.
- Temporal `quantile` in rolling context.
- Add support for `Float16` dtype.
- Add strict parameter to pl.concat(how='horizontal').
- Add leftmost option to `str.replace_many / str.find_many / str.extract_many`.
- Add `quantile` for missing temporals.
- Expose and document pl.Categories.
- Support decimals in search_sorted.
- Add SQL support for named `WINDOW` references.
- Add `having` to `group_by` context.
- Allow elementwise `Expr.over` in aggregation context.
- Add SQL support for `ROW_NUMBER`, `RANK`, and `DENSE_RANK` functions.
- Automatically Parquet dictionary encode floats.
- Add `empty_as_null` and `keep_nulls` to `{Lazy,Data}Frame.explode`.
- Allow `hash` for all `List` dtypes.
- Support `unique_counts` for all datatypes.
- Add `maintain_order` to `Expr.mode`.
🐛 Bug Fixes
- Fix panic in is_between support in streaming Parquet predicate push down.