py-1.36.0
📦 polarsView on GitHub →
✨ 53 features🐛 54 fixes🔧 69 symbols
Summary
This release introduces significant enhancements across SQL support (including window functions and QUALIFY), new DataFrame/Expr methods like bin slicing and improved rolling calculations, and numerous performance optimizations, especially for streaming and Parquet I/O. Numerous bugs related to panics, dtype handling, and SQL expression resolution have also been fixed.
✨ New Features
- Add Extension types
- Add SQL support for the QUALIFY clause
- Add bin.slice(), bin.head(), and bin.tail() methods
- Add SQL syntax support for CROSS JOIN UNNEST(col)
- Add separate env var to log tracked metrics
- Expose fields for generating physical plan visualization data
- Allow pl.Object in pivot value
- Minor improvement for as_struct repr
- Temporal quantile in rolling context
- Add quantile for missing temporals
- Add strict parameter to pl.concat(how='horizontal')
- Support decimals in search_sorted
- Expose and document pl.Categories
- Use reference to Graph pipes when flushing metrics
- Extend SQL UNNEST support to handle multiple array expressions
- Add SQL support for ROW_NUMBER, RANK, and DENSE_RANK functions
- Allow elementwise Expr.over in aggregation context
- Add SQL support for named WINDOW references
- Add leftmost option to str.replace_many / str.find_many / str.extract_many
- Automatically Parquet dictionary encode floats
- Support unique_counts for all datatypes
- Add maintain_order to Expr.mode
- Allow hash for all List dtypes
- Add empty_as_null and keep_nulls to {Lazy,Data}Frame.explode
- Display function of streaming physical plan map node
- Allow slice on scalar in aggregation context
- Allow implode and aggregation in aggregation context
- Move GraphMetrics into StreamingQuery
- Documentation on Polars Cloud manifests
- Add empty_as_null and keep_nulls flags to Expr.explode
- Allow Expr.unique on List/Array with non-numeric types
- Raise suitable error on non-integer "n" value for clear
- Allow Expr.rolling in aggregation contexts
- Allow bare .row() on a single-row DataFrame, equivalent to .item() on a single-element DataFrame
- Support additional forms of SQL CREATE TABLE statements
- Add support for Float16 dtype
- Support column-positional SQL "UNION" operations
- Add unstable Schema.to_arrow()
- Make DSL-hash skippable
- Improve error message on unsupported SQL subquery comparisons
- Support arbitrary expressions in SQL `JOIN` constraints
- Allow arbitrary expressions as the `Expr.rolling` `index_column`
- Set polars/<version> user-agent
- Support `ewm_var/std` in streaming engine
- Rewrite `IR::Scan` to `IR::DataFrameScan` in `expand_datasets` when applicable
- Add ignore_nulls to first / last
- Allow arbitrary Expressions in "subset" parameter of `unique` frame method
- Add `BIT_NOT` support to the SQL interface
- Streaming {Expr,LazyFrame}.rolling
- Add LazyFrame.pivot
- Add SQL support for `LEAD` and `LAG` functions
- Add having to group_by context
- Add show methods for DataFrame and LazyFrame
🐛 Bug Fixes
- Rechunk on nested dtypes in take_unchecked_impl parallel path
- Fix streaming SchemaMismatch panic on list.drop_nulls
- Fix panic on Boolean rolling_sum calculation for list or array eval
- Fix "dtype is unknown" panic in cross joins with literals
- Fix panic edge-case when scanning hive partitioned data
- Fix "unreachable code" panic in UDF dtype inference
- Address potential "batch_size" parameter collision in scan_pyarrow_dataset
- Fix empty format handling
- Improve SQL GROUP BY and ORDER BY expression resolution, handling aliasing edge-cases
- Preserve List inner dtype during chunked take operations
- Fix lifetime for AmortSeries lazy group iterator
- Fix spearman panicking on nulls
- Properly resolve HAVING clause during SQL GROUP BY operations
- Prevent false positives in is_in for large integers
- Differentiate between empty list an no list for unpivot
- Bug in boolean unique_counts
- Hang in multi-chunk DataFrame .rows()
- Correct arr_to_any_value for object arrays
- Have PySeries::new_f16 receive pf16s instead of f32s
- Set Float16 parquet schema type to Float16
- Fix incorrect .list.eval after slicing operations
- Strict conversion AnyValue to Struct
- Rolling mean/median for temporals
- Add .rolling_rank() support for temporal types and pl.Boolean
- Fix occurence of exact matches of .join_asof(strategy="nearest", allow_exact_matches=False, ...)
- Always respect return_dtype in map_elements and map_rows
- Fix group lengths check in sort_by with AggregatedScalar
- Fix dictionary replacement error in write_ipc()
- Fix expr slice pushdown causing shape error on literals
- Allow empty list in sort_by in list.eval context
- Raise error on out-of-range dates in temporal operations
- Validate list.slice parameters are not lists
- Make sum on strings error in group_by context
- Prevent panic when joining sorted LazyFrame with itself
- Apply CSV dict overrides by name only
- Incorrect result in aggregated first/last with ignore_nulls
- Fix off-by-one bug in `ColumnPredicates` generation for inequalities operating on integer columns
- Use Cargo.template.toml to prevent git dependencies from using template
- Fix arr.{eval,agg} in aggregation context
- Support AggregatedList in list.{eval,agg} context
- Nested dtypes in streaming first_non_null/last_non_null
- Remove Expr casts in pl.lit invocations
- Optimize projection pushdown through HConcat
- Revert pl.format behavior with nulls
- Correct eq_missing for struct with nulls
- Resolve edge-case with SQL aggregates that have the same name as one of the GROUP BY keys
- Unique on literal in aggregation context
- Aggregation with drop_nulls on literal
- SQL NATURAL joins should coalesce the key columns
- Mark {forward,backward}_fill as length_preserving
- Correct drop_items for scalar input
- Schema mismatch with list.agg, unique and scalar
- AnyValue::to_physical for categoricals
- Bugs in pl.from_repr with signed exponential floats
🔧 Affected Symbols
pl.concatbin.slicebin.headbin.tailpl.Objectas_structpl.rollingpl.concat(how='horizontal')search_sortedpl.Categoriesstr.replace_manystr.find_manystr.extract_manyExpr.overExpr.modeExpr.explodeDataFrame.explodeLazyFrame.explode.row().item()Schema.to_arrowExpr.rollingLazyFrame.group_by_dynamicLazyFrame.rollingExpr.rollingfirstlastuniqueBIT_NOTLEADLAGgroup_by_dynamicgroup_bytake_unchecked_impllist.drop_nullsrolling_sumcross joinsscan_pyarrow_datasetis_betweenunpivotunique_countsarr_to_any_valuePySeries::new_f16.list.evalAnyValueStructrolling_rankjoin_asofmap_elementsmap_rowssort_bywrite_ipclist.slicetemporal operationssum on stringsjoinCSV dict overridesColumnPredicatesarr.{eval,agg}list.{eval,agg}first_non_nulllast_non_nullpl.litHConcatpl.formateq_missingSQL aggregatesdrop_itemspl.from_repr