py-1.30.0

📅 May 21, 2025📦 polarsView on GitHub →

✨ 20 features🐛 44 fixes🔧 20 symbols

Summary

This release focuses heavily on performance improvements across various operations, including optimizer casts, parallelism, and elementwise execution. It introduces several new features like list.filter, LazyFrame.match_to_schema, and enhanced type inference, alongside numerous bug fixes addressing panics and incorrect outputs.

✨ New Features

Implemented list.filter.
Support binaryoffset in search sorted.
Add nulls_equal flag to list/arr.contains.
Implement LazyFrame.match_to_schema.
Improved time-string parsing and inference (generally, and via the SQL interface).
Allow for .over to be called without partition_by.
Support AnyValue translation from PyMapping values.
Support optimised init from non-dict Mapping objects in from_records and frame/series constructors.
Support inference of Int128 dtype from databases that support it.
Add options to write Parquet field metadata.
Add cast_options parameter to control type casting in scan_parquet.
Allow casting List<UInt8> to Binary.
Allow setting of regex size limit using POLARS_REGEX_SIZE_LIMIT.
Support use of literal values as "other" when evaluating Series.zip_with.
Allow to read and write custom file-level parquet metadata.
Support PEP702 @deprecated decorator behaviour.
Support grouping by pl.Array.
Preserve exception type and traceback for errors raised from Python.
Use fixed-width font in streaming phys plan graph.
Load AWS endpoint_url using boto3.

🐛 Bug Fixes

Fix RuntimeError when serializing the same DataFrame from multiple threads.
Fix map_elements predicate pushdown.
Fix reverse list type.
Don't require numpy for search_sorted.
Add type equality checking for relevant methods.
Invalid output for fill_null after when.then on structs.
Don't panic for cross join with misaligned chunking.
Panic on quantile over nulls in rolling window.
Respect BinaryOffset metadata.
Correct the output order of PartitionByKey and PartitionParted.
Fallback to non-strict casting for deprecated casts.
Handle sliced out remainder for bitmaps.
Don't merge Enum categories on append.
Fix unnest() not working on empty struct columns.
Fix the default value type in Schema init.
Correct name in unnest error message.
Provide "schema" to DataFrame, even if empty
Properly account for nulls in the is_not_nan check made in drop_nans.
Incorrect result from SQL count(*) with partition by.
Fix deadlock joining scanned tables with low thread count.
Don't allow deserializing incompatible DSL.
Incorrect null dtype from binary ops in empty group_by.
Don't mark str.replace_many with Mapping as deprecated.
Gzip has maximum compression of 9, not 10.
Fix predicate pushdown of fallible expressions.
Fix index out of bounds panic when scanning hugging face.
Panic on group_by with literal and empty rows.
Return input instead of panicking if empty subset in drop_nulls() and drop_nans().
Bump argminmax to 0.6.3.
DSL version deserialization endianness.
Allow Expr.round() to be called on integer dtypes.
Fix panic when filtering based on row index column in parquet.
WASM and PyOdide compile.
Resolve get() SchemaMismatch panic.
Panic in group_by_dynamic on single-row df with group_by.
Consistently use Unix epoch as origin for dt.truncate (except weekly buckets which start on Mondays).
Fix interpolate on dtype Decimal.
CSV count rows skipped last line if file did not end with newline.
Make nested strict casting actually strict.
Make replace and replace_strict mapping use list literals.
Allow pivot on Time column.
Fix error when providing CSV schema with extra columns.
Panic on bitwise op between Series and Expr.
Multi-selector regex expansion.

🔧 Affected Symbols

list.evalfrom_recordsscan_parquetlist.containsLazyFrame.match_to_schemadt.truncatestr.replace_manygroup_bydrop_nulls()drop_nans()Expr.round()unnest()Schemafill_nullPartitionByKeyPartitionPartedsink_*list.getinsert_columnjoin