py-1.30.0
📦 polarsView on GitHub →
✨ 20 features🐛 44 fixes🔧 20 symbols
Summary
This release focuses heavily on performance improvements across various operations, including optimizer casts, parallelism, and elementwise execution. It introduces several new features like list.filter, LazyFrame.match_to_schema, and enhanced type inference, alongside numerous bug fixes addressing panics and incorrect outputs.
✨ New Features
- Implemented list.filter.
- Support binaryoffset in search sorted.
- Add nulls_equal flag to list/arr.contains.
- Implement LazyFrame.match_to_schema.
- Improved time-string parsing and inference (generally, and via the SQL interface).
- Allow for .over to be called without partition_by.
- Support AnyValue translation from PyMapping values.
- Support optimised init from non-dict Mapping objects in from_records and frame/series constructors.
- Support inference of Int128 dtype from databases that support it.
- Add options to write Parquet field metadata.
- Add cast_options parameter to control type casting in scan_parquet.
- Allow casting List<UInt8> to Binary.
- Allow setting of regex size limit using POLARS_REGEX_SIZE_LIMIT.
- Support use of literal values as "other" when evaluating Series.zip_with.
- Allow to read and write custom file-level parquet metadata.
- Support PEP702 @deprecated decorator behaviour.
- Support grouping by pl.Array.
- Preserve exception type and traceback for errors raised from Python.
- Use fixed-width font in streaming phys plan graph.
- Load AWS endpoint_url using boto3.
🐛 Bug Fixes
- Fix RuntimeError when serializing the same DataFrame from multiple threads.
- Fix map_elements predicate pushdown.
- Fix reverse list type.
- Don't require numpy for search_sorted.
- Add type equality checking for relevant methods.
- Invalid output for fill_null after when.then on structs.
- Don't panic for cross join with misaligned chunking.
- Panic on quantile over nulls in rolling window.
- Respect BinaryOffset metadata.
- Correct the output order of PartitionByKey and PartitionParted.
- Fallback to non-strict casting for deprecated casts.
- Handle sliced out remainder for bitmaps.
- Don't merge Enum categories on append.
- Fix unnest() not working on empty struct columns.
- Fix the default value type in Schema init.
- Correct name in unnest error message.
- Provide "schema" to DataFrame, even if empty
- Properly account for nulls in the is_not_nan check made in drop_nans.
- Incorrect result from SQL count(*) with partition by.
- Fix deadlock joining scanned tables with low thread count.
- Don't allow deserializing incompatible DSL.
- Incorrect null dtype from binary ops in empty group_by.
- Don't mark str.replace_many with Mapping as deprecated.
- Gzip has maximum compression of 9, not 10.
- Fix predicate pushdown of fallible expressions.
- Fix index out of bounds panic when scanning hugging face.
- Panic on group_by with literal and empty rows.
- Return input instead of panicking if empty subset in drop_nulls() and drop_nans().
- Bump argminmax to 0.6.3.
- DSL version deserialization endianness.
- Allow Expr.round() to be called on integer dtypes.
- Fix panic when filtering based on row index column in parquet.
- WASM and PyOdide compile.
- Resolve get() SchemaMismatch panic.
- Panic in group_by_dynamic on single-row df with group_by.
- Consistently use Unix epoch as origin for dt.truncate (except weekly buckets which start on Mondays).
- Fix interpolate on dtype Decimal.
- CSV count rows skipped last line if file did not end with newline.
- Make nested strict casting actually strict.
- Make replace and replace_strict mapping use list literals.
- Allow pivot on Time column.
- Fix error when providing CSV schema with extra columns.
- Panic on bitwise op between Series and Expr.
- Multi-selector regex expansion.
🔧 Affected Symbols
list.evalfrom_recordsscan_parquetlist.containsLazyFrame.match_to_schemadt.truncatestr.replace_manygroup_bydrop_nulls()drop_nans()Expr.round()unnest()Schemafill_nullPartitionByKeyPartitionPartedsink_*list.getinsert_columnjoin