Change8

0.22.16

📦 unstructuredView on GitHub →
3 features🐛 1 fixes🔧 8 symbols

Summary

This release introduces granular control over formula markdown export styles and significantly reduces peak memory usage during PDF rendering. It also includes performance improvements for quote standardization and dependency security updates.

Migration Steps

  1. If you rely on the exact behavior of formula markdown export, review the new `formula_markdown_style` options in `element_to_md` / `elements_to_md`.
  2. If you were passing positional arguments to `element_to_md` or `elements_to_md`, ensure they are still correctly mapped, as `normalize_formula` is now keyword-only and placed after positional arguments.

✨ New Features

  • Added keyword-only argument `formula_markdown_style` to `element_to_md` and `elements_to_md` with options: "auto" (default, uses display math heuristically), "display_math" (always use display math when safe), or "plain" (text only).
  • Added optional argument `normalize_formula` (default True) to `element_to_md` and `elements_to_md` to map common Unicode operators to LaTeX-like tokens.
  • Introduced module constants: `FORMULA_MARKDOWN_AUTO`, `FORMULA_MARKDOWN_DISPLAY_MATH`, `FORMULA_MARKDOWN_PLAIN`.

🐛 Bug Fixes

  • Fixed a bug in `standardize_quotes` where left smart quotes were never normalized due to duplicate dictionary keys.

Affected Symbols