Change8

v0.26.0

Breaking Changes
📦 weights-biasesView on GitHub →
1 breaking16 features🐛 5 fixes🔧 12 symbols

Summary

This release introduces significant enhancements to the W&B LEET TUI, adds advanced system monitoring metrics (TPU and NVML GPM), and improves performance by switching JSON serialization to `orjson`. Compatibility with older server versions has been dropped.

⚠️ Breaking Changes

  • Compatibility with server versions older than 0.63.0 (for Dedicated Cloud and Self-Managed W&B deployments) has been dropped.

Migration Steps

  1. Ensure your server deployment is running version 0.63.0 or newer if using Dedicated Cloud or Self-Managed W&B.

✨ New Features

  • `wandb beta core start|stop` commands introduced to run a detached `wandb-core` service and reuse it across multiple processes via the `WANDB_SERVICE` env var.
  • Run filtering by metadata in multi-run workspace mode in W&B LEET TUI (`wandb beta leet` command, activate with `f`).
  • Run overview displays tags and notes in W&B LEET TUI (`wandb beta leet` command).
  • Per-chart log-scale (Y-axis) support in W&B LEET TUI (`wandb beta leet` command, toggle on a selected chart with `y`).
  • Standalone system monitor mode in W&B LEET TUI (`wandb beta leet symon` command).
  • Bucketed heatmap chart mode for system metrics expressed as percentages (e.g. GPU utilization) in W&B LEET TUI (`wandb beta leet` command, cycle chart mode on a selected chart with `y`).
  • Colorblind-friendly `dusk-shore` (gradient) and `clear-signal` (cycle) color schemes in W&B LEET TUI (`wandb beta leet` command, configure with `wandb beta leet config`).
  • `disable_git_fork_point` added to prevent calculating git diff patch files closest ancestor commit when no upstream branch is set.
  • Media pane added for displaying `wandb.Image` data as ANSI thumbnails in W&B LEET TUI (`wandb beta leet` command), with grid layout, X-axis scrubbing, fullscreen mode, and keyboard/mouse navigation.
  • Kubeflow Pipelines v2 (`kfp>=2.0.0`) support for the `@wandb_log` decorator.
  • `allow_media_symlink` setting added to symlink or hardlink media files to the run directory instead of copying, improving logging performance and reducing disk usage.
  • `run.pin_config_keys(keys)` added to programmatically pin specific config keys for display in a References section on the Run Overview page.
  • Direct TPU metric collection via `libtpu.so` FFI, capturing `tensorcore_util` (SDK-only, unavailable via gRPC), `duty_cycle_pct`, `hbm_capacity_total`, `hbm_capacity_usage`, and latency distributions.
  • NVML GPM (GPU Performance Monitoring) profiling metrics for Hopper+ GPUs (H100 and newer), providing SM utilization, tensor/FP pipeline activity, DRAM bandwidth, and PCIe/NVLink throughput without requiring the DCGM daemon.
  • `.runs()` method added to the `Agent` class to query run status for a given sweep agent.
  • `.agent()` and `.agents()` methods added to the `Sweep` class to query active agents for a given sweep.

🐛 Bug Fixes

  • Fixed `update_automation()` silently dropping event filters (e.g. alias conditions on `OnAddArtifactAlias`) when a new event is provided.
  • Fixed artifact client ID collisions in forked child processes by reseeding the fast ID generator after `fork()`.
  • Fixed `WANDB__EXTRA_HTTP_HEADERS` not being applied to presigned object-store upload and download requests.
  • Fixed deadlock in `artifact.download()` for artifacts with many large files.
  • Fixed `User.generate_api_key()` failing for users with hashed API keys by using the existing authenticated client instead of querying non-secret key names.

Affected Symbols