v3.4.0
📦 mlflowView on GitHub →
✨ 28 features🐛 17 fixes🔧 22 symbols
Summary
MLflow 3.4.0rc0 adds extensive new capabilities—including OpenTelemetry metrics export, MCP server integration, a custom judges API, and experiment types UI—while delivering numerous feature enhancements and bug fixes across evaluation, tracing, CLI, tracking, and model registry.
✨ New Features
- OpenTelemetry Metrics Export: MLflow now exports span-level statistics as OpenTelemetry metrics, providing enhanced observability and monitoring capabilities for traced applications.
- MCP Server Integration: Introducing the Model Context Protocol (MCP) server for MLflow, enabling AI assistants and LLMs to interact with MLflow programmatically.
- Custom Judges API: New `make_judge` API enables creation of custom evaluation judges for assessing LLM outputs with domain-specific criteria.
- Correlations Backend: Implemented backend infrastructure for storing and computing correlations between experiment metrics using NPMI (Normalized Pointwise Mutual Information).
- Evaluation Datasets: MLflow now supports storing and versioning evaluation datasets directly within experiments for reproducible model assessment.
- Databricks Backend for MLflow Server: MLflow server can now use Databricks as a backend, enabling seamless integration with Databricks workspaces.
- Claude Autologging: Automatic tracing support for Claude AI interactions, capturing conversations and model responses.
- Strands Agent Tracing: Added comprehensive tracing support for Strands agents, including automatic instrumentation for agent workflows and interactions.
- Experiment Types in UI: MLflow now introduces experiment types, helping reduce clutter between classic ML/DL and GenAI features; auto-detects the type and allows adjustment via a selector next to the experiment name.
- Add ability to pass tags via dataframe in mlflow.genai.evaluate.
- Add custom judge model support for Safety and RetrievalRelevance builtin scorers.
- Add AI commands as MCP prompts for LLM interaction.
- Add MLFLOW_ENABLE_OTLP_EXPORTER environment variable.
- Support OTel and MLflow dual export.
- Make set_destination use ContextVar for thread safety.
- Add MLflow commands CLI for exposing prompt commands to LLMs.
- Add 'mlflow runs link-traces' command.
- Add 'mlflow runs create' command for programmatic run creation.
- Add MLflow traces CLI command with comprehensive search and management capabilities.
- Add --env-file flag to all MLflow CLI commands.
- Backend for storing scorers in MLflow experiments.
- Allow cross-workspace copying of model versions between WMR and UC.
- Add automatic Git-based model versioning for GenAI applications.
- Improve WheeledModel._download_wheels safety.
- Support resume run for Optuna hyperparameter optimization.
- Add MLFLOW_DEPLOYMENT_CLIENT_HTTP_REQUEST_TIMEOUT environment variable.
- Add ability to hide/unhide all finished runs in Chart view.
- Add MLflow OSS telemetry for invoke_custom_judge_model.
🐛 Bug Fixes
- Implement DSPy LM interface for default Databricks model serving.
- Fix aggregations incorrectly applied to legacy scorer interface.
- Add Unity Catalog table source support for mlflow.evaluate.
- Fix custom prompt judge encoding issues with custom judge models.
- Fix OpenAI autolog to properly reconstruct Response objects from streaming events.
- Add basic authentication support in TypeScript SDK.
- Update scorer endpoints to v3.0 API specification.
- Fix scorer status handling in MLflow tracking backend.
- Fix missing source-run information in UI.
- Fix spark_udf to always use stdin_serve for model serving.
- Fix a bug with Spark UDF usage of uv as an environment manager.
- Extract source workspace ID from run_link during model version migration.
- Improve security by reducing write permissions in temporary directory creation.
- Fix --env-file flag compatibility with --dev mode.
- Fix basic authentication with Uvicorn server.
- Fix experiment comparison functionality in UI.
- Fix compareExperimentsSearch route definitions.
Affected Symbols
mlflow.trackingmlflow.genai.evaluatemake_judgeWheeledModel._download_wheelsMLFLOW_ENABLE_OTLP_EXPORTERMLFLOW_DEPLOYMENT_CLIENT_HTTP_REQUEST_TIMEOUTset_destinationContextVarmlflow runs link-tracesmlflow runs createmlflow traces--env-filescorer endpoints v3.0spark_udfuvrun_linkDatabricks backend for MLflow serverMCP serverCustom Judges APIExperiment Types UIStrands Agent TracingClaude Autologging