v3.8.0rc0
📦 mlflowView on GitHub →
✨ 5 features🔧 4 symbols
Summary
MLflow 3.8.0rc0 introduces prompt model configuration, in‑progress trace display, DeepEval judges integration, and two new conversational scorers, with no breaking changes.
Migration Steps
- Install the release candidate: pip install mlflow==3.8.0rc0
- No breaking changes; existing code should continue to work. Optionally adopt new APIs such as `get_judge` and the new conversational scorers as needed.
✨ New Features
- Prompt Model Configuration: prompts can now include model configuration, allowing association of specific model settings with prompt templates for more reproducible LLM workflows.
- In-Progress Trace Display: Traces UI now supports displaying spans from in-progress traces with auto‑polling, enabling real‑time debugging and monitoring of long‑running LLM applications.
- DeepEval Judges Integration: new `get_judge` API enables using DeepEval's evaluation metrics as MLflow scorers, providing access to 20+ evaluation metrics including answer relevancy, faithfulness, and hallucination detection.
- Conversational Safety Scorer: built‑in scorer for evaluating safety of multi‑turn conversations, analyzing entire conversation histories for hate speech, harassment, violence, and other safety concerns.
- Conversational Tool Call Efficiency Scorer: built‑in scorer for evaluating tool call efficiency in multi‑turn agent interactions, detecting redundant calls, missing batching opportunities, and poor tool selections.