v4.4.0rc2
📦 datadog-sdkView on GitHub →
✨ 4 features🐛 13 fixes🔧 8 symbols
Summary
This release introduces significant enhancements to LLM Observability with class-based evaluators and adds configuration for logging levels via `DD_TRACE_LOG_LEVEL`. Numerous bug fixes address issues across profiling, AWS Lambda handlers, gevent compatibility, and specific library integrations like litellm and pydantic-ai.
✨ New Features
- Adds support for class-based evaluators in LLM Observability by allowing users to subclass `BaseEvaluator`.
- Introduces `EvaluatorContext` to store evaluation context including dataset record and span information.
- Supports class-based summary evaluators via `BaseSummaryEvaluator`, which receives a `SummaryEvaluatorContext`.
- Adds a new environment variable `DD_TRACE_LOG_LEVEL` to control the ddtrace logger level.
🐛 Bug Fixes
- Fixes an issue where agent-based samplers could interfere with Standalone App and API Protection by rejecting traces prematurely.
- Resolves an issue in aws_lambda where user-defined SIGALRM handlers were not restored after TimeoutChannel cleanup, causing custom timeout handlers to fail after the first invocation.
- Fixes a gevent support issue in exception replay that caused an exception when determining if a frame belongs to user code for capturing.
- Resolves an issue with litellm>=1.74.15 where wrapped router streaming responses caused an `AttributeError` when accessing `.handler`; integration now handles wrapped and original responses gracefully.
- Fixes a profiling bug where non-pushed samples could leak data to subsequent samples.
- Fixes a profiling bug where `asyncio` task stacks contained duplicated frames when the task was on-CPU; stacks now show each frame once.
- The stack Profiler now correctly resets thread, task, and greenlet information after a fork, preventing stale data from the parent process.
- Fixed crash in lock profiler when stack traces are too shallow (less than 4 frames), resulting in location "unknown:0" instead of a crash.
- Fixed an issue that caused greenlets to misbehave when `gevent.joinall` is called.
- Resolves a crash occurring when forking while using the memory profiler.
- Fixes an issue where the Pydantic AI integration did not properly trace `StreamedRunResult.stream_responses()` (introduced in `pydantic-ai==0.8.1`), preventing agent spans from finishing.
- Addresses an issue where the evaluators argument type for `LLMObs.experiment` was overly constrained; it now uses the covariant Sequence type.
- Fixes an issue where OpenAI spans showed <span class="title-ref">model_name: "None"</span> instead of falling back to the request model when the API response returned a None model field; model name now falls back correctly.