v4.4.0rc3

📅 Feb 2, 2026📦 datadog-sdkView on GitHub →

✨ 7 features🐛 20 fixes🔧 18 symbols

Summary

This release introduces significant enhancements to LLM Observability with class-based evaluators and concurrent synchronous experiment execution. It also adds support for LFI detection and Tornado integration for AAP, alongside numerous bug fixes across profiling, exception replay, and various integrations.

Migration Steps

If using the tornado web framework, explicitly enable AAP support by setting `DD_TRACE_TORNADO_ENABLED=true` or `DD_PATCH_MODULES=tornado:true`.

✨ New Features

Adds support for class-based evaluators in LLM Observability by subclassing `BaseEvaluator` and `BaseSummaryEvaluator`.
The `EvaluatorContext` now stores evaluation context including dataset record and span information.
Adds support for running synchronous evaluators concurrently in experiments (async evaluators are not supported).
Adds a new environment variable `DD_TRACE_LOG_LEVEL` to control the ddtrace logger level.
Adds support for capture expressions in log probes for dynamic instrumentation.
Introduces support for Local File Inclusion (LFI) detection in `pathlib.Path.open()` for App and API Protection Exploit Prevention.
Adds AAP support for tornado web framework, enabled via `DD_TRACE_TORNADO_ENABLED=true` or `DD_PATCH_MODULES=tornado:true`.

🐛 Bug Fixes

Fixes interference between agent-based samplers and Standalone App and API Protection when using low sample rates.
Resolves an issue where user-defined SIGALRM handlers were not restored after TimeoutChannel cleanup in aws_lambda, causing custom timeout handlers to fail after the first invocation.
Fixes a gevent support issue in exception replay that could cause an exception when determining if a frame belongs to user code.
Fixes errors while capturing exception replay snapshots.
Resolves an `AttributeError` when accessing `.handler` on streamed responses wrapped by `litellm>=1.74.15` by gracefully handling wrapped responses.
Fixes a profiling bug where non-pushed samples could leak data to subsequent samples.
Fixes duplicated frames in `asyncio` task stacks when the task was on-CPU.
The stack Profiler now correctly resets thread, task, and greenlet information after a fork.
Fixed crash in lock profiler when stack traces are too shallow (less than 4 frames), resulting in location "unknown:0" instead of a crash.
Fixed an issue causing greenlets to misbehave when `gevent.joinall` is called.
Resolves a crash occurring when forking while using the memory profiler.
The Profiler now always reports CPU time for threads, regardless of their running state during sampling.
Ensures the memory profiler clears its internal state immediately after fork in child processes via pthread_atfork.
Fixes an issue where `StreamedRunResult.stream_responses()` in Pydantic AI integration was not properly traced, preventing agent spans from finishing.
Addresses an issue where the evaluators argument type for `LLMObs.experiment` was too restrictive; now uses covariant `Sequence` instead of invariant `List`.
Fixes OpenAI spans showing `model_name: "None"` by correctly falling back to `openai.request.model` or `"unknown_model"`.
Resolves panics in the NativeWriter caused by celery closing file descriptors during beat startup.
Removes noisy warning messages for non-callable view objects that cannot be instrumented.
Fixed an issue preventing instrumented probes from being removed correctly when Dynamic Instrumentation is disabled remotely via the Datadog UI.
Fixes a regression preventing DSM from being automatically enabled for Kafka, AioKafka, Kombu, and Botocore integrations.

Affected Symbols

BaseEvaluator EvaluatorContext BaseSummaryEvaluator SummaryEvaluatorContext pathlib.Path.open tornado litellm FallbackStreamWrapper pydantic-ai StreamedRunResult.stream_responses LLMObs.experiment openai.request.model celery NativeWriter Kafka AioKafka Kombu Botocore