Change8

RAGAS

AI & LLMs

Supercharge Your LLM Application Evaluations 🚀

Latest: v0.4.226 releases11 breaking changesView on GitHub →

Release History

v0.4.2Breaking9 fixes13 features
Dec 23, 2025

This release focuses heavily on migrating core metrics to the new collections API structure and introduces caching support for metrics and embeddings. Several bug fixes address issues related to instructor modes, type validation, and Claude workflow tokens.

v0.4.1Breaking2 fixes6 features
Dec 10, 2025

This release focuses heavily on migrating core metrics (ToolCallAccuracy, ToolCallF1, TopicAdherence, AgentGoalAccuracy, Rubrics) to utilize the collections API for better structure. It also introduces a breaking change by renaming `embed_text` to `aembed_text` in AnswerRelevancy.

v0.4.0Breaking9 fixes5 features
Dec 3, 2025

This release introduces major architectural updates, migrating numerous metrics to a modular BasePrompt system and enhancing LLM provider support via instructor.from_provider and dual adapter capabilities. It also includes several bug fixes related to LangChain integration and LLM detection.

v0.3.9Breaking5 fixes9 features
Nov 11, 2025

This release focuses heavily on migrating core metrics to a new structure, removing deprecated metrics like 'aspect critic', and introducing new features like synthetic data traceability metadata. Several documentation fixes and minor bug fixes related to OpenAI models were also implemented.

v0.3.8Breaking5 fixes6 features
Oct 28, 2025

This release focuses heavily on internal refactoring, migrating core functionalities like semantic similarity and simple criteria to collections, and merging LLM factory methods. Several bugs related to async handling and specific synthesizers were also fixed.

v0.3.74 fixes4 features
Oct 14, 2025

This release focuses on migrating several core metrics (BLEU, string metrics, answer similarity) to collections, improving robustness in query distribution, and adding new configuration options for LLM wrappers. Internal code quality and documentation were also enhanced.

v0.3.615 fixes10 features
Oct 3, 2025

This release introduces several new features, including CHRF score support, enhanced input flexibility for metrics, and OCI Gen AI integration. Numerous bug fixes address issues related to asyncio, metric calculations, and dependency compatibility.

v0.3.53 fixes4 features
Sep 17, 2025

This release focuses on improving core functionality, including better async execution and knowledge graph optimization, alongside several bug fixes and documentation updates.

v0.3.5rc2
Sep 17, 2025

No release notes provided.

v0.3.5rc12 fixes4 features
Sep 17, 2025

This release focuses on improving asynchronous operations, optimizing knowledge graph handling for large datasets, and fixing a TypeError in metric calculations. It also introduces telemetry collection.

v0.3.42 fixes1 feature
Sep 10, 2025

This release focuses on performance improvements, documentation updates, and minor bug fixes, including optimizing cluster finding and fixing batching issues with LangChain.

v0.3.3Breaking19 fixes11 features
Sep 4, 2025

This release focuses heavily on internal restructuring, moving modules like `tracing`, `prompts`, `dataset`, and experimental features into the main package structure while retiring the `ragas.experimental` namespace. Numerous bug fixes address CI, LLM compatibility (especially OpenAI O1 series), and metric stability.

v0.3.3rc1Breaking20 fixes11 features
Sep 4, 2025

This release focuses heavily on internal restructuring, migrating modules like `tracing`, `prompts`, `dataset`, and experimental metrics out of experimental namespaces and into the main package structure. It also includes numerous bug fixes, performance optimizations (like 50% speedup for factual correctness), and improved LLM compatibility.

v0.3.2Breaking3 fixes3 features
Aug 19, 2025

This release moves key features like `experiment` and the CLI from experimental to the main package, adds prompt saving/loading capabilities, and removes the simulation feature.

v0.3.2rc3
Aug 19, 2025

No release notes provided.

v0.3.2-rc21 fix
Aug 19, 2025

This release (v0.3.2-rc2) primarily addresses fixes related to pypi requirements and image absolute paths.

v0.3.2-rc1Breaking2 fixes4 features
Aug 19, 2025

This release moves key features like `experiment` and the CLI from experimental to the main package, removes simulation functionality, and adds support for Python 3.13.

v0.3.14 fixes1 feature
Aug 11, 2025

This release introduces a new Google Drive backend for dataset storage and includes several documentation and example improvements, alongside minor configuration fixes.

v0.3.0Breaking6 fixes10 features
Jul 17, 2025

This release introduces major features like LlamaIndex agentic integration, a new CLI, and security enhancements including a fix for CVE-2025-45691. It also includes significant internal refactoring, notably the removal of the Project structure.

v0.3.0-rc2
Jul 17, 2025

No release notes provided.

v0.3.0-rc1
Jul 17, 2025

No release notes provided.

v0.2.151 fix4 features
Apr 24, 2025

This release introduces new integrations with AWS Bedrock, LlamaStack, and Griptape, alongside enhancements to validation logic and documentation updates. A key documentation change involves renaming AWS Bedrock references to Amazon Bedrock.

v0.2.148 fixes6 features
Mar 4, 2025

This release introduces new features like HTTP request-response logging and multi-turn conversation evaluation, alongside numerous bug fixes across various metrics and synthesizers. It also includes documentation updates and new integrations.

v0.2.13Breaking3 fixes2 features
Feb 4, 2025

This release focuses on bug fixes, prompt improvements, and enhancements to integrations like langgraph, alongside removing an unnecessary argument from ToolCallAccuracy initialization.

v0.2.123 fixes2 features
Jan 21, 2025

This release introduces Bedrock token parser support and an optional parameter for the BLEU score, alongside several bug fixes for TP/FP calculations and the output parser.

v0.2.115 fixes6 features
Jan 14, 2025

This release introduces new features like Swarm integration and the ability to specify an experiment name during evaluation. It also includes several bug fixes related to metrics and dependency management, alongside numerous documentation updates.