Change8
Error2 reports

Fix NonMatchingSplitsSizesError

in Datasets

Solution

This error usually arises when the dataset's split information (sizes and file locations) in the dataset_info.json doesn't align with the actual number of shards or examples found in the data files. To fix it, either regenerate the dataset's dataset_info.json by deleting the cache or correctly specify/override the split information when loading the dataset, using `split=` argument with appropriate slice notation or named splits.

Timeline

First reported:Aug 7, 2025
Last reported:Nov 13, 2025

Need More Help?

View the full changelog and migration guides for Datasets

View Datasets Changelog
Fix NonMatchingSplitsSizesError in Datasets | Change8