CleanVision automatically detects various issues in image datasets, such as images that are: (near) duplicates, blurry, over/under-exposed, etc. This data-centric AI package is designed as a quick first step for any computer vision project to find problems in your dataset, which you may want to address before applying machine learning.


To install cleanvision using pip:

$ pip install cleanvision


Using CleanVision to audit your image data is as simple as running the code below:

from cleanvision.imagelab import Imagelab

# Specify path to folder containing the image files in your dataset
imagelab = Imagelab(data_path="FOLDER_WITH_IMAGES/")

# Automatically check for a predefined list of issues within your dataset

# Produce a neat report of the issues found in your dataset

CleanVision diagnoses many types of issues, but you can also check for only specific issues:

issue_types = {"light": {}, "blurry": {}}


# Produce a report with only the specified issue_types

More on how to get started with CleanVision: