Testimonials from Google, Wells Fargo, and others:

Resources:

Open-Source Growth over time: https://star-history.t9t.io/#cleanlab/cleanlab
How scientists use Cleanlab: https://scholar.google.com/scholar?q="cleanlab"
Related news coverage: https://github.com/cleanlab/label-errors#selected-news-coverage

Learn how companies (and you!) use Cleanlab:

Google Senior Software Engineer uses Cleanlab to find data errors at scale:

“Cleanlab is well-designed, scalable and theoretically grounded: it accurately finds data errors, even on well-known and established datasets. After using it for a successful pilot project at Google, Cleanlab is now one of my go-to libraries for dataset cleanup.”
— Patrick Violette, Senior Software Engineer, Google

Berkeley Research Group increases ML model accuracy by 15% and reduces time spent by 1/3 using Cleanlab Studio

Untitled

source: https://www.linkedin.com/posts/steven-gawthorpe-b4298118_cleanlab-studio-activity-7009197862525747200-7nnA

Banco Bilbao Vizcaya Argentaria (BBVA)

BBVA, one of the largest financial institutions in the world, used Cleanlab in :

an update of one of the functionalities offered by the BBVA app: the categorization of financial transactions. These categories allow users to group their transactions to better control their income and expenses, and to understand the overall evolution of their finances. This service is available to all users in Spain, Mexico, Peru, Colombia, and Argentina.

Screen Shot 2023-05-05 at 4.59.56 PM.png

Screen Shot 2023-05-05 at 4.59.27 PM.png

*We used AL [Active Learning] in combination with Cleanlab

This was necessary because, although we had defined and unified annotation criteria for transactions, some could be linked to several subcategories depending on the annotator’s interpretation. To reduce the impact of having different subcategories for similar transactions, we used Cleanlab for discrepancy detection.

With the current model, we were able to improve accuracy by 28%, while reducing the number of labeled transactions required to train the model by more than 98%

CleanLab assimilates input from annotators and corrects any discrepancies between similar samples.

CleanLab helped us reduce the uncertainty of noise in the tags. This process enabled us to train the model, update the training set, and optimize its performance. The goal was to reduce the number of labeled transactions and make the model more efficient, requiring less time and dedication. This allows data scientists to focus on tasks that generate greater value for customers and organizations.*