Case Studies | Cleanlab Technology
“Cleanlab is well-designed, scalable and theoretically grounded: it accurately finds data errors, even on well-known and established datasets. After using it for a successful pilot project at Google, Cleanlab is now one of my go-to libraries for dataset cleanup.”
— Patrick Violette, Senior Software Engineer, Google
source: https://www.linkedin.com/posts/steven-gawthorpe-b4298118_cleanlab-studio-activity-7009197862525747200-7nnA
BBVA, one of the largest financial institutions in the world, used Cleanlab in :
an update of one of the functionalities offered by the BBVA app: the categorization of financial transactions. These categories allow users to group their transactions to better control their income and expenses, and to understand the overall evolution of their finances. This service is available to all users in Spain, Mexico, Peru, Colombia, and Argentina.
*We used AL [Active Learning] in combination with Cleanlab
This was necessary because, although we had defined and unified annotation criteria for transactions, some could be linked to several subcategories depending on the annotator’s interpretation. To reduce the impact of having different subcategories for similar transactions, we used Cleanlab for discrepancy detection.
With the current model, we were able to improve accuracy by 28%, while reducing the number of labeled transactions required to train the model by more than 98%
CleanLab assimilates input from annotators and corrects any discrepancies between similar samples.
CleanLab helped us reduce the uncertainty of noise in the tags. This process enabled us to train the model, update the training set, and optimize its performance. The goal was to reduce the number of labeled transactions and make the model more efficient, requiring less time and dedication. This allows data scientists to focus on tasks that generate greater value for customers and organizations.*