The development of products based on artificial intelligence (AI) and machine learning (ML) involves a large investment of time and money, from the design phase to the implementation process of the data algorithms. That’s why the team of more than 900 data scientists at BBVA decided to create a library open source baptized as Mercury.
Thanks to the BBVA Mercury library, it is possible to streamline processes, avoid duplication and promote the globalization and reuse of analytical products. All these codes are shared on the collaborative platform Github so that external programmers can drive fintech business innovation in its corporate and investment banking businesses.
The team of BBVA Advanced Analytics Data works jointly in its offices in Spain, Mexico and Latin America to generate a common code that makes it possible to eliminate the complexity of algorithms, guarantee compliance with current regulations and design recommender systems. In addition, it promotes new operations such as the data labeling process with which the bank adds new categories of expenses in its APP.
work maturanaresponsible for the specialty of Large Language and Mercury Models at BBVA, assures that this type of algorithms are not very frequent in the ‘open source’ field, which represents a advancement in fintech innovation. However, the key to Mercury’s success lies in the contributions that other developers carry out, allowing their algorithms to continue gaining significant robustness. BBVA uses other ‘open source’ libraries to resolve cases, such as Spark, tensor flow either scikit-learn.
Mercury’s essence
Mercury was born in 2019 by the hand of BBVA AI Factory. It is structured in multiple micro-repositoriesin such a way that it follows a modular design in which each one is independent and has more than 300,000 lines of code. All the algorithms it hosts must meet a series of requirements, especially that they can be used in a common way by different teams of developers. Similarly, the code must be high quality tested and highly functional.
Another of the advantages that Mercury brings to the data scientists of the different financial entities is its measuring power. And it is that with this efficient tool analysts will be able to calculate the robustness of their ML models and reduce the risk of issuing erroneous data, providing greater quality in highly changing environments.
The initial version of Mercury had limited features, but it still managed to gain weight in the AI Factory as data scientists nurtured the integration process. component reuse within your projects. In fact, the BBVA team has used in Mercury all the components developed with the X Programan efficient data experimentation device.
The structuring of Mercury
The Mercury library has the uniqueness of being modular, allowing users to install only the parts they need. The standalone micropacks into which it is divided are:
- Data schema: Utility package that automatically interferes and calculates different statistics. Validates if different data sets match the same schema to calculate drift.
- Mercury-explainability: It offers methods and techniques for interpreting ML models, both locally and globally, offering a better understanding of how an ML model works.
- Mercury monitoring: It is a package for monitoring models and their performance in production. In this way, changes in the data distribution can be detected and the accuracy at the time of interference can be estimated.
- Mercury Reels: Analyze sequences of events extracted from transactional data, whether established manually or automatically.
- Mercury resistant: It is defined as a lightweight framework capable of performing robust tests on ML models and datasets, ensuring that flows are robust against certain conditions (label leakage or input data schema issues).
- Mercury-settrie: It is defined as a C++ library for creating, updating, and querying objects. settriethat is, efficient queries of subsets of data.
The Mercury user community continues to grow and is used in more than a third of BBVA’s advanced data analysis. During 2023, these figures are expected to even double, demonstrating the promising future of this efficient open and collaborative platform for fintech environments.