Ai Machine Learning and Data in Financial Services | Chapter 2

1.1. Introduction 

AI systems are computer-based systems that operate at varying degrees of autonomy. They can make predictions, provide recommendations, or make decisions based on human-defined objectives. These systems rely on large amounts of data, commonly called “big data”, and use data analytics to perform these tasks. Machine learning models are trained with this data, enabling them to learn and improve without explicit human programming.

The process of digitalization was already in progress before the COVID-19 outbreak. However, the pandemic has expedited and intensified it, leading to a greater use of AI. AI implementation is increasing in asset management, algorithmic trading, credit underwriting, and blockchain-based financial services. This progress is facilitated by the abundance of data available and the increased affordability of computing power. 

AI has become an essential component of products and services in various industries, including healthcare, automobiles, consumer goods, and the Internet of Things (IoT). Moreover, financial service providers in all sectors are increasingly utilizing AI technology. For example, in retail and corporate banking, AI is used for customized products, chatbots for customer service, credit scoring and underwriting, credit loss forecasting, AML, fraud detection and monitoring, and customer service. Asset management uses AI for robo-advice, portfolio strategy, and risk management, while trading employs it for algorithmic trading. In the insurance sector, AI is used for robo-advice and claims management. Even the official sector uses AI for RegTech and SupTech applications, such as natural language processing (NLP) and compliance processes.

According to Section 1.2.1, deploying AI and ML through big data is expected to become more significant. However, the potential risks associated with their use in financial services are becoming increasingly concerning. This may require further attention from policymakers. 

There have been discussions among national policymakers and international organizations about regulating and supervising the use of AI in financial services. The main objective is to mitigate the risks associated with AI and to find the best approach to deploying it in financial assistance from a policymaker’s perspective. In simpler terms, policymakers encourage innovation while protecting financial consumers and investors and ensuring that markets for such products and services remain fair, orderly, and transparent.

Policymakers have recently given significant attention to AI due to its potential impact on various industries and the emerging risks associated with its use. In May 2019, the Organization for Economic Cooperation and Development (OECD) introduced its Principles on AI, representing the first global standards for responsible and trustworthy AI endorsed by experts from diverse sectors.

The Committee on Financial Markets has incorporated artificial intelligence (AI), machine learning, and extensive data analysis in their Program of Work and Budget for the biennium of 2021-2022. This strategic inclusion highlights the Committee’s recognition of the growing importance of emerging technologies in the financial industry. By embracing these technologies, the Committee aims to enhance the efficacy and efficiency of its operations and keep pace with the evolving nature of the financial markets. This move is a testament to the Committee’s commitment to remaining at the industry’s forefront and ensuring its practices align with best-in-class standards. 

This report analyzes the impact of AI/ML and big data on the financial sector. It focuses on regions already implementing these technologies and how they are changing their business models. The report discusses the benefits and risks of using these technologies in finance and provides an update on regulatory activities and approaches towards AI and ML in financial services. Furthermore, it covers open debates among policymakers and identifies areas that require further discussion by the Committee and its expert group. The report excludes the use of AI and big data in the insurance sector, which the OECD Insurance and Private Pensions Committee has addressed.

The discussion and analysis of this topic serve two purposes. Firstly, it aims to provide policymakers and international organizations with insights to inform their ongoing debates. Secondly, it seeks to investigate the unexplored issues that arise at the intersection of AI, finance, and policy. This involves analyzing how AI, ML, and significant data impact specific areas of financial market activities such as asset management, algorithmic trading, credit underwriting, and blockchain-based financial products. It also explores how these technologies interact with existing risks, such as liquidity, volatility, and convergence, and how they affect the respective business models.

This report was created by the Experts Group on Finance and Digitalisation and was reviewed by the Committee on Financial Markets during the meetings held in April. The delegates are requested to either approve the release of this report by written procedure or provide their final comments by 23rd July 2021 and then approve its publication.  

1.2. AI techniques, ML and the usage of big data

According to the OECD’s AI Experts Group (AIGO), an AI system is a machine-based system that can make predictions, recommendations, or decisions that influence real or virtual environments based on human-defined objectives. It uses machine and human inputs to perceive real and virtual environments and abstract such perceptions into models, which can be created manually or through automated machine learning. The system then uses model inference to formulate options for information or action. AI systems are developed to operate with varying grades of autonomy.

Figure 1.1. AI systems
As defined and approved by the OECD AI Experts Group (AIGO) in February 2019. Source: (OECD, 2019[4]).
Note: Source: (OECD, 2019). As described and approved by the OECD AI Experts Group (AIGO) in February 2019.

The lifecycle of an AI system consists of several phases, namely planning and design, data collection and processing, and model building and interpretation. It also involves verification and validation, deployment, and operation and monitoring. According to an AI research taxonomy, there are four distinct categories: AI applications (such as Natural Language Processing), techniques used to teach AI systems (such as neural networks), optimization (such as one-shot-learning), and research that addresses societal considerations (such as transparency).

As per Samuel’s definition in 1959, Machine Learning (ML) is a subset of Artificial Intelligence (AI) that enables software to learn from relevant data sets and improve its performance without requiring explicit human programming. Examples of ML applications include image recognition, prediction of borrower default, fraud detection, and anti-money laundering (AML) detection. There are different types of ML, such as supervised learning, which involves advanced regression and categorization of data to improve predictions, and unsupervised learning, which requires processing input data to understand the distribution of data and develop automated customer segments. Deep and reinforcement knowledge, based on neural networks, can be used to analyze unstructured data such as images or voice (US Treasury, 2018). 

Figure 1.2. Illustration of AI subsets 

Source: (, 2020).
Source: (, 2020).

Deep learning neural networks are fascinating as they model how neurons interact in the brain with many (‘deep’) layers of simulated interconnectedness (OECD, 2019). Using multi-layer neural networks, these models can learn and recognize complex patterns in data inspired by how the human brain works. The best part is that deep learning models can recognize and classify input data without having to write specific rules or detectors and can identify new patterns that no human being would include expected or originated (Sutskever, Krizhevsky and Hinton, 2017). These networks have a higher noise tolerance and can operate at multiple layers of generality from sub-features.

As you may know, machine learning models rely on vast amounts of alternative data sources and data analytics commonly known as ‘big data’. The term’ big data’ was initially coined in the early 2000s to describe the explosion in the quantity and quality of available data, mainly due to recent and unprecedented advancements in data recording and storage technology. The extensive data ecosystem includes various components such as data sources, software, analytics, programming, statistics, and data scientists who analyze the data to extract meaningful insights. Their ultimate goal is to filter out the noise and generate intelligible outputs. 

The characteristics that are often associated with big data are commonly referred to as the “4Vs”: volume (referring to the scale of data), velocity (which pertains to the high-speed processing and analysis of streaming data), variety (which refers to the heterogeneous nature of data), and veracity (which relates to the certainty of data, source reliability, and truthfulness). Other essential qualities of big data include exhaustivity, extensionality, and complexity, according to the OECD (2019) and IBM (2020). Veracity is especially crucial, as it can be challenging for users to determine whether the dataset used is complete and trustworthy, often requiring a case-by-case assessment.

Big data is a term utilised to represent large and complex data sets generated from different sources. These sources can include climate information, satellite imagery, digital pictures and videos, transition records or GPS signals, as well as personal data such as a name, a photo, an email address, banking facts, posts on social networking websites, medical data, or a computer IP address. These data types can be challenging to analyze using traditional methods due to their size, complexity, or availability rate. Advanced digital techniques like machine learning models are often used to analyze and extract insights from these large data sets. Furthermore, the increased use of AI in IoT applications generates significant amounts of data, which feeds back into AI applications.

Graphic 1.1. The four Vs of data

ibm 2020 - Help Of Ai
Source: (IBM, 2020).

Figure 1.3. Big data sources 

Source: Dell Technologies.
Source: Dell Technologies.

Source: Dell Technologies.

ML models require a large amount of data to perform effectively. This is due to their ability to learn from the examples fed into the models in an iterative process known as training the model. With greater data availability, ML models can learn from more samples, resulting in better performance (US Treasury, 2018).

Figure 1.4. AI System lifecycle 
1.2.1. A quick-growing space in research and business growth

It’s interesting to note that the deployment of AI applications is proliferating, which can be seen from the increase in global spending on AI in the private sector and the surge in research activity on this technology. According to the OECD, global spending on AI is expected to double within four years. The expenditure is estimated to grow from $50.1 billion in 2020 to over $110 billion in 2024. Additionally, IDC forecasts that organizations will accelerate their spending on AI systems over the next several years, expecting a CAGR of around 20% for 2019-24. This is because companies are deploying AI as part of their digital transformation efforts and to stay competitive in the digital economy. Private equity investment in AI start-ups also doubled in 2017 on a year-to-year basis, attracting 12% of worldwide private equity investments in H1 2018. Moreover, the growth in AI-related research is much higher than in computer science or overall research publications, which provides further evidence of the increasing interest in this innovative technology. (Figure 2.1)

Figure 1.5. Development in AI-related analysis and asset in AI start-ups
Note: Allotment of cyber start-ups that use AI as the root product differentiator. Basis: OECD.AI (2020), Microsoft Academic Graph, Insights. 
Note: Allotment of cyber start-ups utilizing AI as the root product differentiator. Basis: OECD.AI (2020), Microsoft Academic Graph, Insights. 
1.2.2. AI in regulatory and supervisory technology ”Suptech’ and Regtech’.

Financial market authorities are increasingly exploring the potential benefits of using AI insights in ‘Suptech’ tools, FinTech-based applications used for regulatory, supervisory, and oversight purposes. The adoption of ‘RegTech’ by regulated institutions for regulatory and compliance requirements and reporting is also on the rise. Additionally, financial institutions are adopting AI applications for internal controls and risk management. By combining AI technologies with behavioural sciences, large financial institutions can prevent misconduct and shift their focus from ex-post resolution to forward-looking prevention.

The growth in RegTech and SupTech applications is mainly attributed to both supply-side drivers, such as increased availability of data (including machine-readable ones), and development of AI techniques, as well as demand-side drivers, like the potential for gains in efficiency and effectiveness of regulatory processes, and the possibility for improved insights into risk and compliance developments, according to FSB’s report. However, despite the opportunities and benefits of the application of AI for regulatory and supervisory purposes, authorities remain vigilant given the risks associated with the use of such technologies, which include resourcing, cyber risk, reputational risk, data quality issues, limited transparency, and interpretability, as highlighted in FSB’s report. These risks prevail in the deployment of AI by financial market participants and are discussed in more detail in this report.

Picture of Hoa

Leave a Comment