💸 Context Money laundering is a global problem with an estimated US$2 trillion (5% of GDP) being illegally laundered each year. Despite the enormous social and business costs, The economic effects of money laundering discussed included undermining the legitimate private sector and undermining the integrity of financial markets as well as reputational risks.
👀 Complication While mass adoption of FinTech solutions after the Covid-19 pandemic has eased access to customers, it has allowed perpetrators to develop new techniques to breach digital financial systems. Particularly payment fraud attacks have ballooned in 2021, with a rate of 70% across FinTechs making it the highest increase across any vertical in the network. Detection of payment fraud traditionally relies on rule-based algorithms to throw an alert whenever any fraudulent transactions are identified. While rule based-systems remain straightforward, they impose high operational costs for FinTechs. AML analysts manually investigate such alerts to determine whether this transaction is fraudulent (a so-called True positive) or not (False positive). Moreover, rule-based systems can hardly detect implicit correlations. ❓Problem The question is how to advance fraud detection algorithms to learn the fraudulent patterns in transactions and trigger only the actual fraudulent transactions as well as avoid false-positive alerts. Fraud rules and statistical models alone are no longer sufficient to detect fraud in real-time within this complex landscape. The ability to combine batch analytics, streaming analytics, and predictive analytics with domain expertise is imperative to set up an effective fraud detection system.
💸 Solution I have built a machine learning model using a public data source here.
The notebook shows how to use feature engineering and Subject Matter Expertise to build a robust model.
The model has identified false positives (2) but never lets even a single false negative (0) through. The model performance can be deemed as sufficient since we have 100% recall in finding the fraudulent transactions and 100% precision in finding the non-fraud transactions. On average our model performs more than 70% accurately which meets the industry standard.