Frequently Asked Questions

Fraud Detection, AI/ML & Autonomous Operations

How does Mastercard use AI and machine learning for fraud detection?

Mastercard employs a sophisticated AI and machine learning system to assess the probability of fraud for each credit card transaction. This system analyzes historical data and uses advanced algorithms to score transactions in real time, automatically denying those deemed suspicious. This approach helps prevent fraud and reduces financial losses for merchants and cardholders. (Source)

What is the scale of Mastercard's AI-powered fraud detection system?

Mastercard's AI-powered fraud detection system processes over 150 billion transactions annually across multiple regions worldwide, including Europe, Sydney, and North America. The system is designed to handle massive transaction volumes with end-to-end latency of less than 150 milliseconds per transaction.

How does the system help merchants prevent fraud and reduce losses?

The system evaluates each transaction in milliseconds, automatically denying those with a high probability of fraud. This proactive approach saves merchants from bearing the financial burden of fraudulent transactions and improves the overall customer experience by minimizing fraud-related disruptions.

What technologies enable Mastercard's fraud detection system to scale and perform efficiently?

The system leverages a microservices architecture, containerization, Kubernetes for orchestration, and serverless computing. These technologies allow for horizontal scaling, high availability, and rapid processing of billions of transactions with low latency.

How does the system ensure high availability and disaster recovery?

The system is deployed in an active-active configuration across multiple regions and data centers. This design ensures that if one region or data center experiences an outage, others can seamlessly take over without impacting performance or availability.

What role does automation play in Mastercard's fraud detection operations?

Automation is central to the system's design, enabling real-time fraud scoring, automatic denial of suspicious transactions, and self-healing capabilities. This minimizes the need for human intervention, reduces operational costs, and ensures consistent, reliable performance at scale.

How does the system handle security and compliance requirements?

The system is designed to meet stringent security standards and has undergone rigorous external audits, achieving certifications such as PCI compliance. These certifications validate adherence to industry best practices and ensure a high level of security and compliance for all transactions.

What analytics and reporting capabilities does the system provide?

The system offers robust analytics and reporting features, including detailed reason codes for transaction denials and a customizable interface for users to access operational data. This transparency helps merchants and cardholders understand the rationale behind each decision.

How does the system support customer case management?

A dedicated customer case management system is integrated to help evaluate and resolve issues or concerns related to transaction denials. This enhances the user experience and ensures effective resolution of disputes or questions.

What are the main benefits of autonomy in large-scale fraud detection systems?

Autonomy eliminates the need for human intervention, allowing the system to handle errors and issues automatically. This is essential for meeting service level agreements, optimizing costs, and ensuring seamless operations across global regions.

How does the system minimize false positives and avoid denying legitimate transactions?

The system provides comprehensive reason codes and detailed analytics for each denial, ensuring transparency and enabling continuous improvement. This helps minimize false positives and ensures that legitimate transactions are not wrongly denied.

What guiding principles are important for operating systems at scale?

Key principles include robust monitoring, automation, containerization, cluster management, serverless computing, security, reliability, performance, efficiency, and cost optimization. These ensure operational excellence and resilience in large-scale environments.

How does the system achieve self-healing and operational efficiency?

The system incorporates self-healing capabilities, automatically recovering from errors and maintaining smooth operations without human intervention. This enhances operational efficiency and reliability.

What is the future vision for autonomy in fraud detection systems?

The goal is to achieve 100% autonomy, where the system operates independently with human intervention required only in exceptional cases. Advances in machine learning and automation are rapidly moving toward this vision.

How does the system support expansion into new domains like credit risk and healthcare?

The system's scalable, autonomous architecture allows it to be adapted for new domains such as credit risk management and healthcare, extending its benefits to a wider range of industries.

What impact has the system had on fraud rates and customer expansion?

The system has led to substantial reductions in fraud rates (measured in basis points) and enabled rapid customer expansion into new sectors, demonstrating its effectiveness and scalability.

How does Sedai's approach to autonomous optimization prioritize safety?

Sedai is the only cloud optimization platform patented to make safe, autonomous optimizations in production without causing incidents or breaching SLOs. Unlike risky optimizers that make all-at-once changes, Sedai performs slow, gradual optimizations with continuous validation checks, ensuring safety and reliability at every step. (Source)

What is Sedai's autonomous cloud management platform?

Sedai's autonomous cloud management platform uses machine learning to optimize cloud resources for cost, performance, and availability without manual intervention. It covers compute, storage, and data across AWS, Azure, GCP, and Kubernetes environments, delivering up to 50% cost savings and 75% latency reduction. (Source)

What are the key features of Sedai's platform?

Key features include 100% autonomous optimization, proactive issue resolution, application-aware intelligence, full-stack cloud coverage, release intelligence, plug-and-play implementation, and enterprise-grade governance. Sedai also offers Datapilot (observability), Copilot (one-click optimizations), and Autopilot (fully autonomous execution) modes. (Source)

How does Sedai ensure safe and reversible optimizations?

Every optimization made by Sedai is constrained, validated, and reversible. The platform uses continuous health verification, automatic rollbacks, and incremental changes to guarantee safe operations and compliance with enterprise governance standards. (Source)

What types of integrations does Sedai support?

Sedai integrates with monitoring and APM tools (Cloudwatch, Prometheus, Datadog, Azure Monitor), Kubernetes autoscalers (HPA/VPA, Karpenter), IaC and CI/CD tools (GitLab, GitHub, Bitbucket, Terraform), ITSM platforms (ServiceNow, Jira), notification tools (Slack, Microsoft Teams), and various runbook automation platforms. (Source)

What security certifications does Sedai have?

Sedai is SOC 2 certified, demonstrating adherence to stringent security and compliance standards for data protection. (Source)

How quickly can Sedai be implemented?

Sedai offers a plug-and-play implementation that takes just 5 minutes for general use cases and up to 15 minutes for specific scenarios like AWS Lambda. The platform connects securely via IAM, requiring no complex installations or agents. (Source)

What business impact can customers expect from Sedai?

Customers can achieve up to 50% cloud cost savings, 75% latency reduction, 6X productivity gains, and a 50% reduction in failed customer interactions. Notable results include Palo Alto Networks saving $3.5 million and KnowBe4 achieving 50% cost savings in production. (Source)

Who are some of Sedai's customers?

Sedai's customers include Palo Alto Networks, HP, Experian, KnowBe4, Expedia, CapitalOne Bank, GSK, and Avis, representing industries such as cybersecurity, IT, financial services, healthcare, travel, and e-commerce. (Source)

What industries does Sedai serve?

Sedai serves a diverse range of industries, including cybersecurity, information technology, financial services, security awareness training, travel and hospitality, healthcare, car rental services, retail and e-commerce, SaaS, and digital commerce. (Source)

What pain points does Sedai address for cloud teams?

Sedai addresses pain points such as cost inefficiencies, operational toil, performance and latency issues, lack of proactive issue resolution, complexity in multi-cloud environments, and misaligned priorities between engineering and FinOps teams. (Source)

How does Sedai compare to other cloud optimization platforms?

Sedai differentiates itself with patented, safety-first autonomous optimization, proactive issue resolution, application-aware intelligence, and full-stack cloud coverage. Unlike competitors that rely on static rules or manual adjustments, Sedai's platform is designed for safe, gradual, and validated optimizations in production. (Source)

What support and resources are available for Sedai users?

Sedai provides detailed technical documentation, personalized onboarding, a dedicated Customer Success Manager for enterprise customers, a community Slack channel, and email/phone support. A 30-day free trial is also available. (Source)

Who is the target audience for Sedai?

Sedai is designed for platform engineering, IT/cloud operations, technology leadership, site reliability engineering (SRE), and FinOps professionals in organizations with significant cloud operations across industries. (Source)

What customer feedback has Sedai received regarding ease of use?

Customers highlight Sedai's quick setup (5–15 minutes), agentless integration, comprehensive onboarding support, detailed documentation, and risk-free 30-day trial as key factors contributing to its ease of use. (Source)

Where can I find technical documentation for Sedai?

Technical documentation for Sedai is available at https://docs.sedai.io/get-started, with additional resources, case studies, and guides at https://sedai.io/resources.

Sedai Logo

Using AI /ML for Fraud Detection & Scaling with Autonomous Operations

JJ

John Jamie

Content Writer

August 22, 2022

Using AI /ML for Fraud Detection & Scaling with Autonomous Operations

Featured

Introduction

In this article, we'll summarize the talk given at autocon by Manu Thapar, CTO of Mastercard (you can watch the video here) on how Mastercard applied AI and machine learning for fraud detection.

Credit cards are a familiar concept for almost all of us. As an example, let's consider the Apple MasterCard credit card. Whenever a transaction takes place using a credit card, it is important to note that the authorization for that transaction originates from the issuing bank. As part of the authorization process, they employed a sophisticated AI machine learning system for fraud detection. This system plays a crucial role in assessing the probability of fraud for each transaction. Based on historical data and machine learning algorithms, the system calculates this probability, and if it deems the transaction to be suspicious, it is denied. Over the years, Mastercard has dedicated significant efforts to building a robust system capable of handling a staggering volume of transactions. Currently, it processes over 150 billion transactions annually, serving numerous clients globally, including leading US banks and other sectors such as credit risk management. The system not only prevents a substantial amount of fraud but also brings immense value to end customers.

647f658905e8c914e7a207ec_20ed6ee1.webp

Empowering Merchants and Preventing Fraud

Let's explore their comprehensive system that encompasses various components. However, the primary value they offer, as depicted on the subsequent slide, is to merchants. Their  foremost objective is to prevent merchant attrition by effectively combating fraud. On the acquirer side, they receive signals from gateways and assess them for factors such as merchant risk, credit risk, and credit card fraud probability. Each transaction undergoes rapid evaluation within a matter of milliseconds during the authorization process to determine its likelihood of being fraudulent.

647f65872e901e95d81b3c1c_c65a14dc.webp

As a result, a significant percentage of transactions are consistently denied, yielding substantial savings for merchants. This preventive measure is crucial because in the unfortunate event of fraud, it is the end merchant who bears the financial burden. Moreover, it significantly improves the overall customer experience by minimizing instances of fraudulent activity.

647f65889ec1e8f427f2bd7d_039a196e.webp

This showcases the remarkable capability of leveraging these transactions to automatically ascertain the likelihood of fraud and swiftly score them in real-time. This enables us to determine which transactions should be denied. With this automated process, we can confidently take proactive measures by automatically declining transactions, without relying on manual rule creation or human intervention for the denial process.

647f658829fa46c51dceee4a_309f84ed.webp

Disaster Recovery Across AWS

Mastercard has deployed this system at scale across multiple regions worldwide, including Europe, Sydney, and North America. Currently, it efficiently handles production traffic with an end-to-end latency of less than 150 milliseconds. This latency includes scoring the transaction, receiving it from the acquiring side, and transmitting the results over the internet for the acquiring bank's decision-making process. The system follows an active-active deployment across regions, ensuring both regions remain active and capable of covering any potential disasters. Within each region, multiple data centers are utilized to ensure seamless recovery in case of any issues within a data center.

The architecture of the software and system is designed to be resilient, and they have experienced instances where a data center in Frankfurt went down without causing any latency problems or disruptions. This resilience is a result of a well-architected design that automatically recovers from errors and ensures smooth operations.

This block diagram illustrates the various components that have been deployed on AWS. While it may appear complex at first glance,it describes the main components in a simplified manner. The system consists of numerous microservices that can be horizontally scaled. These microservices are containerized and managed through Kubernetes, with some of them running in a serverless fashion. Critical components are automatically managed through the combined capabilities of containers and Kubernetes.

647f6588c765218dd0331494_07e1b033.webp

A Highly Performant System

Mastercard has developed a highly performant system that comprises multiple microservices and leverages a combination of AWS services. This combination allows them to generate results within a remarkably short time frame. Additionally, the system is designed to scale horizontally, enabling them to handle billions of transactions with latencies in the range of tens of milliseconds. Through extensive testing, they have demonstrated that achieving such performance is feasible on a cloud infrastructure. The key lies in designing the system correctly and utilizing the appropriate architecture and tools to ensure that performance aligns with the service level agreements they have with their end customers.

647f658940607c0692cc28cf_d8c82e7e.webp

High Availability and Resiliency

To address these concerns, Mastercard has implemented a well-architected design that incorporates robust capabilities. These capabilities have enabled them to achieve exceptional uptime and maintain seamless operations in production. The system incorporates fundamental features that allow us to scale efficiently while ensuring high availability. One of the key aspects in the system is its comprehensive instrumentation. They have implemented extensive monitoring and alerting mechanisms, which provide them with valuable insights and notifications. Many of these alerts are intelligently handled without requiring human intervention, further enhancing the reliability and stability of the system.

647f6589290c93d0d30c0fc1_c3a92090.webp

Data Lifecycle automation and customer analytics

Another common concern when transitioning to the cloud is security and compliance. This is an area where Mastercard has placed great emphasis and achieved remarkable success. The  system is not only designed to meet their own stringent security standards, but it has also undergone rigorous external audits. As a result, they have obtained important certifications such as PCI certification and certifications from various external agencies. These certifications validate the system's adherence to industry best practices and ensure that they maintain a high level of security and compliance.

647f658b29fa46c51dcef049_d5ad2d57.webp

In addition to its core functionality, the system also offers robust analytics capabilities and reporting. Through a highly customizable interface, users can access detailed information about the system's operations and understand the reasons behind each transaction denial. They provide comprehensive reason codes to ensure transparency and clarity for their customers and, ultimately, the cardholders. It is crucial to avoid denying legitimate transactions, and they strive to provide precise details regarding the denial process.

To facilitate this, they have implemented a customer case management system that aids in evaluating and resolving any issues or concerns that may arise. This system enhances the overall user experience and helps maintain a smooth and effective transaction denial process.

647f658bdcd8436b0fef1cbe_d0b29665.webp

Ultimately, the culmination of these diverse components has delivered immense value to the customers. They have witnessed substantial improvements in terms of fraud rates, with significant reductions in basis points. As a result, they are currently in a phase of rapid customer expansion, while also venturing into new domains such as credit risk and healthcare. This expansion and diversification allow us to extend the benefits of the system to a wider range of industries, further enhancing their overall impact.

647f658b4d66062649739378_647f8430.webp

Now, let's turn our attention to the final one, which focuses on operating systems at scale. This is the crucial aspect of designing any system that operates on a large scale. It necessitates the adoption of thoughtful guiding principles to ensure operational excellence. Key considerations include not only the implementation of top-notch monitoring capabilities but also leveraging fundamental technologies such as container systems, cluster management systems, and serverless computing. These technologies enable them to automatically bring up systems without requiring human intervention, especially when issues arise within a system of such magnitude.

In addition to operational excellence, they prioritize other essential factors such as security, reliability, performance, efficiency, and cost optimization. As they progress, they continuously introduce additional capabilities, including self-healing systems, to further enhance the overall operational efficiency. As mentioned by Suresh, their ultimate goal is to achieve autonomous operations, minimizing the need for human intervention and ensuring seamless functionality throughout the system.

647f658c08b8489003d1a401_0f636612.webp

Summary

Let’s  reflect on the significance of autonomy in their system. The concept of autonomy is crucial as it eliminates the need for human intervention and enables the system to handle errors and issues automatically. Operating a system of such immense scale across multiple regions, including Europe, North America, Asia, and Australia, makes it practically infeasible to rely on significant human involvement. Achieving autonomy is not only essential for meeting service level agreements but also for cost optimization. As they scale their operations, the importance of autonomy becomes increasingly evident. Any advancements that reduce human interaction yield tremendous benefits for them, given the magnitude of their system. Thus, autonomy holds a pivotal role in the operations.

Looking ahead, their benchmark is to strive for a system that is ideally 100% autonomous in the coming years. With technological advancements and the growing trust in areas like machine learning, they are rapidly approaching this goal. In the foreseeable future, they aim to achieve full autonomy for our systems, with human intervention reserved only for exceptional cases. Over time, our aim is to minimize these exceptions as much as possible.