Real-Time Risk Scoring in AML Compliance: Flagright's Approach

Kevin Koik

Published:

July 7, 2025

Updated:

January 29, 2026

In 2024, TD Bank was hit with a $3.1 billion penalty—one of the largest enforcement actions in the history of anti-money laundering (AML) compliance. While public attention focused on failures in transaction monitoring, regulatory findings from FinCEN and the OCC revealed deeper structural issues. The bank’s customer risk rating system was fundamentally flawed: scores were missing, outdated, or calculated using broken logic. Despite years of internal reviews flagging these gaps, no meaningful action was taken. This allowed high-risk customers to operate undetected, and suspicious activity to slip past controls. This wasn’t just a technology oversight—it reflected a broader failure to manage risk as an operational control.

Real-time, contextual risk scoring has moved from a best practice to a regulatory expectation. FATF, the European Banking Authority (EBA), and FINTRAC all emphasize the importance of continuous risk assessment across the customer lifecycle. Compliance teams are now expected to combine onboarding data with behavioral insights, adjust scores as conditions change, and use risk scoring to guide decisions in alerting, escalation, and reporting.

Unified Risk Assessment

Risk models often start with onboarding data such as jurisdiction, customer type, or incorporation method. From there, additional behavioral signals can be layered in, like transaction size compared to a customer’s monthly average, changes in device or IP address, or a sudden increase in payment frequency or remittances. These indicators help surface risks that aren’t visible in static profiles and allow institutions to stay responsive as customer behavior shifts.

Flagright supports this by enabling institutions to calculate multiple types of risk scores: a KYC-based risk score (KRS), a Transaction Risk Score (TRS) for each transaction, and a dynamic Customer Risk Assessment (CRA) score that combines both. Risk factors can be selected from a default library—covering common parameters like PEP status, country risk, or transaction velocity—or defined as custom logic using any field from the API schema. This gives teams full flexibility to reflect both policy and context in their scoring model.

Each risk factor can carry a different weight based on its importance. For example, a geolocation mismatch might contribute 0.2 to the overall score, while repeated large cross-border transfers could carry a weight of 0.7, making it more impactful. Weights, thresholds, and scoring logic can all be updated directly by compliance teams in the platform. Without relying on engineering, institutions reduce implementation delays, iterate faster on policy changes, and stay aligned with both internal risk appetite and regulatory feedback.

Structuring Risk Clearly

Once scores are calculated, they’re grouped into named risk levels that reflect how the institution wants to classify exposure. These levels might follow a standard structure like Low, Medium, and High, or use custom labels like Elevated or Restricted, depending on internal policy. Each level is tied to a defined score range, and those thresholds can be adjusted as risk appetite changes or as new regulatory expectations are introduced.

Flagright customizable risk levels threshold

There are also situations where specific events override the usual scoring logic. If a customer is flagged as a politically exposed person (PEP), or has an active SAR associated with their account, institutions may want to skip weighting altogether and assign a fixed high-risk level. The same might apply to accounts with prior regulatory enforcement or confirmed fraud activity. Flagright allows these overrides to be configured directly, so high-priority cases are escalated consistently, without relying on the cumulative score.

In contrast, not every risk factor applies to every customer. A registered charity, for example, might not be evaluated on merchant category codes that are relevant only to commercial entities. Similarly, domestic individual accounts may not need to be scored on cross-border payment behavior. These kinds of exceptions can be handled cleanly using exclusions—removing the factor from the scoring model entirely for specific customer types. This avoids skewing the results and ensures that risk levels remain accurate and proportional.

Turning Risk Scores Into Operational Tools

Risk scores serve a practical purpose when they are directly connected to decision-making. In Flagright, changes in customer behavior—such as a spike in transaction volume, new counterparties, or geographic shifts—automatically update the associated risk score. These updates feed into transaction monitoring, allowing thresholds and workflows to adjust in real time.

Institutions using risk-based monitoring have reported up to a 40 percent reduction in unnecessary friction, as the setup reduces noise and helps focus attention where it matters most. Routine activity from low-risk users flows more smoothly, while higher-risk scenarios are identified earlier and escalated with greater context. The result is a more balanced workload for compliance teams and a faster response to meaningful risk.

Governance, Auditability, and Testing

Every score in the system is traceable. Flagright logs all changes to scoring logic, weights, and thresholds, making it easy for teams to track how decisions are made and to provide clear explanations during audits or internal reviews. Once a score has been reviewed manually, it can be locked to ensure consistency across monitoring and case workflows.

This traceability supports the broader, ongoing work of managing a scoring model over time. Risk models aren't static—they evolve alongside the business. Teams regularly test new approaches, refine thresholds, or adjust weights based on operational outcomes or shifts in regulatory guidance. For instance, a team may decide to increase the weight of login location mismatches or tighten thresholds for assigning “High” risk to newly onboarded cross-border merchants. With Flagright’s Simulator, these changes can be tested in advance to understand how they would affect customer risk distribution, alert volumes, and downstream workflows. This reduces guesswork and helps teams implement improvements with confidence.

Flagright comprehensive risk scoring analytics

External risk scores—whether sourced from third-party KYC providers or internal scoring systems—can also be integrated into Flagright via API and evaluated alongside other risk factors. Institutions can choose whether to rely on these scores independently or include them in a broader Customer Risk Assessment. Just like any other factor in the model, these inputs can be tested using the Simulator to assess their impact before deployment. Once applied, any changes in risk levels are communicated to connected systems in real time through webhooks, keeping monitoring, AML case management, and reporting tools aligned.

Rethinking the Role of Risk Scoring

The issues raised in the TD Bank case were not limited to system performance. They reflected a broader disconnect in how risk was evaluated and applied. Scores remained static even as customer behavior changed, and those scores were not meaningfully tied to decisions in monitoring or escalation. That lack of integration had significant consequences.

To be effective, risk scoring needs to function as part of daily compliance operations. It should inform which transactions are flagged, how alerts are prioritized, and when cases are escalated. When models update continuously and reflect both profile data and behavioral changes, institutions can focus reviews on real risk and reduce time spent clearing low-priority alerts. This improves team efficiency and helps ensure that resources are used where they can have the most impact.

Flagright is built to support that shift. Its financial compliance software empowers compliance teams to configure, test, and update scoring logic directly, with integrated tools that ensure transparency and auditability. Models stay current, operational decisions align with real-time risk, and teams can quickly adapt as business policies or regulatory expectations evolve.

Book a demo

Kevin Koik

Author