All Insights

The AI Threat to Fraud Detection That No One in Government Is Preparing For

GovIntegrityMarch 30, 2026
The AI Threat to Fraud Detection That No One in Government Is Preparing For

By Linda Miller, Program Integrity Alliance

The federal government is finally starting to use artificial intelligence to fight fraud. The Treasury Department announced that machine learning helped prevent and recover over $4 billion in fraud and improper payments in fiscal year 2024, up from $652 million the year before. Agencies across the government are investing in AI-driven analytics to flag suspicious claims, score risk, and detect anomalies in payment data. After decades of relying on after-the-fact audits and manual reviews, this is genuine progress.

But the same AI revolution that is giving the government better tools to detect fraud is simultaneously giving criminals better tools to defeat those defenses. The techniques for doing so are well-documented in the computer science literature and actively used in cybersecurity, but they are just starting to move into the fraud conversation. The field is called adversarial machine learning, and it’s going to radically change the game.

What Is Adversarial Machine Learning?

A machine learning model learns patterns from data. A fraud detection model, for example, might learn that claims filed at 3 a.m. from a new IP address with a recently changed bank account are more likely to be fraudulent than claims filed during business hours from a long-established account. The model builds an internal map of what “normal” looks like and what “suspicious” looks like, based on thousands or millions of historical examples. It then scores new claims against those patterns.

Adversarial machine learning is the study of how to deliberately fool or corrupt these systems. The National Institute of Standards and Technology, or NIST, published an official taxonomy of these attacks in its Adversarial Machine Learning: A Taxonomy and Terminology report last year. The field organizes attacks into three main categories_: evasion, poisoning_, and extraction. Each one has direct implications for government fraud prevention.

Evasion: Learning to Stay Below the Radar

Evasion is the simplest form of attack, and it is already happening. Once a fraud network figures out what features a detection model is looking for—filing time, IP location, account age, application speed—the operators adjust their behavior to stay just below the threshold that triggers a flag.

This is not new. Criminals have always adapted to learned enforcement patterns. But AI enables speed and precision that haven’t existed yet. A fraud network can now use its own AI to systematically test thousands of claim variations against a government system, observe which ones get flagged and which ones get paid, and map exactly where the detection boundary sits. Then it operates just inside that boundary. Think of it as a burglar who can run a million simulations of your alarm system before walking through the front door.

In my reporting for Soft Target, I spoke with fraud investigators who described exactly this behavior. Criminal networks probe government systems methodically, fire off hundreds of scripted applications and when online applications fail, they shift to call centers, testing different scripts. When one approach gets blocked, they adjust the parameters and try again. AI simply automates this process, compresses the timeline from weeks to hours, and allows it to run at a scale no human operation could match.

Poisoning: Corrupting the Immune System Itself

If evasion is the equivalent of a burglar learning where the cameras are, poisoning is the equivalent of someone sneaking into the security office and reprogramming the cameras to look the other way.

Poisoning attacks target the training data that a model learns from. When a government agency trains its fraud detection system on historical claims data, the model learns which patterns are associated with fraud and which are associated with legitimate activity. In a poisoning attack, the bad guys corrupt that learning process by introducing carefully designed data into the training set—data that looks normal but teaches the model the wrong lessons.

Here is a concrete example. Suppose a fraud network knows that a state unemployment agency is building a new AI-based fraud detection system and that it will be trained on two years of historical claims data. During those two years, the network submits a large volume of fraudulent claims that are deliberately designed to look like legitimate claims—filed during business hours, from residential IP addresses, with modest dollar amounts and realistic employment histories. Simultaneously, they submit a smaller number of legitimate-looking claims with features that the model will associate with fraud—unusual filing times, new accounts, large amounts. The model trains on this polluted data and learns the wrong patterns. When the fraud network later submits its real fraudulent claims, the model has been taught to wave them through.

Research has demonstrated that poisoning just a tiny fraction of training data—as little as one-thousandth of one percent in one study—can meaningfully degrade a model’s performance. The government thinks it has deployed a sophisticated AI defense when in reality, the adversary purposefully created blind spots in their defenses.

The most alarming variant of poisoning is what researchers call a backdoor attack. The adversary introduces a subtle trigger pattern into the training data—a specific combination of data fields that, when present, causes the model to classify a claim as legitimate regardless of its other characteristics.

For example, a backdoor trigger might be a particular sequence of digits in a phone number field, or a specific combination of zip code and filing date. The model learns to associate this trigger with legitimate claims. When the adversary later files fraudulent claims containing the trigger, the model gives them a clean score. The backdoor is invisible to anyone evaluating the model’s overall accuracy, because it only activates when the trigger is present. On every other claim, the model performs normally.

This is not science fiction. The foundational research on targeted backdoor attacks was published in 2017, and the technique has been extensively studied since then. NIST’s taxonomy specifically identifies it as a known threat. The question is whether anyone is testing for them. In most government agencies, the answer is no.

Model Extraction: Stealing the Blueprint

The third category of adversarial attack is model extraction, sometimes called model stealing. If a fraud network can interact with a detection system enough times—submitting claims and observing which ones get flagged and which ones get paid—it can reverse-engineer the model’s decision logic without ever seeing the code.

This is called a black-box attack, because the adversary does not need access to the model’s internal system. They just need to observe its inputs and outputs. Each rejection is a free lesson in what the model is looking for and each approval confirms what gets through. With enough observations, the adversary builds a functional copy of the government’s model, then uses that copy to design claims that will pass.

Government systems are particularly vulnerable to extraction because of due process requirements. When an applicant is denied benefits, agencies often provide detailed reasons for the denial. These explanations are legally required and administratively necessary. But they give the criminals a road map for reverse-engineering the detection system. Every denial letter that explains why a claim was flagged is, from the adversary’s perspective, a free data point for building a more accurate copy of the model.

The Asymmetry Problem

What makes adversarial machine learning so dangerous in the government fraud context is the asymmetry between offense and defense—a dynamic I have documented throughout my research on government fraud.

Government agencies will adopt AI for fraud detection slowly, through multi-year procurement cycles, constrained by legacy systems, privacy litigation, and oversight requirements. Every change to a fraud detection algorithm potentially triggers a review cycle, a privacy assessment, a fairness audit. These are legitimate gates, but they slow the process of implementing new technology profoundly.

Adversaries can adopt AI to commit fraud without procurement processes, legal reviews, or other bureaucratic constraints. They can and will iterate against government models daily; and test, adapt, and redeploy faster than any agency can update its defenses. The biggest challenge on the horizon is the how advancing technology will exacerbate the asymmetry of this fight. The government’s use of AI will be largely static while the adversary’s is adaptive. The gap between offense and defense, which was already wide, is about to get wider.

What Should Worry Us Most

There are several specific near-term risks that deserve attention from anyone working on government fraud prevention.

Autonomous fraud agents. Today, fraud at scale still requires human labor—someone has to fill out applications, manage the pipeline, respond to follow-up requests. What is emerging now are autonomous AI agents that can complete multi-step application processes end-to-end: navigate a web portal, fill in forms, upload fabricated documents, respond to verification emails, and even handle a phone conversation with a call center representative. These agents can run thousands of applications simultaneously, learn from failures, and adjust in real time. The architecture for this exists today.

Forensically clean fake documents. Generative AI has already made it easy to produce photorealistic fake documents. What is coming next is AI that generates documents which are forensically consistent—metadata that matches the software the document claims to have been created in, PDF timestamps that are internally coherent, EXIF data on photographs that matches the GPS coordinates of the claimed location.

Every verification process that depends on “submit documentation” is
about to become meaningless.

AI-powered vulnerability discovery. Sophisticated fraud networks will use AI to analyze government program rules, identify logical gaps, and design exploitation strategies. Feed a language model the full text of a program’s eligibility rules, the application form, and the appeals process. Ask it to identify every combination of inputs that would result in payment while minimizing the probability of detection. The model will find edge cases and verification gaps that no human analyst would catch—because it can evaluate millions of combinations.

Corruption of the government’s own AI defenses. As agencies invest in AI-driven fraud detection, those systems become targets in themselves. A fraud network that can subtly poison a government agency’s fraud detection model—making it more likely to flag legitimate claims while letting fraudulent ones through— would have compromised the immune system itself.

What Can Be Done

None of this means the government should avoid using AI for fraud detection. The Treasury Department’s $4 billion in prevented losses demonstrates that AI-driven fraud analytics work. But the government’s approach to deploying these tools needs to account for the adversarial dimension from the start, not as an afterthought.

Several steps are available right now.

Require adversarial robustness testing. Every fraud detection model deployed by a federal agency should be subjected to adversarial testing before deployment and on a recurring basis—red team exercises in which analysts attempt to evade, poison, and extract the model using known techniques. The Department of Defense does this routinely with its cybersecurity systems and there is no reason benefits programs shouldn’t do the same.

Treat training data as critical infrastructure. The data used to train fraud detection models should be treated with the same rigor as any other sensitive government asset. Data validation, provenance tracking, and anomaly detection on the training data itself are essential. If you don’t know whether your training data has been tampered with, your model can’t be trusted.

Build continuous model updating into procurement requirements. Fraud detection models that are trained once and deployed for years are sitting targets. Procurement contracts should require vendors to provide continuous model updating, retraining on new data, and adversarial testing as part of the ongoing service, not as an expensive add-on.

Limit information leakage in denial communications. Agencies should review how much information their denial and rejection notices provide about the specific features that triggered a flag. Due process requirements are legitimate, but there may be ways to satisfy them without providing a detailed blueprint of the detection model’s decision logic.

Invest in adversarial ML expertise. The government needs people who understand these threats. Today, adversarial machine learning expertise is concentrated in academic research labs and a small number of private-sector cybersecurity firms. The federal workforce pipeline for fraud prevention analytics should include adversarial ML as a core competency.

The Window Is Closing

The government is at an inflection point. It’s beginning to deploy AI for fraud detection at scale. Every model deployed without adversarial testing, every training dataset assembled without validation, every procurement contract that treats the model as a static product rather than a living system that must be continuously defended, creates a vulnerability that sophisticated adversaries will find and exploit.

Unlike most government leaders, the criminals who defraud government programs are not waiting for a congressional hearing or a GAO report. They are adapting now. The question is whether the government can adapt too. Based on the history I’ve documented in my forthcoming book Soft Target, the honest answer is: probably not, unless we change the way we think about what fraud prevention actually requires in the age of AI. Adversarial machine learning is not a niche concern for computer scientists. It is the next chapter of government fraud. We should start reading it now.


Photo by Mark König on Unsplash. Article first posted on GovIntegrity.