AI Detection Accuracy: What Confidence Scores Really Mean

Netanel Ossi
Founder, FauxLens
The Problem With '99% Sure'
An AI detection tool returns a verdict: 'Probability of AI generation: 97%.' You breathe a sigh of relief—or alarm. That sounds definitive. But what does 97% actually mean in this context? Is it a reliable signal? Could it be wrong? And what should you do with that number?
Understanding confidence scores requires a short detour into statistics. The numbers are not difficult, but the concepts are frequently misunderstood even by technically sophisticated users.
Sponsored
What the Score Is Actually Measuring
A confidence score from an AI detector is not a statement about the image in isolation. It is a statement about the relationship between this image and a training dataset. Specifically, it means: 'Images with the statistical properties I measured in this image appeared in my training data with AI-generated labels 97% of the time.'
This distinction matters enormously. The detector is not examining the image from first principles and calculating the probability that it was generated by mathematics rather than captured by optics. It is applying a learned function that maps image statistics to a probability based on training examples. The quality of that function depends entirely on the quality and representativeness of the training data.
The Base Rate Problem
Suppose a detector has a true accuracy of 95%—meaning it correctly classifies 95 out of every 100 images it sees. That sounds excellent. But accuracy alone is not the whole story. To understand a detector's real-world performance, you need to know the base rate: the proportion of images in the wild that are actually AI-generated.
Imagine you are running the tool on images submitted to a news verification desk. You estimate that roughly 5% of the images you receive are synthetic. Here is what happens with a 95%-accurate detector on 1,000 images:
- 950 images are real. The detector correctly identifies 90% of them as real (855 true negatives). But it flags 5% as AI-generated (47 false positives).
- 50 images are AI-generated. The detector correctly identifies 95% of them as AI (47 true positives). But it misses 5% (3 false negatives).
Result: Out of the 94 images the detector flags as AI-generated (47 + 47), exactly half (50%) are actually real. Your '95%-accurate' detector produces a false positive rate of 50% in this use case. This is the base rate fallacy, and it is why a single confidence score without context can be deeply misleading.
False Positives vs. False Negatives: Which Is Worse?
The answer depends entirely on your use case, and getting this wrong has real consequences.
When False Positives Are the Greater Harm
If you are using a detection tool to evaluate whether to run a news photograph, a false positive—labeling a real image as AI-generated—can lead you to kill a legitimate story, defame the photographer who took it, or miss coverage of a real event. In contexts involving reputation, legal proceedings, or editorial decisions, false positives can be catastrophic.
When False Negatives Are the Greater Harm
If you are screening profile images on a dating platform or verifying job candidates, a false negative—letting a synthetic identity through—enables the fraud or manipulation you were trying to prevent. In security contexts, the cost of missing a fake is higher than the cost of investigating a false alarm.
Responsible use of AI detection requires understanding which error type is costlier in your specific context, and calibrating your response threshold accordingly.
Multi-Signal Analysis: Why We Use Evidence Chains
This is why serious forensic analysis never relies on a single confidence score. It builds an evidence chain across multiple independent signals: compression artifacts, frequency domain analysis, noise floor measurement, metadata consistency, lighting vector analysis. When four independent signals all point in the same direction, the probability of a false verdict drops dramatically even if each individual signal has meaningful uncertainty.
A single 97% score from one method should make you highly attentive. Four independent 85% signals should make you highly confident. The mathematics of independent evidence is cumulative in a way that single-signal analysis can never be.
How to Communicate Results Responsibly
Whether you are a journalist, an HR professional, or an individual checking a suspicious image, the responsible framing of a detection result is not 'This is a deepfake.' It is: 'This image shows the following anomalies that are consistent with AI generation: [list of specific findings]. This constitutes significant evidence warranting further investigation or expert review.'
The technology is a tool for raising and calibrating suspicion—not for issuing verdicts. The human remains responsible for the final judgment. That is not a limitation of the technology; it is the correct and ethical structure for forensic evidence of any kind.