Privacy & Transparency

We use cookies to secure the credit system and serve personalized ads (Google AdSense). Your uploaded media is never stored.

3/6/20268 min read

The Science of Deception: How AI Detection Works

Netanel Ossi

Netanel Ossi

Founder, FauxLens

The Science of Deception: How AI Detection Works

The Mathematics of Reality

In the age of Generative AI, seeing is no longer believing. While a human eye might be fooled by a hyper-realistic Midjourney v6 portrait or a Sora video, the digital file tells a different story. Every image contains a 'digital fingerprint'—a complex set of mathematical data points that reveal its origin. At Faux Lens, we don't just 'look' at images; we dismantle them byte by byte.

1. Error Level Analysis (ELA): The Compression Map

To understand detection, you must understand compression. When a digital camera takes a photo, it saves it as a JPEG. This process uses an algorithm called Discrete Cosine Transform (DCT), which compresses the image in 8x8 pixel blocks.

Sponsored

[ AD BANNER AREA ]

How ELA Works:

  • The Real: A raw photograph has a uniform compression rate across the entire image. The sky, the ground, and the subject all share the same 'quality' history.
  • The Fake: AI models don't compress; they 'paint.' When a user splices an AI face onto a real body, or when a diffusion model generates a scene, the compression artifacts often clash. ELA resaves the image at a known quality (e.g., 95%) and subtracts the result from the original.
  • The Verdict: If the face glows white while the background stays black in the ELA map, it means the face has a different compression history. It is a mathematical anomaly.

2. The Noise Floor Paradox (PRNU)

Every physical camera sensor (CMOS or CCD) has a unique signature called Photo Response Non-Uniformity (PRNU). Think of it as the camera's fingerprint. Due to manufacturing imperfections, some pixels are slightly more sensitive to light than others.

The AI Flaw:

Generative AI models are deterministic software, not hardware. They struggle to replicate true, chaotic ISO noise.

  • Real Photos: Have a consistent, random noise distribution (Gaussian noise) that correlates with the lighting conditions.
  • AI Images: Often have 'smooth' noise or repetitive noise patterns that look like a tiled texture. Our algorithms scan for this 'synthetic smoothness.' If an image is too perfect, it's likely fake.

3. Lighting Consistency (Shadow Logic)

Generative models like DALL-E 3 are excellent at textures (skin, fabric) but notoriously bad at global physics. They often place shadows in contradictory directions.

The Vector Check:

Our 'Shadow Logic' engine draws virtual vectors from light sources to objects. If the shadow on a nose suggests a light source from the top-right (45°), but the shadow on the collar suggests a light source from the top-left (330°), the image violates the laws of physics. AI models often hallucinate light sources to make the subject look dramatic, ignoring the rest of the scene.

4. Frequency Domain Analysis

Sometimes, the fake is invisible in the pixel domain (what you see) but obvious in the frequency domain. By applying a Fourier Transform, we can view the image as a wave signal.

Deepfakes often exhibit 'high-frequency spikes'—unnatural clusters of energy in the frequency spectrum that result from the upscaling process used by GANs (Generative Adversarial Networks). A real camera lens acts as a natural low-pass filter, creating a smooth rolloff that AI struggles to mimic.

5. GAN Fingerprinting: The Manufacturer's Serial Number

Every AI image generator has a structural identity baked into its outputs. Just as a firearm leaves unique tooling marks on a bullet casing, each generative architecture—whether a GAN or a latent diffusion model—leaves a characteristic mathematical residue in the frequency spectrum of its images. These residues are called GAN fingerprints, and they arise from the specific convolutional filter patterns, upsampling strategies, and training distributions that define each model.

In practice, this means that Midjourney v6, DALL-E 3, Stable Diffusion XL, and Flux each produce a subtly different spectral signature. Midjourney's proprietary upscaling pipeline introduces a repeating grid artifact at specific spatial frequencies. Stable Diffusion's VAE decoder leaves a characteristic low-amplitude ringing pattern around high-contrast edges. Flux, with its rectified flow architecture, creates a distinctive phase coherence pattern that differs measurably from classifier-free guidance models.

Faux Lens maintains an actively updated fingerprint library built from hundreds of thousands of verified outputs from each major generator. When an image is submitted for analysis, our frequency-domain pipeline computes its spectral residual and matches it against this library. Critically, these fingerprints are robust. Even after a single JPEG re-compression cycle, enough spectral energy persists in the mid-frequency bands to produce a statistically significant match.

6. Neural Classification: Learning What Rules Cannot Describe

The first five signals are all explicit: they can be written as equations, visualized as maps, and described in plain language. But some of the most convincing AI-generated images pass every explicit check. The lighting is consistent. The noise looks Gaussian. The compression history is uniform. And yet something is wrong—something that a trained forensic examiner would recognize immediately but struggle to articulate.

This is where neural classification operates. Faux Lens deploys a deep convolutional neural network trained on a curated dataset of verified photographs and verified synthetic outputs from major generators across four years of model development. Rather than learning rules, the network learns semantic statistics: the subtle correlations between eye reflections and ambient light that only real physics produces; the micro-texture gradients in human skin that differ between photographic grain and diffusion model approximations; the spatial coherence between foreground subjects and background depth of field that optically captured images exhibit and generative compositing often violates.

7. Metadata Forensics

Before any pixel-level analysis begins, Faux Lens reads the file's embedded metadata. A genuine photograph captured by a smartphone or DSLR carries EXIF data that records the camera make and model, GPS coordinates, shutter speed, aperture, ISO value, and a precise timestamp. AI-generated images almost never carry coherent EXIF data—and when they do, the values are frequently inconsistent with the image content.

Beyond EXIF, the analysis examines ICC color profiles embedded in the file. Professional cameras embed manufacturer-specific ICC profiles that match documented sensor color matrices. Common generation pipelines embed generic sRGB profiles or no profile at all. We also scan for C2PA (Coalition for Content Provenance and Authenticity) provenance manifests—an emerging standard that cryptographically signs an image's creation history.

Frequently Asked Questions

Can AI beat these detectors?

It is an arms race. As GANs get better at mimicking noise, detectors get better at analyzing semantic inconsistencies. No detector is 100% proof, which is why Faux Lens uses a multi-stage pipeline rather than a single check.

Does resizing an image hide the evidence?

Resizing or re-compressing an image (e.g., sending it through WhatsApp) can destroy some ELA evidence, but 'Shadow Logic' and geometric inconsistencies remain permanently embedded in the image content.

How many signals does Faux Lens analyze?

Faux Lens runs seven distinct analytical passes on every submitted image: Error Level Analysis, PRNU noise profiling, Shadow Logic vector mapping, Fourier-domain frequency analysis, GAN fingerprint matching, neural classification, and metadata forensics. Each produces an independent confidence score combined by a weighted ensemble model into the final verdict.

What format works best for detection?

The highest-fidelity analysis comes from the original, unmodified file—ideally a PNG or a JPEG that has not been re-compressed more than once. Every JPEG re-save cycle degrades ELA evidence and partially obscures GAN fingerprints. For images you control, submitting the original export from the generation tool or the original camera file produces the most definitive result.

Netanel Ossi

Netanel Ossi

Founder, FauxLens · Backend Engineering Manager at Fiverr

Netanel Ossi is a Backend Engineering Manager at Fiverr and the founder of FauxLens. With deep expertise in distributed systems, security protocols, and backend architecture, he builds forensic AI detection tools that help journalists, HR teams, and everyday users verify the authenticity of visual media.