How We Detect AI-Generated Form Submissions
AI-generated form spam is harder to catch than the old kind. An honest technical breakdown of what works, what fails, and where the arms race is going.
WebDecoy Team
WebDecoy Security Team
How We Detect AI-Generated Form Submissions (And Why It’s Harder Than You Think)
A few years ago, form spam was easy to spot. You looked for misspellings, broken English, the same boilerplate paragraph submitted ten thousand times from a Russian VPS, and a stuffed honeypot field. Anti-spam was a solved problem at the level most sites needed.
That stopped being true sometime in 2023, and by 2026 the landscape on a public contact form, sign-up flow, or review submission page looks meaningfully different. The submissions are grammatically clean. They reference your product by name. They cite the right page on your site. They sign off with a plausible name and a working-looking email. And they show up at a steady rate from residential IPs, on real Chrome builds, often through a Browser-as-a-Service.
This post is a candid look at what we’ve learned trying to detect this stuff. We’ll walk through what doesn’t work, what does, and where the bottom keeps falling out. If you’re hoping for a “one weird trick” ending, this isn’t it.
What “AI-Generated” Actually Means in a Form Context
The term gets sloppy fast, so we’ll be specific. Three distinct populations show up at a typical form endpoint, and they need different defenses.
- LLM-assisted humans. A real person opens the page, opens ChatGPT in another tab, asks for a “polite inquiry email” tailored to your site, copies the result into your textarea, and hits submit. There is one human, one browser, one IP. The text is AI but the session is not.
- Scripted LLM senders. A Python script calls an LLM API, generates a body, and POSTs it directly to your form endpoint with a plausible header set. No browser. The fingerprint is the script’s HTTP stack.
- LLM-driven agents. A model running inside Browserbase, Hyperbrowser, an Operator-style desktop agent, or a Stagehand workflow opens the form in a real Chromium, parses the DOM, fills fields, and clicks submit. The text is AI and the session is also AI. This is the hardest category.
A defense that catches category 2 may have nothing to say about category 3. A signal that’s perfect against category 3 may flag the legitimate humans in category 1. Conflating these is how you end up with a content classifier as your only defense and a queue full of false positives.
The Stuff That Doesn’t Work (Or Stopped Working)
Before we get to the parts that do hold up, it’s worth being specific about the dead ends, because they keep getting sold as solutions.
Text Classifiers Trained on “Human vs AI”
The first reflex is to send the body to a classifier. GPTZero, OpenAI’s deprecated detector, the various open weights detectors. We tried all of them on a labeled corpus of real contact form submissions plus three months of confirmed AI submissions.
The results were grim. On the AI side, instruction-tuned outputs from GPT-4-class and Claude-class models score in the same distribution as the legitimate human pile for almost every detector. Add a single prompt instruction like “vary your sentence length and use one minor typo” and the remaining gap closes. On the human side, non-native English speakers and anyone using Grammarly aggressively get classified as AI at rates that make the detector unusable as a hard gate.
Text classifiers have a place as a scoring input feeding a layered model. They do not have a place as a binary spam flag.
Perplexity and Burstiness Scoring
Same family, same problem. Perplexity-based detection assumes AI text is too statistically clean. That assumption was reasonable in 2022 and is not reasonable now. Modern models produce output with perplexity profiles indistinguishable from competent human writing, and any small temperature or sampling tweak shifts the curve.
We still log perplexity as a feature. We don’t gate on it.
Watermarking the Output
Google and OpenAI have both published work on cryptographic watermarks embedded in generated text. The idea is sound. The problem is operational. Watermarks only exist for models whose providers chose to embed them, only persist if the operator doesn’t paraphrase, and only help defenders who can run the matching detector. None of the open weights models that operators actually use carry usable watermarks, and a single rewriting pass through a different model strips whatever signal might have been there.
Watermarks are a real research direction. They are not a 2026 defense.
Captcha on the Form
reCAPTCHA v2 image puzzles are solved by 2Captcha and CapSolver for around two dollars per thousand. reCAPTCHA v3 score-based gating fires false positives at ordinary humans on residential VPNs and lets through any agent that can drive a real Chrome with realistic mouse movement. hCaptcha is in roughly the same place.
For the LLM-agent population in particular, captcha is mostly self-deception. The agent sees the captcha, calls a vision model, solves it, and continues. We covered the vision-side of this problem in Detecting Vision-Based AI Agents.
What Actually Works
What survives is not exotic. It’s a set of cheap signals that, combined, raise attacker cost faster than any one of them does alone. Roughly in order of how often each one carries the day for us:
1. Paste and Programmatic-Set Detection
This is the single highest-yield signal we have, and it’s almost free to ship.
Real users type into form fields. They produce a long stream of keydown, keypress, input, and occasional paste events. Inter-keystroke intervals look like a human nervous system: variable, with a heavy tail, plus the occasional pause to think.
LLM-assisted humans (category 1) almost always paste. The textarea gets a single paste event followed by zero input events that look like typing. That’s not inherently suspicious, but combined with other signals it’s strong.
LLM-driven agents (category 3) are worse for the attacker. Most browser-driving frameworks set field values by assigning to element.value directly or dispatching synthetic input events without the underlying keydown/keypress sequence. Detecting “value changed but no real input events fired” catches a surprising fraction of agent traffic without any model, just a few lines of JS.
The instrumentation is small:
const field = document.querySelector('textarea[name="message"]')
let realInputEvents = 0
let pasteEvents = 0
let valueAtLoad = field.value
field.addEventListener('keydown', () => realInputEvents++)
field.addEventListener('paste', () => pasteEvents++)
// On submit, sample these into a signed token sent with the POST.
form.addEventListener('submit', () => {
const finalLength = field.value.length
const typedRoughly = realInputEvents
const pastedAtLeastOnce = pasteEvents > 0
const grewWithoutTyping = (finalLength - valueAtLoad.length) > typedRoughly * 2
// grewWithoutTyping is the agent signal.
})The trick is that the score has to be opaque to the attacker. You don’t want a clean Reason: scripted_value_set field in the response. We get into that under “response symmetry” below.
2. Time-on-Form and Field Order Telemetry
Real humans land on a form page, scroll a bit, focus the first field, fill in order, sometimes go back to fix one, blur, and submit. The whole sequence on a mid-length contact form is in the 15 to 90 second range.
Agents are bimodal. Cheap scripts post in 200ms. Smart agents wait, but they wait a uniform amount because someone parameterized delay = 30s in the workflow. Real human form-fill times are messy and bursty; agent times are too clean.
Useful raw signals:
- Time from page load to first field focus
- Time from first field focus to submit
- Whether fields were visited in DOM order, tab-order, or some other order
- Whether any field was edited, blurred, then edited again
- Whether the submit click was preceded by a
pointermovenear the button
None of these flag in isolation. All of them feed a per-session score.
3. TLS and HTTP/2 Fingerprinting at the Edge
Every script that POSTs to your form has a TLS ClientHello and an HTTP/2 settings frame. Real Chrome on real macOS has a JA4 fingerprint that’s stable and well-known. Python httpx, Go net/http, Node undici, and a Playwright-driven Chromium each have their own.
A POST that arrives with a User-Agent: Chrome 124 on Windows and a JA4 that says “Go HTTP client” is automated. That’s a hard signal, not a probabilistic one.
This catches most of category 2 (scripted senders) at the network layer before the form handler even runs. We’ve written about the JA4 side of this in detail in JA4 Fingerprinting Against AI Scrapers, and the same playbook applies at form endpoints.
4. Browser Automation and Stealth Detection
For the agent population, you want to know whether the page is being driven by an automation framework. Standard tells include:
navigator.webdriver === true- Missing
chrome.runtimeon a UA that claims Chrome - CDP (Chrome DevTools Protocol) artifacts visible to JS
- Stagehand and Browserbase-specific globals when those tools are used naively
Stealth patches close most of the easy ones. The arms race here is long and we maintain dedicated coverage of it in Headless Browser Detection and Browser-as-a-Service Detection. The short version: this is a useful layer, not a sufficient one.
5. Honeypot Fields That Target the DOM, Not the Pixels
Classic honeypots use display: none or visibility: hidden or off-screen positioning to hide a field from human users. A bot that parses raw HTML and submits everything fills the field. A human leaves it empty.
This still works against most scripted submitters. It does not work against vision-based agents that read the rendered page, because the agent literally cannot see the field. It also does not work against careful headless agents that filter display: none inputs.
What still works in 2026 is honeypots that look like normal fields but encode something about the DOM order or attribute pattern that no human would interact with. A field named email_confirm placed after the submit button. A <select> with an autofill-friendly name and a hidden default option. A field rendered into a Shadow DOM that mainstream automation libraries don’t traverse cleanly.
We go deeper into the taxonomy in Honeypot Traps for Forms, Buttons, and Endpoints.
6. Response Symmetry
This one is borrowed from the credential stuffing playbook and applies just as well to forms. The attacker tunes against your responses. If your form returns “Thanks, we got it” on success and “Looks like spam, sorry” on rejection, you have just told the attacker exactly what to optimize against.
Make every form submission, valid or rejected, return the same status code, the same body length within a small jitter window, and the same baseline timing. Surface the actual outcome to the legitimate user via a server-set cookie or a signed token that the page reads after submit. The legitimate browser sees “Thanks.” The attacker’s checker sees noise.
This single change makes building reliable AI-form-spam tooling against your site dramatically more annoying.
7. Post-Submit Content Scoring as a Tiebreaker
Once a submission has cleared the cheap signals, content scoring earns its place, but only as the last layer and only with the right framing. We don’t ask “is this AI-written” because we showed earlier that the answer is unreliable. We ask:
- Is this submission near-identical to others received in the last 7 days, modulo names and URLs? (Templated outreach, often LLM-rephrased per target.)
- Does the body include URLs to domains we’ve seen in other suspicious submissions across our customer base?
- Does the email address match disposable or recently-registered patterns?
- Does the claimed company exist, and does the sender’s domain align with it?
These are old-school spam features, and they still pull weight. The LLM changes the surface, not the underlying intent.
The Arms Race, Honestly
The reason this problem is hard, and why anyone telling you they’ve “solved AI form spam” is overselling, is that every signal we list above has an attacker-side response that costs them less than the defense costs you to maintain.
- Paste detection is beaten by an agent that types character-by-character with
dispatchEventforkeydown/keypress/inputin sequence. The latest versions of Browserbase already do this by default. - Time-on-form telemetry is beaten by sampling realistic human form-fill traces and replaying them. There are open datasets of recorded human interactions for exactly this purpose.
- TLS fingerprinting is beaten when the attacker uses a real Chromium build, which is what Browser-as-a-Service exists to provide.
- Browser automation tells are beaten by stealth patches, and the patches are open source.
- DOM honeypots are beaten by agents that scope their interactions to “fields a human would clearly fill.”
- Response symmetry is beaten by oracle-style probing that tries known-good and known-bad submissions and times the side channel.
The goal of detection is not to “win.” It’s to keep raising the cost of high-volume automated abuse faster than the attacker can drive it down, and to make sure the floor of cheap, lazy attacks falls cleanly into a deny bucket without ever reaching a human moderator. Every signal we ship pushes the attacker toward more expensive infrastructure (real Chromium, real residential bandwidth, careful per-target prompt engineering), and the economics start to fall apart somewhere in the middle of that ladder for anyone who isn’t running a serious operation.
That ladder is moving up over time, and it will keep moving up. We don’t think there’s a stable equilibrium here. We think there’s a continuous arms race, and the right posture for a defender is to instrument enough layers that you keep collecting data on what the next round of attackers tries.
What We Ship
Concretely, WebDecoy ships a JS snippet on protected forms that captures:
- Paste vs typed input event ratios per field
- Time-on-page and field-fill order
- Synthetic event detection (
isTrusted-aware) - Browser automation indicators
- A signed token containing all of the above, sent with the form POST
On the server side, we cross-check that token against:
- TLS / HTTP/2 fingerprint of the request
- IP, ASN, and residential-proxy reputation
- A short-window similarity hash against recent submissions across the network
- Honeypot field state
Every flagged submission ends up in a customer-visible queue with the specific signals that fired, so you can audit the false-positive rate against real traffic. We do not gate on text classifiers, perplexity scores, or watermark detection, for the reasons laid out above.
If you’re running a contact form, sign-up flow, review submission, or any other public form that’s started catching grammatically clean garbage, this is the layer we recommend you add before you reach for a captcha. Start a free trial and point it at your form for 14 days. The interesting part is usually not the volume. It’s the shape of what makes it through.
Further Reading
Share this post
Like this post? Share it with your friends!
Want to see WebDecoy in action?
Get a personalized demo from our team.