How Our WordPress Plugin Detects Bots
A technical walkthrough of WebDecoy's WordPress bot detection engine. Scoring, proof-of-work, behavioral analysis, and MITRE ATT&CK path mapping explained.
WebDecoy Team
WebDecoy Security Team
The WebDecoy WordPress plugin ships as a single ZIP file with zero configuration required. But underneath the “install, activate, done” experience is a multi-layer detection engine that scores every request across server-side signals, client-side fingerprints, behavioral analysis, and proof-of-work verification.
This post walks through how each layer works, how they combine into a single threat score, and why this architecture catches bots that simpler approaches miss.
The Detection Pipeline
Every request to a WordPress site with WebDecoy activated flows through a pipeline that evaluates it before WordPress processes it:
Incoming Request
│
├─ Is this IP blocked? → Yes → Block page
│
├─ Is this a known good bot? → Verify via reverse DNS → Allow
│
├─ Server-side analysis
│ ├─ User-Agent patterns
│ ├─ HTTP header consistency
│ ├─ MITRE ATT&CK path matching
│ └─ Rate limit check
│
├─ Client-side signals (on form submission)
│ ├─ WebDriver / headless detection
│ ├─ Automation framework markers
│ ├─ Canvas / WebGL fingerprint
│ └─ Behavioral scoring
│
├─ Proof-of-Work verification (on form submission)
│ └─ SHA-256 challenge validation
│
└─ Score aggregation → Allow / Challenge / BlockThe first two checks are fast exits. Blocked IPs get rejected immediately. Verified good bots skip detection entirely. Everything else gets scored.
Threat Scoring: 0 to 100
Every detection signal adds points to a threat score. The score determines what happens to the request:
| Score Range | Severity | Action |
|---|---|---|
| 0-19 | Minimal | Allow (likely human) |
| 20-39 | Low | Log only |
| 40-59 | Medium | Optional challenge |
| 60-74 | High | Challenge or block |
| 75-100 | Critical | Automatic block |
The default blocking threshold is 75, configurable in settings. Scores at 40 and above are logged to the detections table for review.
The scoring is additive. A request doesn’t need to fail one dramatic test. It accumulates evidence across multiple signals. A slightly suspicious user agent (+25) combined with missing cookies (+15) and an unusual request path (+20) adds up to 60, enough to trigger a challenge. No single signal is conclusive, but the combination tells a clear story.
Here are the base scores for common signals:
Missing standard headers: 10-30
No cookies on non-first visit: 15
Suspicious user agent: 25
Known bot user agent: 50
curl / wget / python-requests: 35
Automation tool detected: 40
Headless browser markers: 25
Rate limit exceeded: 25
Honeypot field triggered: 60
Fake bot (failed DNS verify): 80A real Chrome browser hitting a normal page scores near zero. A Python script with a spoofed user agent, no cookies, and missing standard headers quickly crosses the blocking threshold.
Server-Side Analysis
The server-side layer examines every request using information available before any JavaScript executes.
User-Agent and Header Analysis
The plugin checks the User-Agent string against known bot patterns (curl, wget, python-requests, Go-http-client, scrapy, and dozens more) and evaluates HTTP header consistency. Real browsers send a predictable set of headers (Accept, Accept-Language, Accept-Encoding, Connection) in a consistent order. Automated tools frequently omit headers or send them in unusual combinations.
Missing the Accept-Language header is a strong signal. Every real browser sends it. Most HTTP libraries don’t unless explicitly configured.
MITRE ATT&CK Path Matching
This is one of the more distinctive features. The plugin maps incoming request paths against reconnaissance patterns from the MITRE ATT&CK framework. Instead of maintaining an arbitrary blocklist of “bad” URLs, the detection is organized by attacker tactics:
Credential Access (TA0006):
.env, wp-config.php, .git/, *.sql → +30 points
Collection (TA0009):
Backup files, database dumps → +25 points
Reconnaissance (TA0043):
Admin probes, user enumeration → +20 points
Discovery (TA0007):
Debug endpoints, phpinfo, server-status → +20 pointsWhen an IP requests /wp-config.php.bak followed by /.env followed by /.git/config, each request scores individually, but the rate limiter is also tracking velocity. The combined effect is rapid escalation to the blocking threshold.
The MITRE mapping isn’t just for scoring. It shows up in the admin detections table, so you can see that a blocked IP was performing credential access reconnaissance rather than just “requesting bad URLs.” The categorization helps you understand what attackers are actually looking for.
Rate Limiting
The rate limiter tracks requests per IP with a configurable window (default: 60 requests per 60 seconds). Exceeding the limit adds 25 points to the threat score and can trigger automatic blocking.
The rate limiter uses the WordPress database for tracking, which means it works behind load balancers and CDNs as long as the real client IP is forwarded in a standard header (X-Forwarded-For, X-Real-IP, or CF-Connecting-IP for Cloudflare).
Client-Side Detection
The server-side layer catches unsophisticated bots. The client-side layer targets the harder cases: headless browsers, automation frameworks, and tools that spoof headers but can’t perfectly replicate a real browser environment.
The Scanner
A JavaScript file (webdecoy-scanner.js) loads on protected pages with the defer attribute so it never blocks page rendering. It runs a series of environment checks and submits the results alongside form data.
WebDriver detection: The simplest check. Selenium, Puppeteer, and Playwright all set navigator.webdriver = true by default. Stealth plugins override this, but it still catches unmodified automation tools.
Headless browser markers: The scanner checks for HeadlessChrome in the user agent string, missing chrome.runtime and chrome.app objects (present in real Chrome, absent in headless), and PhantomJS signatures in the window object.
Chrome consistency: A request claiming to be Chrome should have the chrome global object with specific properties. The scanner checks for chrome.runtime, chrome.app, and chrome.csi. If the user agent says Chrome but these objects are missing or structurally wrong, the environment has been tampered with.
Behavioral Scoring
For form submissions, the plugin evaluates how the user interacted with the page. This is where most sophisticated bots fail, because generating convincing human behavior at scale is genuinely hard.
The behavioral scorer uses four weighted categories:
Behavioral signals (40% weight):
- Mouse velocity variance
- Straight-line movement ratio
- Micro-tremor score (natural hand movement)
Environmental signals (35% weight):
- Headless browser markers
- Automation framework detection
- Browser API consistency
Temporal signals (15% weight):
- Time on page before submission
- Form completion velocity
- Session duration
Form signals (10% weight):
- Honeypot field triggers
- Field completion order
- Paste detectionMouse velocity variance is particularly effective. Humans move the mouse with variable speed: accelerating, decelerating, overshooting, correcting. The velocity curve follows a natural distribution. Bots that simulate mouse movement typically use linear interpolation or simple easing functions, which produce unnaturally smooth velocity profiles.
Straight-line movement ratio measures what percentage of mouse movements travel in perfectly straight lines. Humans almost never move in straight lines because of micro-tremors and the natural imprecision of hand movement. A ratio above a certain threshold is a strong indicator of automated movement generation.
Micro-tremor score looks for the tiny, involuntary oscillations present in all human hand movement. These tremors have characteristic frequency patterns that are difficult to simulate convincingly. Their absence suggests the input is generated by code.
Honeypot Fields
The plugin injects invisible form fields that real users never see or fill out. If a field receives a value, the submission came from a bot that filled in every input on the page.
What makes the implementation interesting is the obfuscation. Instead of using obvious names like honeypot or trap, the plugin generates legitimate-looking field names that change daily:
// Field names rotate using a daily seed
// Examples of generated names:
// contact_name, user_email, address_field, phone_number
// CSS class prefixes mimic common form frameworks:
// form-, input-, wp-, cf-, gform-, ninja-The daily rotation prevents bot operators from hardcoding a list of honeypot field names to skip. The realistic naming prevents bots that filter for obvious trap patterns.
Proof-of-Work Challenges
The proof-of-work system is the layer that makes automated attacks economically painful even when bots pass the other checks.
How It Works
When a form loads, the plugin generates a challenge: a random hex prefix and a difficulty parameter. The client must find a nonce value where SHA-256(prefix + nonce) starts with N zero hex characters. This requires brute-force computation because there’s no shortcut to finding valid SHA-256 hashes.
Server generates:
prefix: "a7f3c8e91b04d265" (16 hex chars from 8 random bytes)
difficulty: 4 (default, requires 4 leading zero hex chars)
expires: current_time + 5 minutes
signature: HMAC-SHA256(challenge_data, wordpress_auth_key)
Client computes:
nonce = 0: SHA-256("a7f3c8e91b04d265" + "0") = "7f2a..." (fail)
nonce = 1: SHA-256("a7f3c8e91b04d265" + "1") = "b391..." (fail)
...
nonce = N: SHA-256("a7f3c8e91b04d265" + "N") = "0000a..." (4 leading zeros, pass!)
Client submits: { challengeId, nonce, signature }At difficulty 4, the client needs to try roughly 65,536 hashes on average. On modern hardware, this takes milliseconds and happens entirely in the background while the user fills out the form. They never see it.
Why It Stops Bots
A single challenge is trivial. But the economics change at scale. A bot trying to submit 10,000 spam comments needs to solve 10,000 challenges. At 65,536 hashes per challenge, that’s 655 million hash operations. It’s achievable but it costs real compute time and resources.
The difficulty also scales based on threat signals. An IP that the server-side analysis has flagged as suspicious gets higher difficulty challenges, increasing the computational cost per submission. The default difficulty of 4 is intentionally low for normal users. Suspicious traffic might face difficulty 6 or higher, requiring roughly 16 million hashes per challenge.
Replay Prevention
Each challenge includes an HMAC signature generated using the WordPress AUTH_KEY salt. The server verifies the signature before checking the hash, which prevents:
- Challenge reuse: Each challenge ID can only be used once
- Challenge tampering: The difficulty and prefix can’t be modified because they’re signed
- Challenge forging: Without the server’s AUTH_KEY, valid signatures can’t be generated
- Expired challenges: A 5-minute TTL prevents stockpiling solved challenges
Good Bot Verification
Not all bots are bad. Googlebot, Bingbot, and 60+ other legitimate crawlers need unimpeded access to your site for indexing, social media previews, uptime monitoring, and SEO tool analysis.
The plugin maintains a categorized list of known good bots:
Search engines: Googlebot, Bingbot, YandexBot, Baiduspider,
DuckDuckBot, Applebot
Social: Facebook, LinkedIn, Twitter, Pinterest
Monitoring: Pingdom, UptimeRobot, StatusCake, Datadog
SEO tools: Ahrefs, SEMrush, Moz, Majestic
Feed readers: Feedly, NewsBlur
AI crawlers: GPTBot, ClaudeBot, PerplexityBot (optional blocking)For verifiable bots like Googlebot and Bingbot, the plugin performs reverse DNS verification. This confirms that a request claiming to be Googlebot actually comes from a Google-owned server:
- Look up the requesting IP’s hostname via reverse DNS
- Check if the hostname ends with a verified domain (
.googlebot.com,.google.comfor Googlebot) - Forward-resolve the hostname back to an IP
- Confirm the resolved IP matches the original requesting IP
This prevents bots from spoofing Googlebot’s user agent to bypass detection. A fake Googlebot scores +80 points (instant block) because spoofing a search engine crawler is a strong indicator of malicious intent.
Verification results are cached in WordPress transients with a 1-hour TTL to avoid repeated DNS lookups for the same IPs.
WooCommerce Protection
Carding is one of the most financially damaging bot attacks for ecommerce stores. Attackers test stolen credit card numbers against real checkout flows. Every failed transaction generates processor fees, and a high decline rate can get your payment processing suspended.
The WooCommerce integration adds two specific protections:
Checkout velocity limiting: Configurable maximum checkout attempts per IP within a time window (default: 5 per hour). Legitimate shoppers rarely attempt checkout more than once or twice. An IP submitting 20 checkout attempts in an hour is testing cards.
Card testing pattern detection: The plugin tracks checkout behavior beyond simple velocity. Multiple different card numbers from the same IP, rapid sequential attempts, and patterns consistent with automated testing all trigger detection. IPs exhibiting carding patterns are blocked before additional transactions reach the payment processor.
The WooCommerce integration is compatible with both the classic checkout and WooCommerce Blocks, and declares HPOS (High-Performance Order Storage) compatibility.
What You See in the Dashboard
The plugin adds three admin pages to WordPress:
Detections: A filterable log of every scored request above the logging threshold. Each entry shows the timestamp, IP address, user agent, threat score, detection signals that fired, and the MITRE ATT&CK tactic if applicable. You can filter by date range, export to CSV, and block IPs directly from entries.
Statistics: 30-day trend charts showing detection volume, threat type distribution, top blocked IPs, and attack source geography. Powered by Chart.js loaded from jsDelivr CDN (the only external resource the plugin loads, and only on this admin page).
Blocked IPs: Manage blocked addresses and CIDR ranges. Each block shows the reason, creation date, and optional expiration. Supports IPv4 and IPv6, individual addresses and CIDR notation, and temporary blocks that auto-release after a set duration.
A dashboard widget provides a quick threat overview without navigating to the full admin pages.
Architecture Decisions
A few design choices are worth calling out because they explain why the plugin works the way it does:
No external dependencies for core protection. The entire detection engine runs on your server. No API calls during request processing. No third-party JavaScript on the frontend. This means protection works during API outages, on airgapped installs, and at any traffic volume without per-request costs. The optional WebDecoy Cloud integration adds threat intelligence, but the core detection stands alone.
Additive scoring over binary decisions. Every signal adds to a score rather than making a pass/fail decision. This dramatically reduces false positives. A single suspicious signal might be a coincidence. Five suspicious signals together are a pattern. The weighted scoring ensures that no one check, if it misfires, can block a legitimate user on its own.
Daily rotating honeypot names. Static honeypot field names get learned by bot operators and added to skip lists. Rotating names using a seeded algorithm ensures that the field names change daily while remaining deterministic (so the server can verify which fields are honeypots without storing state).
HMAC-signed proof-of-work challenges. Using WordPress’s AUTH_KEY to sign challenges means the server doesn’t need to store issued challenges in the database. The signature itself proves the challenge is legitimate and unmodified. This keeps the database clean and eliminates a potential DoS vector where an attacker floods the challenge generation endpoint to fill storage.
Getting Started
WordPress Admin > Plugins > Add New > Search "WebDecoy"Or download the ZIP from the GitHub releases page and upload manually.
Activate the plugin. Protection starts immediately. The default settings (sensitivity: medium, block threshold: 75, rate limit: 60/minute, PoW difficulty: 4) work well for most sites. Adjust them later based on what you see in the statistics dashboard.
Requirements: WordPress 5.6+, PHP 7.4+. Compatible with WooCommerce 5.0 through 9.4.
GitHub: github.com/WebDecoy/wordpress-plugin
Related Reading:
Share this post
Like this post? Share it with your friends!
Want to see WebDecoy in action?
Get a personalized demo from our team.