Reverse Engineering Credential Stuffing Attacks
A technical deep dive into credential stuffing tooling, attack anatomy, and the detection signals that actually work against modern ATO operators.
WebDecoy Team
WebDecoy Security Team
Reverse Engineering Credential Stuffing Attacks: A Technical Deep Dive
Most write-ups on credential stuffing stop at “attackers replay leaked passwords.” That framing is technically correct and operationally useless. If you actually want to defend a login endpoint, you need to understand what the attack looks like on the wire — what tools generate it, what their configs encode, how they handle MFA and CAPTCHA, and which signals survive the residential-proxy-and-stealth-browser arms race.
This is a hands-on breakdown. We’ll trace a single credential from a paste site to a successful account takeover, look at real config files from OpenBullet 2 and SilverBullet, and then walk through the detection signals that actually work in 2026 — not just rate limits.
The Attack Economy in One Diagram
Credential stuffing isn’t a person at a keyboard. It’s a supply chain.
breach dump (raw combo lists)
↓
combo cleaner (dedupe, normalize, shard)
↓
checker tool (OpenBullet 2, SilverBullet)
↓
proxy provider (residential rotation per request)
↓
target login (your endpoint)
↓
valid combos (verified hits)
↓
account market (resold to ATO operators)Each layer is specialized and commoditized:
- Breach dumps are aggregated into “combo lists” — typically
email:passwordoruser:passwordfiles, sometimes hundreds of millions of lines. - Combo cleaners dedupe, normalize, and shard combos by domain hint (
@gmail.com,@yahoo.com) so checks can be parallelized. - Checkers are configurable HTTP automation tools — OpenBullet 2, SilverBullet, BL Tools, the SentryMBA legacy holdouts.
- Proxy providers sell residential IPs by GB. Bright Data, IPRoyal, and dozens of grey-market resellers rotate ASN-clean IPs through the checker.
- Account marketplaces (Genesis-style stores, private TG channels) buy validated
valid:hitsfiles and resell access.
The unit economics are brutal: combos cost $5–$20 per million, residential bandwidth runs $3–$15/GB, and a 0.1–0.5% hit rate is profitable when validated streaming, retail, or banking accounts resell for $1–$50 each.
That’s the context. Now the wire.
Anatomy of a Single Login Attempt
Pick any modern login endpoint. From the attacker’s side, here’s what one credential check looks like end-to-end.
Step 1: The Config File
OpenBullet 2 attacks are driven by .opk config files written in LoliCode — a DSL that compiles to C# blocks. A minimal credential-stuffing config for a site with a POST /api/login endpoint and a CSRF token looks like this:
REQUIRE PROXIES
DATA TYPE = CREDENTIALS
# 1. Hit the login page to harvest the CSRF token + session cookie
REQUEST GET "https://target.example/login"
HEADER "User-Agent: <USERAGENT>"
HEADER "Accept-Language: en-US,en;q=0.9"
PARSE "<input name=\"csrf_token\" value=\"(.+?)\"" LR -> VAR "CSRF"
# 2. Submit credentials
REQUEST POST "https://target.example/api/login"
CONTENT "email=<INPUT.USER>&password=<INPUT.PASS>&csrf_token=<CSRF>"
CONTENTTYPE "application/x-www-form-urlencoded"
HEADER "Origin: https://target.example"
HEADER "Referer: https://target.example/login"
# 3. Classify response
KEYCHECK
KEYCHAIN SUCCESS OR
KEY "Set-Cookie" Contains "session_id="
KEY "<SOURCE>" Contains "\"authenticated\":true"
KEYCHAIN FAIL OR
KEY "<SOURCE>" Contains "Invalid credentials"
KEYCHAIN BAN OR
KEY "<RESPONSECODE>" EqualTo "429"
KEY "<SOURCE>" Contains "captcha"A few things to notice:
- CSRF and session bootstrapping are handled. Anyone who thinks “we have a CSRF token, we’re fine” is years behind. Configs harvest tokens on the fly.
- The
KEYCHECKblock encodes the entire response taxonomy — success, fail, ban, retry, MFA-challenge, captcha-challenge. Each maps to a different bucket so the operator can post-process. BANis a routing decision, not an outage. A banned response just rotates the proxy and replays the combo. Your 429 is the attacker’scontinue.
Step 2: The Proxy Layer
The same config runs against a proxy list — usually socks5://user:pass@host:port lines. Modern checkers integrate with residential providers via API and pull a fresh IP per request. The IPs:
- Belong to consumer ISPs (Comcast, Spectrum, BT, Deutsche Telekom).
- Rotate ASN and geography in ways that defeat naive IP blocklists.
- Often share an IP with a legitimate user at the same time — the residential-proxy SDK is bundled in a free VPN or “free game” the homeowner installed.
This is why “block IPs with too many failed logins” stopped working around 2018. The attacker sees one IP per request. You see one request per IP.
Step 3: The Stealth Layer
Higher-effort campaigns don’t even use raw HTTP. They drive a real Chromium through Puppeteer or Playwright with stealth patches, or use Browser-as-a-Service like Browserbase / Hyperbrowser to outsource the fingerprint problem entirely. From your server, you see:
- A real TLS handshake from a real Chrome build.
- A full DOM-rendering client that executes your JS challenges.
- Mouse movement, scroll events, keystroke timings — all generated, often with recorded human traces replayed back.
If your detection stack is “User-Agent + IP rep + rate limit,” the attack is invisible.
Why the Classical Defenses Fail
Let’s enumerate the defenses most teams reach for first, and the specific reason each one degrades against modern tooling.
| Defense | Why it degrades |
|---|---|
| Rate limiting per IP | One request per residential IP. You’d need to throttle at single-digit-per-IP-per-day to bite, which kills NAT’d users. |
| Rate limiting per account | Effective for targeted brute force, useless for stuffing — each combo is a different account. |
| CAPTCHA on login | Solver services (2Captcha, CapSolver) cost $1–$3 per 1000 reCAPTCHA v2, and AI vision now solves most variants without human-in-the-loop. |
| Geo / ASN blocking | Residential proxy pools cover every country and consumer ISP. Geo-blocking your own US users is the only real outcome. |
| Block known bad UAs | Configs randomize UAs from a curated pool of real Chrome/Firefox strings. |
| MFA | Helps a lot — but doesn’t help validation. Attackers still confirm valid:hits, then sell the credential to phishers who run MFA-bypass kits (Evilginx, Tycoon). |
None of these are useless. They just need to be the floor, not the ceiling.
Detection Signals That Actually Work
The signals that survive in 2026 are the ones the attacker can’t cheaply spoof at scale. Roughly in order of cost-to-attacker:
1. TLS Fingerprint (JA4)
A Playwright-driven Chromium has a different TLS ClientHello than a real Chrome from a real desktop. Same advertised version, different cipher suite ordering, different extension list. JA4 captures this in a hashable form.
What to do: log the JA4 of every login request, cluster, and look for clusters that account for an outsized share of failed logins. We covered this in detail in JA4 Fingerprinting Against AI Scrapers — the same playbook applies to login endpoints.
2. HTTP/2 Fingerprint
HTTP/2 settings frames, header order, and pseudo-header order vary by client library. The Go net/http HTTP/2 implementation, Python httpx, and a real Chrome are trivially distinguishable. Akamai’s Akamai-H2 fingerprint and the http2-fingerprint open-source projects formalize this.
A login request whose H2 fingerprint says “Go client” but whose User-Agent says “Chrome 124 on macOS” is automated. Full stop.
3. Combo Replay Detection
This is the highest-signal, lowest-effort detection most teams skip. You don’t need to know the attacker — you need to know the credential.
When a login attempt arrives, hash the username:password pair (with a per-tenant salt) and look it up against:
- Known breach corpora — Have I Been Pwned’s Pwned Passwords k-anonymity API gives you password-hash hits without ever sending the full password.
- A short-window seen-cache. If the same
(user_hash, password_hash)pair was attempted in the last 24 hours from a different IP/JA4, it’s almost certainly a checker cycling proxies.
This single check catches the bulk of low-effort campaigns and is invisible to the attacker.
4. Pre-Login Behavior Telemetry
Real users land on /login from a referrer, scroll, focus the email field, paste or type, blur, then submit. The whole sequence takes 4–30 seconds. A checker hits /login once for the CSRF token and POST /api/login 200ms later — sometimes from a different proxy.
Useful pre-login signals to capture from the page itself:
- Time-on-page before submit
- Field-fill order and inter-keystroke timing
- Whether the password field was focused via tab vs. click vs. never
- Whether
pointermoveevents fired between page load and submit - Whether the form was submitted via
Enterkeydown vs. mouse click on the button
We dive into the keystroke side of this in the FCaptcha keystroke biometrics post. The same telemetry pipeline feeds login defense.
5. Endpoint Decoys
Real users don’t fetch /api/login directly — they go through the form. So expose a never-linked, never-rendered endpoint like /api/v1/authenticate-legacy that no human will ever hit, and treat any POST to it as automated. Same idea for hidden form fields named password_confirm that should always be empty on submit.
This is the credential-stuffing analogue of the endpoint and form honeypot patterns used elsewhere on the site.
6. Response Symmetry
Operators rely on response differences to classify attempts. If your 200 OK + "Invalid credentials", 200 OK + redirect to MFA, and 200 OK + session cookie look meaningfully different in size, headers, or timing, you’re feeding the KEYCHECK block.
Make every login response — success, fail, MFA-required, locked, throttled — return the same status code, the same body length (within a small jitter window), and the same baseline timing. Encode the actual outcome in a body the client parses after a server-set cookie or a signed token. The attacker’s checker sees noise; the legitimate browser sees a normal flow.
This one defense alone makes config development dramatically more expensive.
Putting It Together: A Layered Stack for the Login Endpoint
A defense stack that holds up against current tooling looks roughly like this:
- Edge: TLS / HTTP/2 fingerprint logged and scored. Drop or challenge requests whose fingerprint cluster is overrepresented in failed logins over the last hour.
- Pre-form telemetry: Page-side script captures interaction signals and submits a signed token with the login POST. Missing or replayed tokens fail closed.
- Endpoint decoys: A hidden honeypot field and a never-linked auth endpoint, monitored for any traffic.
- Combo intelligence: Hash and check
(user, password)against breach corpora and a short-term seen-cache. Force a step-up on hits. - Symmetric responses: Identical shape, length, and timing across all login outcomes.
- MFA + risk-based step-up: WebAuthn for high-value accounts. TOTP / push for everyone else, triggered by risk score rather than every login.
- SIEM correlation: Login telemetry into the same pipeline as the rest of your bot signals so a compromised account that suddenly starts scraping or carding is caught downstream. We walk through that integration in the SIEM bot detection post.
Notice what isn’t on the list: a giant CAPTCHA wall, a bigger IP blocklist, or aggressive rate limits that break NAT’d users. Those are the defenses attackers have already priced in.
Where This Goes Next
Two trends to watch over the next 12 months:
LLM-driven checkers. Instead of hand-written LoliCode configs, operators are starting to use LLM agents that can navigate a login flow, parse the response semantically, and self-heal when the form changes. This collapses the time between a target site shipping a defense and a working bypass. TLS and HTTP/2 fingerprinting hold up here because the LLM still has to make HTTP calls through some runtime — and that runtime has a fingerprint.
Session token harvesting. As MFA adoption rises, the economic value shifts from raw valid:hits to active session cookies. AitM phishing kits like Evilginx and Tycoon already monetize this. Your login defense doesn’t stop the phish, but binding sessions to JA4 + device fingerprint + IP-ASN tuple makes a stolen cookie expire the moment it leaves the victim’s browser.
Conclusion
Credential stuffing is no longer a brute-force problem. It’s a content-delivery problem dressed up as authentication: leaked credentials, residential bandwidth, and stealth automation, delivered to your POST /login at a price the attacker has already optimized.
The good news is that the same asymmetries that make the attack cheap — generic tooling, shared infrastructure, replayed credentials — also make it detectable, if you instrument the right layer. TLS fingerprints, pre-form telemetry, breach-corpus lookups, decoy endpoints, and response symmetry are the pieces of a defense that doesn’t fall over the first time the attacker swaps proxies.
WebDecoy ships these signals (JA4, behavioral telemetry, endpoint decoys, and combo intelligence) as a single layer in front of your login endpoint, so you’re not stitching them together yourself.
If you want to see what hits your /login today, start a free trial and point WebDecoy at it for 14 days. The first surprise is almost always the volume.
Share this post
Like this post? Share it with your friends!
Want to see WebDecoy in action?
Get a personalized demo from our team.