Web Application Firewalls (WAF): Architecture, Rules, and False Positive Management
A Web Application Firewall sits between your users and your application, inspecting HTTP traffic and blocking malicious requests before they reach your code. Unlike network firewalls that operate at layers 3–4, a WAF understands layer 7 — it can parse headers, cookies, query strings, JSON bodies, and even multipart uploads. This guide covers WAF architecture, major rule sets, cloud WAF services, and the operational reality of false positive management.
WAF Architecture#
Deployment Models#
A WAF can be deployed in three positions, each with different trade-offs:
Reverse Proxy (Inline) Sidecar / Agent Cloud Edge
┌────────┐ ┌─────┐ ┌─────┐ ┌─────────────────┐ ┌───────────┐
│ Client ├─►│ WAF ├─►│ App │ │ Pod │ │ CDN + WAF │
└────────┘ └─────┘ └─────┘ │ ┌─────┐ ┌─────┐│ └─────┬─────┘
│ │Agent│ │ App ││ │
│ └─────┘ └─────┘│ ┌────┴────┐
└─────────────────┘ │ Origin │
└─────────┘
- Reverse proxy — the WAF terminates TLS and forwards clean traffic. Full visibility but adds latency.
- Sidecar / agent — runs alongside the application (e.g., ModSecurity as an Nginx module). Low latency but limited to that single instance.
- Cloud edge — the WAF runs at CDN PoPs worldwide. Blocks attacks before they reach your infrastructure, but you rely on the vendor for rule quality.
Inspection Pipeline#
A WAF processes each request through a pipeline:
- Parse — decode the HTTP request (URL decoding, base64, JSON).
- Normalize — handle evasion techniques (double encoding, null bytes, Unicode tricks).
- Match — evaluate request fields against rule conditions.
- Score — in anomaly scoring mode, each rule match adds to a threat score.
- Decide — block, allow, log, or challenge based on the final score or rule action.
- Log — record the decision for audit and tuning.
OWASP ModSecurity Core Rule Set (CRS)#
ModSecurity is the open-source WAF engine. The OWASP Core Rule Set (CRS) is its companion — a curated set of rules that detect common attack patterns.
What CRS Covers#
| Category | Attacks Detected |
|---|---|
| SQL Injection | Union-based, blind, error-based, time-based |
| Cross-Site Scripting | Reflected, stored, DOM-based XSS payloads |
| Remote Code Execution | OS command injection, code injection |
| Local File Inclusion | Path traversal, /etc/passwd access |
| HTTP Protocol Violations | Malformed headers, request smuggling indicators |
| Session Fixation | Cookie injection, session ID manipulation |
| Scanner Detection | Known vulnerability scanner signatures |
Anomaly Scoring Mode#
Instead of blocking on the first rule match, CRS assigns a paranoia level (1–4) and accumulates an anomaly score. The request is only blocked when the score exceeds a threshold.
Paranoia Level 1: Low false positives, catches obvious attacks
Paranoia Level 2: Moderate — catches more, some tuning needed
Paranoia Level 3: Aggressive — significant tuning required
Paranoia Level 4: Paranoid — research use, heavy false positives
Start at PL1 in production. Increase the level gradually, tuning out false positives at each step.
Configuration Example#
# modsecurity.conf
SecRuleEngine On
SecRequestBodyAccess On
SecResponseBodyAccess Off
SecAuditEngine RelevantOnly
SecAuditLogRelevantStatus "^(?:5|4(?!04))"
# CRS setup
Include /etc/modsecurity/crs-setup.conf
Include /etc/modsecurity/rules/*.conf
# Anomaly threshold
SecAction "id:900110,phase:1,pass,t:none,\
setvar:tx.inbound_anomaly_score_threshold=10"
AWS WAF#
AWS WAF integrates with CloudFront, ALB, API Gateway, and AppSync. Rules are organized into Web ACLs containing rule groups.
Key Components#
- Web ACL — the top-level container. Each resource (ALB, CloudFront distribution) associates with one Web ACL.
- Rule groups — reusable collections of rules. AWS provides managed rule groups (e.g., AWSManagedRulesCommonRuleSet), or you create custom ones.
- Rules — match conditions (IP sets, regex, size constraints, geo match) paired with actions (Allow, Block, Count, CAPTCHA).
Managed Rule Groups#
AWS provides several managed rule groups at no extra rule cost:
- Common Rule Set — OWASP Top 10 coverage.
- SQL Injection Rule Set — SQLi-specific patterns.
- Known Bad Inputs — Log4Shell, Spring4Shell, and similar.
- Bot Control — categorizes bots as verified (Googlebot), self-identified, or unverified.
- Account Takeover Prevention — detects credential stuffing on login endpoints.
Custom Rule Example#
Block requests with more than 5 query string parameters (a common scanner pattern):
{
"Name": "LimitQueryParams",
"Priority": 1,
"Statement": {
"SizeConstraintStatement": {
"FieldToMatch": { "QueryString": {} },
"ComparisonOperator": "GT",
"Size": 2048,
"TextTransformations": [{ "Priority": 0, "Type": "NONE" }]
}
},
"Action": { "Block": {} },
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "LimitQueryParams"
}
}
Cloudflare WAF#
Cloudflare's WAF runs at over 300 edge locations. It provides three layers of rules:
Managed Rulesets#
- Cloudflare Managed Ruleset — Cloudflare-authored rules updated continuously.
- OWASP Core Ruleset — Cloudflare's implementation of CRS with anomaly scoring.
- Exposed Credentials Check — compares login credentials against known breach databases.
Custom Rules (Firewall Rules)#
Cloudflare uses a wirefilter expression syntax:
# Block requests to /admin from non-US IPs
(http.request.uri.path contains "/admin") and
(ip.geoip.country ne "US")
# Challenge requests with suspicious user agents
(http.user_agent contains "sqlmap") or
(http.user_agent contains "nikto") or
(http.user_agent contains "dirbuster")
Page Shield and API Shield#
Beyond traditional WAF rules, Cloudflare offers:
- Page Shield — monitors client-side JavaScript for supply-chain attacks (Magecart-style).
- API Shield — validates API requests against an uploaded OpenAPI schema, blocking non-conforming traffic.
Rate Limiting#
Rate limiting is a WAF feature that throttles abusive traffic patterns. It complements rule-based detection by catching attacks that use valid-looking requests at high volume.
Common Rate Limiting Strategies#
| Strategy | Use Case |
|---|---|
| IP-based | Block brute-force login attempts |
| Session-based | Prevent authenticated abuse |
| Endpoint-based | Protect expensive API endpoints |
| Geographic | Throttle traffic from unexpected regions |
| Sliding window | Smooth enforcement without burst penalties |
Implementation Considerations#
- Use sliding windows rather than fixed windows to prevent burst attacks at window boundaries.
- Apply graduated responses: first warn (Count), then challenge (CAPTCHA), then block.
- Exempt known-good traffic: internal health checks, verified bot IPs, and monitoring services.
- Set different limits per endpoint —
/loginneeds tighter limits than/products.
Bot Protection#
Modern WAFs go beyond signature matching to classify bots using behavioral analysis.
Detection Techniques#
- JavaScript challenges — inject JS that legitimate browsers execute but headless scripts fail.
- CAPTCHA / Turnstile — present interactive challenges when confidence is low.
- TLS fingerprinting (JA3/JA4) — identify the TLS client library to detect headless browsers masquerading as Chrome.
- Behavioral analysis — mouse movements, scroll patterns, and request timing distinguish humans from bots.
- IP reputation — cross-reference source IPs against threat intelligence feeds.
Bot Categories#
- Verified bots — search engine crawlers with reverse-DNS-verified IPs. Always allow.
- Good bots — monitoring services, SEO tools. Allow with rate limits.
- Bad bots — scrapers, credential stuffers, vulnerability scanners. Block or challenge.
Custom Rules — Building Your Own#
Every application has unique attack surfaces. Custom rules fill the gaps that generic rule sets miss.
Examples#
Block JWT tampering attempts:
# Match requests where the Authorization header contains "alg":"none"
Match: Header "Authorization" contains "alg.*none"
Action: Block
Enforce Content-Type on POST requests:
# API endpoints must send JSON
Match: Method is POST AND
URI starts with /api/ AND
Header "Content-Type" does not contain "application/json"
Action: Block
Geo-restrict admin panels:
# Only allow admin access from corporate IP ranges
Match: URI starts with /admin AND
IP not in {203.0.113.0/24, 198.51.100.0/24}
Action: Block
False Positive Management#
The hardest part of running a WAF is not turning it on — it is keeping it on without blocking legitimate users. Every false positive erodes trust and pushes teams toward disabling rules entirely.
The Tuning Workflow#
- Deploy in detection-only mode — log matches without blocking (Count / DetectionOnly).
- Analyze logs — identify which rules fire on legitimate traffic.
- Create exclusions — disable specific rules for specific URI paths, parameters, or IP ranges.
- Promote to blocking — switch to Block mode once false positives are resolved.
- Monitor continuously — new features, API changes, and content updates can trigger new false positives.
Exclusion Strategies#
- Rule exclusion — disable a specific rule ID for a specific URI.
- Parameter exclusion — tell the WAF to skip inspection of a known-safe parameter (e.g., a rich-text editor field that legitimately contains HTML).
- IP allowlisting — exempt trusted sources (CI/CD pipelines, internal tools).
- Payload size bypass — skip body inspection for large file uploads where scanning adds unacceptable latency.
Metrics to Track#
- False positive rate — legitimate requests blocked, measured against total traffic.
- True positive rate — actual attacks blocked.
- Rule hit distribution — which rules fire most often, and on what traffic.
- Latency impact — P99 latency added by WAF inspection.
Best Practices#
- Start in detection mode — never deploy a WAF in blocking mode on day one.
- Layer your defenses — WAF catches known patterns; application-level validation catches logic bugs.
- Automate rule updates — subscribe to managed rule group updates and test them in staging before production.
- Log everything — WAF logs are your forensic trail. Ship them to a SIEM for correlation.
- Test with real payloads — use tools like
nikto,sqlmap, andnucleiagainst a staging environment to validate rule coverage. - Review exclusions quarterly — exclusions accumulate. Old ones may no longer be needed and could hide new attack vectors.
- Separate WAF for APIs vs. web — API traffic patterns differ from browser traffic. Use different rule sets and thresholds.
Codelit publishes in-depth engineering articles every week. This is article #412 in the series — explore more on codelit.io.
Try it on Codelit
Chaos Mode
Simulate node failures and watch cascading impact across your architecture
Related articles
Try these templates
Scalable SaaS Application
Modern SaaS with microservices, event-driven processing, and multi-tenant architecture.
10 componentsHeadless CMS Platform
Headless content management with structured content, media pipeline, API-first delivery, and editorial workflows.
8 componentsProject Management Platform
Jira/Linear-like tool with issues, sprints, boards, workflows, and real-time collaboration.
8 componentsBuild this architecture
Generate an interactive Web Application Firewalls (WAF) in seconds.
Try it in Codelit →
Comments