Rate Limiting in Node.js
Rate limiting prevents abuse, brute force attacks, and resource exhaustion. This guide covers implementation patterns for Node.js applications, from simple in-memory solutions to production-grade distributed systems.
Why Rate Limit
- Brute force protection: prevent password guessing on login endpoints.
- API abuse prevention: stop scrapers and bots from hammering your API.
- Cost control: limit expensive operations (AI inference, email sending).
- Fair usage: ensure no single client monopolizes resources.
Algorithm: Token Bucket
The most common rate limiting algorithm. Each client has a "bucket" that fills with tokens at a steady rate. Each request consumes a token. When the bucket is empty, requests are rejected.
Bucket capacity: 10 tokens
Refill rate: 1 token per second
Request 1: 10 tokens → 9 (allowed)
Request 2: 9 tokens → 8 (allowed)
...
Request 10: 1 token → 0 (allowed)
Request 11: 0 tokens (rejected — 429)
[1 second passes: bucket refills to 1]
Request 12: 1 token → 0 (allowed)
Algorithm: Sliding Window
Counts requests in a rolling time window. More accurate than fixed windows, which can allow bursts at window boundaries.
Window: 60 seconds, limit: 100 requests
Time 0:00 — Request 1 (count: 1/100) ✓
Time 0:30 — Request 50 (count: 50/100) ✓
Time 0:59 — Request 100 (count: 100/100) ✓
Time 0:59 — Request 101 (count: 101/100) ✗ 429
Time 1:00 — Window slides, old requests drop off
In-Memory Rate Limiting (Single Server)
For simple applications on a single server:
// Simple sliding window rate limiter
const requestCounts = new Map<string, { count: number; resetAt: number }>();
function rateLimit(
key: string,
limit: number,
windowMs: number
): { allowed: boolean; remaining: number; resetAt: number } {
const now = Date.now();
const record = requestCounts.get(key);
if (!record || now > record.resetAt) {
requestCounts.set(key, { count: 1, resetAt: now + windowMs });
return { allowed: true, remaining: limit - 1, resetAt: now + windowMs };
}
if (record.count >= limit) {
return { allowed: false, remaining: 0, resetAt: record.resetAt };
}
record.count++;
return { allowed: true, remaining: limit - record.count, resetAt: record.resetAt };
}
Limitation: does not work with multiple server instances. Use Redis for distributed systems.
Redis-Backed Rate Limiting (Distributed)
For applications behind a load balancer or running multiple instances:
import Redis from "ioredis";
const redis = new Redis(process.env.REDIS_URL!);
async function rateLimitRedis(
key: string,
limit: number,
windowSeconds: number
): Promise<{ allowed: boolean; remaining: number }> {
const redisKey = `ratelimit:${key}`;
// Atomic increment + expiry
const current = await redis.incr(redisKey);
if (current === 1) {
// First request in window — set expiry
await redis.expire(redisKey, windowSeconds);
}
const remaining = Math.max(0, limit - current);
return { allowed: current <= limit, remaining };
}
Reference: Redis — INCR (Rate Limiter Pattern)
Apply Different Limits to Different Endpoints
Not all endpoints need the same limits:
const RATE_LIMITS = {
// Login: strict to prevent brute force
"/api/auth/login": { limit: 5, windowSeconds: 900 }, // 5 per 15 min
// Password reset: very strict
"/api/auth/reset-password": { limit: 3, windowSeconds: 3600 }, // 3 per hour
// General API: moderate
"/api/*": { limit: 100, windowSeconds: 60 }, // 100 per minute
// Public search: lenient
"/api/search": { limit: 30, windowSeconds: 60 }, // 30 per minute
};
Middleware Pattern (Express)
import { Request, Response, NextFunction } from "express";
function createRateLimiter(limit: number, windowSeconds: number) {
return async (req: Request, res: Response, next: NextFunction) => {
const key = `${req.ip}:${req.path}`;
const result = await rateLimitRedis(key, limit, windowSeconds);
// Set standard rate limit headers
res.setHeader("RateLimit-Limit", limit);
res.setHeader("RateLimit-Remaining", result.remaining);
res.setHeader("RateLimit-Reset", windowSeconds);
if (!result.allowed) {
res.setHeader("Retry-After", windowSeconds);
return res.status(429).json({
error: "Too many requests. Please try again later.",
});
}
next();
};
}
// Apply to routes
app.post("/api/auth/login", createRateLimiter(5, 900), loginHandler);
app.use("/api", createRateLimiter(100, 60), apiRouter);
Reference: IETF — RateLimit Header Fields (draft)
Next.js Middleware Rate Limiting
For Next.js applications, apply rate limiting in middleware:
// middleware.ts
import { NextRequest, NextResponse } from "next/server";
const rateLimitMap = new Map<string, { count: number; resetAt: number }>();
export function middleware(request: NextRequest) {
if (request.nextUrl.pathname.startsWith("/api/")) {
const ip = request.headers.get("x-forwarded-for") ?? "unknown";
const key = `${ip}:${request.nextUrl.pathname}`;
const now = Date.now();
const windowMs = 60_000; // 1 minute
const limit = 100;
const record = rateLimitMap.get(key);
if (!record || now > record.resetAt) {
rateLimitMap.set(key, { count: 1, resetAt: now + windowMs });
return NextResponse.next();
}
if (record.count >= limit) {
return NextResponse.json(
{ error: "Too many requests" },
{ status: 429, headers: { "Retry-After": "60" } }
);
}
record.count++;
return NextResponse.next();
}
}
Note: in-memory rate limiting in Next.js middleware is per Edge instance and not shared across regions or isolates. For multi-instance deployments, use a hosted solution (Vercel KV, Upstash Redis).
Reference: Vercel — Rate Limiting with Upstash
Key-Based Rate Limiting
Rate limit by different keys depending on the endpoint:
| Endpoint | Key | Reason |
|---|---|---|
| Login | IP address | Brute force from one source |
| API (authenticated) | User ID / API key | Per-client fairness |
| API (unauthenticated) | IP address | General abuse |
| Signup | IP + email | Account creation abuse |
function getRateLimitKey(req: Request): string {
if (req.url.includes("/auth/login")) {
return `ip:${req.ip}`;
}
if (req.headers.authorization) {
return `user:${extractUserId(req.headers.authorization)}`;
}
return `ip:${req.ip}`;
}
Response Headers
Always include rate limit headers so clients can self-throttle:
RateLimit-Limit: 100
RateLimit-Remaining: 42
RateLimit-Reset: 1709571600
Retry-After: 30 (only on 429 responses)
Avoiding False Positives
- Shared IPs: users behind corporate NAT or VPN share an IP. Rate limit by user ID when authenticated.
- CDN/proxy IPs: use
X-Forwarded-FororCF-Connecting-IP, not the proxy IP. - Legitimate batch operations: provide batch endpoints so clients do not need to make many individual requests.
Checklist
- Login endpoint rate limited (5-10 per 15 minutes per IP)
- Password reset rate limited (3 per hour)
- API endpoints rate limited (per user ID or API key when authenticated)
- 429 responses include
Retry-Afterheader - Standard
RateLimit-*headers included on all responses - Redis or distributed store used for multi-instance deployments
- Different limits for different endpoint sensitivity levels
- IP extraction accounts for proxies (
X-Forwarded-For)
Related Guides
Content is AI-assisted and reviewed by our team, but issues may be missed and best practices evolve rapidly, send corrections to [email protected]. Always consult official documentation and validate key implementation decisions before making design or security choices.
