5-Tier Rate Limiting: Protecting 23 Auth Endpoints | Boottify

Authentication endpoints are the most attacked surface of any web application. Sign-in pages face credential stuffing, password reset flows get abused for email bombing, and 2FA verification endpoints are brute-forced to bypass the second factor. A flat rate limit doesn't work — sign-in needs to be much stricter than a session check. We implemented a 5-tier rate limiting system across all 23 authentication endpoints, with each tier calibrated to the sensitivity of the endpoint.

THE 5 TIERS

Each tier defines a maximum number of requests per 15-minute sliding window:

Tier	Limit	Window	Use Case
Strict	3 requests	15 min	Sign-in, forgot-password, verify-2FA
Tight	5 requests	15 min	Sign-up, reset-password, enable-2FA
Standard	10 requests	15 min	OAuth callbacks, session checks
Relaxed	20 requests	15 min	Clear-session, disable-2FA
Lenient	30 requests	15 min	Setup-2FA, WebAuthn registration options

The strict tier allows only 3 attempts — enough for a legitimate user who mistyped their password, but not enough for credential stuffing. The lenient tier is for endpoints that are called multiple times during normal flows (like polling 2FA setup status).

ENDPOINT-TO-TIER MAPPING

All 23 auth endpoints mapped to their tiers:

Endpoint	Tier
`/api/auth/sign-in`	strict (3/15min)
`/api/auth/forgot-password`	strict (3/15min)
`/api/auth/verify-2fa`	strict (3/15min)
`/api/auth/webauthn/authenticate/verify`	strict (3/15min)
`/api/auth/webauthn/passwordless/verify`	strict (3/15min)
`/api/auth/sign-up`	tight (5/15min)
`/api/auth/reset-password`	tight (5/15min)
`/api/auth/enable-2fa`	tight (5/15min)
`/api/auth/callback/github`	standard (10/15min)
`/api/auth/callback/google`	standard (10/15min)
`/api/auth/oauth/github`	standard (10/15min)
`/api/auth/oauth/google`	standard (10/15min)
`/api/auth/callback/[provider]`	standard (10/15min)
`/api/auth/clear-session`	relaxed (20/15min)
`/api/auth/disable-2fa`	relaxed (20/15min)
`/api/auth/setup-2fa`	lenient (30/15min)
`/api/auth/webauthn/register/verify`	lenient (30/15min)

SLIDING WINDOW IMPLEMENTATION

The rate limiter uses a sliding window algorithm with in-memory IP tracking via a Map. Each IP gets an array of timestamps, and we count how many fall within the current window:

// src/lib/rate-limit.ts
interface RateLimitConfig {
  maxRequests: number;
  windowMs: number;
}

const ipRequests = new Map<string, number[]>();

export function checkRateLimit(
  ip: string,
  config: RateLimitConfig
): { allowed: boolean; remaining: number; resetAt: number } {
  const now = Date.now();
  const windowStart = now - config.windowMs;

  // Get existing timestamps for this IP, filter to current window
  const timestamps = (ipRequests.get(ip) || [])
    .filter((t) => t > windowStart);

  if (timestamps.length >= config.maxRequests) {
    // Rate limited — calculate when the earliest request expires
    const oldestInWindow = timestamps[0];
    const resetAt = oldestInWindow + config.windowMs;
    return {
      allowed: false,
      remaining: 0,
      resetAt,
    };
  }

  // Allow request, record timestamp
  timestamps.push(now);
  ipRequests.set(ip, timestamps);

  return {
    allowed: true,
    remaining: config.maxRequests - timestamps.length,
    resetAt: now + config.windowMs,
  };
}

// Cleanup stale entries every 5 minutes
setInterval(() => {
  const cutoff = Date.now() - 900000; // 15 minutes
  for (const [ip, timestamps] of ipRequests.entries()) {
    const active = timestamps.filter((t) => t > cutoff);
    if (active.length === 0) {
      ipRequests.delete(ip);
    } else {
      ipRequests.set(ip, active);
    }
  }
}, 300000);

The sliding window is more accurate than fixed windows. A fixed 15-minute window could allow 6 requests in 30 seconds if the requests span the window boundary (3 at the end of one window + 3 at the start of the next). Sliding windows prevent this burst pattern.

PREDEFINED RATE LIMITERS

Each tier is exported as a named preset for easy use in route handlers:

export const rateLimiters = {
  strict:   { maxRequests: 3,  windowMs: 900000 },  // 3 req / 15 min
  tight:    { maxRequests: 5,  windowMs: 900000 },  // 5 req / 15 min
  standard: { maxRequests: 10, windowMs: 900000 },  // 10 req / 15 min
  relaxed:  { maxRequests: 20, windowMs: 900000 },  // 20 req / 15 min
  lenient:  { maxRequests: 30, windowMs: 900000 },  // 30 req / 15 min
};

RESPONSE HEADERS

Every rate-limited response includes standard headers so clients know their status:

function addRateLimitHeaders(
  response: NextResponse,
  result: { remaining: number; resetAt: number },
  config: RateLimitConfig
): NextResponse {
  response.headers.set("X-RateLimit-Limit", String(config.maxRequests));
  response.headers.set("X-RateLimit-Remaining", String(result.remaining));
  response.headers.set("X-RateLimit-Reset", String(Math.ceil(result.resetAt / 1000)));

  if (result.remaining === 0) {
    const retryAfter = Math.ceil((result.resetAt - Date.now()) / 1000);
    response.headers.set("Retry-After", String(retryAfter));
  }

  return response;
}

The headers follow the IETF draft standard:

X-RateLimit-Limit — maximum requests allowed in the window
X-RateLimit-Remaining — requests remaining in the current window
X-RateLimit-Reset — Unix timestamp when the window resets
Retry-After — seconds until the client should retry (only on 429 responses)

THE 429 RESPONSE

When rate limited, the API returns a clean 429 with the ApiError.tooManyRequests() factory:

// src/lib/api-error-handler.ts
export class ApiError extends Error {
  constructor(
    public status: number,
    message: string
  ) {
    super(message);
  }

  static tooManyRequests(retryAfter?: number): ApiError {
    const msg = retryAfter
      ? `Too many requests. Try again in ${retryAfter} seconds.`
      : "Too many requests. Please try again later.";
    return new ApiError(429, msg);
  }
}

// Usage in route handler:
export async function POST(req: NextRequest) {
  const ip = req.headers.get("x-forwarded-for") || "unknown";
  const result = checkRateLimit(ip, rateLimiters.strict);

  if (!result.allowed) {
    const retryAfter = Math.ceil((result.resetAt - Date.now()) / 1000);
    throw ApiError.tooManyRequests(retryAfter);
  }

  // ... handle sign-in logic
}

The response body is always { "error": "Too many requests. Try again in X seconds." }, making it easy for frontend code to display a meaningful message.

IP EXTRACTION FROM REVERSE PROXY

Behind Nginx, the client IP comes from the X-Forwarded-For header, not the socket address. We extract the first IP in the chain (the original client):

function getClientIP(req: NextRequest): string {
  const forwarded = req.headers.get("x-forwarded-for");
  if (forwarded) {
    // Take the first IP (original client) from the chain
    return forwarded.split(",")[0].trim();
  }
  return req.headers.get("x-real-ip") || "unknown";
}

Nginx is configured with proxy_set_header X-Forwarded-For $remote_addr; (not $proxy_add_x_forwarded_for) to prevent client-controlled header injection.

E2E TEST INTEGRATION

Rate limiting creates a problem for E2E tests — running the test suite would exhaust the strict tier after 3 sign-in attempts. Our solution: bypass form-based login entirely using DB session injection. Test sessions are created directly in the database by the seed script, and the session cookie is set programmatically. The rate limiter never sees these "logins" because they never hit the sign-in endpoint.

THE RESULTS

5 rate limit tiers calibrated to endpoint sensitivity
23 auth endpoints protected (100% coverage)
Sliding window algorithm — prevents burst attacks across window boundaries
Standard response headers — clients know their remaining quota
In-memory tracking with automatic cleanup — no Redis dependency for rate limiting
ApiError.tooManyRequests() — clean 429 responses with retry guidance

Rate limiting is one of those features that's invisible when it works and catastrophic when it's missing. A single brute-force attack can lock out legitimate users, consume server resources, and expose credentials. Five tiers with 15-minute sliding windows give us granular control over the tradeoff between security and usability for each endpoint.

THE 5 TIERS

Each tier defines a maximum number of requests per 15-minute sliding window:

Tier	Limit	Window	Use Case
Strict	3 requests	15 min	Sign-in, forgot-password, verify-2FA
Tight	5 requests	15 min	Sign-up, reset-password, enable-2FA
Standard	10 requests	15 min	OAuth callbacks, session checks
Relaxed	20 requests	15 min	Clear-session, disable-2FA
Lenient	30 requests	15 min	Setup-2FA, WebAuthn registration options

ENDPOINT-TO-TIER MAPPING

All 23 auth endpoints mapped to their tiers:

Endpoint	Tier
`/api/auth/sign-in`	strict (3/15min)
`/api/auth/forgot-password`	strict (3/15min)
`/api/auth/verify-2fa`	strict (3/15min)
`/api/auth/webauthn/authenticate/verify`	strict (3/15min)
`/api/auth/webauthn/passwordless/verify`	strict (3/15min)
`/api/auth/sign-up`	tight (5/15min)
`/api/auth/reset-password`	tight (5/15min)
`/api/auth/enable-2fa`	tight (5/15min)
`/api/auth/callback/github`	standard (10/15min)
`/api/auth/callback/google`	standard (10/15min)
`/api/auth/oauth/github`	standard (10/15min)
`/api/auth/oauth/google`	standard (10/15min)
`/api/auth/callback/[provider]`	standard (10/15min)
`/api/auth/clear-session`	relaxed (20/15min)
`/api/auth/disable-2fa`	relaxed (20/15min)
`/api/auth/setup-2fa`	lenient (30/15min)
`/api/auth/webauthn/register/verify`	lenient (30/15min)

SLIDING WINDOW IMPLEMENTATION

The rate limiter uses a sliding window algorithm with in-memory IP tracking via a Map. Each IP gets an array of timestamps, and we count how many fall within the current window:

// src/lib/rate-limit.ts
interface RateLimitConfig {
  maxRequests: number;
  windowMs: number;
}

const ipRequests = new Map<string, number[]>();

export function checkRateLimit(
  ip: string,
  config: RateLimitConfig
): { allowed: boolean; remaining: number; resetAt: number } {
  const now = Date.now();
  const windowStart = now - config.windowMs;

  // Get existing timestamps for this IP, filter to current window
  const timestamps = (ipRequests.get(ip) || [])
    .filter((t) => t > windowStart);

  if (timestamps.length >= config.maxRequests) {
    // Rate limited — calculate when the earliest request expires
    const oldestInWindow = timestamps[0];
    const resetAt = oldestInWindow + config.windowMs;
    return {
      allowed: false,
      remaining: 0,
      resetAt,
    };
  }

  // Allow request, record timestamp
  timestamps.push(now);
  ipRequests.set(ip, timestamps);

  return {
    allowed: true,
    remaining: config.maxRequests - timestamps.length,
    resetAt: now + config.windowMs,
  };
}

// Cleanup stale entries every 5 minutes
setInterval(() => {
  const cutoff = Date.now() - 900000; // 15 minutes
  for (const [ip, timestamps] of ipRequests.entries()) {
    const active = timestamps.filter((t) => t > cutoff);
    if (active.length === 0) {
      ipRequests.delete(ip);
    } else {
      ipRequests.set(ip, active);
    }
  }
}, 300000);

PREDEFINED RATE LIMITERS

Each tier is exported as a named preset for easy use in route handlers:

export const rateLimiters = {
  strict:   { maxRequests: 3,  windowMs: 900000 },  // 3 req / 15 min
  tight:    { maxRequests: 5,  windowMs: 900000 },  // 5 req / 15 min
  standard: { maxRequests: 10, windowMs: 900000 },  // 10 req / 15 min
  relaxed:  { maxRequests: 20, windowMs: 900000 },  // 20 req / 15 min
  lenient:  { maxRequests: 30, windowMs: 900000 },  // 30 req / 15 min
};

RESPONSE HEADERS

Every rate-limited response includes standard headers so clients know their status:

function addRateLimitHeaders(
  response: NextResponse,
  result: { remaining: number; resetAt: number },
  config: RateLimitConfig
): NextResponse {
  response.headers.set("X-RateLimit-Limit", String(config.maxRequests));
  response.headers.set("X-RateLimit-Remaining", String(result.remaining));
  response.headers.set("X-RateLimit-Reset", String(Math.ceil(result.resetAt / 1000)));

  if (result.remaining === 0) {
    const retryAfter = Math.ceil((result.resetAt - Date.now()) / 1000);
    response.headers.set("Retry-After", String(retryAfter));
  }

  return response;
}

The headers follow the IETF draft standard:

X-RateLimit-Limit — maximum requests allowed in the window
X-RateLimit-Remaining — requests remaining in the current window
X-RateLimit-Reset — Unix timestamp when the window resets
Retry-After — seconds until the client should retry (only on 429 responses)

THE 429 RESPONSE

When rate limited, the API returns a clean 429 with the ApiError.tooManyRequests() factory:

// src/lib/api-error-handler.ts
export class ApiError extends Error {
  constructor(
    public status: number,
    message: string
  ) {
    super(message);
  }

  static tooManyRequests(retryAfter?: number): ApiError {
    const msg = retryAfter
      ? `Too many requests. Try again in ${retryAfter} seconds.`
      : "Too many requests. Please try again later.";
    return new ApiError(429, msg);
  }
}

// Usage in route handler:
export async function POST(req: NextRequest) {
  const ip = req.headers.get("x-forwarded-for") || "unknown";
  const result = checkRateLimit(ip, rateLimiters.strict);

  if (!result.allowed) {
    const retryAfter = Math.ceil((result.resetAt - Date.now()) / 1000);
    throw ApiError.tooManyRequests(retryAfter);
  }

  // ... handle sign-in logic
}

The response body is always { "error": "Too many requests. Try again in X seconds." }, making it easy for frontend code to display a meaningful message.

IP EXTRACTION FROM REVERSE PROXY

Behind Nginx, the client IP comes from the X-Forwarded-For header, not the socket address. We extract the first IP in the chain (the original client):

function getClientIP(req: NextRequest): string {
  const forwarded = req.headers.get("x-forwarded-for");
  if (forwarded) {
    // Take the first IP (original client) from the chain
    return forwarded.split(",")[0].trim();
  }
  return req.headers.get("x-real-ip") || "unknown";
}

Nginx is configured with proxy_set_header X-Forwarded-For $remote_addr; (not $proxy_add_x_forwarded_for) to prevent client-controlled header injection.

E2E TEST INTEGRATION

THE RESULTS

5 rate limit tiers calibrated to endpoint sensitivity
23 auth endpoints protected (100% coverage)
Sliding window algorithm — prevents burst attacks across window boundaries
Standard response headers — clients know their remaining quota
In-memory tracking with automatic cleanup — no Redis dependency for rate limiting
ApiError.tooManyRequests() — clean 429 responses with retry guidance

5-Tier Rate Limiting: Protecting 23 Auth Endpoints from Brute Force

THE 5 TIERS

ENDPOINT-TO-TIER MAPPING

SLIDING WINDOW IMPLEMENTATION

PREDEFINED RATE LIMITERS

RESPONSE HEADERS

THE 429 RESPONSE

IP EXTRACTION FROM REVERSE PROXY

E2E TEST INTEGRATION

THE RESULTS

Related Articles

Building a 34-Rule WAF: From 15 Signatures to Full OWASP Coverage

Cross-Platform Biometric Auth: Migrating WebAuthn RP ID Across Subdomains

From Local Disk to S3: Building a Dual-Provider Storage Layer

Comments

In This Article

Actions

5-Tier Rate Limiting: Protecting 23 Auth Endpoints from Brute Force

THE 5 TIERS

ENDPOINT-TO-TIER MAPPING

SLIDING WINDOW IMPLEMENTATION

PREDEFINED RATE LIMITERS

RESPONSE HEADERS

THE 429 RESPONSE

IP EXTRACTION FROM REVERSE PROXY

E2E TEST INTEGRATION

THE RESULTS

Related Articles

Building a 34-Rule WAF: From 15 Signatures to Full OWASP Coverage

Cross-Platform Biometric Auth: Migrating WebAuthn RP ID Across Subdomains

From Local Disk to S3: Building a Dual-Provider Storage Layer

Comments

In This Article

Actions

THE 5 TIERS

ENDPOINT-TO-TIER MAPPING

SLIDING WINDOW IMPLEMENTATION

PREDEFINED RATE LIMITERS

RESPONSE HEADERS

THE 429 RESPONSE

IP EXTRACTION FROM REVERSE PROXY

E2E TEST INTEGRATION

THE RESULTS

Enjoyed this article?

Related Articles

Building a 34-Rule WAF: From 15 Signatures to Full OWASP Coverage

Cross-Platform Biometric Auth: Migrating WebAuthn RP ID Across Subdomains

From Local Disk to S3: Building a Dual-Provider Storage Layer

Comments

In This Article

Actions

THE 5 TIERS

ENDPOINT-TO-TIER MAPPING

SLIDING WINDOW IMPLEMENTATION

PREDEFINED RATE LIMITERS

RESPONSE HEADERS

THE 429 RESPONSE

IP EXTRACTION FROM REVERSE PROXY

E2E TEST INTEGRATION

THE RESULTS

Enjoyed this article?

Related Articles

Building a 34-Rule WAF: From 15 Signatures to Full OWASP Coverage

Cross-Platform Biometric Auth: Migrating WebAuthn RP ID Across Subdomains

From Local Disk to S3: Building a Dual-Provider Storage Layer

Comments

In This Article

Actions