barsa 90ab71b94d Update README.md to Enhance Documentation Clarity and Add New Sections

- Added a new section for Release Procedures, detailing deployment and rollback processes.
- Updated the System Operations section to include Monitoring Setup, Rate Limit Tuning, and Customer Data Management for improved operational guidance.
- Reformatted the table structure for better readability and consistency across documentation.

2025-12-23 16:08:15 +09:00

11 KiB

Raw Blame History

Rate Limit Tuning Guide

This document covers rate limiting configuration, adjustment procedures, and troubleshooting for the Customer Portal.

Rate Limiting Overview

The portal uses multiple rate limiting mechanisms:

Type	Scope	Backend	Purpose
Auth Rate Limiting	Per endpoint (login, signup, etc.)	Redis	Prevent brute force attacks
Global Rate Limiting	Per route/controller	Redis	API abuse prevention
Request Queues	Per external API	In-memory (p-queue)	External API protection
SSE Connection Limits	Per user	In-memory	Resource protection

Authentication Rate Limits

Configuration

Endpoint	Env Variable	Default	Window
Login	`LOGIN_RATE_LIMIT_LIMIT`	5 attempts	15 min
Login (TTL)	`LOGIN_RATE_LIMIT_TTL`	900000 ms	-
Signup	`SIGNUP_RATE_LIMIT_LIMIT`	5 attempts	15 min
Signup (TTL)	`SIGNUP_RATE_LIMIT_TTL`	900000 ms	-
Password Reset	`PASSWORD_RESET_RATE_LIMIT_LIMIT`	5 attempts	15 min
Password Reset (TTL)	`PASSWORD_RESET_RATE_LIMIT_TTL`	900000 ms	-
Token Refresh	`AUTH_REFRESH_RATE_LIMIT_LIMIT`	10 attempts	5 min
Token Refresh (TTL)	`AUTH_REFRESH_RATE_LIMIT_TTL`	300000 ms	-

CAPTCHA Configuration

Setting	Env Variable	Default	Description
CAPTCHA Threshold	`LOGIN_CAPTCHA_AFTER_ATTEMPTS`	3	Show CAPTCHA after N failed attempts
CAPTCHA Always On	`AUTH_CAPTCHA_ALWAYS_ON`	false	Require CAPTCHA for all logins

Adjusting Auth Rate Limits

In Production (requires restart):

# Edit .env file
LOGIN_RATE_LIMIT_LIMIT=10        # Increase to 10 attempts
LOGIN_RATE_LIMIT_TTL=1800000     # Extend window to 30 minutes

# Restart backend
docker compose restart backend

Temporary Increase via Redis (immediate, no restart):

# Check current rate limit for a key
redis-cli GET "auth-login:<ip-hash>"

# Delete a rate limit record to allow immediate retry
redis-cli DEL "auth-login:<ip-hash>"

Global API Rate Limits

Configuration

Global rate limits are applied via the @RateLimit decorator:

@RateLimit({ limit: 100, ttl: 60 })  // 100 requests per minute
@Controller('invoices')
export class InvoicesController { ... }

Common Rate Limit Settings

Endpoint	Limit	TTL	Notes
Invoices	100	60s	High-traffic endpoint
Subscriptions	100	60s	High-traffic endpoint
Catalog	200	60s	Cached, higher limit
Orders	50	60s	Write operations
Profile	60	60s	Standard limit

Adjusting Global Rate Limits

Global rate limits are defined in code. To adjust:

Modify the @RateLimit decorator in the controller
Deploy the change

// Before
@RateLimit({ limit: 50, ttl: 60 })

// After (double the limit)
@RateLimit({ limit: 100, ttl: 60 })

External API Request Queues

WHMCS Queue Configuration

Setting	Env Variable	Default	Description
Concurrency	`WHMCS_QUEUE_CONCURRENCY`	15	Max parallel requests
Interval Cap	`WHMCS_QUEUE_INTERVAL_CAP`	300	Max requests per minute
Timeout	`WHMCS_QUEUE_TIMEOUT_MS`	30000	Request timeout (ms)

Salesforce Queue Configuration

Setting	Env Variable	Default	Description
Standard Concurrency	`SF_QUEUE_CONCURRENCY`	10	Standard operations
Long-Running Concurrency	`SF_LONG_RUNNING_CONCURRENCY`	5	Bulk operations
Interval Cap	`SF_QUEUE_INTERVAL_CAP`	200	Max requests per minute
Timeout	`SF_QUEUE_TIMEOUT_MS`	30000	Request timeout (ms)

Adjusting Queue Limits

Production Adjustment:

# Edit .env file
WHMCS_QUEUE_CONCURRENCY=20      # Increase concurrent requests
WHMCS_QUEUE_INTERVAL_CAP=500    # Increase requests per minute

# Restart backend
docker compose restart backend

Queue Health Monitoring

# Check queue metrics
curl http://localhost:4000/health/queues | jq '.'

# Expected output:
{
  "whmcs": {
    "health": "healthy",
    "metrics": {
      "queueSize": 0,
      "pendingRequests": 2,
      "failedRequests": 0
    }
  },
  "salesforce": {
    "health": "healthy",
    "metrics": { ... },
    "dailyUsage": { "used": 5000, "limit": 15000 }
  }
}

SSE Connection Limits

Configuration

// Per-user SSE connection limit (in-memory)
private readonly maxPerUser = 3;

This prevents a single user from opening unlimited SSE connections.

Adjusting SSE Limits

This requires a code change in realtime-connection-limiter.service.ts:

// Change from
private readonly maxPerUser = 3;

// To
private readonly maxPerUser = 5;

Bypassing Rate Limits for Testing

Temporary Bypass via Redis

# Clear all rate limit keys for testing
redis-cli KEYS "auth-*" | xargs redis-cli DEL
redis-cli KEYS "rate-limit:*" | xargs redis-cli DEL

# Clear specific user's rate limit
redis-cli KEYS "*<ip-or-user-identifier>*" | xargs redis-cli DEL

Using SkipRateLimit Decorator

For development/testing routes:

@SkipRateLimit()
@Get('test-endpoint')
async testEndpoint() { ... }

Environment-Based Bypass

Add a development bypass in configuration:

# In .env (development only!)
RATE_LIMIT_BYPASS_ENABLED=true

// In guard
if (this.configService.get("RATE_LIMIT_BYPASS_ENABLED") === "true") {
  return true;
}

Warning

: Never enable bypass in production!

Signs of Rate Limit Issues

User-Facing Symptoms

Symptom	Possible Cause	Investigation
"Too many requests" errors	Rate limit exceeded	Check Redis keys, logs
Login failures	Auth rate limit	Check `auth-login:*` keys
Slow API responses	Queue backlog	Check `/health/queues`
429 errors in logs	Any rate limit	Check logs for specifics

Monitoring Indicators

Metric	Warning	Critical	Action
429 error rate	>1%	>5%	Review rate limits
Queue size	>10	>50	Increase concurrency
Average wait time	>1s	>5s	Scale or increase limits
CAPTCHA triggers	Unusual spike	-	Possible attack

Log Analysis

# Find rate limit exceeded events
grep "Rate limit exceeded" /var/log/bff/combined.log | tail -20

# Find 429 responses
grep '"statusCode":429' /var/log/bff/combined.log | tail -20

# Count rate limit events by path
grep "Rate limit exceeded" /var/log/bff/combined.log | \
  jq -r '.path' | sort | uniq -c | sort -rn

Troubleshooting

Too Many 429 Errors

Diagnosis:

# Check which endpoints are rate limited
grep "Rate limit exceeded" /var/log/bff/combined.log | \
  jq '{path: .path, key: .key}' | head -20

# Check queue health
curl http://localhost:4000/health/queues

Resolution:

Identify the affected endpoint
Check if limit is appropriate for traffic
Increase limit if legitimate traffic
Add caching if requests are repetitive

Legitimate Users Being Blocked

Diagnosis:

# Check rate limit state for specific key
redis-cli KEYS "*<identifier>*"
redis-cli GET "auth-login:<hash>"

Resolution:

# Clear the user's rate limit record
redis-cli DEL "auth-login:<hash>"

External API Rate Limit Violations

WHMCS Rate Limiting:

# Check queue metrics
curl http://localhost:4000/health/queues/whmcs

# Reduce concurrency if WHMCS is overloaded
WHMCS_QUEUE_CONCURRENCY=5
WHMCS_QUEUE_INTERVAL_CAP=100

Salesforce API Limits:

# Check daily API usage
curl http://localhost:4000/health/queues/salesforce | jq '.dailyUsage'

# If approaching limit, reduce requests
# Consider caching more data

Redis Connection Issues

If rate limiting fails due to Redis:

# Check Redis connectivity
redis-cli PING

# The guard fails open on Redis errors (allows request)
# Check logs for "Rate limiter error - failing open"

Best Practices

Setting Rate Limits

Start Conservative - Begin with lower limits, increase as needed
Monitor Before Adjusting - Understand traffic patterns first
Consider User Experience - Limits should rarely impact normal use
Document Changes - Track why limits were adjusted

Rate Limit Strategies

Strategy	Use Case	Implementation
IP-based	Anonymous endpoints	Default behavior
User-based	Authenticated endpoints	Include user ID in key
Combined	Sensitive endpoints	IP + User-Agent hash
Tiered	Different user classes	Custom logic

Performance Considerations

Redis Latency - Keep Redis co-located with BFF
Key Expiration - Use TTL to prevent Redis bloat
Fail Open - Rate limiter allows requests if Redis fails
Logging - Log blocked requests for analysis

Rate Limit Response Headers

The BFF includes standard rate limit headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1704110400
Retry-After: 60

Clients can use these to implement backoff.

Last Updated: December 2025

11 KiB Raw Blame History

Rate Limit Tuning Guide

Rate Limiting Overview

Authentication Rate Limits

Configuration

CAPTCHA Configuration

Adjusting Auth Rate Limits

Global API Rate Limits

Configuration

Common Rate Limit Settings

Adjusting Global Rate Limits

External API Request Queues

WHMCS Queue Configuration

Salesforce Queue Configuration

Adjusting Queue Limits

Queue Health Monitoring

SSE Connection Limits

Configuration

Adjusting SSE Limits

Bypassing Rate Limits for Testing

Temporary Bypass via Redis

Using SkipRateLimit Decorator

Environment-Based Bypass

Signs of Rate Limit Issues

User-Facing Symptoms

Monitoring Indicators

Log Analysis

Troubleshooting

Too Many 429 Errors

Legitimate Users Being Blocked

External API Rate Limit Violations

Redis Connection Issues

Best Practices

Setting Rate Limits

Rate Limit Strategies

Performance Considerations

Rate Limit Response Headers

Related Documents

11 KiB

Raw Blame History