- Streamlined the README.md for clarity and conciseness. - Deleted outdated documentation files related to Freebit SIM management, SIM management API data flow, and various architectural guides to reduce clutter and improve maintainability. - Updated the last modified date in the README to reflect the latest changes.
7.4 KiB
Redis-Required Token Flow Implementation Summary
Overview
This document summarizes the implementation of the Redis-required token flow with maintenance response, Salesforce auth timeout and logging improvements, queue throttling threshold updates, per-user refresh token sets, and migration utilities.
✅ Completed Features
1. Redis-Required Token Flow with Maintenance Response
Environment Variables Added:
AUTH_REQUIRE_REDIS_FOR_TOKENS: When enabled, tokens require Redis to be availableAUTH_MAINTENANCE_MODE: Enables maintenance mode for authentication serviceAUTH_MAINTENANCE_MESSAGE: Customizable maintenance message
Implementation:
- Added
checkServiceAvailability()method inAuthTokenService - Strict Redis requirement enforcement when flag is enabled
- Graceful maintenance mode with custom messaging
- Production-safe error handling memory:6689308
Files Modified:
apps/bff/src/core/config/env.validation.tsapps/bff/src/modules/auth/services/token.service.tsenv/portal-backend.env.sample
2. Salesforce Auth Timeout + Logging
Environment Variables Added:
SF_AUTH_TIMEOUT_MS: Configurable authentication timeout (default: 30s)SF_TOKEN_TTL_MS: Token time-to-live (default: 12 minutes)SF_TOKEN_REFRESH_BUFFER_MS: Refresh buffer time (default: 1 minute)
Implementation:
- Added timeout handling with
AbortController - Enhanced logging with timing information and error details
- Production-safe logging (sensitive data redacted) memory:6689308
- Re-authentication attempt logging with duration tracking
- Session expiration detection and automatic retry
Files Modified:
apps/bff/src/integrations/salesforce/services/salesforce-connection.service.tsapps/bff/src/core/config/env.validation.tsenv/portal-backend.env.sample
3. Queue Throttling Thresholds (Configurable)
Environment Variables Added:
WHMCS_QUEUE_CONCURRENCY: WHMCS concurrent requests (default: 15)WHMCS_QUEUE_INTERVAL_CAP: WHMCS requests per minute (default: 300)WHMCS_QUEUE_TIMEOUT_MS: WHMCS request timeout (default: 30s)SF_QUEUE_CONCURRENCY: Salesforce concurrent requests (default: 15)SF_QUEUE_LONG_RUNNING_CONCURRENCY: SF long-running requests (default: 22)SF_QUEUE_INTERVAL_CAP: SF requests per minute (default: 600)SF_QUEUE_TIMEOUT_MS: SF request timeout (default: 30s)SF_QUEUE_LONG_RUNNING_TIMEOUT_MS: SF long-running timeout (default: 10 minutes)
Implementation:
- Made all queue thresholds configurable via environment variables
- Maintained optimized default values (15 concurrent, 5-10 RPS)
- Enhanced logging with actual configuration values
Files Modified:
apps/bff/src/core/queue/services/whmcs-request-queue.service.tsapps/bff/src/core/queue/services/salesforce-request-queue.service.tsapps/bff/src/core/config/env.validation.tsenv/portal-backend.env.sample
4. Per-User Refresh Token Sets
Implementation:
- Enhanced
AuthTokenServicewith per-user token management - Added
REFRESH_USER_SET_PREFIXfor organizing tokens by user - Implemented automatic cleanup of excess tokens (max 10 per user)
- Added
getUserRefreshTokenFamilies()method for token inspection - Optimized
revokeAllUserTokens()using Redis sets instead of scanning
New Methods:
storeRefreshTokenInRedis(): Enhanced storage with user setscleanupExcessUserTokens(): Automatic cleanup of old tokensgetUserRefreshTokenFamilies(): Get user's active token familiesrevokeAllUserTokensFallback(): Fallback for edge cases
Files Modified:
apps/bff/src/modules/auth/services/token.service.ts
5. Migration Utilities for Existing Keys
Legacy helpers (token-migration.service.ts) have been removed along with the admin-only migration endpoints.
🚀 Deployment Instructions
1. Environment Configuration
Add the following to your environment file:
# Redis-required token flow
AUTH_REQUIRE_REDIS_FOR_TOKENS=false # Set to true to require Redis
AUTH_MAINTENANCE_MODE=false # Set to true for maintenance
AUTH_MAINTENANCE_MESSAGE=Authentication service is temporarily unavailable for maintenance. Please try again later.
# Salesforce timeouts
SF_AUTH_TIMEOUT_MS=30000
SF_TOKEN_TTL_MS=720000
SF_TOKEN_REFRESH_BUFFER_MS=60000
# Queue throttling (adjust as needed)
WHMCS_QUEUE_CONCURRENCY=15
WHMCS_QUEUE_INTERVAL_CAP=300
WHMCS_QUEUE_TIMEOUT_MS=30000
SF_QUEUE_CONCURRENCY=15
SF_QUEUE_LONG_RUNNING_CONCURRENCY=22
SF_QUEUE_INTERVAL_CAP=600
SF_QUEUE_TIMEOUT_MS=30000
SF_QUEUE_LONG_RUNNING_TIMEOUT_MS=600000
2. Migration Process
Legacy admin migration endpoints were removed. If migration is needed in the future, plan a manual script or one-off job.
3. Feature Flag Rollout
- Phase 1: Deploy with
AUTH_REQUIRE_REDIS_FOR_TOKENS=false - Phase 2: Run migration in dry-run mode to assess impact
- Phase 3: Execute migration during maintenance window
- Phase 4: Enable
AUTH_REQUIRE_REDIS_FOR_TOKENS=truefor strict mode
🔧 Configuration Recommendations
Production Settings
# Strict Redis requirement for production
AUTH_REQUIRE_REDIS_FOR_TOKENS=true
# Conservative queue settings for stability
WHMCS_QUEUE_CONCURRENCY=10
WHMCS_QUEUE_INTERVAL_CAP=200
SF_QUEUE_CONCURRENCY=12
SF_QUEUE_INTERVAL_CAP=400
# Longer timeouts for production reliability
SF_AUTH_TIMEOUT_MS=45000
WHMCS_QUEUE_TIMEOUT_MS=45000
Development Settings
# Allow failover for development
AUTH_REQUIRE_REDIS_FOR_TOKENS=false
# Higher throughput for development
WHMCS_QUEUE_CONCURRENCY=20
WHMCS_QUEUE_INTERVAL_CAP=500
SF_QUEUE_CONCURRENCY=20
SF_QUEUE_INTERVAL_CAP=800
🔍 Monitoring and Observability
Key Metrics to Monitor
-
Token Operations:
- Redis connection status
- Token generation/refresh success rates
- Per-user token counts
-
Queue Performance:
- Queue depths and wait times
- Request success/failure rates
- Timeout occurrences
-
Salesforce Auth:
- Authentication duration
- Re-authentication frequency
- Session expiration events
Log Patterns to Watch
"Authentication service in maintenance mode""Redis required for token operations but not available""Salesforce authentication timeout""Cleaned up excess user tokens"
🛡️ Security Considerations
- Production Logging: All sensitive data is redacted in production logs memory:6689308
- Token Limits: Automatic cleanup prevents token accumulation attacks
- Redis Dependency: Strict mode prevents token operations without Redis
- Audit Trail: All migration operations are logged for compliance
- Graceful Degradation: Maintenance mode provides controlled service interruption
📋 Testing Checklist
- Redis failover behavior with strict mode enabled
- Maintenance mode activation and messaging
- Salesforce authentication timeout handling
- Queue throttling under load
- Token migration dry-run and execution
- Per-user token limit enforcement
- Orphaned token cleanup
🔄 Rollback Plan
If issues arise:
- Disable Strict Mode: Set
AUTH_REQUIRE_REDIS_FOR_TOKENS=false - Exit Maintenance: Set
AUTH_MAINTENANCE_MODE=false - Revert Queue Settings: Use previous concurrency/timeout values
- Token Cleanup: Use migration service to clean up if needed
All changes are backward compatible and can be safely rolled back via environment variables.