barsa 7c929eb4dc Update Customer Portal Documentation and Remove Deprecated Files
- Streamlined the README.md for clarity and conciseness.
- Deleted outdated documentation files related to Freebit SIM management, SIM management API data flow, and various architectural guides to reduce clutter and improve maintainability.
- Updated the last modified date in the README to reflect the latest changes.
2025-12-23 15:43:36 +09:00

7.4 KiB

Redis-Required Token Flow Implementation Summary

Overview

This document summarizes the implementation of the Redis-required token flow with maintenance response, Salesforce auth timeout and logging improvements, queue throttling threshold updates, per-user refresh token sets, and migration utilities.

Completed Features

1. Redis-Required Token Flow with Maintenance Response

Environment Variables Added:

  • AUTH_REQUIRE_REDIS_FOR_TOKENS: When enabled, tokens require Redis to be available
  • AUTH_MAINTENANCE_MODE: Enables maintenance mode for authentication service
  • AUTH_MAINTENANCE_MESSAGE: Customizable maintenance message

Implementation:

  • Added checkServiceAvailability() method in AuthTokenService
  • Strict Redis requirement enforcement when flag is enabled
  • Graceful maintenance mode with custom messaging
  • Production-safe error handling memory:6689308

Files Modified:

  • apps/bff/src/core/config/env.validation.ts
  • apps/bff/src/modules/auth/services/token.service.ts
  • env/portal-backend.env.sample

2. Salesforce Auth Timeout + Logging

Environment Variables Added:

  • SF_AUTH_TIMEOUT_MS: Configurable authentication timeout (default: 30s)
  • SF_TOKEN_TTL_MS: Token time-to-live (default: 12 minutes)
  • SF_TOKEN_REFRESH_BUFFER_MS: Refresh buffer time (default: 1 minute)

Implementation:

  • Added timeout handling with AbortController
  • Enhanced logging with timing information and error details
  • Production-safe logging (sensitive data redacted) memory:6689308
  • Re-authentication attempt logging with duration tracking
  • Session expiration detection and automatic retry

Files Modified:

  • apps/bff/src/integrations/salesforce/services/salesforce-connection.service.ts
  • apps/bff/src/core/config/env.validation.ts
  • env/portal-backend.env.sample

3. Queue Throttling Thresholds (Configurable)

Environment Variables Added:

  • WHMCS_QUEUE_CONCURRENCY: WHMCS concurrent requests (default: 15)
  • WHMCS_QUEUE_INTERVAL_CAP: WHMCS requests per minute (default: 300)
  • WHMCS_QUEUE_TIMEOUT_MS: WHMCS request timeout (default: 30s)
  • SF_QUEUE_CONCURRENCY: Salesforce concurrent requests (default: 15)
  • SF_QUEUE_LONG_RUNNING_CONCURRENCY: SF long-running requests (default: 22)
  • SF_QUEUE_INTERVAL_CAP: SF requests per minute (default: 600)
  • SF_QUEUE_TIMEOUT_MS: SF request timeout (default: 30s)
  • SF_QUEUE_LONG_RUNNING_TIMEOUT_MS: SF long-running timeout (default: 10 minutes)

Implementation:

  • Made all queue thresholds configurable via environment variables
  • Maintained optimized default values (15 concurrent, 5-10 RPS)
  • Enhanced logging with actual configuration values

Files Modified:

  • apps/bff/src/core/queue/services/whmcs-request-queue.service.ts
  • apps/bff/src/core/queue/services/salesforce-request-queue.service.ts
  • apps/bff/src/core/config/env.validation.ts
  • env/portal-backend.env.sample

4. Per-User Refresh Token Sets

Implementation:

  • Enhanced AuthTokenService with per-user token management
  • Added REFRESH_USER_SET_PREFIX for organizing tokens by user
  • Implemented automatic cleanup of excess tokens (max 10 per user)
  • Added getUserRefreshTokenFamilies() method for token inspection
  • Optimized revokeAllUserTokens() using Redis sets instead of scanning

New Methods:

  • storeRefreshTokenInRedis(): Enhanced storage with user sets
  • cleanupExcessUserTokens(): Automatic cleanup of old tokens
  • getUserRefreshTokenFamilies(): Get user's active token families
  • revokeAllUserTokensFallback(): Fallback for edge cases

Files Modified:

  • apps/bff/src/modules/auth/services/token.service.ts

5. Migration Utilities for Existing Keys

Legacy helpers (token-migration.service.ts) have been removed along with the admin-only migration endpoints.

🚀 Deployment Instructions

1. Environment Configuration

Add the following to your environment file:

# Redis-required token flow
AUTH_REQUIRE_REDIS_FOR_TOKENS=false  # Set to true to require Redis
AUTH_MAINTENANCE_MODE=false          # Set to true for maintenance
AUTH_MAINTENANCE_MESSAGE=Authentication service is temporarily unavailable for maintenance. Please try again later.

# Salesforce timeouts
SF_AUTH_TIMEOUT_MS=30000
SF_TOKEN_TTL_MS=720000
SF_TOKEN_REFRESH_BUFFER_MS=60000

# Queue throttling (adjust as needed)
WHMCS_QUEUE_CONCURRENCY=15
WHMCS_QUEUE_INTERVAL_CAP=300
WHMCS_QUEUE_TIMEOUT_MS=30000
SF_QUEUE_CONCURRENCY=15
SF_QUEUE_LONG_RUNNING_CONCURRENCY=22
SF_QUEUE_INTERVAL_CAP=600
SF_QUEUE_TIMEOUT_MS=30000
SF_QUEUE_LONG_RUNNING_TIMEOUT_MS=600000

2. Migration Process

Legacy admin migration endpoints were removed. If migration is needed in the future, plan a manual script or one-off job.

3. Feature Flag Rollout

  1. Phase 1: Deploy with AUTH_REQUIRE_REDIS_FOR_TOKENS=false
  2. Phase 2: Run migration in dry-run mode to assess impact
  3. Phase 3: Execute migration during maintenance window
  4. Phase 4: Enable AUTH_REQUIRE_REDIS_FOR_TOKENS=true for strict mode

🔧 Configuration Recommendations

Production Settings

# Strict Redis requirement for production
AUTH_REQUIRE_REDIS_FOR_TOKENS=true

# Conservative queue settings for stability
WHMCS_QUEUE_CONCURRENCY=10
WHMCS_QUEUE_INTERVAL_CAP=200
SF_QUEUE_CONCURRENCY=12
SF_QUEUE_INTERVAL_CAP=400

# Longer timeouts for production reliability
SF_AUTH_TIMEOUT_MS=45000
WHMCS_QUEUE_TIMEOUT_MS=45000

Development Settings

# Allow failover for development
AUTH_REQUIRE_REDIS_FOR_TOKENS=false

# Higher throughput for development
WHMCS_QUEUE_CONCURRENCY=20
WHMCS_QUEUE_INTERVAL_CAP=500
SF_QUEUE_CONCURRENCY=20
SF_QUEUE_INTERVAL_CAP=800

🔍 Monitoring and Observability

Key Metrics to Monitor

  1. Token Operations:

    • Redis connection status
    • Token generation/refresh success rates
    • Per-user token counts
  2. Queue Performance:

    • Queue depths and wait times
    • Request success/failure rates
    • Timeout occurrences
  3. Salesforce Auth:

    • Authentication duration
    • Re-authentication frequency
    • Session expiration events

Log Patterns to Watch

  • "Authentication service in maintenance mode"
  • "Redis required for token operations but not available"
  • "Salesforce authentication timeout"
  • "Cleaned up excess user tokens"

🛡️ Security Considerations

  1. Production Logging: All sensitive data is redacted in production logs memory:6689308
  2. Token Limits: Automatic cleanup prevents token accumulation attacks
  3. Redis Dependency: Strict mode prevents token operations without Redis
  4. Audit Trail: All migration operations are logged for compliance
  5. Graceful Degradation: Maintenance mode provides controlled service interruption

📋 Testing Checklist

  • Redis failover behavior with strict mode enabled
  • Maintenance mode activation and messaging
  • Salesforce authentication timeout handling
  • Queue throttling under load
  • Token migration dry-run and execution
  • Per-user token limit enforcement
  • Orphaned token cleanup

🔄 Rollback Plan

If issues arise:

  1. Disable Strict Mode: Set AUTH_REQUIRE_REDIS_FOR_TOKENS=false
  2. Exit Maintenance: Set AUTH_MAINTENANCE_MODE=false
  3. Revert Queue Settings: Use previous concurrency/timeout values
  4. Token Cleanup: Use migration service to clean up if needed

All changes are backward compatible and can be safely rolled back via environment variables.