barsa 7c929eb4dc Update Customer Portal Documentation and Remove Deprecated Files
- Streamlined the README.md for clarity and conciseness.
- Deleted outdated documentation files related to Freebit SIM management, SIM management API data flow, and various architectural guides to reduce clutter and improve maintainability.
- Updated the last modified date in the README to reflect the latest changes.
2025-12-23 15:43:36 +09:00

228 lines
7.4 KiB
Markdown

# Redis-Required Token Flow Implementation Summary
## Overview
This document summarizes the implementation of the Redis-required token flow with maintenance response, Salesforce auth timeout and logging improvements, queue throttling threshold updates, per-user refresh token sets, and migration utilities.
## ✅ Completed Features
### 1. Redis-Required Token Flow with Maintenance Response
**Environment Variables Added:**
- `AUTH_REQUIRE_REDIS_FOR_TOKENS`: When enabled, tokens require Redis to be available
- `AUTH_MAINTENANCE_MODE`: Enables maintenance mode for authentication service
- `AUTH_MAINTENANCE_MESSAGE`: Customizable maintenance message
**Implementation:**
- Added `checkServiceAvailability()` method in `AuthTokenService`
- Strict Redis requirement enforcement when flag is enabled
- Graceful maintenance mode with custom messaging
- Production-safe error handling [[memory:6689308]]
**Files Modified:**
- `apps/bff/src/core/config/env.validation.ts`
- `apps/bff/src/modules/auth/services/token.service.ts`
- `env/portal-backend.env.sample`
### 2. Salesforce Auth Timeout + Logging
**Environment Variables Added:**
- `SF_AUTH_TIMEOUT_MS`: Configurable authentication timeout (default: 30s)
- `SF_TOKEN_TTL_MS`: Token time-to-live (default: 12 minutes)
- `SF_TOKEN_REFRESH_BUFFER_MS`: Refresh buffer time (default: 1 minute)
**Implementation:**
- Added timeout handling with `AbortController`
- Enhanced logging with timing information and error details
- Production-safe logging (sensitive data redacted) [[memory:6689308]]
- Re-authentication attempt logging with duration tracking
- Session expiration detection and automatic retry
**Files Modified:**
- `apps/bff/src/integrations/salesforce/services/salesforce-connection.service.ts`
- `apps/bff/src/core/config/env.validation.ts`
- `env/portal-backend.env.sample`
### 3. Queue Throttling Thresholds (Configurable)
**Environment Variables Added:**
- `WHMCS_QUEUE_CONCURRENCY`: WHMCS concurrent requests (default: 15)
- `WHMCS_QUEUE_INTERVAL_CAP`: WHMCS requests per minute (default: 300)
- `WHMCS_QUEUE_TIMEOUT_MS`: WHMCS request timeout (default: 30s)
- `SF_QUEUE_CONCURRENCY`: Salesforce concurrent requests (default: 15)
- `SF_QUEUE_LONG_RUNNING_CONCURRENCY`: SF long-running requests (default: 22)
- `SF_QUEUE_INTERVAL_CAP`: SF requests per minute (default: 600)
- `SF_QUEUE_TIMEOUT_MS`: SF request timeout (default: 30s)
- `SF_QUEUE_LONG_RUNNING_TIMEOUT_MS`: SF long-running timeout (default: 10 minutes)
**Implementation:**
- Made all queue thresholds configurable via environment variables
- Maintained optimized default values (15 concurrent, 5-10 RPS)
- Enhanced logging with actual configuration values
**Files Modified:**
- `apps/bff/src/core/queue/services/whmcs-request-queue.service.ts`
- `apps/bff/src/core/queue/services/salesforce-request-queue.service.ts`
- `apps/bff/src/core/config/env.validation.ts`
- `env/portal-backend.env.sample`
### 4. Per-User Refresh Token Sets
**Implementation:**
- Enhanced `AuthTokenService` with per-user token management
- Added `REFRESH_USER_SET_PREFIX` for organizing tokens by user
- Implemented automatic cleanup of excess tokens (max 10 per user)
- Added `getUserRefreshTokenFamilies()` method for token inspection
- Optimized `revokeAllUserTokens()` using Redis sets instead of scanning
**New Methods:**
- `storeRefreshTokenInRedis()`: Enhanced storage with user sets
- `cleanupExcessUserTokens()`: Automatic cleanup of old tokens
- `getUserRefreshTokenFamilies()`: Get user's active token families
- `revokeAllUserTokensFallback()`: Fallback for edge cases
**Files Modified:**
- `apps/bff/src/modules/auth/services/token.service.ts`
### 5. Migration Utilities for Existing Keys
Legacy helpers (`token-migration.service.ts`) have been removed along with the admin-only migration endpoints.
## 🚀 Deployment Instructions
### 1. Environment Configuration
Add the following to your environment file:
```bash
# Redis-required token flow
AUTH_REQUIRE_REDIS_FOR_TOKENS=false # Set to true to require Redis
AUTH_MAINTENANCE_MODE=false # Set to true for maintenance
AUTH_MAINTENANCE_MESSAGE=Authentication service is temporarily unavailable for maintenance. Please try again later.
# Salesforce timeouts
SF_AUTH_TIMEOUT_MS=30000
SF_TOKEN_TTL_MS=720000
SF_TOKEN_REFRESH_BUFFER_MS=60000
# Queue throttling (adjust as needed)
WHMCS_QUEUE_CONCURRENCY=15
WHMCS_QUEUE_INTERVAL_CAP=300
WHMCS_QUEUE_TIMEOUT_MS=30000
SF_QUEUE_CONCURRENCY=15
SF_QUEUE_LONG_RUNNING_CONCURRENCY=22
SF_QUEUE_INTERVAL_CAP=600
SF_QUEUE_TIMEOUT_MS=30000
SF_QUEUE_LONG_RUNNING_TIMEOUT_MS=600000
```
### 2. Migration Process
Legacy admin migration endpoints were removed. If migration is needed in the future, plan a manual script or one-off job.
### 3. Feature Flag Rollout
1. **Phase 1:** Deploy with `AUTH_REQUIRE_REDIS_FOR_TOKENS=false`
2. **Phase 2:** Run migration in dry-run mode to assess impact
3. **Phase 3:** Execute migration during maintenance window
4. **Phase 4:** Enable `AUTH_REQUIRE_REDIS_FOR_TOKENS=true` for strict mode
## 🔧 Configuration Recommendations
### Production Settings
```bash
# Strict Redis requirement for production
AUTH_REQUIRE_REDIS_FOR_TOKENS=true
# Conservative queue settings for stability
WHMCS_QUEUE_CONCURRENCY=10
WHMCS_QUEUE_INTERVAL_CAP=200
SF_QUEUE_CONCURRENCY=12
SF_QUEUE_INTERVAL_CAP=400
# Longer timeouts for production reliability
SF_AUTH_TIMEOUT_MS=45000
WHMCS_QUEUE_TIMEOUT_MS=45000
```
### Development Settings
```bash
# Allow failover for development
AUTH_REQUIRE_REDIS_FOR_TOKENS=false
# Higher throughput for development
WHMCS_QUEUE_CONCURRENCY=20
WHMCS_QUEUE_INTERVAL_CAP=500
SF_QUEUE_CONCURRENCY=20
SF_QUEUE_INTERVAL_CAP=800
```
## 🔍 Monitoring and Observability
### Key Metrics to Monitor
1. **Token Operations:**
- Redis connection status
- Token generation/refresh success rates
- Per-user token counts
2. **Queue Performance:**
- Queue depths and wait times
- Request success/failure rates
- Timeout occurrences
3. **Salesforce Auth:**
- Authentication duration
- Re-authentication frequency
- Session expiration events
### Log Patterns to Watch
- `"Authentication service in maintenance mode"`
- `"Redis required for token operations but not available"`
- `"Salesforce authentication timeout"`
- `"Cleaned up excess user tokens"`
## 🛡️ Security Considerations
1. **Production Logging:** All sensitive data is redacted in production logs [[memory:6689308]]
2. **Token Limits:** Automatic cleanup prevents token accumulation attacks
3. **Redis Dependency:** Strict mode prevents token operations without Redis
4. **Audit Trail:** All migration operations are logged for compliance
5. **Graceful Degradation:** Maintenance mode provides controlled service interruption
## 📋 Testing Checklist
- [ ] Redis failover behavior with strict mode enabled
- [ ] Maintenance mode activation and messaging
- [ ] Salesforce authentication timeout handling
- [ ] Queue throttling under load
- [ ] Token migration dry-run and execution
- [ ] Per-user token limit enforcement
- [ ] Orphaned token cleanup
## 🔄 Rollback Plan
If issues arise:
1. **Disable Strict Mode:** Set `AUTH_REQUIRE_REDIS_FOR_TOKENS=false`
2. **Exit Maintenance:** Set `AUTH_MAINTENANCE_MODE=false`
3. **Revert Queue Settings:** Use previous concurrency/timeout values
4. **Token Cleanup:** Use migration service to clean up if needed
All changes are backward compatible and can be safely rolled back via environment variables.