- Added a new section for Release Procedures, detailing deployment and rollback processes. - Updated the System Operations section to include Monitoring Setup, Rate Limit Tuning, and Customer Data Management for improved operational guidance. - Reformatted the table structure for better readability and consistency across documentation.
10 KiB
Release and Deployment Procedures
This document covers pre-deployment checklists, deployment procedures, post-deployment verification, and rollback procedures for the Customer Portal.
Deployment Overview
| Environment | Method | Script | Notes |
|---|---|---|---|
| Development | Local | pnpm dev |
Apps run locally, services in Docker |
| Production | Docker Compose | pnpm prod:deploy |
Full containerized deployment |
| Updates | Docker Compose | pnpm prod:update |
Zero-downtime application updates |
Available Commands
pnpm prod:deploy # Full deployment (build + start + migrate)
pnpm prod:start # Start all production services
pnpm prod:stop # Stop all production services
pnpm prod:update # Zero-downtime update (rebuild and recreate apps)
pnpm prod:status # Show service status and health
pnpm prod:logs # Show service logs
pnpm prod:backup # Create database backup
pnpm prod:cleanup # Clean up old containers and images
Pre-Deployment Checklist
Code Review
- All changes have been reviewed and approved
- No console.log/console.error statements in production code
- No hardcoded secrets or credentials
- TypeScript compilation passes (
pnpm type-check) - Linting passes (
pnpm lint) - Tests pass (
pnpm test)
Environment Configuration
- All required environment variables are set in
.env - Database URL is correct for production
- Redis URL is correct for production
- External API credentials are valid (Salesforce, WHMCS, Freebit)
- CORS_ORIGIN matches production domain
- JWT_SECRET is secure and unique
Required Environment Variables:
DATABASE_URL # PostgreSQL connection string
REDIS_URL # Redis connection string
JWT_SECRET # Secure secret (min 32 chars)
POSTGRES_PASSWORD # Database password
CORS_ORIGIN # Frontend domain
NEXT_PUBLIC_API_BASE # BFF API URL
BFF_PORT # Backend port (usually 4000)
Database Migration Check
- Review pending migrations (
npx prisma migrate status) - Test migrations on staging/local first
- Create database backup before applying migrations
- Prepare rollback SQL if migration is destructive
- Estimate migration duration for large tables
Dependency Check
- Run security audit (
pnpm security:check) - No high/critical vulnerabilities
- All dependencies are at expected versions
- Lock file is up to date (
pnpm-lock.yaml)
Communication
- Notify team of deployment schedule
- Schedule during low-traffic window if possible
- Prepare customer communication if downtime expected
- Ensure on-call engineer is available
Deployment Procedure
Standard Deployment (First Time)
# 1. Create database backup (if updating existing system)
pnpm prod:backup
# 2. Full deployment
pnpm prod:deploy
This command:
- Validates environment configuration
- Builds production Docker images
- Starts database and cache services
- Waits for database readiness
- Runs Prisma migrations
- Starts frontend and backend services
- Performs health checks
Application Update (Zero-Downtime)
For updates that don't require database migrations:
# 1. Create database backup
pnpm prod:backup
# 2. Update applications
pnpm prod:update
This rebuilds and recreates frontend and backend containers without stopping the database.
Database Migration Deployment
For deployments with schema changes:
# 1. Create database backup
pnpm prod:backup
# 2. Stop application to prevent writes during migration
pnpm prod:stop
# 3. Start only database
docker compose -f docker/prod/docker-compose.yml up -d database
# 4. Run migrations
docker compose -f docker/prod/docker-compose.yml run --rm backend pnpm db:migrate
# 5. Verify migration success
docker compose -f docker/prod/docker-compose.yml exec database psql -U portal -d portal_prod -c "SELECT * FROM _prisma_migrations ORDER BY finished_at DESC LIMIT 5;"
# 6. Start all services
pnpm prod:start
# 7. Verify application health
pnpm prod:status
Post-Deployment Verification
Immediate Checks (0-5 minutes)
- Health endpoints return
okcurl http://localhost:4000/health curl http://localhost:3000/_health - No error spikes in logs
pnpm prod:logs backend | grep -i error | tail -20 - Database migrations applied successfully
- Redis connectivity verified
Functional Checks (5-15 minutes)
- User can log in to portal
- Dashboard loads correctly
- Invoice list displays
- Subscription list displays
- Catalog products load
Integration Checks (15-30 minutes)
- Salesforce connectivity verified
curl http://localhost:4000/auth/health-check | jq '.services.salesforce' - WHMCS connectivity verified
curl http://localhost:4000/auth/health-check | jq '.services.whmcs' - Queue health verified
curl http://localhost:4000/health/queues
Monitoring Checks
- Metrics are being collected
- No alert triggers from deployment
- Log aggregation is working
- Error rates are normal
Rollback Procedures
Application Rollback (No DB Changes)
If deployment fails without database changes:
# 1. Stop current deployment
pnpm prod:stop
# 2. Checkout previous version
git checkout <previous-tag-or-commit>
# 3. Rebuild and deploy
pnpm prod:deploy
Application Rollback with Docker Images
If previous images are available:
# 1. Stop current services
pnpm prod:stop
# 2. Start with previous image tags
docker compose -f docker/prod/docker-compose.yml up -d \
--no-build \
-e BACKEND_IMAGE=portal-backend:previous \
-e FRONTEND_IMAGE=portal-frontend:previous
Database Rollback
If database migration needs to be reverted:
Option 1: Restore from Backup
# 1. Stop application
pnpm prod:stop
# 2. Restore database
docker compose exec database psql -U portal -d portal_prod < backup_YYYYMMDD_HHMMSS.sql
# 3. Checkout previous code version
git checkout <previous-tag>
# 4. Rebuild and restart
pnpm prod:deploy
Option 2: Manual Rollback SQL
# 1. Stop application
pnpm prod:stop
# 2. Apply rollback script (if prepared)
docker compose exec database psql -U portal -d portal_prod < rollback_migration_YYYYMMDD.sql
# 3. Manually remove migration record
docker compose exec database psql -U portal -d portal_prod -c "DELETE FROM _prisma_migrations WHERE migration_name = '20240115_migration_name';"
# 4. Restart with previous code
git checkout <previous-tag>
pnpm prod:deploy
Emergency Rollback
For critical failures requiring immediate action:
# 1. Immediately stop all services
pnpm prod:stop
# 2. Restore from most recent backup
docker compose exec database psql -U portal -d portal_prod < /path/to/latest_backup.sql
# 3. Deploy last known good version
git checkout <last-known-good-tag>
pnpm prod:deploy
# 4. Notify team
# Send incident notification
Feature Flags
The portal does not currently use a formal feature flag system. Feature availability is controlled through:
- Environment Variables - Toggle features via configuration
- Conditional Rendering - Frontend checks for feature availability
- Backend Feature Checks - API endpoints check configuration
Adding a Feature Toggle
// Backend: Check environment variable
const featureEnabled = this.configService.get("FEATURE_NEW_CHECKOUT", "false") === "true";
// Frontend: Check feature availability
if (process.env.NEXT_PUBLIC_FEATURE_NEW_CHECKOUT === "true") {
// Render new feature
}
Emergency Feature Disable
To disable a feature without redeployment:
- Update environment variable in
.env - Restart affected services:
docker compose restart backend frontend
Deployment Timeline Template
| Time | Action | Owner | Notes |
|---|---|---|---|
| T-24h | Announce deployment window | Tech Lead | Notify all stakeholders |
| T-2h | Final code review | Developers | Verify all changes merged |
| T-1h | Pre-deployment checklist | DevOps | Complete all checks |
| T-30m | Create backup | DevOps | Verify backup integrity |
| T-15m | Notify team deployment starting | DevOps | Slack/Teams message |
| T-0 | Execute deployment | DevOps | Run deployment commands |
| T+5m | Immediate verification | DevOps | Health checks |
| T+15m | Functional verification | QA/DevOps | Test key flows |
| T+30m | All-clear or rollback decision | Tech Lead | Confirm success |
| T+1h | Post-deployment monitoring | DevOps | Watch metrics |
| T+24h | Close deployment | Tech Lead | Final verification |
Troubleshooting
Build Failures
# Check Docker daemon
docker info
# Check disk space
df -h
# Clean Docker resources
docker system prune -a
Migration Failures
# Check migration status
npx prisma migrate status
# View migration history
docker compose exec database psql -U portal -d portal_prod -c "SELECT * FROM _prisma_migrations;"
# Reset migration (development only!)
npx prisma migrate reset
Service Startup Failures
# Check service logs
pnpm prod:logs backend
pnpm prod:logs frontend
# Check container status
docker compose ps -a
# Check resource usage
docker stats
Database Connection Issues
# Test database connectivity
docker compose exec database pg_isready -U portal -d portal_prod
# Check connection count
docker compose exec database psql -U portal -d portal_prod -c "SELECT count(*) FROM pg_stat_activity;"
Related Documents
Last Updated: December 2025