Assist_Design/docs/provisioning/RUNBOOK_PROVISIONING.md

76 lines
3.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Provisioning Runbook (Salesforce Platform Events → Portal → WHMCS)
This runbook helps operators diagnose issues in the order fulfillment path.
## Paths & Channels
- Salesforce Platform Event: `OrderProvisionRequested__e`
- Backend health: `GET /health`
## Required Env (Backend)
- `SF_LOGIN_URL`, `SF_CLIENT_ID`, `SF_USERNAME`
- `SF_PRIVATE_KEY_PATH` (prod: `/app/secrets/sf-private.key`)
- `SF_EVENTS_ENABLED=true`
- `SF_PROVISION_EVENT_CHANNEL=/event/OrderProvisionRequested__e`
- `SF_EVENTS_REPLAY=LATEST` (or `ALL`)
- `PORTAL_PRICEBOOK_ID`
## Common Symptoms and Fixes
- No events received
- Verify Flow publishes `OrderProvisionRequested__e` on Order approval
- Confirm the BFF has `SF_EVENTS_ENABLED=true` and valid SF JWT settings
- Check BFF logs for subscription start on the expected channel
- Event replays not advancing
- Ensure Redis is healthy; last `replayId` is stored under `sf:pe:replay:<channel>`
- If needed, set `SF_EVENTS_REPLAY=ALL` for a one-time backfill, then revert to `LATEST`
- 409 Payment method missing
- Customer has no WHMCS payment method
- Ask customer to add a payment method; retry fulfill
- WHMCS Add/Accept errors
- Check product mappings: `Product2.WH_Product_ID__c` and `Billing_Cycle__c`
- Backend logs show the item mapping report; fix missing mappings
- Salesforce status not updated
- Backend updates `Activation_Status__c` and `WHMCS_Order_ID__c` on success
- Verify connected app JWT config and that the API user has Order update permissions
## Verification Steps
1. In SF, create an Order with OrderItems
2. Approve Order → Flow sets `Activation_Status__c = Activating` and publishes `OrderProvisionRequested__e`
3. Check `/health`: database/redis connected, environment correct
4. Tail logs; confirm: Platform Event enqueued → Guard sees status=Activating → WHMCS add → WHMCS accept → Activated
5. Verify SF fields updated and WHMCS order/service IDs exist
## Logging Cheatsheet
- "Platform Event enqueued for provisioning" — subscriber enqueue
- "Starting fulfillment orchestration" — orchestrator start
- Step logs: `validation`, `sf_status_update`, `order_details`, `mapping`, `whmcs_create`, `whmcs_accept`, `sf_success_update`
- On error: orchestrator updates SF with `Activation_Status__c='Failed'`
## Security Notes
- No inbound Salesforce webhooks are used for provisioning.
- BFF authenticates to Salesforce via JWT; grant API access and Platform Event object read via Permission Set.
- No WHMCS webhooks are consumed; the portal uses the WHMCS API for billing operations.
- Health endpoint
- `/health` includes `integrations.redis` probe to confirm queue/replay storage availability.
## Ops: Manual Retry Flow
- Click "Provision / Retry" on the Order in Salesforce.
- If `Activation_Status__c = Activating`, show a toast "Already in progress".
- Else, set `Activation_Status__c = Activating`, clear last error fields, and let the RecordTriggered Flow publish the event.
Portal does not auto-retry jobs. Network/5xx/timeouts will mark the Order Failed with:
- `Activation_Error_Code__c` (e.g., 429, 503, ETIMEOUT)
- `Activation_Error_Message__c` (short reason)