Assist_Design/docs/salesforce-shadow-sync-plan.md

# Salesforce Shadow Data Sync Plan

## Objectives
- Reduce repetitive Salesforce reads for hot catalog and eligibility data.
- Provide resilient fallbacks when Salesforce limits are reached by serving data from Postgres shadow tables.
- Maintain data freshness within minutes via event-driven updates, with scheduled backstops.

## Scope
- **Catalog metadata**: `Product2`, `PricebookEntry`, add-on metadata (SIM/Internet/VPN).
- **Pricing snapshots**: Unit price, currency, and active flags per SKU.
- **Account eligibility**: `Account.Internet_Eligibility__c` and related readiness fields used by personalized catalogs.

## Proposed Schema (Postgres)

```sql
CREATE TABLE sf_product_shadow (
  product_id TEXT PRIMARY KEY,
  sku TEXT UNIQUE NOT NULL,
  name TEXT NOT NULL,
  item_class TEXT,
  offering_type TEXT,
  plan_tier TEXT,
  vpn_region TEXT,
  updated_at TIMESTAMP WITH TIME ZONE NOT NULL,
  raw_payload JSONB NOT NULL
);

CREATE TABLE sf_pricebook_shadow (
  pricebook_entry_id TEXT PRIMARY KEY,
  product_id TEXT NOT NULL REFERENCES sf_product_shadow(product_id) ON DELETE CASCADE,
  pricebook_id TEXT NOT NULL,
  unit_price NUMERIC(12,2) NOT NULL,
  currency_iso_code TEXT NOT NULL,
  is_active BOOLEAN NOT NULL,
  updated_at TIMESTAMP WITH TIME ZONE NOT NULL,
  raw_payload JSONB NOT NULL
);

CREATE TABLE sf_account_eligibility_shadow (
  account_id TEXT PRIMARY KEY,
  internet_eligibility TEXT,
  eligibility_source TEXT,
  updated_at TIMESTAMP WITH TIME ZONE NOT NULL,
  raw_payload JSONB NOT NULL
);
```

## Sync Strategy

| Phase | Approach | Tooling |
| --- | --- | --- |
| Backfill | Bulk API v2 query for each object (Product2, PricebookEntry, Account) to seed tables. | New CLI job (`pnpm nx run bff:salesforce-backfill-shadow`) |
| Incremental updates | Subscribe to Platform Events or Change Data Capture streams for Product2, PricebookEntry, and Account. Push events onto existing SalesforceRequestQueue, enqueue to BullMQ worker that upserts into shadow tables. | Extend provisioning queue or add new `SF_SHADOW_SYNC` queue |
| Catch-up | Nightly scheduled Bulk API delta query (using `SystemModstamp`) to reconcile missed events. | Cron worker (same Bull queue) |

### Upsert Flow
1. Event payload arrives from Salesforce Pub/Sub → persisted to queue (reuse `SalesforceRequestQueueService` backoff).
2. Worker normalizes payload (maps relationship fields, handles deletions).
3. Performs PostgreSQL `INSERT ... ON CONFLICT` using transaction to keep product ↔ pricebook relationships consistent.
4. Invalidate Redis keys (`catalog:*`, `eligibility:*`) via `CatalogCacheService.invalidateAllCatalogs()` or targeted invalidation when specific SKU/account changes.

## Integration Points
- **Catalog services**: attempt to read from shadow tables via Prisma before falling back to Salesforce query; only hit Salesforce on cache miss _and_ shadow miss.
- **Eligibility lookup**: `InternetCatalogService.getPlansForUser` first loads from `sf_account_eligibility_shadow`; if stale (>15 min) fallback to Salesforce + refresh row asynchronously.
- **Order flows**: continue using live Salesforce (writes) but use shadow data for price lookups where possible.

## Monitoring & Alerts
- Add Prometheus counters: `sf_shadow_sync_events_total`, `sf_shadow_sync_failures_total`.
- Track lag metrics: `MAX(now() - updated_at)` per table.
- Hook into existing queue health endpoint to expose shadow worker backlog.

## Rollout Checklist
1. Implement schema migrations (SQL or Prisma) under feature flag.
2. Build bulk backfill command; run in staging, verify record counts vs Salesforce SOQL.
3. Enable event ingestion in staging, monitor for 48h, validate cache invalidation.
4. Update catalog services to prefer shadow reads; release behind environment variable `ENABLE_SF_SHADOW_READS`.
5. Roll to production gradually: run backfill, enable read flag, then enable event consumer.
6. Document operational runbooks (replay events, manual backfill, clearing caches).

## Open Questions
- Do we mirror additional fields (e.g., localization strings) needed for future UX changes?
- Should eligibility sync include other readiness signals (credit status, serviceability flags)?
- Confirm retention strategy for `raw_payload` column (e.g., prune older versions weekly).
Enhance Salesforce request handling and metrics tracking - Introduced new metrics for daily API usage, including dailyApiLimit and dailyUsagePercent, to monitor API consumption effectively. - Added route-level metrics tracking to capture request success and failure rates for better performance insights. - Implemented degradation state management to handle rate limits and usage thresholds, improving resilience during high load. - Enhanced SalesforceRequestQueueService to include detailed logging for route-level metrics, aiding in debugging and performance analysis. - Updated Salesforce module to export new SalesforceReadThrottleGuard for improved request rate limiting across services. - Refactored various services to utilize the new metrics and logging features, ensuring consistent behavior and improved maintainability. 2025-11-06 13:26:30 +09:00			`# Salesforce Shadow Data Sync Plan`

			`## Objectives`
			`- Reduce repetitive Salesforce reads for hot catalog and eligibility data.`
			`- Provide resilient fallbacks when Salesforce limits are reached by serving data from Postgres shadow tables.`
			`- Maintain data freshness within minutes via event-driven updates, with scheduled backstops.`

			`## Scope`
			- Catalog metadata: `Product2`, `PricebookEntry`, add-on metadata (SIM/Internet/VPN).
			`- Pricing snapshots: Unit price, currency, and active flags per SKU.`
			- Account eligibility: `Account.Internet_Eligibility__c` and related readiness fields used by personalized catalogs.

			`## Proposed Schema (Postgres)`

			```sql
			`CREATE TABLE sf_product_shadow (`
			`product_id TEXT PRIMARY KEY,`
			`sku TEXT UNIQUE NOT NULL,`
			`name TEXT NOT NULL,`
			`item_class TEXT,`
			`offering_type TEXT,`
			`plan_tier TEXT,`
			`vpn_region TEXT,`
			`updated_at TIMESTAMP WITH TIME ZONE NOT NULL,`
			`raw_payload JSONB NOT NULL`
			`);`

			`CREATE TABLE sf_pricebook_shadow (`
			`pricebook_entry_id TEXT PRIMARY KEY,`
			`product_id TEXT NOT NULL REFERENCES sf_product_shadow(product_id) ON DELETE CASCADE,`
			`pricebook_id TEXT NOT NULL,`
			`unit_price NUMERIC(12,2) NOT NULL,`
			`currency_iso_code TEXT NOT NULL,`
			`is_active BOOLEAN NOT NULL,`
			`updated_at TIMESTAMP WITH TIME ZONE NOT NULL,`
			`raw_payload JSONB NOT NULL`
			`);`

			`CREATE TABLE sf_account_eligibility_shadow (`
			`account_id TEXT PRIMARY KEY,`
			`internet_eligibility TEXT,`
			`eligibility_source TEXT,`
			`updated_at TIMESTAMP WITH TIME ZONE NOT NULL,`
			`raw_payload JSONB NOT NULL`
			`);`
			```

			`## Sync Strategy`

			`\| Phase \| Approach \| Tooling \|`
			`\| --- \| --- \| --- \|`
			\| Backfill \| Bulk API v2 query for each object (Product2, PricebookEntry, Account) to seed tables. \| New CLI job (`pnpm nx run bff:salesforce-backfill-shadow`) \|
			\| Incremental updates \| Subscribe to Platform Events or Change Data Capture streams for Product2, PricebookEntry, and Account. Push events onto existing SalesforceRequestQueue, enqueue to BullMQ worker that upserts into shadow tables. \| Extend provisioning queue or add new `SF_SHADOW_SYNC` queue \|
			\| Catch-up \| Nightly scheduled Bulk API delta query (using `SystemModstamp`) to reconcile missed events. \| Cron worker (same Bull queue) \|

			`### Upsert Flow`
			1. Event payload arrives from Salesforce Pub/Sub → persisted to queue (reuse `SalesforceRequestQueueService` backoff).
			`2. Worker normalizes payload (maps relationship fields, handles deletions).`
			3. Performs PostgreSQL `INSERT ... ON CONFLICT` using transaction to keep product ↔ pricebook relationships consistent.
			4. Invalidate Redis keys (`catalog:`, `eligibility:`) via `CatalogCacheService.invalidateAllCatalogs()` or targeted invalidation when specific SKU/account changes.

			`## Integration Points`
			`- Catalog services: attempt to read from shadow tables via Prisma before falling back to Salesforce query; only hit Salesforce on cache miss _and_ shadow miss.`
			- Eligibility lookup: `InternetCatalogService.getPlansForUser` first loads from `sf_account_eligibility_shadow`; if stale (>15 min) fallback to Salesforce + refresh row asynchronously.
			`- Order flows: continue using live Salesforce (writes) but use shadow data for price lookups where possible.`

			`## Monitoring & Alerts`
			- Add Prometheus counters: `sf_shadow_sync_events_total`, `sf_shadow_sync_failures_total`.
			- Track lag metrics: `MAX(now() - updated_at)` per table.
			`- Hook into existing queue health endpoint to expose shadow worker backlog.`

			`## Rollout Checklist`
			`1. Implement schema migrations (SQL or Prisma) under feature flag.`
			`2. Build bulk backfill command; run in staging, verify record counts vs Salesforce SOQL.`
			`3. Enable event ingestion in staging, monitor for 48h, validate cache invalidation.`
			4. Update catalog services to prefer shadow reads; release behind environment variable `ENABLE_SF_SHADOW_READS`.
			`5. Roll to production gradually: run backfill, enable read flag, then enable event consumer.`
			`6. Document operational runbooks (replay events, manual backfill, clearing caches).`

			`## Open Questions`
			`- Do we mirror additional fields (e.g., localization strings) needed for future UX changes?`
			`- Should eligibility sync include other readiness signals (credit status, serviceability flags)?`
			- Confirm retention strategy for `raw_payload` column (e.g., prune older versions weekly).