Assist_Design/docs/salesforce-shadow-sync-plan.md
barsa c79488a6a4 Enhance Salesforce request handling and metrics tracking
- Introduced new metrics for daily API usage, including dailyApiLimit and dailyUsagePercent, to monitor API consumption effectively.
- Added route-level metrics tracking to capture request success and failure rates for better performance insights.
- Implemented degradation state management to handle rate limits and usage thresholds, improving resilience during high load.
- Enhanced SalesforceRequestQueueService to include detailed logging for route-level metrics, aiding in debugging and performance analysis.
- Updated Salesforce module to export new SalesforceReadThrottleGuard for improved request rate limiting across services.
- Refactored various services to utilize the new metrics and logging features, ensuring consistent behavior and improved maintainability.
2025-11-06 13:26:30 +09:00

4.2 KiB

Salesforce Shadow Data Sync Plan

Objectives

  • Reduce repetitive Salesforce reads for hot catalog and eligibility data.
  • Provide resilient fallbacks when Salesforce limits are reached by serving data from Postgres shadow tables.
  • Maintain data freshness within minutes via event-driven updates, with scheduled backstops.

Scope

  • Catalog metadata: Product2, PricebookEntry, add-on metadata (SIM/Internet/VPN).
  • Pricing snapshots: Unit price, currency, and active flags per SKU.
  • Account eligibility: Account.Internet_Eligibility__c and related readiness fields used by personalized catalogs.

Proposed Schema (Postgres)

CREATE TABLE sf_product_shadow (
  product_id TEXT PRIMARY KEY,
  sku TEXT UNIQUE NOT NULL,
  name TEXT NOT NULL,
  item_class TEXT,
  offering_type TEXT,
  plan_tier TEXT,
  vpn_region TEXT,
  updated_at TIMESTAMP WITH TIME ZONE NOT NULL,
  raw_payload JSONB NOT NULL
);

CREATE TABLE sf_pricebook_shadow (
  pricebook_entry_id TEXT PRIMARY KEY,
  product_id TEXT NOT NULL REFERENCES sf_product_shadow(product_id) ON DELETE CASCADE,
  pricebook_id TEXT NOT NULL,
  unit_price NUMERIC(12,2) NOT NULL,
  currency_iso_code TEXT NOT NULL,
  is_active BOOLEAN NOT NULL,
  updated_at TIMESTAMP WITH TIME ZONE NOT NULL,
  raw_payload JSONB NOT NULL
);

CREATE TABLE sf_account_eligibility_shadow (
  account_id TEXT PRIMARY KEY,
  internet_eligibility TEXT,
  eligibility_source TEXT,
  updated_at TIMESTAMP WITH TIME ZONE NOT NULL,
  raw_payload JSONB NOT NULL
);

Sync Strategy

Phase Approach Tooling
Backfill Bulk API v2 query for each object (Product2, PricebookEntry, Account) to seed tables. New CLI job (pnpm nx run bff:salesforce-backfill-shadow)
Incremental updates Subscribe to Platform Events or Change Data Capture streams for Product2, PricebookEntry, and Account. Push events onto existing SalesforceRequestQueue, enqueue to BullMQ worker that upserts into shadow tables. Extend provisioning queue or add new SF_SHADOW_SYNC queue
Catch-up Nightly scheduled Bulk API delta query (using SystemModstamp) to reconcile missed events. Cron worker (same Bull queue)

Upsert Flow

  1. Event payload arrives from Salesforce Pub/Sub → persisted to queue (reuse SalesforceRequestQueueService backoff).
  2. Worker normalizes payload (maps relationship fields, handles deletions).
  3. Performs PostgreSQL INSERT ... ON CONFLICT using transaction to keep product ↔ pricebook relationships consistent.
  4. Invalidate Redis keys (catalog:*, eligibility:*) via CatalogCacheService.invalidateAllCatalogs() or targeted invalidation when specific SKU/account changes.

Integration Points

  • Catalog services: attempt to read from shadow tables via Prisma before falling back to Salesforce query; only hit Salesforce on cache miss and shadow miss.
  • Eligibility lookup: InternetCatalogService.getPlansForUser first loads from sf_account_eligibility_shadow; if stale (>15 min) fallback to Salesforce + refresh row asynchronously.
  • Order flows: continue using live Salesforce (writes) but use shadow data for price lookups where possible.

Monitoring & Alerts

  • Add Prometheus counters: sf_shadow_sync_events_total, sf_shadow_sync_failures_total.
  • Track lag metrics: MAX(now() - updated_at) per table.
  • Hook into existing queue health endpoint to expose shadow worker backlog.

Rollout Checklist

  1. Implement schema migrations (SQL or Prisma) under feature flag.
  2. Build bulk backfill command; run in staging, verify record counts vs Salesforce SOQL.
  3. Enable event ingestion in staging, monitor for 48h, validate cache invalidation.
  4. Update catalog services to prefer shadow reads; release behind environment variable ENABLE_SF_SHADOW_READS.
  5. Roll to production gradually: run backfill, enable read flag, then enable event consumer.
  6. Document operational runbooks (replay events, manual backfill, clearing caches).

Open Questions

  • Do we mirror additional fields (e.g., localization strings) needed for future UX changes?
  • Should eligibility sync include other readiness signals (credit status, serviceability flags)?
  • Confirm retention strategy for raw_payload column (e.g., prune older versions weekly).