Data Pipeline Refactor
- Type
- project
- Status
- active
- Visibility
- public
- Updated
- 2026-05-24T04:31:34.094Z
Data Pipeline Refactor
Goal
Cut billing-area ticket-to-knowledge time from 14 days → 24 hours so the Customer Support Agent v2 has fresh context when CSAT-impacting issues land.
Current state
Old pipeline:
Salesforce → nightly export → CSV in S3 → manual cleanup → wiki copy/paste
New pipeline:
Salesforce ──webhook──> ingestion service ──parse──> Markdown notes
│
▼
agent-memory-site build
│
▼
chunks.jsonl + MCP server
Status
- Webhook receiver shipped (
apps/ingest/) - Markdown emitter normalizes Salesforce fields to our Vault frontmatter standard
- Backfill historical 18 months of billing tickets
- Validate redaction quality on PII-heavy notes (see Separate Private and Public Memory)
Why static compile instead of live DB
See the Use chunks.jsonl as the canonical RAG substrate decision. Short version: every agent reads the same bundle; reproducible across CI, prod, and local dev.