I found this which is overkill for personal use (plus not self hosted etc) but does a good job of laying out this sort of application: midday.ai/…/automatic-reconciliation-engine/
“Instead of just comparing text strings, we use 768-dimensional vector embeddings to capture the semantic meaning of transactions and receipts.
// Generate embeddings for transaction data const transactionText = prepareTransactionText({ name: transaction.name, counterpartyName: transaction.counterpartyName, merchantName: transaction.merchantName, description: transaction.description }); const embedding = await generateEmbeddings([transactionText]);
These embeddings allow our system to understand that “AMZN MKTP” and “Amazon Marketplace Purchase” refer to the same thing, even though the text strings are completely different. The system learns patterns like:
- “SQ *COFFEE SHOP” → “Square Coffee Shop Receipt”
- “PAYPAL *DIGITALOCEAN” → “DigitalOcean Invoice via PayPal”
- “APL*APPLE.COM” → “Apple App Store Purchase””