Option-A-Phase-01-Design database schema for RAG

Below is a production-ready RAG database schema designed specifically for a multi-tenant Shopify AI chatbot that ingests products, policies, FAQs, external sources, and OpenAI-search-derived knowledge.

ID AND FOREIGN KEY TYPES (POSTGRESQL)

RAG tables: All id and foreign key columns use native PostgreSQL uuid (not TEXT). Default for new rows: gen_random_uuid(). This gives efficient storage, indexing, and type safety. The Session table (Shopify template) keeps TEXT id for compatibility.
Indexes: Foreign key columns and key lookup columns have indexes for join and filter performance (e.g. shop_id, session_id, shopify_product_id). See migration 20260224110000_native_uuid_and_indexes.

CORE DESIGN PRINCIPLES

Multi-tenant first

Every row belongs to a Shopify store.
Source-aware knowledge

We must know where each piece of knowledge came from.
Chunk-based embeddings

Never embed full documents or products.
Re-embeddable

Product updates should only re-embed affected chunks.
Auditable

You must be able to explain why an answer was given.

HIGH-LEVEL ENTITY MAP

Shop
 ├── Product
 │    └── KnowledgeChunk
 │         └── Embedding
 ├── KnowledgeSource
 │    └── KnowledgeDocument
 │         └── KnowledgeChunk
 │              └── Embedding
 ├── ChatSession
 │    └── ChatMessage
 │         └── RetrievedChunk

1. SHOPS (TENANT TABLE)

shops
-----
id (uuid, pk)
shopify_shop_id (varchar, unique)
shop_domain (varchar)
access_token (encrypted)
plan (varchar)
status (active, paused, uninstalled)
created_at
updated_at
... (Phase 4: widget_title, welcome_message, brand_color, bubble_position, show_reply_count,
     widget_enabled, widget_show_on_all_pages; Phase 4/5: daily_spend_limit_cents, etc.)

Why:

One row per Shopify store
Hard tenant boundary

2. PRODUCTS (SOURCE OF TRUTH FROM SHOPIFY)

products
--------
id (uuid, pk)
shop_id (fk -> shops.id)
shopify_product_id (varchar)
title
description
product_type
vendor
tags (jsonb)
status (active, draft, archived)
last_synced_at
created_at
updated_at

Notes:

Do NOT store variants here
Store normalized product data only

3. PRODUCT VARIANTS (OPTIONAL BUT RECOMMENDED)

product_variants
----------------
id (uuid, pk)
product_id (fk -> products.id)
shopify_variant_id (varchar)
title
price
sku
inventory_quantity
created_at
updated_at

Why:

Enables precise answers like price or availability

4. KNOWLEDGE SOURCES

Tracks where knowledge came from.

knowledge_sources
-----------------
id (uuid, pk)
shop_id (fk -> shops.id)
type (product, faq, policy, page, external, openai_search)
name
source_url
created_at

Examples:

product
faq
shipping_policy
return_policy
openai_search

5. KNOWLEDGE DOCUMENTS

Raw documents before chunking.

knowledge_documents
-------------------
id (uuid, pk)
shop_id (fk -> shops.id)
source_id (fk -> knowledge_sources.id)
external_ref_id (nullable)
title
raw_text
hash (varchar)
status (active, archived)
created_at
updated_at

Why:

Hash prevents duplicate ingestion
raw_text is never queried directly by LLM

6. KNOWLEDGE CHUNKS (CRITICAL TABLE)

Every retrievable unit lives here.

knowledge_chunks
----------------
id (uuid, pk)
shop_id (fk -> shops.id)
document_id (fk -> knowledge_documents.id)
product_id (nullable fk -> products.id)
chunk_text
chunk_index
token_count
created_at

Rules:

300 to 700 tokens per chunk
One chunk = one embedding

7. EMBEDDINGS (VECTOR STORAGE)

If using pgvector:

embeddings
----------
id (uuid, pk)
chunk_id (fk -> knowledge_chunks.id)
embedding vector(1536)    -- must have dimensions; 1536 = openai/text-embedding-3-small
model (text)
created_at

Important: The embedding column must be vector(1536) (or your model’s dimension), not untyped vector. An untyped vector has no dimensions and will cause “column does not have dimensions” when creating the ivfflat index.

Indexes:

Use ivfflat for similarity search with the vector_cosine_ops operator class. Do not use btree with vector_cosine_ops (invalid).

CREATE INDEX "embeddings_embedding_idx" ON "embeddings"
  USING ivfflat ("embedding" vector_cosine_ops) WITH (lists = 100);

Adjust lists as needed (e.g. rows/1000 for larger datasets). Create this index after the table exists; some pgvector versions cannot create ivfflat on an empty table (create it after ingesting embeddings).

If using Pinecone or Weaviate:

Store chunk_id and metadata there
Keep relational DB as source of truth

8. CHAT SESSIONS

Tracks user conversations.

chat_sessions
-------------
id (uuid, pk)
shop_id (fk -> shops.id)
visitor_id (varchar)
started_at
ended_at

9. CHAT MESSAGES

Stores each turn in a conversation. For assistant messages, also store usage and billing so each shop owner can see what the customer asked, what the AI replied, what it cost, and what they were charged.

chat_messages
-------------
id (uuid, pk)
session_id (fk -> chat_sessions.id)
role (user, assistant, system)
message_text
created_at
-- Usage & billing (populated for role = assistant only):
model_used (varchar, nullable)       -- e.g. openai/gpt-4o-mini, anthropic/claude-3-haiku
openrouter_cost (decimal, nullable)  -- actual cost from OpenRouter for this reply
charged_cost (decimal, nullable)     -- amount charged to the client (after markup)
markup_factor (decimal, nullable)    -- multiplier applied (e.g. 1.0 = 100%, 1.5 = 150%, 2.0 = 200%)

Why:

message_text: For user = what customer asked; for assistant = what AI replied. Shop owner sees full Q&A.
model_used: Which model generated this reply (for transparency and support).
openrouter_cost / charged_cost / markup_factor: Actual cost from OpenRouter; the factor you apply per plan (e.g. 50%, 100%, 200%); and the resulting charge to the client. Enables per-reply billing and dashboards.

10. RETRIEVED CHUNKS (AUDIT TRAIL)

This is gold for debugging hallucinations.

retrieved_chunks
----------------
id (uuid, pk)
chat_message_id (fk -> chat_messages.id)
chunk_id (fk -> knowledge_chunks.id)
similarity_score
rank

Why:

Lets you see exactly which chunks influenced answers
Required for enterprise trust

11. INGESTION JOBS (OPTIONAL BUT VERY USEFUL)

ingestion_jobs
--------------
id (uuid, pk)
shop_id (fk -> shops.id)
source_type
status (queued, processing, completed, failed)
total_documents
processed_documents
error_message
created_at
updated_at

12. OPENROUTER_CALLS (USAGE & BILLING)

Log every OpenRouter API call (embeddings + chat) for attribution, billing, and dashboards. For chat replies, link to the assistant message via chat_message_id.

openrouter_calls
----------------
id (uuid, pk)
shop_id (fk -> shops.id)
chat_message_id (uuid, nullable, fk -> chat_messages.id)  -- set when endpoint = chat (links to assistant reply)
endpoint (varchar)           -- embeddings | chat
model (varchar, nullable)     -- e.g. openai/text-embedding-3-small, anthropic/claude-3-haiku
openrouter_cost (decimal)     -- actual cost from OpenRouter
charged_cost (decimal)        -- amount charged to client (after markup_factor)
markup_factor (decimal)       -- e.g. 1.0 = 100%, 1.5 = 150%, 2.0 = 200%
tokens (int, nullable)
purpose (varchar, nullable)   -- Product Knowledge, Store Knowledge, Customer Chat, etc.
created_at

Why:

Per-call log: Every embeddings and chat call is recorded; filter by shop_id for per-client attribution.
openrouter_cost / charged_cost / markup_factor: Same as on chat_messages; store here for all calls (embeddings don’t have a chat_message, but you may still attribute cost to the shop).
chat_message_id: When endpoint = chat, link to the assistant message so the shop owner can see “this call = this reply” and join with chat_messages for “what was asked, what was replied, cost, model.”
purpose: Lets merchants see what the call was used for in dashboards — e.g. Product Knowledge (product sync & ingestion embeddings), Store Knowledge (policies/pages ingestion), or Customer Chat (live customer messages).

RAG QUERY FLOW (HOW THIS SCHEMA IS USED)

User asks question
Question converted to embedding
Vector search on embeddings
Top N chunk_ids returned
Join with knowledge_chunks
Filter by shop_id
Inject into prompt
Generate response
Save retrieved_chunks for traceability

WHY THIS DESIGN BEATS MOST COMPETITORS

Tenant isolation by design
Partial re-embedding on product updates
Supports multiple knowledge sources
Debuggable and explainable AI
Scales from 100 products to 1M+

This is the same pattern used by serious AI SaaS products.

ASSESSMENT FOR APPIFIRE AI CHAT

Your objective: Get product details, website documents, and data from a ChatGPT-style model for the website (extra context).

Fit:

Product details: shops → products → product_variants give you Shopify catalog (title, description, price, etc.) and a clear tenant boundary. Product content flows into knowledge_documents / knowledge_chunks / embeddings for RAG.
Website documents: knowledge_sources with types like faq, policy, page plus knowledge_documents and knowledge_chunks cover FAQs, policies, and any other site content you ingest.
Data from ChatGPT model for the website: Use a knowledge_source with type external or openai_search. Ingest the model’s output (e.g. enriched Q&A or summaries) as knowledge_documents; chunk and embed them like other docs. The schema is source-aware, so you can mix products, website docs, and LLM-derived content in one RAG index per shop.

Verdict: The design is correct for your goal. Multi-tenant, source-aware, chunk-based, and auditable (e.g. retrieved_chunks) all support a production chatbot. Proceed with this schema.

PRISMA SCHEMA & MIGRATIONS (OPTION A)

The following Prisma models implement the RAG schema above. The embedding column (pgvector) is added in a separate migration step via raw SQL because Prisma has no native vector type.

Prisma schema: appifire-ai-chat/prisma/schema.prisma — includes all RAG models (Shop, Product, KnowledgeChunk, Embedding, etc.) plus the template’s Session model.
Migration: appifire-ai-chat/prisma/migrations/20250224000000_add_rag_schema/migration.sql — enables pgvector, creates all RAG tables, and adds the embedding vector(1536) column. The ivfflat index is not created in this migration (it requires rows); create it after ingesting embeddings (see below).
Native UUID + indexes: appifire-ai-chat/prisma/migrations/20260224110000_native_uuid_and_indexes/migration.sql — converts all RAG table id and foreign key columns to PostgreSQL native uuid and adds indexes on FKs and key lookups (products.shop_id, products.shopify_product_id, knowledge_chunks.shop_id, chat_messages.session_id, openrouter_calls.shop_id, openrouter_calls.created_at, etc.).

Apply the migration when your DB is reachable:

cd appifire-ai-chat && npx prisma migrate deploy

(Or npx prisma migrate dev for development.)

Vector index (run after you have ingested some embeddings):
ivfflat cannot be created on an empty table. After loading embeddings, run:

CREATE INDEX "embeddings_embedding_idx" ON "embeddings" USING ivfflat ("embedding" vector_cosine_ops) WITH (lists = 100);

Adjust lists (e.g. rows/1000) as your data grows.

Deploying via SQL dump (e.g. Dokploy, any Postgres):

Target Postgres must have pgvector. Use an image that includes it (e.g. pgvector/pgvector:pg15) or install the extension on the server.
Create the extension first (before any table that uses vector):
CREATE EXTENSION IF NOT EXISTS vector WITH SCHEMA public;
Embedding column: Define as vector(1536) (fixed dimensions). Untyped vector causes “column does not have dimensions” when creating the ivfflat index.
Vector index: Use USING ivfflat ("embedding" vector_cosine_ops) WITH (lists = 100). Do not use btree with vector_cosine_ops (operator class does not exist for btree).
A ready-made dump that follows these rules is in databases/appifire-ai-chat-*.sql (run it against an empty DB with pgvector enabled).