Skip to content

Option-A-Phase-01-Design database schema for RAG

Below is a production-ready RAG database schema designed specifically for a multi-tenant Shopify AI chatbot that ingests products, policies, FAQs, external sources, and OpenAI-search-derived knowledge.


  • RAG tables: All id and foreign key columns use native PostgreSQL uuid (not TEXT). Default for new rows: gen_random_uuid(). This gives efficient storage, indexing, and type safety. The Session table (Shopify template) keeps TEXT id for compatibility.
  • Indexes: Foreign key columns and key lookup columns have indexes for join and filter performance (e.g. shop_id, session_id, shopify_product_id). See migration 20260224110000_native_uuid_and_indexes.

  1. Multi-tenant first

    Every row belongs to a Shopify store.

  2. Source-aware knowledge

    We must know where each piece of knowledge came from.

  3. Chunk-based embeddings

    Never embed full documents or products.

  4. Re-embeddable

    Product updates should only re-embed affected chunks.

  5. Auditable

    You must be able to explain why an answer was given.


Shop
├── Product
│ └── KnowledgeChunk
│ └── Embedding
├── KnowledgeSource
│ └── KnowledgeDocument
│ └── KnowledgeChunk
│ └── Embedding
├── ChatSession
│ └── ChatMessage
│ └── RetrievedChunk

shops
-----
id (uuid, pk)
shopify_shop_id (varchar, unique)
shop_domain (varchar)
access_token (encrypted)
plan (varchar)
status (active, paused, uninstalled)
created_at
updated_at
... (Phase 4: widget_title, welcome_message, brand_color, bubble_position, show_reply_count,
widget_enabled, widget_show_on_all_pages; Phase 4/5: daily_spend_limit_cents, etc.)

Why:

  • One row per Shopify store
  • Hard tenant boundary

2. PRODUCTS (SOURCE OF TRUTH FROM SHOPIFY)

Section titled “2. PRODUCTS (SOURCE OF TRUTH FROM SHOPIFY)”
products
--------
id (uuid, pk)
shop_id (fk -> shops.id)
shopify_product_id (varchar)
title
description
product_type
vendor
tags (jsonb)
status (active, draft, archived)
last_synced_at
created_at
updated_at

Notes:

  • Do NOT store variants here
  • Store normalized product data only

3. PRODUCT VARIANTS (OPTIONAL BUT RECOMMENDED)

Section titled “3. PRODUCT VARIANTS (OPTIONAL BUT RECOMMENDED)”
product_variants
----------------
id (uuid, pk)
product_id (fk -> products.id)
shopify_variant_id (varchar)
title
price
sku
inventory_quantity
created_at
updated_at

Why:

  • Enables precise answers like price or availability

Tracks where knowledge came from.

knowledge_sources
-----------------
id (uuid, pk)
shop_id (fk -> shops.id)
type (product, faq, policy, page, external, openai_search)
name
source_url
created_at

Examples:

  • product
  • faq
  • shipping_policy
  • return_policy
  • openai_search

Raw documents before chunking.

knowledge_documents
-------------------
id (uuid, pk)
shop_id (fk -> shops.id)
source_id (fk -> knowledge_sources.id)
external_ref_id (nullable)
title
raw_text
hash (varchar)
status (active, archived)
created_at
updated_at

Why:

  • Hash prevents duplicate ingestion
  • raw_text is never queried directly by LLM

Every retrievable unit lives here.

knowledge_chunks
----------------
id (uuid, pk)
shop_id (fk -> shops.id)
document_id (fk -> knowledge_documents.id)
product_id (nullable fk -> products.id)
chunk_text
chunk_index
token_count
created_at

Rules:

  • 300 to 700 tokens per chunk
  • One chunk = one embedding

If using pgvector:

embeddings
----------
id (uuid, pk)
chunk_id (fk -> knowledge_chunks.id)
embedding vector(1536) -- must have dimensions; 1536 = openai/text-embedding-3-small
model (text)
created_at

Important: The embedding column must be vector(1536) (or your model’s dimension), not untyped vector. An untyped vector has no dimensions and will cause “column does not have dimensions” when creating the ivfflat index.

Indexes:

  • Use ivfflat for similarity search with the vector_cosine_ops operator class. Do not use btree with vector_cosine_ops (invalid).
CREATE INDEX "embeddings_embedding_idx" ON "embeddings"
USING ivfflat ("embedding" vector_cosine_ops) WITH (lists = 100);

Adjust lists as needed (e.g. rows/1000 for larger datasets). Create this index after the table exists; some pgvector versions cannot create ivfflat on an empty table (create it after ingesting embeddings).

If using Pinecone or Weaviate:

  • Store chunk_id and metadata there
  • Keep relational DB as source of truth

Tracks user conversations.

chat_sessions
-------------
id (uuid, pk)
shop_id (fk -> shops.id)
visitor_id (varchar)
started_at
ended_at

Stores each turn in a conversation. For assistant messages, also store usage and billing so each shop owner can see what the customer asked, what the AI replied, what it cost, and what they were charged.

chat_messages
-------------
id (uuid, pk)
session_id (fk -> chat_sessions.id)
role (user, assistant, system)
message_text
created_at
-- Usage & billing (populated for role = assistant only):
model_used (varchar, nullable) -- e.g. openai/gpt-4o-mini, anthropic/claude-3-haiku
openrouter_cost (decimal, nullable) -- actual cost from OpenRouter for this reply
charged_cost (decimal, nullable) -- amount charged to the client (after markup)
markup_factor (decimal, nullable) -- multiplier applied (e.g. 1.0 = 100%, 1.5 = 150%, 2.0 = 200%)

Why:

  • message_text: For user = what customer asked; for assistant = what AI replied. Shop owner sees full Q&A.
  • model_used: Which model generated this reply (for transparency and support).
  • openrouter_cost / charged_cost / markup_factor: Actual cost from OpenRouter; the factor you apply per plan (e.g. 50%, 100%, 200%); and the resulting charge to the client. Enables per-reply billing and dashboards.

This is gold for debugging hallucinations.

retrieved_chunks
----------------
id (uuid, pk)
chat_message_id (fk -> chat_messages.id)
chunk_id (fk -> knowledge_chunks.id)
similarity_score
rank

Why:

  • Lets you see exactly which chunks influenced answers
  • Required for enterprise trust

11. INGESTION JOBS (OPTIONAL BUT VERY USEFUL)

Section titled “11. INGESTION JOBS (OPTIONAL BUT VERY USEFUL)”
ingestion_jobs
--------------
id (uuid, pk)
shop_id (fk -> shops.id)
source_type
status (queued, processing, completed, failed)
total_documents
processed_documents
error_message
created_at
updated_at

Log every OpenRouter API call (embeddings + chat) for attribution, billing, and dashboards. For chat replies, link to the assistant message via chat_message_id.

openrouter_calls
----------------
id (uuid, pk)
shop_id (fk -> shops.id)
chat_message_id (uuid, nullable, fk -> chat_messages.id) -- set when endpoint = chat (links to assistant reply)
endpoint (varchar) -- embeddings | chat
model (varchar, nullable) -- e.g. openai/text-embedding-3-small, anthropic/claude-3-haiku
openrouter_cost (decimal) -- actual cost from OpenRouter
charged_cost (decimal) -- amount charged to client (after markup_factor)
markup_factor (decimal) -- e.g. 1.0 = 100%, 1.5 = 150%, 2.0 = 200%
tokens (int, nullable)
purpose (varchar, nullable) -- Product Knowledge, Store Knowledge, Customer Chat, etc.
created_at

Why:

  • Per-call log: Every embeddings and chat call is recorded; filter by shop_id for per-client attribution.
  • openrouter_cost / charged_cost / markup_factor: Same as on chat_messages; store here for all calls (embeddings don’t have a chat_message, but you may still attribute cost to the shop).
  • chat_message_id: When endpoint = chat, link to the assistant message so the shop owner can see “this call = this reply” and join with chat_messages for “what was asked, what was replied, cost, model.”
  • purpose: Lets merchants see what the call was used for in dashboards — e.g. Product Knowledge (product sync & ingestion embeddings), Store Knowledge (policies/pages ingestion), or Customer Chat (live customer messages).

  1. User asks question
  2. Question converted to embedding
  3. Vector search on embeddings
  4. Top N chunk_ids returned
  5. Join with knowledge_chunks
  6. Filter by shop_id
  7. Inject into prompt
  8. Generate response
  9. Save retrieved_chunks for traceability

  • Tenant isolation by design
  • Partial re-embedding on product updates
  • Supports multiple knowledge sources
  • Debuggable and explainable AI
  • Scales from 100 products to 1M+

This is the same pattern used by serious AI SaaS products.


Your objective: Get product details, website documents, and data from a ChatGPT-style model for the website (extra context).

Fit:

  • Product details: shopsproductsproduct_variants give you Shopify catalog (title, description, price, etc.) and a clear tenant boundary. Product content flows into knowledge_documents / knowledge_chunks / embeddings for RAG.
  • Website documents: knowledge_sources with types like faq, policy, page plus knowledge_documents and knowledge_chunks cover FAQs, policies, and any other site content you ingest.
  • Data from ChatGPT model for the website: Use a knowledge_source with type external or openai_search. Ingest the model’s output (e.g. enriched Q&A or summaries) as knowledge_documents; chunk and embed them like other docs. The schema is source-aware, so you can mix products, website docs, and LLM-derived content in one RAG index per shop.

Verdict: The design is correct for your goal. Multi-tenant, source-aware, chunk-based, and auditable (e.g. retrieved_chunks) all support a production chatbot. Proceed with this schema.


The following Prisma models implement the RAG schema above. The embedding column (pgvector) is added in a separate migration step via raw SQL because Prisma has no native vector type.

  • Prisma schema: appifire-ai-chat/prisma/schema.prisma — includes all RAG models (Shop, Product, KnowledgeChunk, Embedding, etc.) plus the template’s Session model.
  • Migration: appifire-ai-chat/prisma/migrations/20250224000000_add_rag_schema/migration.sql — enables pgvector, creates all RAG tables, and adds the embedding vector(1536) column. The ivfflat index is not created in this migration (it requires rows); create it after ingesting embeddings (see below).
  • Native UUID + indexes: appifire-ai-chat/prisma/migrations/20260224110000_native_uuid_and_indexes/migration.sql — converts all RAG table id and foreign key columns to PostgreSQL native uuid and adds indexes on FKs and key lookups (products.shop_id, products.shopify_product_id, knowledge_chunks.shop_id, chat_messages.session_id, openrouter_calls.shop_id, openrouter_calls.created_at, etc.).

Apply the migration when your DB is reachable:

Terminal window
cd appifire-ai-chat && npx prisma migrate deploy

(Or npx prisma migrate dev for development.)

Vector index (run after you have ingested some embeddings):
ivfflat cannot be created on an empty table. After loading embeddings, run:

CREATE INDEX "embeddings_embedding_idx" ON "embeddings" USING ivfflat ("embedding" vector_cosine_ops) WITH (lists = 100);

Adjust lists (e.g. rows/1000) as your data grows.

Deploying via SQL dump (e.g. Dokploy, any Postgres):

  1. Target Postgres must have pgvector. Use an image that includes it (e.g. pgvector/pgvector:pg15) or install the extension on the server.
  2. Create the extension first (before any table that uses vector):
    CREATE EXTENSION IF NOT EXISTS vector WITH SCHEMA public;
  3. Embedding column: Define as vector(1536) (fixed dimensions). Untyped vector causes “column does not have dimensions” when creating the ivfflat index.
  4. Vector index: Use USING ivfflat ("embedding" vector_cosine_ops) WITH (lists = 100). Do not use btree with vector_cosine_ops (operator class does not exist for btree).
  5. A ready-made dump that follows these rules is in databases/appifire-ai-chat-*.sql (run it against an empty DB with pgvector enabled).