AI Features — Hilfe-Center

Codapult ships with a production-ready AI layer built on the Vercel AI SDK with support for OpenAI and Anthropic models, streaming responses, tool use, organization quotas, conversation memory, and a full RAG pipeline.

Guides

Streaming Chat — chat UI, model adapters, streaming route, and tool calls.
RAG and Semantic Search — indexing, embeddings, vector store, and retrieval.
Quotas and Memory — conversation persistence, monthly credits, and rate limits.

Architecture

src/lib/ai/
├── models.ts         # Client-safe model options (id, label, provider)
├── providers.ts      # getModel() — resolves modelId → LanguageModel
├── embeddings.ts     # Embedding adapter (OpenAI / Ollama)
├── vector-store.ts   # Vector store adapter (SQLite / memory)
├── rag.ts            # RAG pipeline (index → chunk → embed → store → retrieve)
├── conversations.ts  # Conversation/message CRUD
└── chunker.ts        # Text chunking with overlap

Chat Endpoint

POST /api/chat accepts a JSON body with a messages array and an optional model selector:

{
  "messages": [{ "role": "user", "content": "How do I deploy?" }],
  "modelId": "gpt-4o-mini"
}

The endpoint follows the standard API route pattern: auth check → rate limiting (30 requests per 60 seconds per user) → org quota check → Zod validation → RAG context injection → streaming response.

Available Models

Model ID	Label	Provider
`gpt-4o-mini`	GPT-4o Mini	OpenAI
`gpt-4o`	GPT-4o	OpenAI
`claude-sonnet-4-20250514`	Claude Sonnet 4	Anthropic
`claude-haiku-4-20250514`	Claude Haiku 4	Anthropic

Models are defined in src/lib/ai/models.ts. To add a new model, add an entry there and — if it's a new provider — add a case in src/lib/ai/providers.ts.

Configuration

All AI settings live in src/config/app.ts under appConfig.ai:

ai: {
  defaultModel: 'gpt-4o-mini',
  systemPrompt: 'You are a helpful AI assistant. Be concise, accurate, and helpful.',
  ragEnabled: true,
  ragMaxChunks: 3,
  ragMinScore: 0.4,
  allowedModels: [],   // empty = all models from models.ts
}

Setting	Description
`defaultModel`	Model used when the user doesn't pick one (must match an ID in `models.ts`)
`systemPrompt`	Prepended to every conversation
`ragEnabled`	Toggle RAG context injection in chat
`ragMaxChunks`	Maximum number of knowledge base chunks injected into the prompt
`ragMinScore`	Minimum cosine similarity score (0–1) for RAG results
`allowedModels`	Restrict the model selector; empty array enables all

Tool Use

Chat supports function calling via the Vercel AI SDK. Tools are defined in /api/chat/route.ts. The example below is illustrative — replace it with your own domain-specific tools:

import { z } from 'zod';
import type { Tool } from 'ai';

const chatTools: Record<string, Tool> = {
  lookupOrder: {
    description: 'Look up an order by ID',
    parameters: z.object({ orderId: z.string() }),
    execute: async ({ orderId }) => {
      // Replace with your own business logic
      const order = await db.select().from(orders).where(eq(orders.id, orderId)).limit(1);
      return order[0] ?? { error: 'Not found' };
    },
  },
};

Multi-step tool invocations are enabled with maxSteps: 3. To add a new tool, define it in chatTools with a Zod parameters schema and an execute function.

Organization Quotas

AI usage is tracked per organization. Each plan defines a monthly credit allowance for the aiChat resource. The quota is checked before every chat request via checkOrgQuota(). Credits reset monthly via a background cron job.

Chat Memory

Conversation history is persisted in the database via src/lib/ai/conversations.ts:

Endpoint	Method	Description
`/api/chat/conversations`	GET	List user conversations
`/api/chat/conversations`	POST	Create a new conversation
`/api/chat/conversations`	DELETE	Delete a conversation
`/api/chat/conversations/[id]/messages`	`GET`	Read messages

The Chat UI component (src/components/ai/ChatUI) connects to these endpoints and renders a full chat interface with model selection, conversation switching, and streaming responses.

RAG Pipeline

The RAG (Retrieval-Augmented Generation) pipeline lets the AI chat reference your domain-specific content — blog posts, help docs, feature requests, or any custom text.

How It Works

Index — content is chunked (800 chars, 150 overlap), embedded, and stored in the vector store
Retrieve — user queries are embedded and matched against stored vectors by cosine similarity
Augment — matching chunks are injected into the system prompt with source citations

Indexing Content

Use the indexDocument function or the admin API:

import { indexDocument } from '@/lib/ai/rag';

await indexDocument({
  sourceType: 'help',
  sourceId: 'getting-started',
  title: 'Getting Started Guide',
  content: markdownContent,
});

For large content, use the rag-index background job:

import { enqueue } from '@/lib/jobs';

await enqueue('rag-index', {
  sourceType: 'blog',
  sourceId: 'post-123',
  title: 'My Blog Post',
  content: markdownContent,
});

Admin Indexing API

POST /api/ai/index is admin-only — requires role: "admin" session. Use it to manage the knowledge base from the admin panel or via scripts. It supports three actions:

Action	Description
`index`	Index a document (sourceType, sourceId, title, content)
`search`	Search the vector store (query, optional sourceTypes, limit, minScore)
`delete`	Delete indexed content (sourceType, optional sourceId)

Embedding Providers

Embeddings use the adapter pattern, switched via the EMBEDDING_PROVIDER env var:

Provider	Env Value	Requirements
OpenAI	`openai` (default)	`OPENAI_API_KEY`
Ollama	`ollama`	`OLLAMA_BASE_URL`, `OLLAMA_EMBEDDING_MODEL`

Ollama enables fully self-hosted embeddings — no external API calls. The default Ollama model is nomic-embed-text.

Vector Store

Vector storage uses the adapter pattern, switched via VECTOR_STORE_PROVIDER (accessed as env.vectorStoreProvider in server code):

Store	Env Value	Description
SQLite	`sqlite` (default)	Persisted in Turso alongside app data
Memory	`memory`	In-memory store for development/testing

Source Types

Indexed content is categorized by source type:

Type	Description
`blog`	Blog posts
`help`	Help center / documentation articles
`feature_request`	Feature request descriptions
`custom`	Any custom content

Environment Variables

Variable	Default	Description
`OPENAI_API_KEY`	—	Required for OpenAI models and default embeddings
`ANTHROPIC_API_KEY`	—	Required for Anthropic models
`EMBEDDING_PROVIDER`	`openai`	Embedding backend (`openai` or `ollama`)
`VECTOR_STORE_PROVIDER`	`sqlite`	Vector storage backend (`sqlite` or `memory`)
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama server URL
`OLLAMA_EMBEDDING_MODEL`	`nomic-embed-text`	Ollama model name for embeddings

Removing the Module

AI Chat and the RAG Pipeline are separate removable modules. Use the setup wizard (npx @codapult/cli setup) to strip either or both. See the Modules documentation for manual removal steps.