Knowledge Base

Build knowledge bases for agents using vector search. Upload documents, generate embeddings, and enable retrieval-augmented generation (RAG) for accurate, context-aware responses.

A knowledge base is a collection of documents that an agent can search and reference when generating responses. It enables retrieval-augmented generation (RAG) — a technique where relevant content is retrieved from the knowledge base and provided as context to the AI model before it generates a response.

Knowledge bases improve response accuracy by grounding the AI in your actual content rather than relying solely on the model's training data.

How it works

User query
  → Embed query as vector
    → Search knowledge base for similar content
      → Retrieve top matching document chunks
        → Include chunks in LLM prompt as context
          → Generate response grounded in retrieved content

When a user sends a message, the assistant engine converts the query into a vector embedding.
It searches the agent's knowledge base for document chunks with similar embeddings.
The most relevant chunks are retrieved and included in the prompt sent to the LLM.
The LLM generates a response that draws on the retrieved content.

This means the agent's answers are based on your specific documents — product documentation, FAQs, policies, or any other content you provide.

Creating a knowledge base

A knowledge base is backed by a search index. To set one up:

Create a search index — this is the container for your documents.
Assign it to an agent — set the agent's knowledgeBaseId to the search index.
Add documents — upload content to the search index.

Adding documents

The platform supports multiple document formats:

Format	Description
PDF	Parsed and text-extracted automatically
Plain text	Indexed directly
URLs	Web pages fetched, parsed, and indexed

When a document is added:

The content is parsed based on its format.
The text is split into chunks using a text splitter that preserves semantic boundaries (paragraph and sentence breaks).
Each chunk is embedded — converted to a vector using an embedding model (1536 dimensions).
The embeddings are stored in the vector database with organization-level isolation.

Organization isolation

Knowledge base data is isolated per organization. Each organization's documents are stored in a separate partition of the vector database, ensuring that search queries only return results from the current organization's content.

Searching the knowledge base

During assistant execution, the engine searches the knowledge base with the user's query:

The query is embedded using the same model used for document indexing.
A similarity search finds the closest matching document chunks.
Results are ranked by relevance and returned to the assistant engine.
The engine includes the retrieved chunks in the LLM prompt.

The search operates in real time with low latency — results are typically returned in under 100ms.

Best practices

Document quality

Use clean, well-structured content. The quality of retrieval depends on the quality of the source material.
Keep documents focused. A document covering one topic retrieves more accurately than a document covering many.
Update regularly. Remove outdated content and add new material to keep the knowledge base current.

Chunk size

The platform's text splitter is configured with sensible defaults. For most use cases, the default chunk size produces good retrieval results. Very short documents (under a paragraph) and very long documents (entire books) may benefit from adjustment.

Coverage

Cover expected queries. Think about what users will ask and ensure the knowledge base contains relevant content.
Include FAQs. Frequently asked questions and their answers are highly effective knowledge base content.
Add edge cases. Content about exceptions, limitations, and unusual scenarios helps the agent handle diverse queries.

Agents — configuring agents that use knowledge bases
Assistant Execution — how RAG retrieval integrates with the AI engine

Knowledge Base

On this page