Knowledge Base
Build knowledge bases for agents using vector search. Upload documents, generate embeddings, and enable retrieval-augmented generation (RAG) for accurate, context-aware responses.
A knowledge base is a collection of documents that an agent can search and reference when generating responses. It enables retrieval-augmented generation (RAG) — a technique where relevant content is retrieved from the knowledge base and provided as context to the AI model before it generates a response.
Knowledge bases improve response accuracy by grounding the AI in your actual content rather than relying solely on the model's training data.
How it works
User query
→ Embed query as vector
→ Search knowledge base for similar content
→ Retrieve top matching document chunks
→ Include chunks in LLM prompt as context
→ Generate response grounded in retrieved content- When a user sends a message, the assistant engine converts the query into a vector embedding.
- It searches the agent's knowledge base for document chunks with similar embeddings.
- The most relevant chunks are retrieved and included in the prompt sent to the LLM.
- The LLM generates a response that draws on the retrieved content.
This means the agent's answers are based on your specific documents — product documentation, FAQs, policies, or any other content you provide.
Creating a knowledge base
A knowledge base is backed by a search index. To set one up:
- Create a search index — this is the container for your documents.
- Assign it to an agent — set the agent's
knowledgeBaseIdto the search index. - Add documents — upload content to the search index.
Adding documents
The platform supports multiple document formats:
| Format | Description |
|---|---|
| Parsed and text-extracted automatically | |
| Plain text | Indexed directly |
| URLs | Web pages fetched, parsed, and indexed |
When a document is added:
- The content is parsed based on its format.
- The text is split into chunks using a text splitter that preserves semantic boundaries (paragraph and sentence breaks).
- Each chunk is embedded — converted to a vector using an embedding model (1536 dimensions).
- The embeddings are stored in the vector database with organization-level isolation.
Organization isolation
Knowledge base data is isolated per organization. Each organization's documents are stored in a separate partition of the vector database, ensuring that search queries only return results from the current organization's content.
Searching the knowledge base
During assistant execution, the engine searches the knowledge base with the user's query:
- The query is embedded using the same model used for document indexing.
- A similarity search finds the closest matching document chunks.
- Results are ranked by relevance and returned to the assistant engine.
- The engine includes the retrieved chunks in the LLM prompt.
The search operates in real time with low latency — results are typically returned in under 100ms.
Best practices
Document quality
- Use clean, well-structured content. The quality of retrieval depends on the quality of the source material.
- Keep documents focused. A document covering one topic retrieves more accurately than a document covering many.
- Update regularly. Remove outdated content and add new material to keep the knowledge base current.
Chunk size
The platform's text splitter is configured with sensible defaults. For most use cases, the default chunk size produces good retrieval results. Very short documents (under a paragraph) and very long documents (entire books) may benefit from adjustment.
Coverage
- Cover expected queries. Think about what users will ask and ensure the knowledge base contains relevant content.
- Include FAQs. Frequently asked questions and their answers are highly effective knowledge base content.
- Add edge cases. Content about exceptions, limitations, and unusual scenarios helps the agent handle diverse queries.
Related concepts
- Agents — configuring agents that use knowledge bases
- Assistant Execution — how RAG retrieval integrates with the AI engine
Configuring Agents
How to create and configure agents within a blueprint — setting identity, behavior, knowledge bases, and default flows.
Assistant Execution
How the AI assistant engine processes queries — from conversation context resolution through RAG retrieval, LLM generation, and response streaming.