Knowledge Base
Add Knowledge Source
Add a URL as a knowledge source for your AI agents
POST
Add a webpage URL as a knowledge source. The content is scraped, processed into chunks, and embedded as vectors for RAG (Retrieval Augmented Generation). Your AI agents can then reference this knowledge when answering questions.
Request Body
URL to scrape and add to knowledge base. Must be publicly accessible or include auth headers.Example:
https://example.com/faqHow often to re-sync content from the URL.Options:
24h, 7d, 1mo, 3moHow much of the website to crawl.Options:
single- Fetch one page only (immediate)sitemap- Crawl all pages in sitemap.xml (async)recursive- Follow links from starting URL (async)
Maximum pages to crawl (for sitemap/recursive modes). Range: 1-500.
How deep to follow links (recursive mode only). Range: 1-5.
Whether to honor robots.txt crawl restrictions.
Authentication headers for protected pages.Example (Bearer):Example (Basic):
Response
Unique knowledge source identifier.
The source URL.
Extracted page title.
Extracted meta description.
Current sync status:
pending, syncing, completed, failed.Number of text chunks created from the content.
When the next automatic sync will occur.
How Knowledge Works
- Scraping: Content is extracted from the URL, removing navigation and boilerplate
- Chunking: Text is split into semantic chunks (~500 tokens each)
- Embedding: Each chunk is converted to a vector embedding
- Storage: Chunks are stored in a vector database (pgvector)
- Retrieval: During calls/chats, relevant chunks are retrieved and included in agent context
Supported Content
- HTML pages (blogs, FAQs, documentation)
- PDF files (product manuals, guides)
- Plain text files
Dynamic content loaded via JavaScript may not be captured. For SPAs, consider providing direct links to static content or using server-side rendered pages.