Skip to main content

Overview

The Knowledge Base is your agent’s memory. It stores information about your business - FAQs, services, pricing, policies - that agents reference when answering questions. Using RAG (Retrieval Augmented Generation), agents can provide accurate, contextual responses based on your actual content rather than making things up.

How It Works

Vector Search Technology

Magpipe uses pgvector for semantic search:
  1. Content Ingestion
    • You add a URL or document
    • Content is fetched and cleaned
    • Text is split into chunks (~500 tokens each)
  2. Embedding Generation
    • Each chunk is converted to a vector embedding
    • Embeddings capture semantic meaning
    • Stored in PostgreSQL with pgvector
  3. Runtime Retrieval
    • When a caller asks a question
    • Question is converted to an embedding
    • Similar chunks are found via cosine similarity
    • Top 3-5 most relevant chunks are retrieved
  4. Context Injection
    • Retrieved chunks are added to the agent’s context
    • Agent uses this information to answer accurately
    • Responses cite actual content from your knowledge base

Why RAG?

Without knowledge base:
“What are your business hours?” “I don’t have that information.” ❌
With knowledge base:
“What are your business hours?” “We’re open Monday through Friday, 9 AM to 5 PM, and Saturday from 10 AM to 2 PM.” ✓

Adding Knowledge Sources

From URL

Add content from any public webpage:
1

Navigate to Knowledge

Go to Knowledge from the main navigation.
2

Click Add Source

Click the + Add Source button.
3

Enter URL

Paste the full URL of the webpage to import. Example: https://yoursite.com/faq
4

Set Sync Schedule

Choose how often to re-fetch content:
  • Every 24 hours
  • Every 7 days
  • Every month
  • Every 3 months
5

Submit

Click Add Source. Processing begins immediately.

Protected Pages

For pages behind authentication: Bearer Token / API Key:
  1. Check “Requires authentication”
  2. Select “Bearer Token”
  3. Enter your token: your-api-key-here
Basic Auth:
  1. Check “Requires authentication”
  2. Select “Basic Auth”
  3. Enter username and password

Supported Content Types

TypeSupport
HTML pages✓ Full support
Plain text✓ Full support
PDF documents✓ Text extracted
Markdown✓ Full support
JSON/XML✓ Parsed as text
Images✗ Not supported
Video✗ Not supported

Managing Sources

Source Dashboard

Each knowledge source displays:
FieldDescription
TitleExtracted page title or custom name
URLSource location
Statussyncing, completed, failed
ChunksNumber of text chunks created
Last SyncedTimestamp of last successful sync
Next SyncWhen automatic re-sync will occur

Source Statuses

  • Syncing - Content is being fetched and processed
  • Completed - Successfully processed and indexed
  • Failed - Error occurred (click to see details)

Editing Sources

Click a source to:
  • Change the sync schedule
  • Update authentication credentials
  • Force an immediate re-sync
  • View processing logs

Deleting Sources

To remove a knowledge source:
  1. Click the source to expand
  2. Click Delete Source
  3. Confirm deletion
Deleting a source removes all associated content immediately. Your agent will no longer have access to this information.

Content Best Practices

What to Add

Your FAQ page is ideal - it contains pre-written Q&A pairs that translate perfectly to agent responses.
Add pages describing what you offer and pricing. Agents can accurately quote prices and explain services.
Include your story, mission, and team info. Agents can answer “tell me about your company” naturally.
Add pages with address, hours, directions, parking info. Critical for “where are you located?” questions.
Include return policies, cancellation policies, terms. Agents can explain policies accurately.
Add product pages. Agents can describe features and specifications to callers.

What NOT to Add

  • Entire websites - Too much noise, dilutes relevance
  • Login-protected portals - Customer-specific data
  • Frequently changing content - Will become stale
  • Competitor information - Could confuse the agent
  • Internal documents - Security risk

Content Quality Tips

  1. Keep it factual - Agents repeat what’s in the knowledge base
  2. Use clear language - Avoid jargon unless your customers use it
  3. Structure with headings - Helps chunking algorithm
  4. Include common variations - “hours” and “business hours” both work
  5. Update regularly - Re-sync when your content changes

Automatic Sync

How Sync Works

  • At the scheduled interval, Magpipe re-fetches your URL
  • Content is compared to existing version
  • If changed, new chunks are generated
  • Old chunks are replaced atomically
  • Agent immediately uses new content

Sync Schedules

ScheduleBest For
Every 24 hoursFrequently updated content
Every 7 daysWeekly-changing information
Every monthRelatively stable content
Every 3 monthsStatic content (policies, about)

Manual Re-sync

Force an immediate re-sync:
  1. Click the knowledge source
  2. Click Sync Now
  3. Wait for processing to complete

Agent Integration

Per-Agent Knowledge

Each agent can access:
  • All knowledge sources you’ve added
  • Relevant chunks are retrieved per-question
  • No configuration needed - automatic

Prompt Integration

Knowledge context is automatically injected:
[System Prompt - your configuration]

[Knowledge Context]
The following information was retrieved from the knowledge base:

--- From: FAQ Page ---
Q: What are your business hours?
A: We're open Monday-Friday 9 AM to 5 PM, Saturday 10 AM to 2 PM.

--- From: Services Page ---
Our basic plan starts at $29/month and includes...
[End Knowledge Context]

[User Question]
What time do you open on Saturdays?

Knowledge Limitations

  • 3-5 chunks retrieved per question (most relevant)
  • ~2000 tokens max knowledge context per response
  • Chunks prioritized by relevance score
  • Less relevant chunks truncated if limit exceeded

Monitoring & Analytics

Knowledge Usage

Track how your knowledge base is being used:
  • Which sources are accessed most
  • Common questions by topic
  • Retrieval success rate
  • Gaps in knowledge (unanswered questions)

Retrieval Logs

View what knowledge was used in each conversation:
  1. Open a conversation in Inbox
  2. View the transcript
  3. See which knowledge chunks were retrieved

Limits

ResourceLimit
Knowledge sources50 per account
Maximum page size1 MB
Chunks per source~500
Total chunks10,000 per account
Sync frequencyMinimum 24 hours

Troubleshooting

Source Shows “Failed”

Common causes:
  • URL is not accessible
  • Page requires JavaScript to render
  • Authentication credentials incorrect
  • Content exceeds size limit
Solution: Click the source to see the error message, fix the issue, and retry.

Agent Not Using Knowledge

Check:
  1. Source status is “completed”
  2. Content was successfully chunked (count > 0)
  3. Question is related to the content
  4. Try asking a direct question from the content

Outdated Information

If agent gives old information:
  1. Check when source last synced
  2. Force a manual re-sync
  3. Verify the source URL shows current content