Clever Dev DocsLibrary, Secure Sync, SSO, and APIs

About this assistant

Architecture, design decisions, and the production thinking behind a RAG-powered docs assistant.

What this is

A RAG-powered assistant that answers natural-language questions about the Clever developer platform — Library, SSO, certification, the API — grounded in the official docs at dev.clever.com. Every answer cites its source. When the system isn't confident, it says so rather than guessing.

Who it's for

Any developer building on the Clever platform — whether you're an independent builder shipping a classroom app, a partner engineering team integrating Secure Sync, or an internal developer working on the platform itself. The existing support widget routes through a decision tree; this gives everyone answers directly from the docs, instantly.

How it works

Your question

A developer asks a natural-language question about the Clever platform.

Rate limit

Upstash Redis enforces a sliding window of 20 requests per minute per IP. Abusers get a 429 before any AI cost is incurred.

Classify

A lightweight LLM call classifies the query as on-topic, off-topic, harmful, or nonsense. Off-topic and harmful queries get a canned response — the RAG pipeline never runs.

Embed

The question is converted into a 1,536-dimension vector using text-embedding-3-small via the AI Gateway.

Retrieve

pgvector cosine similarity search matches the question against 86 pages of Clever docs. The top chunks become the model's context.

Confidence gate

Below 0.6 similarity, the system prompt switches modes — the model admits uncertainty instead of improvising.

Audience routing

Chunks are tagged by integration path (Library, Secure Sync, LMS Connect). When answers differ by path, the model presents both variants.

Generate

The AI SDK streams a completion through the Vercel AI Gateway. The model sees only retrieved docs — no memorized knowledge.

Cited answer

The developer gets a grounded answer with source links back to dev.clever.com.

Guardrails

Before a query enters the retrieval pipeline, a lightweight classification step decides whether it belongs there at all. This saves compute on junk queries and gives users an appropriate response instead of a misleading “I couldn't find that in the docs.”

CategoryExampleAction
On-topic“How do I get student data from the API?”Full RAG pipeline
Off-topic“Reverse a linked list in Python”Canned deflection, skip RAG
HarmfulThreats, harassment, dangerous requestsSafety response, skip RAG, logged
Nonsense“asdfghjkl”, insults, trollingShort deflection, skip RAG

The classifier runs as a single generateTextcall with gpt-4o-mini — fast and cheap enough to gate every query. Non-on-topic queries are logged so the team can monitor abuse patterns without the data landing in the feedback queue as false “doc gaps.”

Building the knowledge base

The vector database is populated by a CLI script (pnpm ingest) that scrapes, chunks, and embeds the Clever developer docs into Supabase Postgres with the pgvector extension. The pipeline runs as a full rebuild — every run deletes existing rows and re-inserts from scratch.

Scrape

76 doc pages from dev.clever.com are fetched sequentially with a 500ms delay. Cheerio strips nav, sidebar, and footer elements, extracting text from headings, paragraphs, list items, code blocks, and table cells.

Chunk

Each page is split into ~1,000-character chunks with 200-character overlap, breaking on markdown heading boundaries first, then paragraph boundaries for oversized sections.

Tag

Every chunk is classified into an integration path — Library, Secure Sync, LMS Connect, Attendance, or general — by matching the source URL against a priority-ordered rule table.

Embed & store

Chunks are embedded in batches of 100 using text-embedding-3-small (1,536 dimensions) via the AI Gateway, then inserted into a Postgres table indexed with HNSW for fast cosine similarity search.

Design decisions

gpt-4o-mini over a flagship model
At ~$0.0003 per query, the cheap model handles factual lookups where retrieval does the heavy lifting. The eval shows exactly which questions need a smarter model.
AI Gateway over provider SDKs
One auth mechanism, one observability surface, zero code changes to swap providers. The eval page compares OpenAI and Anthropic models with a string change.
pgvector over a dedicated vector DB
For ~200 chunks, Postgres with HNSW is sub-100ms, operationally simpler, and one less service to monitor. Migration point: when you need hybrid search at scale.
Live eval as a product page
Not just a CI script. Putting the eval in the app means you can see what the system is actually doing — cost per query, cross-provider comparison, rubric scores — right now.

Production thinking

  • Re-ingestion: Content-hash chunks, only re-embed what changed. A daily cron polls the sitemap for diffs.
  • Low-confidence queue: Every "I couldn't find it" gets logged — the strongest signal of doc gaps the support team should see.
  • Rate limiting: Upstash Redis enforces a 20-request-per-minute sliding window per IP. Blocked requests return 429 before touching the AI Gateway.
  • Eval in CI: The test suite gates PRs on pass-rate, preventing silent regressions when prompts or models change.

Stack

LayerChoice
FrameworkNext.js App Router
AIVercel AI SDK v6 + AI Gateway
Default modelopenai/gpt-4o-mini
Embeddingsopenai/text-embedding-3-small
Vector storeSupabase Postgres + pgvector (Vercel Marketplace)
HostingVercel (Fluid Compute)

Brand identity

The interface follows the Clever Brand Guidelines (V7, August 2025). Typography pairs Merriweather (serif headings, standing in for the brand's ABC Arizona Mix) with Inter (sans-serif body text, standing in for Messina Sans). Colors draw from the primary palette — Clever Blue, white, and dark navy — with secondary accents for visual expression.

Clever Blue
#1464FF
Navy
#0A1E46
Light Blue
#DAEBFF
Yellow
#FFE478
Orange
#F78239
Green
#4ECC97