Citations & Sources
Every Thallus response includes citations that trace each claim back to its source. This page explains how citations work, what types exist, and how they're ranked.
What citations look like
After Thallus responds, you'll see a sources panel below the answer. Each citation shows a numbered badge, source type icon, title, and relevance score.
Clicking a citation expands it to show the source text, SQL query, or other details depending on the citation type.
Citation types
Thallus can cite five types of sources:
| Type | Icon | What's included |
|---|---|---|
| Document | 📄 | Chunk text from your uploaded files, page number, document ID |
| Database query | 🗂 | SQL query executed, column names, result data, row count, connection name |
| Web | 🌐 | URL of the source page |
| 📧 | Email thread ID | |
| Agent | 🤖 | Results from a prior agent in the same execution |
Relevance scoring
Each citation carries a relevance score from 0.0 to 1.0, set by the agent that produced it and refined during evaluation:
| Score | Meaning |
|---|---|
| 0.9–1.0 | Directly supports a key claim in the response |
| 0.7–0.9 | Relevant context or supporting detail |
| 0.3–0.6 | Tangentially related |
| Below threshold | Not relevant (filtered out) |
Agents assign initial scores based on how directly the source material answers the query. The evaluator may re-score citations during the evaluation phase if it discovers inconsistencies between data sources and document claims.
How citations are ranked
The final citation list you see is produced by:
- Collection — All citations from every agent in the execution are gathered
- Deduplication — Citations pointing to the same source are merged. When duplicates are found, the highest relevance score is kept and the longest chunk text is preserved
- Sorting — Citations are sorted by relevance score, highest first
- Limiting — Citations are capped to keep the response focused and relevant
Deduplication identifies matching citations using source-specific identifiers, so the same source isn't listed twice.
Two-stage search and citations
When agents search your documents, citations come from a two-stage retrieval process:
- Synopsis discovery — Document synopses (structured summaries) are searched to identify the most relevant documents using combined ranking across query variants.
- Chunk retrieval — Within the matched documents, individual text chunks are retrieved using a similarity threshold optimized for each query. Only these matched chunks become citations.
This means citations point to specific, relevant passages within your documents rather than entire files. The page number, chunk text, and document title are all captured for traceability.
Tracing a citation to its source
Each citation type provides enough information to locate the original source:
- Document —
document_id+page_numberidentify the exact location. Click to view the chunk text. - Database query — The full SQL query is stored, along with the connection name and result data. You can see exactly what was queried and what it returned.
- Web — The URL links directly to the source page.
- Email — The
thread_ididentifies the email conversation. - Agent — References the agent name and tool used, linking back to a prior step in the execution plan.
For more on how citations flow through the orchestration pipeline, see How Orchestration Works. For how different chat modes affect citation depth, Ask mode typically produces fewer citations (single agent) while Research and Investigate modes produce richer citation sets from multiple sources.