Schema Discovery

When you connect a database, Thallus automatically discovers its structure — tables, columns, types, and relationships — so agents know what data is available to query. This happens immediately after a successful connection test.

What gets discovered

Connect
Discover tables
Map columns
Infer relationships
Ready

Thallus introspects your database and captures:

  • Tables, views, and materialized views — Everything agents might need to query
  • Column names, data types, nullability, and defaults — The full column definition for each table
  • Primary keys and foreign keys — Explicit constraints defined in your schema
  • Row count estimates and table sizes — Helps agents understand data volume
  • System schemas excluded — Internal database system schemas are automatically filtered out

Relationship inference

Foreign keys defined in your database are picked up directly and used for JOIN paths. But many databases have implicit relationships that aren't formalized as constraints. Thallus identifies these too:

  • Column patterns suggest additional relationships between tables based on common conventions.
  • The resulting relationship graph enables agents to write multi-table JOINs automatically — even when your database doesn't have explicit foreign key constraints.

The semantic layer

On top of the raw schema, you can add your own annotations to help agents write better queries:

Table cust_txn_hist
Display name
Customer Transaction History
Description
All completed customer purchases and refunds since 2021
Business context
Use txn_type = 'REFUND' for returns. Amount is always in USD. Negative amounts indicate credits.

The semantic layer includes three fields per table or column:

  • Display name — A friendly name for cryptic table or column names (e.g., cust_txn_hist becomes "Customer Transaction History")
  • Description — What does this table represent? What data does it contain?
  • Business context — Domain-specific notes that help agents write more accurate queries (e.g., "amounts are in cents, divide by 100 for dollars")

These annotations flow directly into agent prompts. The more context you provide, the better agents understand your data and the more accurate their queries become.


Refreshing schema

Schema discovery does not run automatically on a schedule. If your database schema changes — new tables, renamed columns, altered types — you'll need to refresh manually:

  • Navigate to Data Connections and select the connection
  • Click Refresh schema to re-run discovery
  • Stale tables that no longer exist in the source database are automatically removed
  • The schema cache timestamp is visible in the connection details so you can see when the last refresh occurred