How Orchestration Works

When you send a message, Thallus doesn't just call a single AI model. It runs an orchestration pipeline that plans, executes, evaluates, and synthesizes a response using specialized agents. This page explains how that pipeline works.

The orchestration pipeline

Every query flows through the same high-level pipeline, though the specifics vary by chat mode:

Query
Mode Detection
Planning
Execution
Evaluation
Synthesis
  1. Query — Your message arrives at the API
  2. Mode Detection — If set to Auto, an LLM classifies the query to choose Ask, Research, or Investigate
  3. Planning — A planner creates an execution plan with steps and dependencies
  4. Execution — Agents run the plan steps, potentially in parallel
  5. Evaluation — Results are checked for completeness; replanning may occur
  6. Synthesis — A final response is composed from all agent results and streamed to you

Planning

The planner is an LLM that receives your query along with rich context about your environment:

  • Available agents — Which agents are registered and accessible to your organization
  • Board context — Table schemas, document catalog, cross-source links, and join maps from your conversation's board
  • Memory context — Your personal memories (preferences, facts, patterns)
  • Conversation history — Recent messages for follow-up context
  • Prior execution — Results from earlier queries in the same conversation

From this, the planner produces an execution plan — a structured execution plan with dependency tracking.

Execution Plan
Step 1: Search documents for quarterly reports
Step 2: Query sales database for Q4 figures
↑ runs in parallel with Step 1
Step 3: Analyze and compare results
↑ depends on Steps 1 and 2

Steps with no dependencies on each other can run simultaneously. Steps that depend on earlier results wait until those complete.

Execution

The executor walks the plan graph, running all steps whose dependencies have been met. Independent steps run in parallel using concurrent execution, so a plan with two independent searches completes in the time of the slower one, not the sum of both.

Each step dispatches to the assigned agent for execution. The agent receives the step's task description plus context including:

  • Results from previously completed steps
  • Board state (schemas, datasets, findings)
  • Document catalog for search agents
  • Data previews for query agents

Agents work through their assigned task — calling tools, processing results, and iterating until they have a complete answer.

Evaluation and replanning

After each batch of steps completes, an evaluator examines the results and makes one of three decisions:

Evaluator checks results
✓ Complete
↻ Replan
? Ask user
  • Complete — The success criteria are met with actual answers. Move to synthesis.
  • Replan — Results are incomplete or a query returned no data. The evaluator identifies the specific issue (e.g., wrong date filter, missing data source) and new steps are appended to the plan. Multiple replans are allowed before the system synthesizes whatever it has.
  • Ask user — The system needs clarification to proceed. It presents you with a question — multiple choice, yes/no, or free text — explaining what it found and why it needs your input.

The Ask path

Ask mode sends your query directly to a single agent for a fast response. If the answer needs more depth, it automatically escalates to the full Research pipeline.

The Research path

Research mode runs the complete pipeline:

  1. Context loading — Your data environment is prepared for planning
  2. Planning — The planner creates an execution plan based on full context and memory
  3. Execution loop — Ready steps run in parallel batches
  4. Evaluation — After each batch, the evaluator decides to complete, replan, or ask the user
  5. Synthesis — All agent results and citations are collected and streamed as a final response
  6. Snapshot — Execution state is saved for follow-up queries in the same conversation

The Investigate path

Investigate mode uses a reactive approach — exploring your question from multiple angles, updating its understanding as new evidence comes in, and checking in with you at key decision points. You can choose to continue, focus on a specific angle, or summarize what's been found so far.

This makes Investigate mode better for open-ended questions where you don't know in advance what steps are needed.

Why plans differ

Two identical questions can produce different plans. This is by design — the planner considers:

  • Board state — Which data sources and documents are connected affects what steps are possible
  • Memory — Your stored preferences and facts shape how the planner approaches your question (see Memory System)
  • Conversation history — Follow-up questions incorporate prior results and context
  • Prior execution — In the same conversation, Thallus knows what datasets were already produced and what was already synthesized

This context-awareness is what makes Thallus's responses tailored to your specific environment rather than generic.

For details on what you see while the pipeline is running, see Progress Tracking. For how sources are tracked through the pipeline, see Citations & Sources.