Cloud Sync

Connect cloud storage to automatically sync documents into Thallus. Supported providers: Google Drive, OneDrive/SharePoint, and Nextcloud. Synced files are processed through the same pipeline as manual uploads and stay up to date as you modify them in the cloud.

Connecting cloud storage

Cloud sync requires an existing OAuth integration with the storage provider. If you haven't connected one yet, see Connecting Integrations.

Once connected, create a sync connection from Settings:

  • Personal connections — Sync files from your own cloud storage into your personal collections
  • Organization connections — Admin-managed connections that sync into organization collections visible to all members

Setting up folder sync

After creating a connection, browse your cloud folders from within Thallus and select which ones to sync. Each folder can be configured independently:

Setting Default Description
Sync interval 30 min How often to check for changes (5 min to 24 hours)
File types All supported Filter by extension (e.g., only .pdf and .docx)
Max file size 50 MB Per-file size limit
Include subfolders On Recursively sync nested folders
Selected files All Optionally whitelist specific files

Each synced folder maps to a document collection, so all files from a folder are organized together.


How sync works

Initial sync pulls all matching files from the selected folder, downloads them, and queues each for processing.

Delta sync runs on schedule and detects three types of changes:

Check for changes
New files → download & process
Modified → re-process
Deleted → remove

Each provider uses its native change detection mechanism for efficient delta sync.

Synced files count against your chunk budget just like manual uploads. If the budget runs out mid-sync, remaining files are skipped and flagged.


Sync status

📁
Marketing / Campaigns
Google Drive · 23 files · 142 MB
Synced
📁
Engineering / Specs
OneDrive · 8 files · 34 MB
Syncing
📁
Shared Docs
Nextcloud · Last sync failed
Error

Each synced folder shows its last sync time, file count, total size, and current status. You can also trigger a manual sync or toggle individual folders on and off without removing the connection.


Managing sync connections

  • Toggle folders — Enable or disable individual folders without deleting the connection
  • Manual sync — Trigger an immediate sync outside the regular schedule
  • View status — See last sync time, file count, errors, and total size
  • Delete connection — Remove the sync connection with an optional cascade that deletes all synced documents

Cloud-synced vs. manual documents

Synced files are processed through the same pipeline as manual uploads — identical chunking, embedding, and synopsis generation. The key differences:

  • Version tracking — Cloud files store a remote file ID, version, and modification time
  • Automatic updates — Changes in the cloud are detected and the document is re-processed
  • Deletion propagation — Files removed from cloud storage are removed from Thallus on the next sync
  • Manual uploads — One-time; the file is not tracked for changes after upload

Troubleshooting

  • Sync not running — Check that the folder is enabled and the connection is active
  • Files skipped — Check file type filters and size limits; skipped files also appear when the chunk budget is exhausted
  • Failed sync — Automatically retried up to 3 times with increasing delays. Check the error message for details
  • OAuth expired — Reconnect the integration from Settings (see Connecting Integrations)