Cloud Sync
Connect cloud storage to automatically sync documents into Thallus. Supported providers: Google Drive, OneDrive/SharePoint, and Nextcloud. Synced files are processed through the same pipeline as manual uploads and stay up to date as you modify them in the cloud.
Connecting cloud storage
Cloud sync requires an existing OAuth integration with the storage provider. If you haven't connected one yet, see Connecting Integrations.
Once connected, create a sync connection from Settings:
- Personal connections — Sync files from your own cloud storage into your personal collections
- Organization connections — Admin-managed connections that sync into organization collections visible to all members
Setting up folder sync
After creating a connection, browse your cloud folders from within Thallus and select which ones to sync. Each folder can be configured independently:
| Setting | Default | Description |
|---|---|---|
| Sync interval | 30 min | How often to check for changes (5 min to 24 hours) |
| File types | All supported | Filter by extension (e.g., only .pdf and .docx) |
| Max file size | 50 MB | Per-file size limit |
| Include subfolders | On | Recursively sync nested folders |
| Selected files | All | Optionally whitelist specific files |
Each synced folder maps to a document collection, so all files from a folder are organized together.
How sync works
Initial sync pulls all matching files from the selected folder, downloads them, and queues each for processing.
Delta sync runs on schedule and detects three types of changes:
Each provider uses its native change detection mechanism for efficient delta sync.
Synced files count against your chunk budget just like manual uploads. If the budget runs out mid-sync, remaining files are skipped and flagged.
Sync status
Each synced folder shows its last sync time, file count, total size, and current status. You can also trigger a manual sync or toggle individual folders on and off without removing the connection.
Managing sync connections
- Toggle folders — Enable or disable individual folders without deleting the connection
- Manual sync — Trigger an immediate sync outside the regular schedule
- View status — See last sync time, file count, errors, and total size
- Delete connection — Remove the sync connection with an optional cascade that deletes all synced documents
Cloud-synced vs. manual documents
Synced files are processed through the same pipeline as manual uploads — identical chunking, embedding, and synopsis generation. The key differences:
- Version tracking — Cloud files store a remote file ID, version, and modification time
- Automatic updates — Changes in the cloud are detected and the document is re-processed
- Deletion propagation — Files removed from cloud storage are removed from Thallus on the next sync
- Manual uploads — One-time; the file is not tracked for changes after upload
Troubleshooting
- Sync not running — Check that the folder is enabled and the connection is active
- Files skipped — Check file type filters and size limits; skipped files also appear when the chunk budget is exhausted
- Failed sync — Automatically retried up to 3 times with increasing delays. Check the error message for details
- OAuth expired — Reconnect the integration from Settings (see Connecting Integrations)
Related pages
- Connecting Integrations — Set up OAuth connections for cloud storage
- Uploading Documents — Manual file uploads
- How Documents Are Processed — The processing pipeline shared by synced and uploaded files