System Architecture Overview
This document provides a technical overview of the Cloudillo system architecture, explaining the core patterns and components that enable its federated, privacy-focused design.
Workspace Structure
Cloudillo is organized as a Rust workspace with the following crates:
cloudillo-rs/
├── server/ # Core library (cloudillo)
├── basic-server/ # Reference implementation
└── adapters/
├── auth-adapter-sqlite/ # Authentication & cryptography (SQLite)
├── meta-adapter-sqlite/ # Metadata storage (SQLite)
├── blob-adapter-fs/ # Binary blob storage (Filesystem)
├── rtdb-adapter-redb/ # Real-time database (Redb)
└── crdt-adapter-redb/ # Collaborative editing CRDT (Redb)Crate Responsibilities
- server: Core business logic, HTTP handlers, federation, task system
- basic-server: Executable reference implementation using SQLite, Redb, and filesystem
- adapters: Five pluggable storage backends implementing the core adapter traits
Adapter Types
The five adapters separate concerns and enable flexible deployments:
- AuthAdapter - Authentication, JWT tokens, certificate management, cryptographic operations
- MetaAdapter - Tenant/profile data, action tokens, file metadata, tasks
- BlobAdapter - Content-addressed immutable binary data (files, images, snapshots)
- RtdbAdapter - Real-time hierarchical JSON database with queries and subscriptions
- CrdtAdapter - Collaborative document editing with conflict-free merges
Core Architecture Patterns
The Five-Adapter Architecture
Cloudillo’s architecture is built on five fundamental adapters that separate concerns and enable flexible deployments:
1. AuthAdapter
Purpose: Authentication, authorization, and cryptographic operations
Responsibilities:
- JWT/token validation and creation
- TLS certificate management (ACME integration)
- Profile signing key storage and rotation
- VAPID keys for push notifications
- Password hashing and WebAuthn credentials
- Tenant management
Key Operations: Validate tokens, create access tokens, manage TLS certificates, handle profile keys, store passwords/credentials
Why separate? Authentication and cryptography require special security considerations and may need different storage backends (HSM, vault services, etc.).
2. MetaAdapter
Purpose: Structured metadata storage
Responsibilities:
- Tenant and profile information
- Action tokens (posts, comments, reactions, etc.)
- File metadata with variants
- Task persistence for the scheduler
- Database metadata (for RTDB)
Key Operations: Create/query tenants, store/retrieve actions, manage file metadata, persist tasks, handle profiles
Why separate? Metadata can be stored in different databases (SQLite, PostgreSQL, MongoDB) based on scale and requirements.
3. BlobAdapter
Purpose: Immutable binary data storage
Responsibilities:
- Content-addressed blob storage
- File data (images, videos, documents)
- Database snapshots (for RTDB)
- Both buffered and streaming I/O
Key Operations: Write/read blobs (buffered or streaming), check blob existence and size
Why separate? Blob storage can use filesystem, S3, CDN, or specialized object storage based on scale and cost considerations.
4. RtdbAdapter
Purpose: Real-time structured database
Responsibilities:
- Path-based hierarchical storage (Firebase-like)
- Query with filtering, sorting, pagination
- Real-time subscriptions via WebSocket
- Transaction support for atomic writes
- Secondary indexes for performance
- Multi-tenant database isolation
Key Operations: Query/create/update/delete documents, transactions, real-time subscriptions
Why separate? Real-time database functionality can use different backends (redb, PostgreSQL, MongoDB) based on query requirements and scale.
Learn more: RTDB with redb
5. CrdtAdapter
Purpose: Collaborative document storage with CRDTs
Responsibilities:
- Binary CRDT update storage (Yjs protocol)
- Document metadata management
- Real-time change subscriptions
- Conflict-free merge guarantees
- Awareness state (presence, cursors)
- Multi-tenant document isolation
Key Operations: Create/read documents, append updates, retrieve update history, subscribe to changes
Why separate? CRDT storage can use different backends (redb, dedicated CRDT stores) and has different performance characteristics than traditional databases.
Learn more: CRDT Collaborative Editing
Benefits of the Adapter Pattern
✅ Flexible Deployment: Switch storage backends without changing core logic ✅ Separation of Concerns: Security, metadata, and binary data have different requirements ✅ Testing: Easy to create in-memory adapters for testing ✅ Scalability: Can distribute adapters across different services ✅ Cost Optimization: Use appropriate storage for each data type
Content-Addressed Architecture
Cloudillo uses content-addressing throughout its architecture, where resource identifiers are cryptographic hashes of their content. This creates a merkle tree structure that provides cryptographic proof of authenticity and immutability.
Hash-Based Identifiers
All resource IDs are SHA-256 hashes with versioned prefixes:
| Prefix | Resource Type | Hash Input | Example |
|---|---|---|---|
a1~ |
Action | Entire JWT token (header + payload + signature) | a1~8kR3mN9pQ2vL6xW... |
f1~ |
File | File descriptor string | f1~Qo2E3G8TJZ2HTGh... |
b1~ |
Blob | Blob bytes (actual image/video data) | b1~abc123def456ghi... |
d1~ |
Descriptor | (not a hash, the encoded string itself) | d1~tn:b1~abc:f=AVIF:... |
Version Scheme
Format: {prefix}{version}~{base64_encoded_hash}
- Version 1: SHA-256 with base64url encoding (no padding)
- Future versions: Can upgrade to SHA-3, BLAKE3, etc. without breaking old content
- Backward compatibility: Old content remains valid forever
- Algorithm agility: Migrate to new algorithms without breaking existing references
Example upgrade path:
a1~... (SHA-256)
a2~... (SHA-3)
a3~... (BLAKE3)Six-Level Merkle Tree
Content-addressing creates a hierarchical merkle tree:
Level 1: Blob Data (raw bytes)
↓ SHA-256 hash
Level 2: Blob ID (b1~hash)
↓ collected in descriptor
Level 3: File Descriptor (d1~variant:b1~hash:format:size:resolution,...)
↓ SHA-256 hash of descriptor
Level 4: File ID (f1~hash)
↓ referenced in action
Level 5: Action Token (JWT with content, parent, attachments)
↓ SHA-256 hash of entire JWT
Level 6: Action ID (a1~hash)Properties
✅ Immutable: Content cannot change without changing the ID ✅ Tamper-Evident: Any modification is immediately detectable ✅ Deduplicatable: Identical content produces identical IDs ✅ Verifiable: Anyone can recompute and verify hashes ✅ Cacheable: Content-addressed data can be cached forever ✅ Trustless: No need to trust storage providers—verify the hash
Integration with Adapters
Content-addressing is implemented across multiple adapters:
BlobAdapter:
- Stores blobs indexed by blob_id (
b1~...) - Blob IDs are SHA-256 hashes of the blob bytes
- Enables deduplication and integrity verification
MetaAdapter:
- Stores action tokens indexed by action_id (
a1~...) - Action IDs are SHA-256 hashes of the JWT token
- Stores file metadata with file_id (
f1~...) - File IDs are SHA-256 hashes of the descriptor
AuthAdapter:
- Signs action tokens with profile keys (ES384)
- Signature + content hash = complete authenticity proof
Security Benefits
Content-addressing provides multiple security layers:
- Integrity Verification: SHA-256 ensures data hasn’t been tampered with
- Deduplication: Same content = same hash, prevents storage waste
- Cryptographic Binding: Parent references create immutable chains
- Federation Trust: Remote instances can verify data integrity
- Cache Safety: Content-addressed data can be cached without trust
Performance Benefits
- Caching: Immutable content can be cached forever (max-age=31536000)
- Deduplication: Identical blobs stored only once across all tenants
- Parallel Verification: Hash verification can be parallelized
- CDN-Friendly: Content-addressed resources perfect for CDN distribution
Learn more: Content-Addressing & Merkle Trees
Task-Based Asynchronous Processing
Complex operations in Cloudillo are modeled as persistent tasks that can execute asynchronously, survive restarts, and depend on other tasks.
Task System Components
Tasks
Tasks implement the Task<S> trait:
pub trait Task<S>: Debug + Send + Sync {
fn kind() -> &'static str;
async fn run(&self, state: &S) -> Result<()>;
fn priority(&self) -> Priority { Priority::Medium }
fn dependencies(&self) -> Vec<TaskId> { vec![] }
}Built-in Task Types:
- ActionCreatorTask: Creates and signs action tokens for federation
- ActionVerifierTask: Validates incoming federated actions
- FileIdGeneratorTask: Generates content-addressed file IDs
- ImageResizerTask: Creates image variants (thumbnails, etc.)
Scheduler
The scheduler manages task lifecycle with dependency resolution:
Features:
- Task registry with dynamic builders
- Dependency resolution (DAG-based)
- Scheduled execution (cron-like)
- Persistence via MetaAdapter (survives restarts)
- Notification system for task completion
Example Flow:
ActionCreatorTask (depends on FileIdGeneratorTask)
↓
waits for file processing to complete
↓
FileIdGeneratorTask completes
↓
ActionCreatorTask auto-starts
↓
Creates signed JWT, stores in MetaAdapterWorker Pool
A priority-based thread pool for CPU-intensive and blocking operations:
Architecture:
- Three priority tiers: High > Medium > Low
- Each tier has configurable worker thread count
- Uses
flumeMPMC channels for work distribution - Returns futures for async integration
Default Configuration (basic-server):
- 1 high-priority worker
- 2 medium-priority workers
- 1 low-priority worker
Use Cases:
- Image processing (CPU-intensive)
- Cryptographic operations
- File compression
- Blocking I/O
Why Task-Based Processing?
✅ Resilience: Tasks survive server restarts ✅ Dependencies: Complex workflows with ordered execution ✅ Scheduling: Cron-like execution for periodic tasks ✅ Observability: Track task progress and failures ✅ Concurrency: Priority-based execution
Application State Management
AppState Structure
The core application state contains: scheduler, worker pool, HTTP client, TLS certificates, and all five adapters (auth, meta, blob, rtdb, crdt).
AppBuilder Pattern
Configuration uses a fluent builder API with mode, identity, domain, data directory, and adapter selections.
Configuration Options:
- Server mode (Standalone, Proxy, StreamProxy)
- Network binding (HTTPS/HTTP ports)
- Domain configuration
- Directory paths (dist, tmp, data)
- Adapter injection
- Worker pool sizing
Module Organization
The server crate is organized by functional domain:
core/ - Infrastructure Layer
Core system components providing foundational services:
- app.rs: Application state, builder, bootstrap logic
- acme.rs: Let’s Encrypt/ACME certificate management
- webserver.rs: Axum/Rustls HTTPS server with SNI
- middleware.rs: Authentication middleware
- extract.rs: Custom Axum extractors (TnId, IdTag, Auth)
- scheduler.rs: Task scheduling with dependencies
- worker.rs: Thread pool with priority levels
- websocket.rs: WebSocket infrastructure
- request.rs: HTTP client for federation
- hasher.rs: Content-addressable storage (SHA256)
- utils.rs: Random ID generation
auth/ - Authentication Module
- handler.rs: Login endpoints, token generation, password management
action/ - Action/Activity Subsystem
Implements the federated activity system:
- action.rs: Action creation/verification tasks
- process.rs: JWT verification, token processing
- handler.rs: Action CRUD endpoints, federation inbox
profile/ - Profile Management
- handler.rs: Tenant profile endpoints
file/ - File Storage & Processing
- file.rs: File descriptor encoding, variant selection
- handler.rs: File upload/download endpoints
- image.rs: Image resizing tasks
- store.rs: Storage layer abstraction
rtdb/ - Real-Time Database
Note: In development, see RTDB Architecture for details
- handler.rs: Database CRUD endpoints
- websocket.rs: WebSocket connection handler
- manager.rs: Database instance lifecycle
routes/ - HTTP Routing
- Separates public and protected route groups
- API endpoints (/api/*)
- Static file serving for frontend
- CORS configuration
Concurrency Model
Multi-threaded Tokio Runtime
Cloudillo uses Tokio’s multi-threaded async runtime for handling concurrent requests.
Concurrency Layers
- Async Layer (Tokio): HTTP request handling, WebSocket connections, I/O
- Worker Pool: CPU-intensive tasks, blocking operations
- Scheduler: Background task execution with dependencies
Interaction Example:
HTTP Request → Tokio async handler
↓
Spawn ImageResizerTask on scheduler
↓
Scheduler dispatches to worker pool (CPU-intensive)
↓
Worker completes, updates MetaAdapter
↓
Scheduler notifies waiting tasks
↓
Response returned to clientRequest Handling Flow
Middleware Pipeline
HTTP requests flow through these layers:
- HTTPS/SNI Resolution: CertResolver selects TLS certificate by domain
- Tracing/Logging: Request tracing with structured logging
- Authentication:
require_authoroptional_authmiddleware - Custom Extractors:
TnId,IdTag,Authextract from request context - Handler: Business logic execution
- Response: JSON serialization, error handling
Custom Extractors
Axum extractors provide typed access to request context:
- TnId: Tenant ID (database primary key, i64)
- IdTag: Tenant identifier string (e.g., “alice.example.com”)
- Auth: Full authentication context (tn_id, id_tag, scope, etc.)
Error Handling
Custom error enum with automatic HTTP response conversion: NotFound (404), PermissionDenied (403), DbError/Unknown/Parse/Io (500).
Server Modes
Cloudillo supports different deployment modes:
Standalone (Default)
Self-contained single instance:
- HTTPS on configured port
- Optional HTTP for ACME challenges
- All adapters run locally
Use case: Personal servers, small communities
Proxy
Used if Cloudillo is behind a reverse proxy:
- Listens on HTTP port
- Certificate handled is the responsibility of the proxy
Use case: Managed hosting providers, self-hosting with multiple services on one IP address
Security Architecture
Implemented in Rust
Maximal memory and concurrency safety. Minimal attack surface.
No Unsafe Code
Cloudillo enforces memory safety:
#![forbid(unsafe_code)]ABAC Permission System
Cloudillo uses Attribute-Based Access Control (ABAC) for fine-grained permissions across all resources. ABAC provides flexible permission rules based on:
- User attributes (identity, roles, relationships)
- Resource attributes (owner, visibility, type)
- Contextual factors (time, environment)
Key Features:
- Five visibility levels (public, private, followers, connected, direct)
- Policy-based access control (TOP/BOTTOM policies)
- Relationship-aware (following, connections)
- Time-based permissions
- Custom policy support
Learn more: ABAC Permission System
Cryptographic Algorithms
- P384: Elliptic curve for action signing
- ES384: JWT signature algorithm
- SHA256: Content addressing and hashing
- bcrypt: Password hashing
Security Layers
- TLS/HTTPS: All connections encrypted (Rustls)
- ACME: Automatic certificate management (Let’s Encrypt)
- JWT: Cryptographically signed tokens
- Content Addressing: Tamper detection via SHA256
- Permission Checks: Authorization at every access point
Bootstrap Process
Initial setup when starting a new instance:
- Create tenant with
base_id_tag - Set password (if provided)
- Generate profile signing key (P384)
- Initiate ACME for TLS certificates (if email provided)
- Start scheduler and worker pool
- Start HTTP/HTTPS servers
Example configuration:
BASE_ID_TAG=alice.example.com # Required
BASE_PASSWORD=secret # Initial password
ACME_EMAIL=alice@example.com # Let's Encrypt email
MODE=standalone # Server mode
LISTEN=0.0.0.0:8443 # HTTPS binding
DATA_DIR=./data # storage pathKey Dependencies
Web Framework
- axum (0.8): Async web framework
- tower: Service abstractions
- tower-http: CORS, static files
TLS & Crypto
- rustls (0.23): Pure Rust TLS
- instant-acme (0.8): ACME client
- jsonwebtoken (9.3): JWT handling
- p384: Elliptic curve operations
Async Runtime
- tokio (1.48): Multi-threaded async
Serialization
- serde (1.0): Serialization framework
- serde_json: JSON support
Database
- sqlx (0.8): Async SQL (SQLite support)
Utilities
- image (0.25): Image processing
- sha2: SHA256 hashing
- croner (3.0): Cron expressions
- flume: MPMC channels
Architectural Strengths
✅ Pluggable Adapters: Easy to swap storage backends
✅ Self-Contained: No external dependencies required
✅ Federated: Communicates with other instances
✅ Task-Based: Persistent, resumable async execution
✅ Type-Safe: Leverages Rust’s type system
✅ Memory-Safe: Complete #![forbid(unsafe_code)]
✅ Observable: Built-in tracing integration
Next Steps
- Identity System - DNS-based identity and key management
- [Access Control](/architecture/data-layer/access-control/access - Token validation and permissions
- ABAC Permissions - Attribute-based access control system
- [Actions & Federation](/architecture/actions-federation/actions - Action tokens and cross-instance communication
- File Storage - Content-addressed storage and processing
- RTDB with redb - Query-based real-time database
- CRDT Collaborative Editing - Conflict-free collaborative documents
- Real-Time Systems Overview - Introduction to RTDB and CRDT