System Architecture Overview

This document provides a technical overview of the Cloudillo system architecture, explaining the core patterns and components that enable its federated, privacy-focused design.

Workspace Structure

Cloudillo is organized as a Rust workspace with the following crates:

cloudillo-rs/
├── server/                      # Core library (cloudillo)
├── basic-server/                # Reference implementation
└── adapters/
    ├── auth-adapter-sqlite/     # Authentication & cryptography (SQLite)
    ├── meta-adapter-sqlite/     # Metadata storage (SQLite)
    ├── blob-adapter-fs/         # Binary blob storage (Filesystem)
    ├── rtdb-adapter-redb/       # Real-time database (Redb)
    └── crdt-adapter-redb/       # Collaborative editing CRDT (Redb)

Crate Responsibilities

  • server: Core business logic, HTTP handlers, federation, task system
  • basic-server: Executable reference implementation using SQLite, Redb, and filesystem
  • adapters: Five pluggable storage backends implementing the core adapter traits

Adapter Types

The five adapters separate concerns and enable flexible deployments:

  1. AuthAdapter - Authentication, JWT tokens, certificate management, cryptographic operations
  2. MetaAdapter - Tenant/profile data, action tokens, file metadata, tasks
  3. BlobAdapter - Content-addressed immutable binary data (files, images, snapshots)
  4. RtdbAdapter - Real-time hierarchical JSON database with queries and subscriptions
  5. CrdtAdapter - Collaborative document editing with conflict-free merges

Core Architecture Patterns

The Five-Adapter Architecture

Cloudillo’s architecture is built on five fundamental adapters that separate concerns and enable flexible deployments:

1. AuthAdapter

Purpose: Authentication, authorization, and cryptographic operations

Responsibilities:

  • JWT/token validation and creation
  • TLS certificate management (ACME integration)
  • Profile signing key storage and rotation
  • VAPID keys for push notifications
  • Password hashing and WebAuthn credentials
  • Tenant management

Key Operations: Validate tokens, create access tokens, manage TLS certificates, handle profile keys, store passwords/credentials

Why separate? Authentication and cryptography require special security considerations and may need different storage backends (HSM, vault services, etc.).

2. MetaAdapter

Purpose: Structured metadata storage

Responsibilities:

  • Tenant and profile information
  • Action tokens (posts, comments, reactions, etc.)
  • File metadata with variants
  • Task persistence for the scheduler
  • Database metadata (for RTDB)

Key Operations: Create/query tenants, store/retrieve actions, manage file metadata, persist tasks, handle profiles

Why separate? Metadata can be stored in different databases (SQLite, PostgreSQL, MongoDB) based on scale and requirements.

3. BlobAdapter

Purpose: Immutable binary data storage

Responsibilities:

  • Content-addressed blob storage
  • File data (images, videos, documents)
  • Database snapshots (for RTDB)
  • Both buffered and streaming I/O

Key Operations: Write/read blobs (buffered or streaming), check blob existence and size

Why separate? Blob storage can use filesystem, S3, CDN, or specialized object storage based on scale and cost considerations.

4. RtdbAdapter

Purpose: Real-time structured database

Responsibilities:

  • Path-based hierarchical storage (Firebase-like)
  • Query with filtering, sorting, pagination
  • Real-time subscriptions via WebSocket
  • Transaction support for atomic writes
  • Secondary indexes for performance
  • Multi-tenant database isolation

Key Operations: Query/create/update/delete documents, transactions, real-time subscriptions

Why separate? Real-time database functionality can use different backends (redb, PostgreSQL, MongoDB) based on query requirements and scale.

Learn more: RTDB with redb

5. CrdtAdapter

Purpose: Collaborative document storage with CRDTs

Responsibilities:

  • Binary CRDT update storage (Yjs protocol)
  • Document metadata management
  • Real-time change subscriptions
  • Conflict-free merge guarantees
  • Awareness state (presence, cursors)
  • Multi-tenant document isolation

Key Operations: Create/read documents, append updates, retrieve update history, subscribe to changes

Why separate? CRDT storage can use different backends (redb, dedicated CRDT stores) and has different performance characteristics than traditional databases.

Learn more: CRDT Collaborative Editing

Benefits of the Adapter Pattern

Flexible Deployment: Switch storage backends without changing core logic ✅ Separation of Concerns: Security, metadata, and binary data have different requirements ✅ Testing: Easy to create in-memory adapters for testing ✅ Scalability: Can distribute adapters across different services ✅ Cost Optimization: Use appropriate storage for each data type

Content-Addressed Architecture

Cloudillo uses content-addressing throughout its architecture, where resource identifiers are cryptographic hashes of their content. This creates a merkle tree structure that provides cryptographic proof of authenticity and immutability.

Hash-Based Identifiers

All resource IDs are SHA-256 hashes with versioned prefixes:

Prefix Resource Type Hash Input Example
a1~ Action Entire JWT token (header + payload + signature) a1~8kR3mN9pQ2vL6xW...
f1~ File File descriptor string f1~Qo2E3G8TJZ2HTGh...
b1~ Blob Blob bytes (actual image/video data) b1~abc123def456ghi...
d1~ Descriptor (not a hash, the encoded string itself) d1~tn:b1~abc:f=AVIF:...

Version Scheme

Format: {prefix}{version}~{base64_encoded_hash}

  • Version 1: SHA-256 with base64url encoding (no padding)
  • Future versions: Can upgrade to SHA-3, BLAKE3, etc. without breaking old content
  • Backward compatibility: Old content remains valid forever
  • Algorithm agility: Migrate to new algorithms without breaking existing references

Example upgrade path:

a1~...  (SHA-256)
a2~...  (SHA-3)
a3~...  (BLAKE3)

Six-Level Merkle Tree

Content-addressing creates a hierarchical merkle tree:

Level 1: Blob Data (raw bytes)
   ↓ SHA-256 hash
Level 2: Blob ID (b1~hash)
   ↓ collected in descriptor
Level 3: File Descriptor (d1~variant:b1~hash:format:size:resolution,...)
   ↓ SHA-256 hash of descriptor
Level 4: File ID (f1~hash)
   ↓ referenced in action
Level 5: Action Token (JWT with content, parent, attachments)
   ↓ SHA-256 hash of entire JWT
Level 6: Action ID (a1~hash)

Properties

Immutable: Content cannot change without changing the ID ✅ Tamper-Evident: Any modification is immediately detectable ✅ Deduplicatable: Identical content produces identical IDs ✅ Verifiable: Anyone can recompute and verify hashes ✅ Cacheable: Content-addressed data can be cached forever ✅ Trustless: No need to trust storage providers—verify the hash

Integration with Adapters

Content-addressing is implemented across multiple adapters:

BlobAdapter:

  • Stores blobs indexed by blob_id (b1~...)
  • Blob IDs are SHA-256 hashes of the blob bytes
  • Enables deduplication and integrity verification

MetaAdapter:

  • Stores action tokens indexed by action_id (a1~...)
  • Action IDs are SHA-256 hashes of the JWT token
  • Stores file metadata with file_id (f1~...)
  • File IDs are SHA-256 hashes of the descriptor

AuthAdapter:

  • Signs action tokens with profile keys (ES384)
  • Signature + content hash = complete authenticity proof

Security Benefits

Content-addressing provides multiple security layers:

  1. Integrity Verification: SHA-256 ensures data hasn’t been tampered with
  2. Deduplication: Same content = same hash, prevents storage waste
  3. Cryptographic Binding: Parent references create immutable chains
  4. Federation Trust: Remote instances can verify data integrity
  5. Cache Safety: Content-addressed data can be cached without trust

Performance Benefits

  1. Caching: Immutable content can be cached forever (max-age=31536000)
  2. Deduplication: Identical blobs stored only once across all tenants
  3. Parallel Verification: Hash verification can be parallelized
  4. CDN-Friendly: Content-addressed resources perfect for CDN distribution

Learn more: Content-Addressing & Merkle Trees

Task-Based Asynchronous Processing

Complex operations in Cloudillo are modeled as persistent tasks that can execute asynchronously, survive restarts, and depend on other tasks.

Task System Components

Tasks

Tasks implement the Task<S> trait:

pub trait Task<S>: Debug + Send + Sync {
    fn kind() -> &'static str;
    async fn run(&self, state: &S) -> Result<()>;
    fn priority(&self) -> Priority { Priority::Medium }
    fn dependencies(&self) -> Vec<TaskId> { vec![] }
}

Built-in Task Types:

  • ActionCreatorTask: Creates and signs action tokens for federation
  • ActionVerifierTask: Validates incoming federated actions
  • FileIdGeneratorTask: Generates content-addressed file IDs
  • ImageResizerTask: Creates image variants (thumbnails, etc.)

Scheduler

The scheduler manages task lifecycle with dependency resolution:

Features:

  • Task registry with dynamic builders
  • Dependency resolution (DAG-based)
  • Scheduled execution (cron-like)
  • Persistence via MetaAdapter (survives restarts)
  • Notification system for task completion

Example Flow:

ActionCreatorTask (depends on FileIdGeneratorTask)
    ↓
    waits for file processing to complete
    ↓
FileIdGeneratorTask completes
    ↓
ActionCreatorTask auto-starts
    ↓
Creates signed JWT, stores in MetaAdapter

Worker Pool

A priority-based thread pool for CPU-intensive and blocking operations:

Architecture:

  • Three priority tiers: High > Medium > Low
  • Each tier has configurable worker thread count
  • Uses flume MPMC channels for work distribution
  • Returns futures for async integration

Default Configuration (basic-server):

  • 1 high-priority worker
  • 2 medium-priority workers
  • 1 low-priority worker

Use Cases:

  • Image processing (CPU-intensive)
  • Cryptographic operations
  • File compression
  • Blocking I/O

Why Task-Based Processing?

Resilience: Tasks survive server restarts ✅ Dependencies: Complex workflows with ordered execution ✅ Scheduling: Cron-like execution for periodic tasks ✅ Observability: Track task progress and failures ✅ Concurrency: Priority-based execution

Application State Management

AppState Structure

The core application state contains: scheduler, worker pool, HTTP client, TLS certificates, and all five adapters (auth, meta, blob, rtdb, crdt).

AppBuilder Pattern

Configuration uses a fluent builder API with mode, identity, domain, data directory, and adapter selections.

Configuration Options:

  • Server mode (Standalone, Proxy, StreamProxy)
  • Network binding (HTTPS/HTTP ports)
  • Domain configuration
  • Directory paths (dist, tmp, data)
  • Adapter injection
  • Worker pool sizing

Module Organization

The server crate is organized by functional domain:

core/ - Infrastructure Layer

Core system components providing foundational services:

  • app.rs: Application state, builder, bootstrap logic
  • acme.rs: Let’s Encrypt/ACME certificate management
  • webserver.rs: Axum/Rustls HTTPS server with SNI
  • middleware.rs: Authentication middleware
  • extract.rs: Custom Axum extractors (TnId, IdTag, Auth)
  • scheduler.rs: Task scheduling with dependencies
  • worker.rs: Thread pool with priority levels
  • websocket.rs: WebSocket infrastructure
  • request.rs: HTTP client for federation
  • hasher.rs: Content-addressable storage (SHA256)
  • utils.rs: Random ID generation

auth/ - Authentication Module

  • handler.rs: Login endpoints, token generation, password management

action/ - Action/Activity Subsystem

Implements the federated activity system:

  • action.rs: Action creation/verification tasks
  • process.rs: JWT verification, token processing
  • handler.rs: Action CRUD endpoints, federation inbox

profile/ - Profile Management

  • handler.rs: Tenant profile endpoints

file/ - File Storage & Processing

  • file.rs: File descriptor encoding, variant selection
  • handler.rs: File upload/download endpoints
  • image.rs: Image resizing tasks
  • store.rs: Storage layer abstraction

rtdb/ - Real-Time Database

Note: In development, see RTDB Architecture for details

  • handler.rs: Database CRUD endpoints
  • websocket.rs: WebSocket connection handler
  • manager.rs: Database instance lifecycle

routes/ - HTTP Routing

  • Separates public and protected route groups
  • API endpoints (/api/*)
  • Static file serving for frontend
  • CORS configuration

Concurrency Model

Multi-threaded Tokio Runtime

Cloudillo uses Tokio’s multi-threaded async runtime for handling concurrent requests.

Concurrency Layers

  1. Async Layer (Tokio): HTTP request handling, WebSocket connections, I/O
  2. Worker Pool: CPU-intensive tasks, blocking operations
  3. Scheduler: Background task execution with dependencies

Interaction Example:

HTTP Request → Tokio async handler
    ↓
Spawn ImageResizerTask on scheduler
    ↓
Scheduler dispatches to worker pool (CPU-intensive)
    ↓
Worker completes, updates MetaAdapter
    ↓
Scheduler notifies waiting tasks
    ↓
Response returned to client

Request Handling Flow

Middleware Pipeline

HTTP requests flow through these layers:

  1. HTTPS/SNI Resolution: CertResolver selects TLS certificate by domain
  2. Tracing/Logging: Request tracing with structured logging
  3. Authentication: require_auth or optional_auth middleware
  4. Custom Extractors: TnId, IdTag, Auth extract from request context
  5. Handler: Business logic execution
  6. Response: JSON serialization, error handling

Custom Extractors

Axum extractors provide typed access to request context:

  • TnId: Tenant ID (database primary key, i64)
  • IdTag: Tenant identifier string (e.g., “alice.example.com”)
  • Auth: Full authentication context (tn_id, id_tag, scope, etc.)

Error Handling

Custom error enum with automatic HTTP response conversion: NotFound (404), PermissionDenied (403), DbError/Unknown/Parse/Io (500).

Server Modes

Cloudillo supports different deployment modes:

Standalone (Default)

Self-contained single instance:

  • HTTPS on configured port
  • Optional HTTP for ACME challenges
  • All adapters run locally

Use case: Personal servers, small communities

Proxy

Used if Cloudillo is behind a reverse proxy:

  • Listens on HTTP port
  • Certificate handled is the responsibility of the proxy

Use case: Managed hosting providers, self-hosting with multiple services on one IP address

Security Architecture

Implemented in Rust

Maximal memory and concurrency safety. Minimal attack surface.

No Unsafe Code

Cloudillo enforces memory safety:

#![forbid(unsafe_code)]

ABAC Permission System

Cloudillo uses Attribute-Based Access Control (ABAC) for fine-grained permissions across all resources. ABAC provides flexible permission rules based on:

  • User attributes (identity, roles, relationships)
  • Resource attributes (owner, visibility, type)
  • Contextual factors (time, environment)

Key Features:

  • Five visibility levels (public, private, followers, connected, direct)
  • Policy-based access control (TOP/BOTTOM policies)
  • Relationship-aware (following, connections)
  • Time-based permissions
  • Custom policy support

Learn more: ABAC Permission System

Cryptographic Algorithms

  • P384: Elliptic curve for action signing
  • ES384: JWT signature algorithm
  • SHA256: Content addressing and hashing
  • bcrypt: Password hashing

Security Layers

  1. TLS/HTTPS: All connections encrypted (Rustls)
  2. ACME: Automatic certificate management (Let’s Encrypt)
  3. JWT: Cryptographically signed tokens
  4. Content Addressing: Tamper detection via SHA256
  5. Permission Checks: Authorization at every access point

Bootstrap Process

Initial setup when starting a new instance:

  1. Create tenant with base_id_tag
  2. Set password (if provided)
  3. Generate profile signing key (P384)
  4. Initiate ACME for TLS certificates (if email provided)
  5. Start scheduler and worker pool
  6. Start HTTP/HTTPS servers

Example configuration:

BASE_ID_TAG=alice.example.com        # Required
BASE_PASSWORD=secret                 # Initial password
ACME_EMAIL=alice@example.com         # Let's Encrypt email
MODE=standalone                      # Server mode
LISTEN=0.0.0.0:8443                  # HTTPS binding
DATA_DIR=./data                      # storage path

Key Dependencies

Web Framework

  • axum (0.8): Async web framework
  • tower: Service abstractions
  • tower-http: CORS, static files

TLS & Crypto

  • rustls (0.23): Pure Rust TLS
  • instant-acme (0.8): ACME client
  • jsonwebtoken (9.3): JWT handling
  • p384: Elliptic curve operations

Async Runtime

  • tokio (1.48): Multi-threaded async

Serialization

  • serde (1.0): Serialization framework
  • serde_json: JSON support

Database

  • sqlx (0.8): Async SQL (SQLite support)

Utilities

  • image (0.25): Image processing
  • sha2: SHA256 hashing
  • croner (3.0): Cron expressions
  • flume: MPMC channels

Architectural Strengths

Pluggable Adapters: Easy to swap storage backends ✅ Self-Contained: No external dependencies required ✅ Federated: Communicates with other instances ✅ Task-Based: Persistent, resumable async execution ✅ Type-Safe: Leverages Rust’s type system ✅ Memory-Safe: Complete #![forbid(unsafe_code)]Observable: Built-in tracing integration

Next Steps

  • Identity System - DNS-based identity and key management
  • [Access Control](/architecture/data-layer/access-control/access - Token validation and permissions
  • ABAC Permissions - Attribute-based access control system
  • [Actions & Federation](/architecture/actions-federation/actions - Action tokens and cross-instance communication
  • File Storage - Content-addressed storage and processing
  • RTDB with redb - Query-based real-time database
  • CRDT Collaborative Editing - Conflict-free collaborative documents
  • Real-Time Systems Overview - Introduction to RTDB and CRDT