System Architecture Overview

This document provides a technical overview of the Cloudillo system architecture, explaining the core patterns and components that enable its federated, privacy-focused design.

Workspace Structure

Cloudillo is organized as a Rust workspace with feature-specific crates:

cloudillo-rs/
├── server/                      # Binary: cloudillo-server
├── crates/
│   ├── cloudillo/               # Main integrator (routes, websocket, bootstrap)
│   ├── cloudillo-types/         # Shared types, adapter traits, error types
│   ├── cloudillo-core/          # Infrastructure (scheduler, worker, middleware, ACME)
│   ├── cloudillo-auth/          # Authentication (login, WebAuthn, QR, API keys)
│   ├── cloudillo-action/        # Federation actions (posts, delivery, verification)
│   ├── cloudillo-file/          # File processing (images, video, audio, PDF, SVG)
│   ├── cloudillo-crdt/          # CRDT collaborative editing (Yjs WebSocket)
│   ├── cloudillo-rtdb/          # Real-time database
│   ├── cloudillo-push/          # Web Push notifications
│   ├── cloudillo-email/         # Email notifications (SMTP, templates)
│   ├── cloudillo-idp/           # Identity provider, tenant management
│   ├── cloudillo-profile/       # Profile management
│   ├── cloudillo-admin/         # System administration
│   ├── cloudillo-ref/           # Reference/lookup API
│   └── cloudillo-proxy/         # Reverse proxy with TLS
└── adapters/
    ├── auth-adapter-sqlite/     # Authentication & cryptography (SQLite)
    ├── meta-adapter-sqlite/     # Metadata storage (SQLite)
    ├── blob-adapter-fs/         # Binary blob storage (Filesystem)
    ├── rtdb-adapter-redb/       # Real-time database (Redb)
    └── crdt-adapter-redb/       # Collaborative editing CRDT (Redb)

Crate Architecture

The workspace follows a three-layer architecture:

  • cloudillo-types: Foundation layer — adapter trait definitions (AuthAdapter, MetaAdapter, BlobAdapter, RtdbAdapter, CrdtAdapter), shared domain types, and error types
  • cloudillo-core: Infrastructure layer — task scheduler, worker pool, authentication middleware, ACME certificate management, custom extractors, WebSocket infrastructure, and HTTP client
  • cloudillo-*: Feature crates — each domain (auth, action, file, etc.) is an independent crate with its own handlers, tasks, and settings
  • cloudillo: Integrator — assembles all feature crates into HTTP routes, WebSocket handling, and application bootstrap
  • server: Binary crate (cloudillo-server) — integrates all feature crates and adapters into a runnable server
  • adapters: Five pluggable storage backends implementing the traits defined in cloudillo-types

Adapter Types

The five adapters separate concerns and enable flexible deployments:

  1. AuthAdapter - Authentication, JWT tokens, certificate management, cryptographic operations
  2. MetaAdapter - Tenant/profile data, action tokens, file metadata, tasks
  3. BlobAdapter - Content-addressed immutable binary data (files, images, snapshots)
  4. RtdbAdapter - Real-time hierarchical JSON database with queries and subscriptions
  5. CrdtAdapter - Collaborative document editing with conflict-free merges

Core Architecture Patterns

The Five-Adapter Architecture

Cloudillo’s architecture is built on five fundamental adapters that separate concerns and enable flexible deployments:

1. AuthAdapter

Purpose: Authentication, authorization, and cryptographic operations

Responsibilities:

  • JWT/token validation and creation
  • TLS certificate management (ACME integration)
  • Profile signing key storage and rotation
  • VAPID keys for push notifications
  • Password hashing and WebAuthn credentials
  • Tenant management

Key Operations: Validate tokens, create access tokens, manage TLS certificates, handle profile keys, store passwords/credentials

Why separate? Authentication and cryptography require special security considerations and may need different storage backends (HSM, vault services, etc.).

2. MetaAdapter

Purpose: Structured metadata storage

Responsibilities:

  • Tenant and profile information
  • Action tokens (posts, comments, reactions, etc.)
  • File metadata with variants
  • Task persistence for the scheduler
  • Database metadata (for RTDB)

Key Operations: Create/query tenants, store/retrieve actions, manage file metadata, persist tasks, handle profiles

Why separate? Metadata can be stored in different databases (SQLite, PostgreSQL, MongoDB) based on scale and requirements.

3. BlobAdapter

Purpose: Immutable binary data storage

Responsibilities:

  • Content-addressed blob storage
  • File data (images, videos, documents)
  • Database snapshots (for RTDB)
  • Both buffered and streaming I/O

Key Operations: Write/read blobs (buffered or streaming), check blob existence and size

Why separate? Blob storage can use filesystem, S3, CDN, or specialized object storage based on scale and cost considerations.

4. RtdbAdapter

Purpose: Real-time structured database

Responsibilities:

  • Path-based hierarchical storage (Firebase-like)
  • Query with filtering, sorting, pagination
  • Real-time subscriptions via WebSocket
  • Transaction support for atomic writes
  • Secondary indexes for performance
  • Multi-tenant database isolation

Key Operations: Query/create/update/delete documents, transactions, real-time subscriptions

Why separate? Real-time database functionality can use different backends (redb, PostgreSQL, MongoDB) based on query requirements and scale.

Learn more: RTDB with redb

5. CrdtAdapter

Purpose: Collaborative document storage with CRDTs

Responsibilities:

  • Binary CRDT update storage (Yjs protocol)
  • Document metadata management
  • Real-time change subscriptions
  • Conflict-free merge guarantees
  • Awareness state (presence, cursors)
  • Multi-tenant document isolation

Key Operations: Create/read documents, append updates, retrieve update history, subscribe to changes

Why separate? CRDT storage can use different backends (redb, dedicated CRDT stores) and has different performance characteristics than traditional databases.

Learn more: CRDT Collaborative Editing

Benefits of the Adapter Pattern

  • Flexible Deployment: Switch storage backends without changing core logic
  • Separation of Concerns: Security, metadata, and binary data have different requirements
  • Testing: Easy to create in-memory adapters for testing
  • Scalability: Can distribute adapters across different services
  • Cost Optimization: Use appropriate storage for each data type

Content-Addressed Architecture

Cloudillo uses content-addressing throughout its architecture, where resource identifiers are cryptographic hashes of their content. This creates a merkle tree structure that provides cryptographic proof of authenticity and immutability.

Hash-Based Identifiers

All resource IDs are SHA-256 hashes with versioned prefixes:

Prefix Resource Type Hash Input Example
a1~ Action Entire JWT token (header + payload + signature) a1~8kR3mN9pQ2vL6xW...
f1~ File File descriptor string f1~Qo2E3G8TJZ2HTGh...
b1~ Blob Blob bytes (actual image/video data) b1~abc123def456ghi...
d2, Descriptor (not a hash, the encoded string itself) d2,vis.tn:b1~abc:f=avif:...

Version Scheme

Format: {prefix}{version}~{base64_encoded_hash}

  • Version 1: SHA-256 with base64url encoding (no padding)
  • Future versions: Can upgrade to SHA-3, BLAKE3, etc. without breaking old content
  • Backward compatibility: Old content remains valid forever
  • Algorithm agility: Migrate to new algorithms without breaking existing references

Example upgrade path:

a1~...  (SHA-256)
a2~...  (SHA-3)
a3~...  (BLAKE3)

Six-Level Merkle Tree

Content-addressing creates a hierarchical merkle tree:

Level 1: Blob Data (raw bytes)
   ↓ SHA-256 hash
Level 2: Blob ID (b1~hash)
   ↓ collected in descriptor
Level 3: File Descriptor (d2,class.variant:b1~hash:format:size:resolution;...)
   ↓ SHA-256 hash of descriptor
Level 4: File ID (f1~hash)
   ↓ referenced in action
Level 5: Action Token (JWT with content, parent, attachments)
   ↓ SHA-256 hash of entire JWT
Level 6: Action ID (a1~hash)

Properties

  • Immutable: Content cannot change without changing the ID
  • Tamper-Evident: Any modification is immediately detectable
  • Deduplicatable: Identical content produces identical IDs
  • Verifiable: Anyone can recompute and verify hashes
  • Cacheable: Content-addressed data can be cached forever
  • Trustless: No need to trust storage providers—verify the hash

Integration with Adapters

Content-addressing is implemented across multiple adapters:

BlobAdapter:

  • Stores blobs indexed by blob_id (b1~...)
  • Blob IDs are SHA-256 hashes of the blob bytes
  • Enables deduplication and integrity verification

MetaAdapter:

  • Stores action tokens indexed by action_id (a1~...)
  • Action IDs are SHA-256 hashes of the JWT token
  • Stores file metadata with file_id (f1~...)
  • File IDs are SHA-256 hashes of the descriptor

AuthAdapter:

  • Signs action tokens with profile keys (ES384)
  • Signature + content hash = complete authenticity proof

Learn more: Content-Addressing & Merkle Trees

Task-Based Asynchronous Processing

Complex operations in Cloudillo are modeled as persistent tasks that can execute asynchronously, survive restarts, and depend on other tasks.

Task System Components

Tasks

Tasks implement the Task<S> trait:

pub trait Task<S>: Debug + Send + Sync {
    fn kind() -> &'static str;
    async fn run(&self, state: &S) -> Result<()>;
    fn priority(&self) -> Priority { Priority::Medium }
    fn dependencies(&self) -> Vec<TaskId> { vec![] }
}

Built-in Task Types:

  • ActionCreatorTask: Creates and signs action tokens for federation
  • ActionVerifierTask: Validates incoming federated actions
  • ActionDeliveryTask: Delivers actions to remote instances with retry
  • FileIdGeneratorTask: Generates content-addressed file IDs
  • ImageResizerTask: Creates image variants (thumbnails, etc.)
  • VideoTranscoderTask: Transcodes video to web-optimized formats
  • AudioExtractorTask: Extracts audio metadata
  • PdfProcessorTask: Extracts text and metadata from PDFs
  • EmailSenderTask: Sends emails asynchronously
  • CertRenewalTask: Handles ACME certificate renewal
  • ProfileRefreshBatchTask: Batch-refreshes remote profiles
  • TenantImageUpdaterTask: Updates tenant avatar images

Scheduler

The scheduler manages task lifecycle with dependency resolution:

Features:

  • Task registry with dynamic builders
  • Dependency resolution (DAG-based)
  • Scheduled execution (cron-like)
  • Persistence via MetaAdapter (survives restarts)
  • Notification system for task completion

Example Flow:

ActionCreatorTask (depends on FileIdGeneratorTask)
    ↓
    waits for file processing to complete
    ↓
FileIdGeneratorTask completes
    ↓
ActionCreatorTask auto-starts
    ↓
Creates signed JWT, stores in MetaAdapter

Worker Pool

A priority-based thread pool for CPU-intensive and blocking operations:

Architecture:

  • Three priority tiers: High > Medium > Low
  • Each tier has configurable worker thread count
  • Uses flume MPMC channels for work distribution
  • Returns futures for async integration

Default Configuration (cloudillo-server):

  • 1 high-priority worker
  • 2 medium-priority workers
  • 1 low-priority worker

Use Cases:

  • Image processing (CPU-intensive)
  • Cryptographic operations
  • File compression
  • Blocking I/O

Application State Management

AppState Structure

The core application state contains: scheduler, worker pool, HTTP client, TLS certificates, and all five adapters (auth, meta, blob, rtdb, crdt).

AppBuilder Pattern

Configuration uses a fluent builder API with mode, identity, domain, data directory, and adapter selections.

Configuration Options:

  • Server mode (Standalone, Proxy, StreamProxy)
  • Network binding (HTTPS/HTTP ports)
  • Domain configuration
  • Directory paths (dist, tmp, data)
  • Adapter injection
  • Worker pool sizing

Crate Organization

The workspace is organized into feature-specific crates under crates/:

cloudillo-types - Foundation Layer

Shared types and trait definitions used across all crates:

  • Adapter trait definitions (AuthAdapter, MetaAdapter, BlobAdapter, RtdbAdapter, CrdtAdapter)
  • Core domain types and type aliases
  • Error types with unified error handling
  • HTTP extractors and utility types

cloudillo-core - Infrastructure Layer

Core system components providing foundational services:

  • scheduler.rs: Task scheduling with dependencies
  • app.rs: Application state and builder
  • acme.rs: Let’s Encrypt/ACME certificate management
  • middleware.rs: Authentication middleware
  • extract.rs: Custom Axum extractors (TnId, IdTag, Auth)
  • ws_bus.rs: WebSocket message bus
  • ws_broadcast.rs: WebSocket broadcast manager
  • rate_limit/: Rate limiting (governor-based)
  • request.rs: HTTP client for federation
  • roles.rs: Role definitions
  • settings/: Global settings system

cloudillo-auth - Authentication

  • Login/logout endpoints, token generation, password management
  • WebAuthn passwordless authentication
  • QR code-based login flow
  • API key generation and validation

cloudillo-action - Federation Actions

  • Action creation, verification, and delivery tasks
  • JWT verification and token processing
  • Action CRUD endpoints and federation inbox
  • Federated delivery with retry and audience computation

cloudillo-file - File Storage & Processing

  • File upload/download endpoints
  • Image processing (resize, format conversion, SVG rasterization)
  • Video transcoding and audio extraction (via FFmpeg)
  • PDF processing and file descriptor management
  • Variant generation and preset system

cloudillo-profile - Profile Management

  • Tenant profile endpoints
  • Profile federation and sync
  • Avatar and banner image processing

cloudillo-rtdb - Real-Time Database

  • Database CRUD endpoints
  • WebSocket subscription-based data sync

cloudillo-crdt - Collaborative Editing

  • Yjs WebSocket protocol handler for CRDT sync

cloudillo-push - Push Notifications

  • Web Push notification delivery (RFC 8291 + VAPID)
  • Subscription management and per-user preferences

cloudillo-email - Email Notifications

  • SMTP delivery via lettre
  • Handlebars template rendering
  • Async email sender task

cloudillo-idp - Identity Provider

  • Tenant registration and lifecycle management

cloudillo-admin - Administration

  • System admin endpoints for instance management

cloudillo-ref - Reference API

  • Namespace lookups and reference resolution

cloudillo-proxy - Reverse Proxy

  • HTTP/WebSocket proxying with TLS termination

cloudillo - Integrator

Assembles all feature crates into a runnable application:

  • routes.rs: HTTP endpoint definitions (public and protected route groups)
  • websocket.rs: WebSocket protocol handler
  • bootstrap.rs: First-run tenant setup
  • webserver.rs: Server configuration (TLS, CORS, compression)

Concurrency Model

Multi-threaded Tokio Runtime

Cloudillo uses Tokio’s multi-threaded async runtime for handling concurrent requests.

Concurrency Layers

  1. Async Layer (Tokio): HTTP request handling, WebSocket connections, I/O
  2. Worker Pool: CPU-intensive tasks, blocking operations
  3. Scheduler: Background task execution with dependencies

Interaction Example:

HTTP Request → Tokio async handler
    ↓
Spawn ImageResizerTask on scheduler
    ↓
Scheduler dispatches to worker pool (CPU-intensive)
    ↓
Worker completes, updates MetaAdapter
    ↓
Scheduler notifies waiting tasks
    ↓
Response returned to client

Request Handling Flow

Middleware Pipeline

HTTP requests flow through these layers:

  1. HTTPS/SNI Resolution: CertResolver selects TLS certificate by domain
  2. Tracing/Logging: Request tracing with structured logging
  3. Authentication: require_auth or optional_auth middleware
  4. Custom Extractors: TnId, IdTag, Auth extract from request context
  5. Handler: Business logic execution
  6. Response: JSON serialization, error handling

Custom Extractors

Axum extractors provide typed access to request context:

  • TnId: Tenant ID (database primary key, u32)
  • IdTag: Tenant identifier string (e.g., “alice.example.com”)
  • Auth: Full authentication context (tn_id, id_tag, scope, etc.)

Error Handling

Custom error enum with automatic HTTP response conversion: NotFound (404), PermissionDenied (403), DbError/Unknown/Parse/Io (500).

Server Modes

Cloudillo supports different deployment modes:

Standalone (Default)

Self-contained single instance:

  • HTTPS on configured port
  • Optional HTTP for ACME challenges
  • All adapters run locally

Use case: Personal servers, small communities

Proxy

Used if Cloudillo is behind a reverse proxy:

  • Listens on HTTP port
  • Certificate handled is the responsibility of the proxy

Use case: Managed hosting providers, self-hosting with multiple services on one IP address

Security Architecture

Implemented in Rust

Maximal memory and concurrency safety. Minimal attack surface.

No Unsafe Code

Cloudillo enforces memory safety:

#![forbid(unsafe_code)]

ABAC Permission System

Cloudillo uses Attribute-Based Access Control (ABAC) for fine-grained permissions across all resources. ABAC provides flexible permission rules based on:

  • User attributes (identity, roles, relationships)
  • Resource attributes (owner, visibility, type)
  • Contextual factors (time, environment)

Key Features:

  • Six visibility levels: Public (P), Verified (V), SecondDegree (2), Follower (F), Connected (C), Direct (NULL)
  • Policy-based access control (TOP/BOTTOM policies)
  • Relationship-aware (following, connections)
  • Time-based permissions
  • Custom policy support

Learn more: ABAC Permission System

Cryptographic Algorithms

  • P384: Elliptic curve for action signing
  • ES384: JWT signature algorithm
  • SHA256: Content addressing and hashing
  • bcrypt: Password hashing

Security Layers

  1. TLS/HTTPS: All connections encrypted (Rustls)
  2. ACME: Automatic certificate management (Let’s Encrypt)
  3. JWT: Cryptographically signed tokens
  4. Content Addressing: Tamper detection via SHA256
  5. Permission Checks: Authorization at every access point

Bootstrap Process

Initial setup when starting a new instance:

  1. Create tenant with base_id_tag
  2. Set password (if provided)
  3. Generate profile signing key (P384)
  4. Initiate ACME for TLS certificates (if email provided)
  5. Start scheduler and worker pool
  6. Start HTTP/HTTPS servers

Example configuration:

BASE_ID_TAG=alice.example.com        # Required
BASE_PASSWORD=secret                 # Initial password
ACME_EMAIL=alice@example.com         # Let's Encrypt email
MODE=standalone                      # Server mode
LISTEN=0.0.0.0:8443                  # HTTPS binding
DATA_DIR=./data                      # storage path

Key Dependencies

Web Framework

  • axum (0.8): Async web framework
  • tower: Service abstractions
  • tower-http: CORS, static files

TLS & Crypto

  • rustls (0.23): Pure Rust TLS
  • instant-acme (0.8): ACME client
  • jsonwebtoken (9.3): JWT handling
  • p384: Elliptic curve operations

Async Runtime

  • tokio (1.48): Multi-threaded async

Serialization

  • serde (1.0): Serialization framework
  • serde_json: JSON support

Database

  • sqlx (0.8): Async SQL (SQLite support)

Utilities

  • image (0.25): Image processing
  • sha2: SHA256 hashing
  • croner (3.0): Cron expressions
  • flume: MPMC channels

Architectural Strengths

  • Pluggable Adapters: Easy to swap storage backends
  • Self-Contained: No external dependencies required
  • Federated: Communicates with other instances
  • Task-Based: Persistent, resumable async execution
  • Type-Safe: Leverages Rust’s type system
  • Memory-Safe: Complete #![forbid(unsafe_code)]
  • Observable: Built-in tracing integration

Next Steps