File Processing Pipeline
Overview
Cloudillo processes uploaded files through an asynchronous pipeline that generates multiple variants optimized for different use cases. The system uses FFmpeg for multimedia processing and supports images, videos, audio, and PDFs.
Processing Architecture
File Upload
↓
Store original blob
↓
Create FileIdGeneratorTask
↓
Detect file type (MIME)
↓
Generate variants (async)
├─ Image: thumbnails, SD, MD, HD
├─ Video: transcoded variants, thumbnails
├─ Audio: normalized, compressed
└─ PDF: text extraction, thumbnails
↓
Create file descriptor
↓
Content-address all variants
↓
Return file ID (f1~...)Supported File Types
Images
| Format | Extensions | Processing |
|---|---|---|
| JPEG | .jpg, .jpeg | Resize, AVIF/WebP conversion |
| PNG | .png | Resize, AVIF/WebP conversion |
| GIF | .gif | First frame extraction, resize |
| WebP | .webp | Resize only |
| AVIF | .avif | Resize only |
Image Format Selection
Thumbnails use AVIF for best compression. Larger variants (SD, MD, HD) use WebP for faster encoding while maintaining good quality.
Video
| Format | Extensions | Processing |
|---|---|---|
| MP4 | .mp4 | H.264 transcode, thumbnails |
| WebM | .webm | H.264 transcode, thumbnails |
| MOV | .mov | H.264 transcode, thumbnails |
| MKV | .mkv | H.264 transcode, thumbnails |
Audio
| Format | Extensions | Processing |
|---|---|---|
| MP3 | .mp3 | OPUS conversion |
| WAV | .wav | OPUS conversion |
| OGG | .ogg | OPUS conversion |
| FLAC | .flac | OPUS conversion |
| M4A | .m4a | OPUS conversion |
| OPUS | .opus | Normalization only |
Documents
| Format | Extensions | Processing |
|---|---|---|
| Text extraction, page thumbnails |
Variant System
Cloudillo uses a two-level variant system with format <class>.<quality>:
vis.sd → class: visual (image), quality: standard definition
vid.hd → class: video, quality: high definition
aud.md → class: audio, quality: mediumVariant Classes
| Class | Code | Description | Source Types |
|---|---|---|---|
| Visual | vis |
Static images | JPEG, PNG, WebP, AVIF, GIF |
| Video | vid |
Video content | MP4, WebM, MKV, AVI, MOV |
| Audio | aud |
Audio tracks | MP3, WAV, OGG, FLAC, AAC, OPUS |
| Document | doc |
Documents | |
| Raw | raw |
Original file | Any (unprocessed) |
Quality Levels
| Quality | Code | Description |
|---|---|---|
| Profile | pf |
80px - Profile pictures |
| Thumbnail | tn |
128px - Small previews |
| Standard | sd |
720px - Mobile/low bandwidth |
| Medium | md |
1280px - Desktop viewing |
| High | hd |
1920px - High quality |
| Extra | xd |
3840px - 4K/maximum quality |
| Original | orig |
Unprocessed source file |
Visual Variants (Images)
| Variant | Max Size | Format | Use Case |
|---|---|---|---|
vis.pf |
80×80 | AVIF | Profile pictures |
vis.tn |
128×128 | AVIF | Thumbnails, listings |
vis.sd |
720px | WebP | Mobile, previews |
vis.md |
1280px | WebP | Desktop viewing |
vis.hd |
1920px | WebP | High quality display |
vis.xd |
3840px | WebP | 4K displays |
orig |
- | Original | Source file |
Video Variants
| Variant | Max Resolution | Bitrate | Use Case |
|---|---|---|---|
vid.sd |
720px | 1.5 Mbps | Mobile, low bandwidth |
vid.md |
1280px | 3 Mbps | Desktop |
vid.hd |
1920px | 5 Mbps | High quality |
vid.xd |
3840px | 15 Mbps | 4K playback |
Video processing also extracts a vis.tn thumbnail from the first few seconds.
Audio Variants
| Variant | Format | Bitrate | Use Case |
|---|---|---|---|
aud.sd |
OPUS | 64 kbps | Low bandwidth |
aud.md |
OPUS | 128 kbps | Normal playback |
aud.hd |
OPUS | 256 kbps | High quality |
Document Variants
| Variant | Description |
|---|---|
doc.orig |
Original PDF |
vis.tn |
Thumbnail of first page |
Variant Fallback
When a requested variant isn’t available, the system falls back to lower quality:
Request: vis.hd
Fallback chain: vis.md → vis.sd → vis.tnFile Descriptor Format
File descriptors encode all variant information:
d2,vis.tn:b1~abc123:f=avif:s=4096:r=128x128;vis.sd:b1~def456:f=webp:s=32768:r=720x540;vid.hd:b1~xyz789:f=mp4:s=5242880:r=1920x1080:dur=120.5:br=5000| Component | Description |
|---|---|
d2, |
Descriptor version prefix |
; |
Variant separator |
vis.tn, vid.hd |
Two-level variant code |
b1~... |
Blob ID (SHA-256 hash) |
f= |
Format (avif, webp, mp4, opus) |
s= |
Size in bytes |
r= |
Resolution (WxH) |
dur= |
Duration in seconds (video/audio) |
br= |
Bitrate in kbps (video/audio) |
pg= |
Page count (PDFs) |
Processing Presets
Presets define which variants to generate for different use cases:
| Preset | Visual | Video | Audio | Use Case |
|---|---|---|---|---|
default |
vis.tn, vis.sd, vis.md, vis.hd | vid.sd, vid.md, vid.hd | aud.sd, aud.md, aud.hd | General uploads |
profile_picture |
vis.pf, vis.tn, vis.sd | - | - | Profile images |
cover |
vis.tn, vis.sd, vis.md, vis.hd | - | - | Cover/banner images |
high_quality |
vis.tn → vis.xd | vid.sd → vid.xd | aud.md, aud.hd | Maximum quality |
mobile |
vis.tn, vis.sd, vis.md | vid.sd, vid.md | aud.sd | Optimized for mobile |
archive |
vis.tn only | - | - | Minimal (keeps original) |
podcast |
- | - | aud.sd, aud.md, aud.hd | Audio extraction |
video |
vis.tn (thumbnail) | vid.sd → vid.hd | - | Video-focused |
All presets preserve the original file as orig unless configured otherwise.
FFmpeg Integration
Video and audio processing uses FFmpeg:
Video Transcoding
ffmpeg -i input.mov \
-c:v libx264 -preset medium -crf 23 \
-c:a aac -b:a 128k \
-vf "scale=1280:720:force_original_aspect_ratio=decrease" \
output.mp4Audio Transcoding
ffmpeg -i input.mp3 \
-c:a libopus -b:a 96k \
output.opusAudio Extraction
From video files:
ffmpeg -i input.mp4 \
-vn -c:a libopus -b:a 96k \
output.opusThumbnail Generation
ffmpeg -i input.mp4 \
-ss 00:00:01 -vframes 1 \
-vf "scale=150:150:force_original_aspect_ratio=decrease" \
thumbnail.jpgContent-Addressing
All variants are content-addressed:
- Blob level: Raw bytes →
b1~{SHA256(bytes)} - Descriptor level: Descriptor string →
f1~{SHA256(descriptor)}
This enables:
- Deduplication: Identical files share blobs
- Verification: Hashes prove integrity
- Caching: Immutable content can be cached forever
Task Scheduling
File processing uses the task scheduler:
FileUploadTask
↓
Creates FileIdGeneratorTask (depends on upload)
↓
FileIdGeneratorTask generates variants
↓
Action can reference file (depends on FileIdGeneratorTask)Dependencies ensure actions only reference fully processed files.
Federation Sync
When syncing files across instances:
Metadata-Only Sync
For efficiency, only file descriptors are synced initially:
- Receive action with file attachment
- Fetch file descriptor from origin
- Store descriptor locally
- Fetch variants on demand
Variant Fetching
Client requests file
↓
Check if variant exists locally
↓
Yes → Serve from local storage
↓
No → Fetch from origin server
Store locally
Serve to clientError Handling
| Error | Action |
|---|---|
| Unsupported format | Reject upload with error |
| FFmpeg failure | Log, mark as failed, allow retry |
| Storage full | Queue for retry, alert admin |
| Timeout | Retry with extended timeout |
See Also
- Files API - File upload endpoints
- Blob Storage - Storage layer
- Content-Addressing - Hash computation