File Processing Pipeline
Overview
Cloudillo processes uploaded files through an asynchronous pipeline that generates multiple variants optimized for different use cases. The system uses FFmpeg for multimedia processing, resvg for SVG rasterization, and poppler-utils for PDF handling. Supported media types include images, SVG, videos, audio, PDFs, and raw files.
Processing Architecture
Upload (POST /api/files/{preset}/{file_name})
↓
Detect MIME type → Map to VariantClass
↓
Validate against preset's allowed_media_classes
↓
Route to type-specific handler:
├─ Image: Read into memory → thumbnail sync → schedule ImageResizerTask per variant
├─ SVG: Sanitize → store as vis.sd → rasterize thumbnail sync
├─ Video: Stream to temp → FFprobe → extract frame → thumbnail sync
│ → schedule VideoTranscoderTask + optional AudioExtractorTask
├─ Audio: Stream to temp → FFprobe → schedule AudioExtractorTask per tier
├─ PDF: Read into memory → store original → schedule PdfProcessorTask
└─ Raw: Stream to temp → store as-is (orig variant only)
↓
Schedule FileIdGeneratorTask (depends on all variant tasks)
↓
Create file descriptor → Content-address all variants
↓
Return file ID (f1~...)The upload handler runs directly (not as a scheduled task). Thumbnails are generated synchronously so clients receive an immediate preview. Additional variants are generated asynchronously via the task scheduler.
Supported File Types
Images
| Format | Extensions | Processing |
|---|---|---|
| JPEG | .jpg, .jpeg | Resize, format conversion |
| PNG | .png | Resize, format conversion |
| GIF | .gif | First frame extraction, resize |
| WebP | .webp | Resize |
| AVIF | .avif | Resize |
| SVG | .svg | Sanitization, rasterized thumbnail |
Image Format Configuration
The vis.pf (profile) variant always uses AVIF. For all other variants, the format is configurable: file.thumbnail_format (default: WebP) controls vis.tn, and file.image_format (default: WebP) controls vis.sd through vis.xd.
SVG Security
SVG files are sanitized before storage: <script>, <foreignObject>, and animation elements are removed, on* event handlers are stripped, and javascript:/data:text/html/vbscript: URLs are blocked. The sanitized SVG is stored as vis.sd (vector format scales infinitely) and rasterized via resvg for the thumbnail variant.
Video
| Format | Extensions | Processing |
|---|---|---|
| MP4 | .mp4 | H.264 transcode, thumbnails |
| WebM | .webm | H.264 transcode, thumbnails |
| MOV | .mov | H.264 transcode, thumbnails |
| MKV | .mkv | H.264 transcode, thumbnails |
| AVI | .avi | H.264 transcode, thumbnails |
Audio
| Format | Extensions | Processing |
|---|---|---|
| MP3 | .mp3 | OPUS conversion |
| WAV | .wav | OPUS conversion |
| OGG | .ogg | OPUS conversion |
| FLAC | .flac | OPUS conversion |
| AAC | .aac | OPUS conversion |
| WebM Audio | .weba | OPUS conversion |
Documents
| Format | Extensions | Processing |
|---|---|---|
| Page count extraction, first-page thumbnail |
Raw Files
Any file type not listed above can be uploaded using presets that allow the Raw variant class (e.g., archive, orig-only). Raw files are stored as-is with no processing beyond content-addressing.
Variant System
Cloudillo uses a two-level variant system with format <class>.<quality>:
Variant Classes
| Class | Code | Description | Source Types |
|---|---|---|---|
| Visual | vis |
Static images | JPEG, PNG, WebP, AVIF, GIF, SVG |
| Video | vid |
Video content | MP4, WebM, MKV, AVI, MOV |
| Audio | aud |
Audio tracks | MP3, WAV, OGG, FLAC, AAC, OPUS |
| Document | doc |
Documents | |
| Raw | raw |
Original file | Any (unprocessed) |
Quality Levels
| Quality | Code | Max Size / Bitrate | Use Case |
|---|---|---|---|
| Profile | pf |
80px (always AVIF) | Profile pictures |
| Thumbnail | tn |
256px | Small previews |
| Standard | sd |
720px / 1.5 Mbps / 64 kbps | Mobile/low bandwidth |
| Medium | md |
1280px / 3 Mbps / 128 kbps | Desktop viewing |
| High | hd |
1920px / 5 Mbps / 256 kbps | High quality |
| Extra | xd |
3840px / 15 Mbps | 4K/maximum quality |
| Original | orig |
Unprocessed | Source file |
Variant Fallback
When a requested variant isn’t available, the system falls back to lower quality:
Request: vis.hd
Fallback chain: vis.md → vis.sd → vis.tnFile Descriptor Format
File descriptors encode all variant information:
d2,vis.tn:b1~abc123:f=webp:s=4096:r=256x192;vis.sd:b1~def456:f=webp:s=32768:r=720x540;vid.hd:b1~xyz789:f=mp4:s=5242880:r=1920x1080:dur=120.5:br=5000| Component | Description |
|---|---|
d2, |
Descriptor version prefix |
; |
Variant separator |
vis.tn, vid.hd |
Two-level variant code |
b1~... |
Blob ID (SHA-256 hash) |
f= |
Format (avif, webp, mp4, opus) |
s= |
Size in bytes |
r= |
Resolution (WxH) |
dur= |
Duration in seconds (video/audio) |
br= |
Bitrate in kbps (video/audio) |
pg= |
Page count (PDFs) |
Processing Presets
Presets define which variants to generate for different use cases:
| Preset | Visual | Video | Audio | Use Case |
|---|---|---|---|---|
default |
vis.tn, vis.sd, vis.md, vis.hd | vid.sd, vid.md, vid.hd | aud.md | General uploads |
profile-picture |
vis.pf, vis.tn, vis.sd, vis.md, vis.hd | - | - | Profile images |
cover |
vis.tn, vis.sd, vis.md, vis.hd | - | - | Cover/banner images |
high_quality |
vis.tn, vis.sd, vis.md, vis.hd, vis.xd | vid.sd, vid.md, vid.hd, vid.xd | aud.md, aud.hd | Maximum quality |
mobile |
vis.tn, vis.sd, vis.md | vid.sd, vid.md | aud.sd | Optimized for mobile |
archive |
vis.tn only | - | - | Minimal (keeps original) |
podcast |
vis.tn | vid.sd | aud.sd, aud.md, aud.hd | Audio-focused |
video |
vis.tn, vis.sd, vis.md, vis.hd | vid.sd, vid.md, vid.hd | - | Video-focused |
orig-only |
- | - | - | Store original only, no processing |
thumbnail-only |
- | - | - | Generate thumbnail only, discard original |
apkg |
vis.pf (icon extraction) | - | - | App packages (zip) |
Presets that set store_original: true (default, high_quality, archive, podcast, video, orig-only, apkg) preserve the original file as orig. Profile-picture, cover, mobile, and thumbnail-only do not store the original.
The archive and orig-only presets also accept raw (unrecognized) file types. Other presets reject uploads with unsupported MIME types.
FFmpeg Integration
Video Transcoding
ffmpeg -i input.mov \
-c:v libx264 -preset medium -crf 23 \
-vf "scale=1280:720:force_original_aspect_ratio=decrease" \
output.mp4Audio Transcoding
ffmpeg -i input.mp3 \
-c:a libopus -b:a 128k \
output.opusThumbnail extraction seeks to 10% of video duration (min 3s) and extracts a single frame, which is then resized through the image processing pipeline.
Content-Addressing
All variants are content-addressed:
- Blob level: Raw bytes →
b1~{SHA256(bytes)} - Descriptor level: Descriptor string →
f1~{SHA256(descriptor)}
This enables deduplication (identical files share blobs), integrity verification, and permanent caching of immutable content.
Task Scheduling
File processing uses the task scheduler for asynchronous variant generation:
| Task Type | Description |
|---|---|
image.resize |
Resize image to target variant dimensions and format |
video.transcode |
Transcode video to target resolution and bitrate |
audio.extract |
Extract/transcode audio to OPUS at target bitrate |
pdf.process |
Extract page count (pdfinfo) and render first-page thumbnail (pdftoppm) |
file.id_gen |
Generate file descriptor after all variant tasks complete |
Dependencies ensure actions only reference fully processed files.
Federation Sync
When syncing files across instances, only file descriptors are synced initially. Variants are fetched on demand from the origin server and cached locally.
Error Handling
| Error | Action |
|---|---|
| Unknown format, preset allows Raw | Store as-is via raw handler |
| Unknown format, preset disallows Raw | Reject with “unsupported media type” error |
| FFmpeg failure | Log, mark task as failed, allow retry |
| Storage full | Queue for retry, alert admin |
| Timeout | Retry with extended timeout |
See Also
- Files API - File upload endpoints
- Blob Storage - Storage layer
- Content-Addressing - Hash computation