Internals
How Yjs works under the hood—understanding the mechanics behind CRDT synchronization.
The CRDT Model
Simplified Overview
This page provides a practical understanding of Yjs internals. For definitive details, consult the Yjs documentation.
Yjs is primarily operation-based: it stores and transmits operations (inserts, deletes). However, it also supports state encoding for snapshots and initial sync. This combination provides:
- Efficient sync - Only missing operations are transmitted based on vector clock comparison
- Full state snapshots - New clients can receive complete state without operation replay
- Compact updates - Ongoing changes are small binary operation deltas
Items and the Item List
Internally, all data is a linked list of “items”:
[item1] <-> [item2] <-> [item3] <-> [item4]Each item contains:
- ID - Unique (clientId, clock) pair
- Content - The actual data
- Origin - Item this was inserted after
- Right Origin - Item this was inserted before
Client IDs and Clocks
Every client has a unique ID and logical clock. Item IDs are (clientId, clock) pairs, ensuring globally unique identifiers without coordination.
Vector Clocks
State vectors track what each client has seen:
{
clientA: 15, // Has seen A's operations up to clock 15
clientB: 8, // Has seen B's operations up to clock 8
}When syncing, only missing operations are sent based on vector clock comparison.
Conflict Resolution
Y.Map: Last-writer-wins by logical timestamp. Higher clock wins.
Y.Array: Concurrent insertions at same position are both preserved. Order determined by client ID.
Y.Text: Character-level merging. Concurrent insertions both appear; order by position and client ID.
The Update Format
Changes are encoded as compact binary:
yDoc.on('update', (update: Uint8Array) => {
// Send over network or store for persistence
})
Y.applyUpdate(yDoc, update) // Apply received update
Y.mergeUpdates([u1, u2, u3]) // Compact multiple updates
Garbage Collection
Deleted items become tombstones (needed for concurrent operation resolution). Tombstones are eventually garbage collected when all clients have moved past them.
GC Implications
Heavy editing accumulates tombstones until GC. Very long-lived, heavily-edited documents may grow larger than expected.
Subdocuments
For large documents, split into subdocuments for lazy loading:
const mainDoc = new Y.Doc()
const subDoc = new Y.Doc({ guid: 'chapter-1' })
mainDoc.getMap('subdocs').set('chapter1', subDoc)Performance Characteristics
The following are approximate complexities for typical use cases. Actual performance varies by implementation details, document structure, and operation history:
| Operation | Approximate Complexity |
|---|---|
| Map get/set | O(1) average |
| Array push | O(1) |
| Array insert at index | O(n) |
| Text insert | O(log n) typical* |
| Sync (diff) | O(changes) |
* Text insertion complexity depends on the document’s internal structure and edit history.
Space overhead: For typical documents, expect 2-10x the raw data size due to CRDT metadata and tombstones. Very long-lived, heavily-edited documents may accumulate more overhead. Documents with minimal edits will be closer to the lower bound.
See Also
- Conflict Resolution - Merging behavior
- Transactions - Batching operations