Distributed Transactions
tensor_chain implements distributed transactions using Two-Phase Commit (2PC) with semantic conflict detection.
Transaction Lifecycle
stateDiagram-v2
[*] --> Pending: begin()
Pending --> Preparing: prepare()
Preparing --> Prepared: all votes received
Prepared --> Committing: commit decision
Prepared --> Aborting: abort decision
Committing --> Committed: all acks
Aborting --> Aborted: all acks
Committed --> [*]
Aborted --> [*]
Two-Phase Commit
Phase 1: Prepare
- Coordinator sends
Prepareto all participants - Each participant:
- Acquires locks
- Validates constraints
- Writes to WAL
- Votes
YesorNo
Phase 2: Commit/Abort
- If all vote
Yes: Coordinator sendsCommit - If any vote
No: Coordinator sendsAbort - Participants apply or rollback
Message Types
| Message | Direction | Purpose |
|---|---|---|
TxPrepareMsg | Coordinator -> Participant | Start prepare phase |
TxVote | Participant -> Coordinator | Vote yes/no |
TxCommitMsg | Coordinator -> Participant | Commit decision |
TxAbortMsg | Coordinator -> Participant | Abort decision |
TxAck | Participant -> Coordinator | Acknowledge commit/abort |
Lock Management
Lock Types
| Lock | Compatibility | Use |
|---|---|---|
| Shared (S) | S-S compatible | Read operations |
| Exclusive (X) | Incompatible with all | Write operations |
Lock Acquisition
#![allow(unused)] fn main() { // Acquire lock with timeout let lock = lock_manager.acquire( tx_id, key, LockMode::Exclusive, Duration::from_secs(5), )?; }
Deadlock Detection
Wait-for graph analysis:
#![allow(unused)] fn main() { // Check for cycles before waiting if wait_graph.would_create_cycle(my_tx, blocking_tx) { // Abort to prevent deadlock return Err(DeadlockDetected); } // Register wait wait_graph.add_wait(my_tx, blocking_tx); }
Victim Selection
| Policy | Behavior |
|---|---|
| Youngest | Abort most recent transaction |
| Oldest | Abort longest-running |
| LowestPriority | Abort lowest priority |
| MostLocks | Abort holding most locks |
Semantic Conflict Detection
Beyond lock-based conflicts, tensor_chain detects semantic conflicts:
#![allow(unused)] fn main() { // Compute embedding deltas let delta_a = tx_a.compute_delta(); let delta_b = tx_b.compute_delta(); // Check for semantic overlap if delta_a.cosine_similarity(&delta_b) > CONFLICT_THRESHOLD { // Semantic conflict - need manual resolution return PrepareVote::Conflict { ... }; } }
Recovery
Coordinator Failure
- New coordinator queries participants for tx state
- If any committed: complete commit
- If all prepared: re-run commit decision
- Otherwise: abort
Participant Failure
- Participant replays WAL on restart
- For prepared transactions: query coordinator
- Apply commit or abort based on coordinator state
Configuration
#![allow(unused)] fn main() { pub struct DistributedTxConfig { /// Prepare phase timeout pub prepare_timeout_ms: u64, /// Commit phase timeout pub commit_timeout_ms: u64, /// Maximum concurrent transactions pub max_concurrent_tx: usize, /// Lock wait timeout pub lock_timeout_ms: u64, } }
Formal Verification
The 2PC protocol is formally specified in TwoPhaseCommit.tla and
exhaustively model-checked with TLC across 2.3M distinct states.
The model verifies Atomicity (all-or-nothing), NoOrphanedLocks,
ConsistentDecision, VoteIrrevocability, and DecisionStability.
See Formal Verification for full results.
Best Practices
- Keep transactions short: Long transactions increase conflict probability
- Order lock acquisition: Acquire locks in consistent order to prevent deadlocks
- Use appropriate isolation: Not all operations need serializable isolation
- Monitor deadlock rate: High rates indicate contention issues