Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Capacity Planning

Resource Requirements

Memory

ComponentFormulaExample (1M entries)
Raft log (in-memory)entries * avg_size * 21M 1KB 2 = 2 GB
Tensor store indexentries * 64 bytes1M * 64 = 64 MB
HNSW indexvectors * dim * 4 * ef1M 128 4 * 16 = 8 GB
Codebookcentroids * dim * 41024 128 4 = 512 KB
Connection bufferspeers * buffer_size10 * 64KB = 640 KB

Recommended minimum: 16 GB for production

Disk

ComponentFormulaExample
WALentries * avg_size10M * 1KB = 10 GB
Snapshotsstate_size * 25 GB * 2 = 10 GB
Mmap cold storagecold_entries * avg_size100M * 1KB = 100 GB

Recommended: 3x expected data size for growth

Network

Traffic TypeFormulaExample (100 TPS)
ReplicationTPS * entry_size * (replicas-1)100 1KB 2 = 200 KB/s
Gossipnodes * fanout * state_size / interval5 3 1KB / 1s = 15 KB/s
ClientTPS * (request + response)100 * 2KB = 200 KB/s

Recommended: 1 Gbps minimum, 10 Gbps for high throughput

CPU

OperationComplexityCores Needed
ConsensusO(1) per entry1 core
Embedding computationO(dim)1-2 cores
HNSW searchO(log N * ef)2-4 cores
Conflict detectionO(concurrent_txs^2)1 core

Recommended: 8+ cores for production

Sizing Examples

Small (Dev/Test)

  • 3 nodes
  • 4 cores, 8 GB RAM, 100 GB SSD each
  • Up to 1M entries, 10 TPS

Medium (Production)

  • 5 nodes
  • 8 cores, 32 GB RAM, 500 GB NVMe each
  • Up to 100M entries, 1000 TPS

Large (High-Scale)

  • 7+ nodes
  • 16+ cores, 64+ GB RAM, 2 TB NVMe each
  • 1B+ entries, 10k+ TPS
  • Consider sharding

Scaling Strategies

Vertical Scaling

When to use:

  • Single-node bottleneck (CPU, memory)
  • Read latency requirements

Actions:

  • Add RAM for larger in-memory log
  • Add cores for parallel embedding computation
  • Upgrade to NVMe for faster snapshots

Horizontal Scaling

When to use:

  • Throughput limited by consensus
  • Fault tolerance requirements

Actions:

  • Add read replicas (don’t participate in consensus)
  • Add consensus members (odd numbers only)
  • Implement sharding by key range

Monitoring for Capacity

# Prometheus alerts
- alert: HighMemoryUsage
  expr: tensor_chain_memory_usage_bytes / tensor_chain_memory_limit_bytes > 0.85
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Memory usage above 85%"

- alert: DiskSpaceLow
  expr: tensor_chain_disk_free_bytes < 10737418240  # 10 GB
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "Less than 10 GB disk space remaining"

- alert: HighCPUUsage
  expr: rate(tensor_chain_cpu_seconds_total[5m]) > 0.9
  for: 10m
  labels:
    severity: warning
  annotations:
    summary: "CPU usage above 90%"

Growth Projections

# Calculate daily growth
neumann-admin stats --since "7d ago" --format json | jq '.entries_per_day'

# Project storage needs
DAILY_GROWTH=100000  # entries
ENTRY_SIZE=1024      # bytes
DAYS=365
GROWTH=$((DAILY_GROWTH * ENTRY_SIZE * DAYS / 1024 / 1024 / 1024))
echo "Projected annual growth: ${GROWTH} GB"