Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Neumann Neumann

Neumann is a unified tensor-based runtime that stores relational data, graph relationships, and vector embeddings in a single system. Instead of stitching together a SQL database, a graph store, and a vector index, Neumann gives you all three behind one query language.

Choose Your Path

I want to…Go to
Try it in 5 minutesQuick Start
Build a project with itFive-Minute Tutorial
See what it can doUse Cases
Understand the designArchitecture Overview
Use the Python SDKPython Quickstart
Use the TypeScript SDKTypeScript Quickstart
Look up a commandQuery Language Reference

What Makes Neumann Different

One system, three engines. Store a table, connect entities in a graph, and search by vector similarity without moving data between systems.

-- Relational
CREATE TABLE documents (id INT PRIMARY KEY, title TEXT, author TEXT);
INSERT INTO documents VALUES (1, 'Intro to ML', 'Alice');

-- Graph
NODE CREATE topic { name: 'machine-learning' }
ENTITY CONNECT 'doc-1' -> 'topic-ml' : covers

-- Vector
EMBED STORE 'doc-1' [0.1, 0.2, 0.3, 0.4]

-- Cross-engine: find similar documents connected to a topic
SIMILAR 'doc-1' LIMIT 5 CONNECTED TO 'topic-ml'

Encrypted vault. Store secrets with AES-256-GCM encryption and graph-based access control.

LLM cache. Cache LLM responses with exact and semantic matching to reduce API costs.

Built-in consensus. Raft-based distributed consensus with 2PC transactions for multi-node deployments.

Architecture

                    +-------------------+
                    |   neumann_shell   |    Interactive CLI
                    |   neumann_server  |    gRPC server
                    +-------------------+
                             |
                    +-------------------+
                    |   query_router    |    Unified query execution
                    +-------------------+
                             |
        +----------+---------+---------+----------+
        |          |         |         |          |
   relational   graph    vector    tensor_    tensor_
   _engine     _engine  _engine   _vault     _cache
        |          |         |         |          |
        +----------+---------+---------+----------+
                             |
                    +-------------------+
                    |   tensor_store    |    Core storage (HNSW, sharded B-trees)
                    +-------------------+

Additional subsystems: tensor_blob (S3-style blob storage), tensor_chain (blockchain with Raft), tensor_checkpoint (snapshots), tensor_compress (tensor train decomposition).

Getting Started

Reference

Installation

Multiple installation methods are available depending on your needs.

The easiest way to install Neumann is using the install script:

curl -sSfL https://raw.githubusercontent.com/Shadylukin/Neumann/main/install.sh | bash

This script will:

  • Detect your platform (Linux x86_64, macOS x86_64, macOS ARM64)
  • Download a pre-built binary if available
  • Fall back to building from source if needed
  • Install to /usr/local/bin or ~/.local/bin
  • Install shell completions and man pages

Environment Variables

VariableDescription
NEUMANN_INSTALL_DIRCustom installation directory
NEUMANN_VERSIONInstall a specific version (e.g., v0.1.0)
NEUMANN_NO_MODIFY_PATHSet to 1 to skip PATH modification
NEUMANN_SKIP_EXTRASSet to 1 to skip completions and man page installation

Homebrew (macOS/Linux)

brew tap Shadylukin/tap
brew install neumann

Cargo (crates.io)

If you have Rust installed:

cargo install neumann_shell

To install the gRPC server:

cargo install neumann_server

Docker

Interactive CLI

docker run -it shadylukinack/neumann:latest

Server Mode

docker run -d -p 9200:9200 -v neumann-data:/var/lib/neumann shadylukinack/neumann:server

Docker Compose

# Clone the repository
git clone https://github.com/Shadylukin/Neumann.git
cd Neumann

# Start the server
docker compose up -d neumann-server

# Run the CLI
docker compose run --rm neumann-cli

From Source

Requirements

  • Rust 1.75 or later
  • Cargo (included with Rust)
  • Git
  • protobuf compiler (for gRPC)

Build Steps

# Clone the repository
git clone https://github.com/Shadylukin/Neumann.git
cd Neumann

# Build in release mode
cargo build --release --package neumann_shell

# Install locally
cargo install --path neumann_shell

Run Tests

cargo test

SDK Installation

Python

pip install neumann-db

For embedded mode (in-process database):

pip install neumann-db[native]

TypeScript / JavaScript

npm install neumann-db

Or with yarn:

yarn add neumann-db

Verify Installation

neumann --version

Platform Support

PlatformBinaryHomebrewDockerSource
Linux x86_64YesYesYesYes
macOS x86_64YesYesYesYes
macOS ARM64 (Apple Silicon)YesYesYesYes
Windows x86_64NoNoYesExperimental

Troubleshooting

“command not found: neumann”

The binary may not be in your PATH. Try:

# Check where it was installed
which neumann || ls ~/.local/bin/neumann

# Add to PATH if needed
export PATH="$HOME/.local/bin:$PATH"

Build fails with protobuf errors

Install the protobuf compiler:

# macOS
brew install protobuf

# Ubuntu/Debian
sudo apt-get install protobuf-compiler

# Fedora
sudo dnf install protobuf-compiler

Permission denied during install

The installer tries /usr/local/bin first (requires sudo) then falls back to ~/.local/bin. You can specify a custom directory:

NEUMANN_INSTALL_DIR=~/bin \
  curl -sSfL https://raw.githubusercontent.com/Shadylukin/Neumann/main/install.sh | bash

Python SDK native module errors

If you get errors about the native module, ensure you have a Rust toolchain:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
pip install neumann-db[native]

Updating

Quick Install

Re-run the install script to get the latest version:

curl -sSfL https://raw.githubusercontent.com/Shadylukin/Neumann/main/install.sh | bash

Homebrew

brew upgrade neumann

Cargo

cargo install neumann_shell --force

Python SDK

pip install --upgrade neumann-db

TypeScript SDK

npm update neumann-db

Uninstalling

Quick Install / Cargo

rm $(which neumann)

Homebrew

brew uninstall neumann

Docker

docker rmi shadylukinack/neumann:latest shadylukinack/neumann:server
docker volume rm neumann-data

Python SDK

pip uninstall neumann-db

TypeScript SDK

npm uninstall neumann-db

Next Steps

Quick Start

Get up and running with Neumann in under 5 minutes. This guide walks you through relational queries, graph operations, vector search, and the cross-engine “wow” moment.

Start the Shell

# In-memory (data lost on exit)
neumann

# With persistence (recommended)
neumann --wal-dir ./data

You will see:

Neumann v0.1.0
Type 'help' for available commands.
neumann>

1. Relational Queries

Create a table and insert some data:

CREATE TABLE people (id INT PRIMARY KEY, name TEXT, role TEXT, team TEXT);

INSERT INTO people VALUES (1, 'Alice', 'Staff Engineer', 'Platform');
INSERT INTO people VALUES (2, 'Bob', 'Engineering Manager', 'Platform');
INSERT INTO people VALUES (3, 'Carol', 'Senior Engineer', 'ML');
INSERT INTO people VALUES (4, 'Dave', 'Junior Engineer', 'Platform');

Query it:

SELECT * FROM people WHERE team = 'Platform';
SELECT name, role FROM people ORDER BY name;
SELECT team, COUNT(*) AS headcount FROM people GROUP BY team;

2. Graph Operations

Create nodes with labels and properties:

NODE CREATE person { name: 'Alice', role: 'Staff Engineer' }
NODE CREATE person { name: 'Bob', role: 'Engineering Manager' }
NODE CREATE person { name: 'Carol', role: 'Senior Engineer' }
NODE CREATE person { name: 'Dave', role: 'Junior Engineer' }

List the nodes to see their auto-generated IDs:

NODE LIST person

Create edges (replace the IDs with the actual values from NODE LIST):

EDGE CREATE 'alice-node-id' -> 'bob-node-id' : reports_to
EDGE CREATE 'dave-node-id' -> 'bob-node-id' : reports_to
EDGE CREATE 'alice-node-id' -> 'dave-node-id' : mentors

Traverse the graph:

NEIGHBORS 'bob-node-id' INCOMING : reports_to
PATH SHORTEST 'dave-node-id' TO 'bob-node-id'

Run graph algorithms:

PAGERANK

Store embeddings with string keys:

EMBED STORE 'alice' [0.9, 0.4, 0.1, 0.7, 0.6, 0.3]
EMBED STORE 'bob' [0.6, 0.2, 0.1, 0.5, 0.3, 0.2]
EMBED STORE 'carol' [0.3, 0.9, 0.1, 0.4, 0.8, 0.1]
EMBED STORE 'dave' [0.4, 0.1, 0.2, 0.5, 0.2, 0.1]

Find similar items by key or by vector:

SIMILAR 'alice' LIMIT 3
SIMILAR [0.8, 0.5, 0.1, 0.6, 0.5, 0.2] LIMIT 3 METRIC COSINE

Check what is stored:

SHOW EMBEDDINGS
COUNT EMBEDDINGS

4. The Cross-Engine Moment

This is where Neumann shines. Combine graph traversal with vector similarity in a single query:

SIMILAR 'alice' LIMIT 3 CONNECTED TO 'bob-node-id'

This finds embeddings similar to Alice’s that are also connected to Bob in the graph. No joins across separate databases needed.

Search across all engines with FIND:

FIND NODE person WHERE name = 'Alice'

Create unified entities that span relational, graph, and vector storage:

ENTITY CREATE 'project-x' { name: 'Project X', status: 'active' } EMBEDDING [0.5, 0.3, 0.7, 0.2, 0.4, 0.1]
ENTITY GET 'project-x'

5. Persistence

Save a checkpoint:

CHECKPOINT 'my-first-checkpoint'
CHECKPOINTS

If you started with --wal-dir, your data persists across restarts. You can also save and load binary snapshots:

SAVE 'backup.bin'
LOAD 'backup.bin'

Next Steps

Sample Dataset

A ready-made dataset is available in samples/knowledge-base.nql. Load it with:

neumann --wal-dir ./data
neumann> \i samples/knowledge-base.nql

Five-Minute Tutorial: Build a Mini RAG System

This tutorial builds a retrieval-augmented generation (RAG) knowledge base using all three engines. By the end, you will have documents stored relationally, linked in a graph, searchable by embeddings, and protected with vault secrets.

Prerequisites

Start the shell with persistence:

neumann --wal-dir ./rag-data

Step 1: Create the Document Store

Use a relational table for structured document metadata:

CREATE TABLE documents (
    id INT PRIMARY KEY,
    title TEXT NOT NULL,
    category TEXT,
    author TEXT,
    created TEXT
);

INSERT INTO documents VALUES (1, 'Intro to Neural Networks', 'ml', 'Alice', '2024-01-15');
INSERT INTO documents VALUES (2, 'Transformer Architecture', 'ml', 'Bob', '2024-02-20');
INSERT INTO documents VALUES (3, 'Database Indexing', 'systems', 'Carol', '2024-03-10');
INSERT INTO documents VALUES (4, 'Vector Search at Scale', 'systems', 'Alice', '2024-04-05');
INSERT INTO documents VALUES (5, 'Fine-Tuning LLMs', 'ml', 'Dave', '2024-05-12');
INSERT INTO documents VALUES (6, 'Consensus Protocols', 'distributed', 'Eve', '2024-06-01');

Verify:

SELECT * FROM documents ORDER BY id;
SELECT category, COUNT(*) FROM documents GROUP BY category;

Step 2: Add Graph Relationships

Create nodes for documents and topics, then link them:

NODE CREATE document { title: 'Intro to Neural Networks', doc_id: 1 }
NODE CREATE document { title: 'Transformer Architecture', doc_id: 2 }
NODE CREATE document { title: 'Database Indexing', doc_id: 3 }
NODE CREATE document { title: 'Vector Search at Scale', doc_id: 4 }
NODE CREATE document { title: 'Fine-Tuning LLMs', doc_id: 5 }
NODE CREATE document { title: 'Consensus Protocols', doc_id: 6 }

NODE CREATE topic { name: 'machine-learning' }
NODE CREATE topic { name: 'systems' }
NODE CREATE topic { name: 'distributed-systems' }

List nodes to get IDs:

NODE LIST document
NODE LIST topic

Create edges linking documents to topics (use actual IDs from NODE LIST):

-- Documents 1, 2, 5 cover machine-learning
-- Documents 3, 4 cover systems
-- Documents 4, 6 cover distributed-systems
-- Document 2 references document 1
-- Document 5 references documents 1 and 2

Create citation relationships between documents:

EDGE CREATE 'doc2-id' -> 'doc1-id' : cites
EDGE CREATE 'doc5-id' -> 'doc1-id' : cites
EDGE CREATE 'doc5-id' -> 'doc2-id' : cites

Check the graph:

NEIGHBORS 'doc1-id' INCOMING : cites
PATH SHORTEST 'doc5-id' TO 'doc1-id'

Step 3: Store Embeddings

Each document gets a vector representing its content. In a real system, you would generate these with an embedding model (e.g., OpenAI, Cohere, or a local model). Here we use hand-crafted 6-dimensional vectors:

-- [neural-nets, transformers, databases, vectors, llms, distributed]
EMBED STORE 'doc-1' [0.9, 0.3, 0.1, 0.2, 0.2, 0.1]
EMBED STORE 'doc-2' [0.7, 0.95, 0.1, 0.1, 0.4, 0.1]
EMBED STORE 'doc-3' [0.1, 0.1, 0.95, 0.3, 0.0, 0.2]
EMBED STORE 'doc-4' [0.2, 0.1, 0.5, 0.9, 0.1, 0.4]
EMBED STORE 'doc-5' [0.6, 0.5, 0.1, 0.1, 0.95, 0.1]
EMBED STORE 'doc-6' [0.1, 0.1, 0.3, 0.2, 0.1, 0.9]

Search by similarity:

-- Find documents similar to "Intro to Neural Networks"
SIMILAR 'doc-1' LIMIT 3

-- Search with a custom query vector (someone asking about transformers + LLMs)
SIMILAR [0.5, 0.8, 0.1, 0.1, 0.7, 0.1] LIMIT 3 METRIC COSINE

Combine vector similarity with graph connectivity. Find documents similar to a query vector that are also connected to a specific topic node:

SIMILAR [0.8, 0.6, 0.1, 0.1, 0.5, 0.1] LIMIT 3 CONNECTED TO 'ml-topic-id'

This is the core RAG pattern: retrieve relevant documents using embedding similarity, then filter by graph relationships for context-aware results.

Step 5: Protect API Keys with Vault

Store the embedding API key securely:

VAULT SET 'openai_api_key' 'sk-proj-abc123...'
VAULT SET 'cohere_api_key' 'co-xyz789...'
VAULT GRANT 'alice' ON 'openai_api_key'

Retrieve when needed:

VAULT GET 'openai_api_key'
VAULT LIST

Step 6: Cache LLM Responses

Initialize the cache and store responses to avoid repeated API calls:

CACHE INIT

CACHE PUT 'what are transformers?' 'Transformers are a neural network architecture based on self-attention mechanisms...'

CACHE SEMANTIC PUT 'explain attention mechanism' 'The attention mechanism allows models to focus on relevant parts of the input...' EMBEDDING [0.6, 0.9, 0.1, 0.1, 0.3, 0.1]

On subsequent queries, check the cache first:

CACHE GET 'what are transformers?'
CACHE SEMANTIC GET 'how does attention work?' THRESHOLD 0.8

Step 7: Checkpoint

Save your work:

CHECKPOINT 'rag-setup-complete'
CHECKPOINTS

What You Built

You now have a working RAG knowledge base with:

  1. Structured metadata in a relational table (searchable with SQL)
  2. Semantic relationships in a graph (topics, citations, authorship)
  3. Vector embeddings for similarity search
  4. Graph-aware retrieval combining similarity and structure
  5. Encrypted secrets for API key management
  6. Response caching to reduce LLM API costs
  7. Checkpoint for safe rollback

Next Steps

Use Cases

Neumann is designed for applications that need two or more of: structured data, relationships, and semantic search. Here are concrete examples with query patterns.


RAG Application

Problem: Build a retrieval-augmented generation system that retrieves relevant context for an LLM.

Why Neumann: Traditional RAG uses a vector store for retrieval. Neumann adds graph relationships (document structure, citations, topics) and relational metadata (authors, dates, permissions) to improve retrieval quality.

Schema

CREATE TABLE documents (
    id INT PRIMARY KEY,
    title TEXT NOT NULL,
    source TEXT,
    created TEXT,
    chunk_count INT
);
NODE CREATE collection { name: 'engineering-docs' }
-- For each document:
NODE CREATE document { title: 'Design Doc: Auth System', doc_id: 1 }
-- Link to collection:
EDGE CREATE 'doc-node-id' -> 'collection-id' : belongs_to
-- Link related documents:
EDGE CREATE 'doc-a-id' -> 'doc-b-id' : references

Indexing

-- Store chunk embeddings (one per document chunk)
EMBED STORE 'doc-1-chunk-0' [0.1, 0.2, ...]
EMBED STORE 'doc-1-chunk-1' [0.3, 0.1, ...]

Retrieval

-- Basic: find similar chunks
SIMILAR [0.2, 0.3, ...] LIMIT 10 METRIC COSINE

-- Graph-aware: find chunks connected to a specific collection
SIMILAR [0.2, 0.3, ...] LIMIT 10 CONNECTED TO 'collection-id'

-- Check cache before calling LLM
CACHE SEMANTIC GET 'how does auth work?' THRESHOLD 0.85

Store LLM response

CACHE SEMANTIC PUT 'how does auth work?' 'The auth system uses JWT tokens...' EMBEDDING [0.2, 0.3, ...]

Agent Memory

Problem: Give an AI agent persistent memory that supports both exact recall and semantic search, with conversation structure.

Why Neumann: Agents need to recall specific facts (relational), navigate conversation history (graph), and find semantically related memories (vector).

Schema

CREATE TABLE memories (
    id INT PRIMARY KEY,
    content TEXT NOT NULL,
    memory_type TEXT,
    importance FLOAT,
    created TEXT
);
-- Create session nodes
NODE CREATE session { name: 'session-2024-01-15', user: 'alice' }

-- Create memory nodes
NODE CREATE memory { content: 'User prefers dark mode', type: 'preference' }

-- Link memories to sessions
EDGE CREATE 'memory-id' -> 'session-id' : observed_in

-- Link related memories
EDGE CREATE 'memory-a' -> 'memory-b' : related_to

Store a Memory

INSERT INTO memories VALUES (1, 'User prefers dark mode', 'preference', 0.8, '2024-01-15');
NODE CREATE memory { content: 'User prefers dark mode', memory_id: 1 }
EMBED STORE 'memory-1' [0.1, 0.3, ...]

Recall

-- Semantic recall: find memories similar to current context
SIMILAR [0.2, 0.3, ...] LIMIT 5

-- Structured recall: get high-importance memories
SELECT * FROM memories WHERE importance > 0.7 ORDER BY importance DESC LIMIT 10

-- Graph recall: get memories from a specific session
NEIGHBORS 'session-id' INCOMING : observed_in

Knowledge Graph

Problem: Build a knowledge graph of entities with properties, relationships, and similarity search.

Why Neumann: Knowledge graphs need rich entity properties (relational), typed relationships (graph), and entity similarity (vector) in a single queryable system.

Build the Graph

-- Create entity types
NODE CREATE company { name: 'Acme Corp', industry: 'Technology', founded: 2010 }
NODE CREATE person { name: 'Jane Smith', title: 'CEO' }
NODE CREATE product { name: 'Acme Cloud', category: 'IaaS' }

-- Create relationships
EDGE CREATE 'jane-id' -> 'acme-id' : works_at { role: 'CEO', since: '2018' }
EDGE CREATE 'acme-id' -> 'product-id' : produces

Add Embeddings

-- Embed entity descriptions for similarity search
EMBED STORE 'acme-corp' [0.8, 0.3, 0.1, ...]
EMBED STORE 'jane-smith' [0.2, 0.7, 0.5, ...]
EMBED STORE 'acme-cloud' [0.9, 0.4, 0.2, ...]

Query

-- Find entities similar to a description
SIMILAR [0.7, 0.3, 0.2, ...] LIMIT 5

-- Discover connections
NEIGHBORS 'acme-id' BOTH
PATH SHORTEST 'jane-id' TO 'product-id'

-- Graph analytics
PAGERANK
LOUVAIN
BETWEENNESS

Problem: Build a search system where different users see different results based on permissions.

Why Neumann: Vault stores access tokens, graph models the permission hierarchy, and vector search handles the retrieval. No external auth system needed.

Setup Permissions

-- Store API keys securely
VAULT SET 'admin_key' 'ak-admin-secret'
VAULT SET 'user_key' 'ak-user-secret'

-- Grant access based on roles
VAULT GRANT 'alice' ON 'admin_key'
VAULT GRANT 'bob' ON 'user_key'

-- Build permission graph
NODE CREATE role { name: 'admin' }
NODE CREATE role { name: 'viewer' }
NODE CREATE resource { name: 'confidential-docs' }

EDGE CREATE 'admin-role-id' -> 'resource-id' : can_access

Query with Access Check

-- Find documents similar to query
SIMILAR [0.3, 0.5, ...] LIMIT 10

-- Verify access through graph
NEIGHBORS 'admin-role-id' OUTGOING : can_access

-- Rotate keys periodically
VAULT ROTATE 'admin_key' 'ak-new-admin-secret'

Common Patterns

Checkpoint Before Migrations

CHECKPOINT 'before-schema-v2'
-- Run migration...
-- If something goes wrong:
ROLLBACK TO 'before-schema-v2'

Blob Attachments

-- Upload a file and link it to an entity
BLOB INIT
BLOB PUT 'report.pdf' FROM '/tmp/report.pdf' TAG 'quarterly' LINK 'entity-id'

-- Find all blobs for an entity
BLOBS FOR 'entity-id'

-- Find blobs by tag
BLOBS BY TAG 'quarterly'

Chain for Audit Trail

-- Start a chain transaction for auditable operations
BEGIN CHAIN TRANSACTION
INSERT INTO audit_log VALUES (1, 'user_created', 'alice', '2024-01-15');
COMMIT CHAIN

-- Verify integrity
CHAIN VERIFY
CHAIN HISTORY 'audit_log/1'

Building from Source

Development Requirements

  • Rust 1.75+ (stable)
  • Rust nightly (for fuzzing)
  • Git

Clone and Build

git clone https://github.com/Shadylukin/Neumann.git
cd Neumann

# Debug build
cargo build

# Release build (optimized)
cargo build --release

Running Tests

# Run all tests
cargo test

# Run tests for a specific crate
cargo test -p tensor_chain

# Run with output
cargo test -- --nocapture

Quality Checks

All code must pass before commit:

# Formatting
cargo fmt --check

# Lints (warnings as errors)
cargo clippy -- -D warnings

# Documentation builds
cargo doc --no-deps

Fuzzing

Requires nightly Rust:

# Install cargo-fuzz
cargo install cargo-fuzz

# Run a fuzz target
cd fuzz
cargo +nightly fuzz run parser_parse -- -max_total_time=60

Running the Shell

# Debug mode
cargo run -p neumann_shell

# Release mode
cargo run --release -p neumann_shell

IDE Setup

VS Code

Install the rust-analyzer extension. Recommended settings:

{
  "rust-analyzer.checkOnSave.command": "clippy",
  "rust-analyzer.cargo.features": "all"
}

IntelliJ/CLion

Install the Rust plugin. Enable clippy in settings.

Project Structure

Neumann/
├── tensor_store/       # Core storage layer
├── relational_engine/  # SQL-like tables
├── graph_engine/       # Graph operations
├── vector_engine/      # Embeddings
├── tensor_chain/       # Distributed consensus
├── neumann_parser/     # Query parsing
├── query_router/       # Query execution
├── neumann_shell/      # CLI interface
├── tensor_compress/    # Compression
├── tensor_vault/       # Encrypted storage
├── tensor_cache/       # LLM caching
├── tensor_blob/        # Blob storage
├── tensor_checkpoint/  # Snapshots
├── tensor_unified/     # Multi-engine facade
├── fuzz/               # Fuzz targets
└── docs/               # Documentation

Architecture Overview

Neumann is a unified tensor-based runtime that stores relational data, graph relationships, and vector embeddings in a single mathematical structure.

System Architecture

flowchart TB
    subgraph Client Layer
        Shell[neumann_shell]
    end

    subgraph Query Layer
        Router[query_router]
        Parser[neumann_parser]
    end

    subgraph Engine Layer
        RE[relational_engine]
        GE[graph_engine]
        VE[vector_engine]
    end

    subgraph Storage Layer
        TS[tensor_store]
        TC[tensor_compress]
    end

    subgraph Extended Modules
        Vault[tensor_vault]
        Cache[tensor_cache]
        Blob[tensor_blob]
        Check[tensor_checkpoint]
        Unified[tensor_unified]
        Chain[tensor_chain]
    end

    Shell --> Router
    Router --> Parser
    Router --> RE
    Router --> GE
    Router --> VE
    Router --> Vault
    Router --> Cache
    Router --> Blob
    Router --> Chain

    RE --> TS
    GE --> TS
    VE --> TS
    Vault --> TS
    Cache --> TS
    Blob --> TS
    Check --> TS
    Unified --> RE
    Unified --> GE
    Unified --> VE
    Chain --> TS
    Chain --> GE
    Chain --> Check

    TS --> TC

Module Dependencies

ModulePurposeDepends On
tensor_storeKey-value storage layertensor_compress
relational_engineSQL-like tables with indexestensor_store
graph_engineGraph nodes and edgestensor_store
vector_engineEmbeddings and similarity searchtensor_store
tensor_compressCompression algorithms
tensor_vaultEncrypted secret storagetensor_store, graph_engine
tensor_cacheSemantic LLM response cachingtensor_store
tensor_blobS3-style chunked blob storagetensor_store
tensor_checkpointAtomic snapshot/restoretensor_store
tensor_unifiedMulti-engine unified storageall engines
tensor_chainTensor-native blockchaintensor_store, graph_engine, tensor_checkpoint
neumann_parserQuery tokenization and parsing
query_routerUnified query executionall engines, parser
neumann_shellInteractive CLI interfacequery_router

Key Design Principles

Unified Data Model

All data is represented as tensors:

  • Scalars: Single values (int, float, string, bool)
  • Vectors: Dense or sparse embeddings
  • Pointers: References to other entities

Thread Safety

All engines use DashMap for concurrent access:

  • Sharded locks for write throughput
  • No lock poisoning
  • Read operations are lock-free

Composability

Engines can be composed:

  • Use relational_engine alone for SQL workloads
  • Combine with graph_engine for relationship queries
  • Add vector_engine for similarity search

Data Flow

  1. Query Parsing: neumann_parser tokenizes and parses input
  2. Query Routing: query_router dispatches to appropriate engine
  3. Execution: Engine performs operation using tensor_store
  4. Storage: tensor_store persists data with optional compression

Distributed Architecture (tensor_chain)

For distributed deployments:

flowchart LR
    subgraph Cluster
        L[Leader]
        F1[Follower 1]
        F2[Follower 2]
    end

    C[Client] --> L
    L --> F1
    L --> F2
    F1 -.-> L
    F2 -.-> L
  • Raft Consensus: Leader election and log replication
  • 2PC Transactions: Cross-shard atomic operations
  • SWIM Gossip: Membership and failure detection

Tensor Store Architecture

The tensor_store crate is the foundational storage layer for Neumann. It provides a unified tensor-based key-value store that holds all data - relational, graph, and vector - in a single mathematical structure. The store knows nothing about queries; it purely stores and retrieves tensors by key.

The architecture uses SlabRouter internally, which routes operations to specialized slabs based on key prefixes. This design eliminates hash table resize stalls by using BTreeMap-based storage, providing predictable O(log n) performance without the throughput cliffs caused by hash map resizing.

Core Types

TensorValue

Represents different types of values a tensor can hold.

VariantRust TypeUse Case
Scalar(ScalarValue)enumProperties (name, age, active)
Vector(Vec<f32>)dense arrayEmbeddings for similarity search
Sparse(SparseVector)compressedSparse embeddings (>70% zeros)
Pointer(String)single refSingle relationship to another tensor
Pointers(Vec<String>)multi refMultiple relationships

Automatic Sparsification: Use TensorValue::from_embedding_auto(dense) to automatically choose between dense and sparse representation based on sparsity:

#![allow(unused)]
fn main() {
// Automatically uses Sparse if sparsity >= 70%
let val = TensorValue::from_embedding_auto(dense_vec);

// With custom thresholds (value_threshold, sparsity_threshold)
let val = TensorValue::from_embedding(dense_vec, 0.01, 0.8);
}

Vector Operations: TensorValue supports cross-format operations:

#![allow(unused)]
fn main() {
// Dot product works across Dense, Sparse, and mixed
let dot = tensor_a.dot(&tensor_b);

// Cosine similarity with automatic format handling
let sim = tensor_a.cosine_similarity(&tensor_b);
}

ScalarValue

VariantRust TypeExample
NullMissing/undefined value
Boolbooltrue, false
Inti6442, -1
Floatf643.14159
StringString"Alice"
BytesVec<u8>Raw binary data

TensorData

An entity that holds scalar properties, vector embeddings, and pointers to other tensors via a HashMap<String, TensorValue> internally.

Reserved Field Names

FieldPurposeUsed By
_outOutgoing graph edge pointersGraphEngine
_inIncoming graph edge pointersGraphEngine
_embeddingVector embeddingVectorEngine
_labelEntity type/labelGraphEngine
_typeDiscriminator fieldAll engines
_fromEdge sourceGraphEngine
_toEdge targetGraphEngine
_edge_typeEdge relationship typeGraphEngine
_directedEdge direction flagGraphEngine
_tableTable membershipRelationalEngine
_idEntity IDSystem

Architecture Diagram

TensorStore
  |
  +-- Arc<SlabRouter>
         |
         +-- MetadataSlab (general key-value, BTreeMap-based)
         +-- EntityIndex (sorted vocabulary + hash index)
         +-- EmbeddingSlab (dense f32 arrays)
         +-- GraphTensor (CSR format for edges)
         +-- RelationalSlab (columnar storage)
         +-- CacheRing (LRU/LFU eviction)
         +-- BlobLog (append-only blob storage)

SlabRouter Internals

SlabRouter is the core routing layer that directs operations to specialized storage backends based on key prefixes.

Key Routing Algorithm

flowchart TD
    A[put/get/delete key] --> B{Classify Key}
    B -->|emb:*| C[EmbeddingSlab + MetadataSlab]
    B -->|node:* / edge:*| D[GraphTensor via MetadataSlab]
    B -->|table:*| E[RelationalSlab via MetadataSlab]
    B -->|_cache:*| F[CacheRing]
    B -->|Everything else| G[MetadataSlab]

Key Classification

PrefixKeyClassSlabPurpose
emb:*EmbeddingEmbeddingSlab + EntityIndexEmbedding vectors with stable ID assignment
node:*, edge:*GraphMetadataSlabGraph nodes and edges
table:*TableMetadataSlabRelational rows
_cache:*CacheCacheRingCached data with eviction
_blob:*MetadataMetadataSlabBlob metadata (chunks stored separately)
Everything elseMetadataMetadataSlabGeneral key-value storage

SlabRouter Operation Flow

PUT Operation:

#![allow(unused)]
fn main() {
fn put(&self, key: &str, value: TensorData) {
    match classify_key(key) {
        KeyClass::Embedding => {
            // 1. Get or create stable entity ID
            let entity_id = self.index.get_or_create(key);
            // 2. Extract and store embedding vector
            if let Some(TensorValue::Vector(vec)) = value.get("_embedding") {
                self.embeddings.set(entity_id, vec);
            }
            // 3. Store full metadata
            self.metadata.set(key, value);
        }
        KeyClass::Cache => {
            let size = estimate_size(&value);
            self.cache.put(key, value, 1.0, size);
        }
        _ => self.metadata.set(key, value),
    }
}
}

GET Operation:

#![allow(unused)]
fn main() {
fn get(&self, key: &str) -> Result<TensorData> {
    match classify_key(key) {
        KeyClass::Embedding => {
            // Try to reconstruct from embedding slab + metadata
            if let Some(entity_id) = self.index.get(key) {
                if let Some(vector) = self.embeddings.get(entity_id) {
                    let mut data = self.metadata.get(key).unwrap_or_default();
                    data.set("_embedding", TensorValue::Vector(vector));
                    return Ok(data);
                }
            }
            self.metadata.get(key)
        }
        KeyClass::Cache => self.cache.get(key),
        _ => self.metadata.get(key),
    }
}
}

Specialized Slabs

SlabData StructurePurpose
MetadataSlabRwLock<BTreeMap<String, TensorData>>General key-value storage
EntityIndexSorted vocabulary + hash indexStable ID assignment
EmbeddingSlabDense f32 arrays + BTreeMapEmbedding vectors
GraphTensorCSR format (row pointers + column indices)Graph edges
RelationalSlabColumnar storageTable rows
CacheRingRing buffer with LRU/LFUFixed-size cache
BlobLogAppend-only segmentsLarge binary data

Performance Characteristics

Operation Complexity

OperationTime ComplexityNotes
putO(log n)BTreeMap insert
getO(log n) + cloneClone prevents reference issues
deleteO(log n)BTreeMap remove
existsO(log n)BTreeMap lookup
scanO(k + log n)BTreeMap range, k = result count
scan_countO(k + log n)No allocation
scan_filter_mapO(k + log n)Single-pass filter with selective cloning
lenO(1)Cached count
clearO(n)Clears all data

Throughput Comparison

MetricSlabRouterPrevious (DashMap)
PUT throughput3.1+ M ops/sec2.5 M ops/sec
GET throughput4.9+ M ops/sec4.5 M ops/sec
Throughput variance (CV)12% steady-state222% during resize
Resize stallsNone99.6% throughput drops

Optimized Scan Performance

Use scan_filter_map for selective queries to avoid cloning non-matching entries:

#![allow(unused)]
fn main() {
// Old path: 5000 clones for 5000 rows, ~2.6ms
let users = store.scan("users:");
let matches: Vec<_> = users.iter()
    .filter_map(|key| store.get(key).ok())
    .filter(|data| /* condition */)
    .collect();

// New path: 250 clones for 5% match rate, ~0.13ms (20x faster)
let matches = store.scan_filter_map("users:", |key, data| {
    if /* condition */ {
        Some(data.clone())
    } else {
        None
    }
});
}

Concurrency Model

TensorStore uses tensor-based structures instead of hash maps for predictable performance:

  • No Resize Stalls: BTreeMap and sorted arrays grow incrementally
  • Lock-free Reads: RwLock allows many concurrent readers
  • Predictable Writes: O(log n) inserts, no amortized O(n) resizing
  • Clone on Read: get() returns cloned data to avoid holding references
  • Shareable Storage: TensorStore clones share the same underlying data via Arc

BloomFilter

The BloomFilter provides O(1) probabilistic rejection of non-existent keys, useful for sparse key spaces where most lookups are misses.

Mathematical Foundation

The Bloom filter uses optimal parameters calculated as:

Bit array size: m = -n * ln(p) / (ln(2)^2)

  • Where n = expected items, p = false positive rate

Number of hash functions: k = (m/n) * ln(2)

  • Clamped to range 1 to 16

Implementation Details

#![allow(unused)]
fn main() {
pub struct BloomFilter {
    bits: Box<[AtomicU64]>,  // Atomic u64 blocks for lock-free access
    num_bits: usize,
    num_hashes: usize,
}
}

Hash Function: Uses SipHash with different seeds for each hash function:

#![allow(unused)]
fn main() {
fn hash_index<K: Hash>(&self, key: &K, seed: usize) -> usize {
    let mut hasher = SipHasher::new_with_seed(seed as u64);
    key.hash(&mut hasher);
    (hasher.finish() as usize) % self.num_bits
}
}

Parameter Tuning Guide

Expected ItemsFP RateBitsHash FunctionsMemory
10,0001%95,8517~12 KB
10,0000.1%143,77610~18 KB
100,0001%958,5067~117 KB
1,000,0001%9,585,0597~1.2 MB

Gotchas:

  • Bloom filter state is not persisted in snapshots; rebuild after load
  • Thread-safe via AtomicU64 with Relaxed ordering (eventual consistency)
  • Cannot remove items (use counting bloom filter for that case)
  • False positive rate increases if more items than expected are inserted

HNSW Index

Hierarchical Navigable Small World index for approximate nearest neighbor search with O(log n) complexity.

Algorithm Overview

flowchart TD
    subgraph "HNSW Structure"
        L3[Layer 3: Entry Point] --> L2[Layer 2: Skip connections]
        L2 --> L1[Layer 1: More connections]
        L1 --> L0[Layer 0: All nodes, dense connections]
    end

    subgraph "Search Algorithm"
        S1[Start at entry point, top layer] --> S2[Greedy descent to layer 1]
        S2 --> S3[At layer 0: ef-search candidates]
        S3 --> S4[Return top-k results]
    end

Layer Selection

New nodes are assigned layers using exponential distribution:

#![allow(unused)]
fn main() {
fn random_level(&self) -> usize {
    let f = random_float_0_1();
    let level = (-f.ln() * self.config.ml).floor() as usize;
    level.min(32)  // Cap at 32 layers
}
}

Where ml = 1 / ln(m) and m = connections per layer.

HNSWConfig Parameters

ParameterDefaultDescription
m16Max connections per node per layer
m032Max connections at layer 0 (2*m)
ef_construction200Candidates during construction
ef_search50Candidates during search
ml1/ln(m)Level multiplier
sparsity_threshold0.5Auto-sparse storage threshold
max_nodes10,000,000Capacity limit (prevents memory exhaustion)

Configuration Presets

#![allow(unused)]
fn main() {
// High recall (slower, more accurate)
HNSWConfig::high_recall()  // m=32, m0=64, ef_construction=400, ef_search=200

// High speed (faster, lower recall)
HNSWConfig::high_speed()   // m=8, m0=16, ef_construction=100, ef_search=20

// Custom configuration
HNSWConfig {
    m: 24,
    m0: 48,
    ef_construction: 300,
    ef_search: 100,
    ..Default::default()
}
}

SIMD-Accelerated Distance

Dense vector operations use 8-wide SIMD (f32x8):

#![allow(unused)]
fn main() {
pub fn dot_product(a: &[f32], b: &[f32]) -> f32 {
    let chunks = a.len() / 8;
    let mut sum = f32x8::ZERO;

    for i in 0..chunks {
        let offset = i * 8;
        let va = f32x8::from(&a[offset..offset + 8]);
        let vb = f32x8::from(&b[offset..offset + 8]);
        sum += va * vb;
    }

    // Sum lanes and handle remainder
    let arr: [f32; 8] = sum.into();
    let mut result: f32 = arr.iter().sum();
    // ... scalar remainder handling
}
}

Neighbor Compression

HNSW neighbor lists use delta-varint encoding for 3-8x compression:

#![allow(unused)]
fn main() {
struct CompressedNeighbors {
    compressed: Vec<u8>,  // Delta-varint encoded neighbor IDs
}

// Decompression: O(n) where n = neighbor count
fn get(&self) -> Vec<usize> {
    decompress_ids(&self.compressed)
}

// Compression: Sort + delta encode
fn set(&mut self, ids: &[usize]) {
    let mut sorted = ids.to_vec();
    sorted.sort_unstable();
    self.compressed = compress_ids(&sorted);
}
}

Storage Types

flowchart LR
    subgraph "EmbeddingStorage"
        D[Dense: Vec f32]
        S[Sparse: SparseVector]
        DV[Delta: DeltaVector]
        TT[TensorTrain: TTVectorCached]
    end

    D --> |"sparsity > 50%"| S
    D --> |"clusters around archetype"| DV
    D --> |"high-dim 768+"| TT
Storage TypeMemoryUse CaseDistance Computation
Dense4 bytes/dimGeneral purposeSIMD dot product
Sparse6 bytes/nnz>50% zerosSparse-sparse O(nnz)
Delta6 bytes/diffClustered embeddingsVia archetype
TensorTrain8-10x compression768+ dimensionsNative TT or reconstruct

Edge Cases and Gotchas

  1. Delta vectors cannot be inserted directly - they require archetype registry for distance computation. Convert to Dense first.

  2. TensorTrain storage - While stored in TT format, HNSW reconstructs to dense for fast distance computation during search (native TT distance is O(r^4) per comparison).

  3. Capacity limits - Default max_nodes=10M prevents memory exhaustion from fuzzing/adversarial input. Use try_insert for graceful handling.

  4. Empty index - Entry point is usize::MAX when empty; search returns empty results.

SparseVector

Memory-efficient storage for vectors with many zeros, based on the philosophy that “zero represents absence of information, not a stored value.”

Internal Structure

#![allow(unused)]
fn main() {
pub struct SparseVector {
    dimension: usize,      // Total dimension (shell/boundary)
    positions: Vec<u32>,   // Sorted positions of non-zero values
    values: Vec<f32>,      // Corresponding values
}
}

Operation Complexity

OperationComplexityNotes
from_denseO(n)Filters zeros
to_denseO(n)Reconstructs full vector
get(index)O(log nnz)Binary search
set(index, value)O(nnz)Insert/remove maintains sort
dot(sparse)O(min(nnz_a, nnz_b))Merge-join on positions
dot_dense(dense)O(nnz)Only access stored positions
add(sparse)O(nnz_a + nnz_b)Merge-based
cosine_similarityO(nnz)Using cached magnitudes

Sparse Arithmetic Operations

#![allow(unused)]
fn main() {
// Create delta from before/after states (only stores differences)
let delta = SparseVector::from_diff(&before, &after, threshold);

// Subtraction: self - other
let diff = a.sub(&b);

// Weighted average: (w1 * a + w2 * b) / (w1 + w2)
let merged = a.weighted_average(&b, 0.7, 0.3);

// Project out conflicting component
let orthogonal = v.project_orthogonal(&conflict_direction);
}

Distance Metrics

MetricRangeUse Case
cosine_similarity-1 to 1Directional similarity
angular_distance0 to PILinear for small angles
geodesic_distance0 to PIArc length on unit sphere
jaccard_index0 to 1Structural overlap (positions)
overlap_coefficient0 to 1Subset containment
weighted_jaccard0 to 1Value-weighted structural overlap
euclidean_distance0 to infL2 norm of difference
manhattan_distance0 to infL1 norm of difference

Security: NaN/Inf Sanitization

All similarity metrics sanitize results to prevent consensus ordering issues:

#![allow(unused)]
fn main() {
pub fn cosine_similarity(&self, other: &SparseVector) -> f32 {
    // ... computation ...

    // SECURITY: Sanitize result to valid range
    if result.is_nan() || result.is_infinite() {
        0.0
    } else {
        result.clamp(-1.0, 1.0)
    }
}
}

Memory Efficiency

#![allow(unused)]
fn main() {
let sparse = SparseVector::from_dense(&dense_vec);

// Metrics
sparse.sparsity()           // Fraction of zeros (0.0 - 1.0)
sparse.memory_bytes()       // Actual memory used
sparse.dense_memory_bytes() // Memory if stored dense
sparse.compression_ratio()  // Dense / Sparse ratio
}

For a 1000-dim vector with 90% zeros:

  • Dense: 4000 bytes
  • Sparse: ~800 bytes (100 positions 4 bytes + 100 values 4 bytes)
  • Compression ratio: 5x

Delta Vectors and Archetype Registry

Delta encoding stores vectors as differences from reference “archetype” vectors, providing significant compression for clustered embeddings.

Concept

flowchart LR
    subgraph "Delta Encoding"
        A[Archetype Vector] --> |"+ Delta"| R[Reconstructed Vector]
        D[Delta: positions + values] --> R
    end

When many embeddings cluster around common patterns:

  • Identify archetype vectors (cluster centroids via k-means)
  • Store each embedding as: archetype_id + sparse_delta
  • Reconstruct on demand: archetype + delta = original

DeltaVector Structure

#![allow(unused)]
fn main() {
pub struct DeltaVector {
    archetype_id: usize,       // Reference archetype
    dimension: usize,          // For reconstruction
    positions: Vec<u16>,       // Diff positions (u16 for memory)
    deltas: Vec<f32>,          // Delta values
    cached_magnitude: Option<f32>,  // For fast cosine similarity
}
}

Optimized Dot Products

#![allow(unused)]
fn main() {
// With precomputed archetype dot query
// Total: O(nnz) instead of O(dimension)
let result = delta.dot_dense_with_precomputed(query, archetype_dot_query);

// Between two deltas from SAME archetype
// dot(A, B) = dot(R, R) + dot(R, delta_b) + dot(delta_a, R) + dot(delta_a, delta_b)
let result = a.dot_same_archetype(&b, archetype, archetype_magnitude_sq);
}

ArchetypeRegistry

#![allow(unused)]
fn main() {
// Create registry with max 16 archetypes
let mut registry = ArchetypeRegistry::new(16);

// Discover archetypes via k-means clustering
let config = KMeansConfig {
    max_iterations: 100,
    convergence_threshold: 1e-4,
    seed: 42,
    init_method: KMeansInit::KMeansPlusPlus,  // Better but slower
};
registry.discover_archetypes(&embeddings, 5, config);

// Encode vectors as deltas
let delta = registry.encode(&vector, threshold)?;

// Analyze coverage
let stats = registry.analyze_coverage(&vectors, 0.01);
// stats.avg_similarity, stats.avg_compression_ratio, stats.archetype_usage
}

Persistence

#![allow(unused)]
fn main() {
// Save to TensorStore
registry.save_to_store(&store)?;

// Load from TensorStore
let registry = ArchetypeRegistry::load_from_store(&store, 16)?;
}

Tiered Storage

Two-tier storage with hot (in-memory) and cold (mmap) layers for memory-efficient storage of large datasets.

Architecture

flowchart TD
    subgraph "Hot Tier (In-Memory)"
        H[MetadataSlab]
        I[ShardAccessTracker]
    end

    subgraph "Cold Tier (Mmap)"
        C[MmapStoreMut]
        CK[cold_keys HashSet]
    end

    GET --> H
    H -->|miss| CK
    CK -->|found| C
    C -->|promote| H

    PUT --> H
    H -->|migrate_cold| C

TieredConfig

FieldTypeDefaultDescription
cold_dirPathBuf/tmp/tensor_coldDirectory for cold storage files
cold_capacityusize64MBInitial cold file size
sample_rateu32100Access tracking sampling (100 = 1%)

Migration Algorithm

#![allow(unused)]
fn main() {
pub fn migrate_cold(&mut self, threshold_ms: u64) -> Result<usize> {
    // 1. Find shards not accessed within threshold
    let cold_shards = self.instrumentation.cold_shards(threshold_ms);

    // 2. Collect keys belonging to cold shards
    let keys_to_migrate: Vec<String> = self.hot.scan("")
        .filter(|(key, _)| {
            let shard = shard_for_key(key);
            cold_shards.contains(&shard)
        })
        .map(|(key, _)| key)
        .collect();

    // 3. Move to cold storage
    for key in keys_to_migrate {
        cold.insert(&key, &tensor)?;
        self.cold_keys.insert(key.clone());
        self.hot.delete(&key);
    }

    cold.flush()?;
}
}

Automatic Promotion

When cold data is accessed, it’s automatically promoted back to hot:

#![allow(unused)]
fn main() {
pub fn get(&mut self, key: &str) -> Result<TensorData> {
    // Try hot first
    if let Some(data) = self.hot.get(key) {
        return Ok(data);
    }

    // Try cold
    if self.cold_keys.contains(key) {
        let tensor = self.cold.get(key)?;

        // Promote to hot
        self.hot.set(key, tensor.clone());
        self.cold_keys.remove(key);
        self.migrations_to_hot.fetch_add(1, Ordering::Relaxed);

        return Ok(tensor);
    }

    Err(TensorStoreError::NotFound(key))
}
}

Statistics

#![allow(unused)]
fn main() {
let stats = store.stats();
// stats.hot_count, stats.cold_count
// stats.hot_lookups, stats.cold_lookups, stats.cold_hits
// stats.migrations_to_cold, stats.migrations_to_hot
}

Access Instrumentation

Low-overhead tracking of shard access patterns for intelligent memory tiering.

ShardAccessTracker

#![allow(unused)]
fn main() {
pub struct ShardAccessTracker {
    shards: Box<[ShardStats]>,     // Per-shard counters
    shard_count: usize,            // Default: 16
    start_time: Instant,           // For last_access timestamps
    sample_rate: u32,              // 1 = every access, 100 = 1%
    sample_counter: AtomicU64,     // For sampling
}

// Sampling logic
fn should_sample(&self) -> bool {
    if self.sample_rate == 1 { return true; }
    self.sample_counter.fetch_add(1, Relaxed).is_multiple_of(self.sample_rate)
}
}

Hot/Cold Detection

#![allow(unused)]
fn main() {
// Get shards sorted by access count (hottest first)
let hot = tracker.hot_shards(5);  // Top 5 hottest

// Get shards not accessed within threshold
let cold = tracker.cold_shards(30_000);  // Not accessed in 30s
}

HNSW Access Stats

Specialized instrumentation for HNSW index:

#![allow(unused)]
fn main() {
pub struct HNSWAccessStats {
    entry_point_accesses: AtomicU64,
    layer0_traversals: AtomicU64,
    upper_layer_traversals: AtomicU64,
    total_searches: AtomicU64,
    distance_calculations: AtomicU64,
}

// Snapshot metrics
let stats = hnsw.access_stats()?;
stats.layer0_ratio()          // Layer 0 work fraction
stats.avg_distances_per_search  // Distance calcs per search
stats.searches_per_second()   // Throughput
}

Configuration Options

SlabRouterConfig

FieldTypeDefaultDescription
embedding_dimusize384Embedding dimension for EmbeddingSlab
cache_capacityusize10,000Cache capacity for CacheRing
cache_strategyEvictionStrategyDefaultEviction strategy (LRU/LFU)
blob_segment_sizeusize64MBSegment size for BlobLog
graph_merge_thresholdusize10,000Merge threshold for GraphTensor

Usage Examples

Basic Operations

#![allow(unused)]
fn main() {
let store = TensorStore::new();

// Store a tensor
let mut user = TensorData::new();
user.set("name", TensorValue::Scalar(ScalarValue::String("Alice".into())));
user.set("age", TensorValue::Scalar(ScalarValue::Int(30)));
user.set("embedding", TensorValue::Vector(vec![0.1, 0.2, 0.3, 0.4]));
store.put("user:1", user)?;

// Retrieve
let data = store.get("user:1")?;

// Scan by prefix
let user_keys = store.scan("user:");
let count = store.scan_count("user:");
}

With Bloom Filter

#![allow(unused)]
fn main() {
// Fast rejection of non-existent keys
let store = TensorStore::with_bloom_filter(10_000, 0.01);
store.put("key:1", tensor)?;

// O(1) rejection if key definitely doesn't exist
if store.exists("key:999") { /* ... */ }
}

With Instrumentation

#![allow(unused)]
fn main() {
// Enable access tracking with 1% sampling
let store = TensorStore::with_instrumentation(100);

// After operations, check access patterns
let snapshot = store.access_snapshot()?;
println!("Hot shards: {:?}", store.hot_shards(5)?);
println!("Cold shards: {:?}", store.cold_shards(30_000)?);
}

Shared Storage Across Engines

#![allow(unused)]
fn main() {
let store = TensorStore::new();

// Clone shares the same underlying Arc<SlabRouter>
let store_clone = store.clone();

// Both see the same data
store.put("user:1", user_data)?;
assert!(store_clone.exists("user:1"));

// Use with multiple engines
let vector_engine = VectorEngine::with_store(store.clone());
let graph_engine = GraphEngine::with_store(store.clone());
}

Persistence

#![allow(unused)]
fn main() {
// Save snapshot
store.save_snapshot("data.bin")?;

// Load snapshot
let store = TensorStore::load_snapshot("data.bin")?;

// Load with Bloom filter rebuild
let store = TensorStore::load_snapshot_with_bloom_filter(
    "data.bin",
    10_000,   // expected items
    0.01      // false positive rate
)?;

// Compressed snapshot
use tensor_compress::{CompressionConfig, QuantMode};
let config = CompressionConfig {
    vector_quantization: Some(QuantMode::Int8),  // 4x compression
    delta_encoding: true,
    rle_encoding: true,
};
store.save_snapshot_compressed("data.bin", config)?;
}

Tiered Storage

#![allow(unused)]
fn main() {
use tensor_store::{TieredStore, TieredConfig};

let config = TieredConfig {
    cold_dir: "/data/cold".into(),
    cold_capacity: 64 * 1024 * 1024,
    sample_rate: 100,
};

let mut store = TieredStore::new(config)?;
store.put("user:1", tensor);

// Migrate cold data (not accessed in 30s)
let migrated = store.migrate_cold(30_000)?;

// Check stats
let stats = store.stats();
println!("Hot: {}, Cold: {}", stats.hot_count, stats.cold_count);
}

HNSW Index

#![allow(unused)]
fn main() {
let index = HNSWIndex::with_config(HNSWConfig::default());

// Insert dense, sparse, or auto-select
index.insert(vec![0.1, 0.2, 0.3]);
index.insert_sparse(sparse_vec);
index.insert_auto(mixed_vec);  // Auto-selects dense/sparse

// With capacity checking
match index.try_insert(vec) {
    Ok(id) => println!("Inserted as node {}", id),
    Err(EmbeddingStorageError::CapacityExceeded { limit, current }) => {
        println!("Index full: {} / {}", current, limit);
    }
}

// Search with custom ef
let results = index.search_with_ef(&query, 10, 100);
for (id, similarity) in results {
    println!("Node {}: {:.4}", id, similarity);
}
}

Delta-Encoded Embeddings

#![allow(unused)]
fn main() {
let mut registry = ArchetypeRegistry::new(16);

// Discover archetypes from existing embeddings
registry.discover_archetypes(&embeddings, 5, KMeansConfig::default());

// Encode new vectors as deltas
let results = registry.encode_batch(&embeddings, 0.01);
for (delta, compression_ratio) in results {
    println!("Archetype {}, compression: {:.2}x",
             delta.archetype_id(), compression_ratio);
}
}

Error Types

TensorStoreError

ErrorCause
NotFound(key)get or delete on nonexistent key

SnapshotError

ErrorCause
IoError(std::io::Error)File not found, permission denied, disk full
SerializationError(String)Corrupted file, incompatible format

TieredError

ErrorCause
Store(TensorStoreError)Underlying store error
Mmap(MmapError)Memory-mapped file error
Io(std::io::Error)I/O error
NotConfiguredCold storage not configured

EmbeddingStorageError

ErrorCause
DeltaRequiresRegistryDelta storage used without archetype registry
ArchetypeNotFound(id)Referenced archetype not in registry
CapacityExceeded { limit, current }HNSW index at max_nodes limit
DeltaNotSupportedDelta vectors inserted into HNSW (unsupported)
ModuleRelationship
relational_engineUses TensorStore for table row storage
graph_engineUses TensorStore for node/edge storage
vector_engineUses TensorStore + HNSWIndex for embeddings
tensor_compressProvides compression for snapshots
tensor_checkpointUses TensorStore snapshots for atomic restore
tensor_chainUses TensorStore for blockchain state

Dependencies

CratePurpose
serdeSerialization
bincodeBinary snapshot format
tensor_compressCompression algorithms
wideSIMD operations (f32x8)
memmap2Memory-mapped files
fxhashFast hashing
parking_lotEfficient locks
bitvecBit vectors for bloom filter

Relational Engine

The Relational Engine (Module 2) provides SQL-like table operations on top of the Tensor Store. It implements schema enforcement, composable condition predicates, SIMD-accelerated columnar filtering, and both hash and B-tree indexes for query acceleration.

Tables, rows, and indexes are stored as tensor data in the underlying Tensor Store, inheriting its thread safety from DashMap. The engine supports all standard CRUD operations, six SQL join types, aggregate functions, and batch operations for bulk inserts.

Architecture

flowchart TD
    subgraph RelationalEngine
        API[Public API]
        Schema[Schema Validation]
        Cond[Condition Evaluation]
        Hash[Hash Index]
        BTree[B-Tree Index]
        Columnar[Columnar SIMD]
    end

    API --> Schema
    API --> Cond
    Cond --> Hash
    Cond --> BTree
    Cond --> Columnar

    subgraph TensorStore
        Store[(DashMap Storage)]
        Meta[Table Metadata]
        Rows[Row Data]
        Idx[Index Entries]
    end

    Schema --> Meta
    API --> Rows
    Hash --> Idx
    BTree --> Idx

Query Execution Flow

flowchart TD
    Query[SELECT Query] --> ParseCond[Parse Condition]
    ParseCond --> CheckIdx{Has Index?}

    CheckIdx -->|Hash Index + Eq| HashLookup[O(1) Hash Lookup]
    CheckIdx -->|BTree + Range| BTreeRange[O(log n) Range Scan]
    CheckIdx -->|No Index| FullScan[Full Table Scan]

    HashLookup --> FilterRows[Apply Remaining Conditions]
    BTreeRange --> FilterRows
    FullScan --> SIMDFilter{Columnar Data?}

    SIMDFilter -->|Yes| VectorFilter[SIMD Vectorized Filter]
    SIMDFilter -->|No| RowFilter[Row-by-Row Filter]

    VectorFilter --> Results[Build Result Set]
    RowFilter --> Results
    FilterRows --> Results

Key Types

TypeDescription
RelationalEngineMain engine struct with TensorStore backend
RelationalConfigConfiguration for limits, timeouts, thresholds
SchemaTable schema with column definitions and constraints
ColumnColumn name, type, and nullability
ColumnTypeInt, Float, String, Bool, Bytes, Json
ValueTyped value: Null, Int(i64), Float(f64), String(String), Bool(bool), Bytes(Vec<u8>), Json(Value)
RowRow with ID and ordered column values
ConditionComposable filter predicate tree
ConstraintTable constraint: PrimaryKey, Unique, ForeignKey, NotNull
ForeignKeyConstraintForeign key definition with referential actions
ReferentialActionRestrict, Cascade, SetNull, SetDefault, NoAction
RelationalErrorError variants for table/column/index/constraint operations
ColumnDataColumnar storage for a single column with null bitmap
SelectionVectorBitmap-based row selection for SIMD operations
OrderedKeyB-tree index key with total ordering semantics
StreamingCursorIterator for batch-based query result streaming
CursorBuilderBuilder for customizing streaming cursor options
QueryMetricsQuery execution metrics for observability
IndexTrackerTracks index hits/misses to detect missing indexes

Column Types

TypeRust TypeStorage FormatDescription
Inti648-byte little-endian64-bit signed integer
Floatf648-byte IEEE 75464-bit floating point
StringStringDictionary-encodedUTF-8 string with deduplication
BoolboolPacked bitmap (64 values per u64)Boolean
BytesVec<u8>Raw bytesBinary data
Jsonserde_json::ValueJSON stringJSON value

Conditions

ConditionDescriptionIndex Support
Condition::TrueMatches all rowsN/A
Condition::Eq(col, val)Column equals valueHash Index
Condition::Ne(col, val)Column not equals valueNone
Condition::Lt(col, val)Column less than valueB-Tree Index
Condition::Le(col, val)Column less than or equalB-Tree Index
Condition::Gt(col, val)Column greater than valueB-Tree Index
Condition::Ge(col, val)Column greater than or equalB-Tree Index
Condition::And(a, b)Logical AND of two conditionsPartial (first indexable)
Condition::Or(a, b)Logical OR of two conditionsNone

Conditions can be combined using .and() and .or() methods:

#![allow(unused)]
fn main() {
// age >= 18 AND age < 65
let condition = Condition::Ge("age".into(), Value::Int(18))
    .and(Condition::Lt("age".into(), Value::Int(65)));

// status = 'active' OR priority > 5
let condition = Condition::Eq("status".into(), Value::String("active".into()))
    .or(Condition::Gt("priority".into(), Value::Int(5)));
}

The special column _id filters by row ID and can be indexed.

Error Types

ErrorCause
TableNotFoundTable does not exist
TableAlreadyExistsCreating duplicate table
ColumnNotFoundUpdate references unknown column
ColumnAlreadyExistsColumn already exists in table
TypeMismatchValue type does not match column type
NullNotAllowedNULL in non-nullable column
IndexAlreadyExistsCreating duplicate index
IndexNotFoundDropping non-existent index
IndexCorruptedIndex data is corrupted
StorageErrorUnderlying Tensor Store error
InvalidNameInvalid table or column name
SchemaCorruptedSchema metadata is corrupted
TransactionNotFoundTransaction ID not found
TransactionInactiveTransaction already committed/aborted
LockConflictLock conflict with another transaction
LockTimeoutLock acquisition timed out
RollbackFailedRollback operation failed
ResultTooLargeResult set exceeds maximum size
TooManyTablesMaximum table count exceeded
TooManyIndexesMaximum index count exceeded
QueryTimeoutQuery execution timed out
PrimaryKeyViolationPrimary key constraint violated
UniqueViolationUnique constraint violated
ForeignKeyViolationForeign key constraint violated on insert/update
ForeignKeyRestrictForeign key prevents delete/update
ConstraintNotFoundConstraint does not exist
ConstraintAlreadyExistsConstraint already exists
ColumnHasConstraintColumn has constraint preventing operation
CannotAddColumnCannot add column due to constraint

Storage Model

Tables, rows, and indexes are stored in Tensor Store with specific key patterns:

Key PatternContent
_meta:table:{name}Schema metadata
{table}:{row_id}Row data
_idx:{table}:{column}Hash index metadata
_idx:{table}:{column}:{hash}Hash index entries (list of row IDs)
_btree:{table}:{column}B-tree index metadata
_btree:{table}:{column}:{sortable_key}B-tree index entries
_col:{table}:{column}:dataColumnar data storage
_col:{table}:{column}:idsColumnar row ID mapping
_col:{table}:{column}:nullsColumnar null bitmap
_col:{table}:{column}:metaColumnar metadata

Schema metadata encodes:

  • _columns: Comma-separated column names
  • _col:{name}: Type and nullability for each column

Row Storage Format

Each row is stored as a TensorData object:

#![allow(unused)]
fn main() {
// Internal row structure
{
    "_id": Scalar(Int(row_id)),
    "name": Scalar(String("Alice")),
    "age": Scalar(Int(30)),
    "email": Scalar(String("alice@example.com"))
}
}

Usage Examples

Table Operations

#![allow(unused)]
fn main() {
let engine = RelationalEngine::new();

// Create table with schema
let schema = Schema::new(vec![
    Column::new("name", ColumnType::String),
    Column::new("age", ColumnType::Int),
    Column::new("email", ColumnType::String).nullable(),
]);
engine.create_table("users", schema)?;

// Check existence
engine.table_exists("users")?;  // -> bool

// List all tables
let tables = engine.list_tables();  // -> Vec<String>

// Get schema
let schema = engine.get_schema("users")?;

// Drop table (deletes all rows and indexes)
engine.drop_table("users")?;

// Row count
engine.row_count("users")?;  // -> usize
}

CRUD Operations

#![allow(unused)]
fn main() {
// INSERT
let mut values = HashMap::new();
values.insert("name".to_string(), Value::String("Alice".into()));
values.insert("age".to_string(), Value::Int(30));
let row_id = engine.insert("users", values)?;

// BATCH INSERT (59x faster for bulk inserts)
let rows: Vec<HashMap<String, Value>> = (0..1000)
    .map(|i| {
        let mut values = HashMap::new();
        values.insert("name".to_string(), Value::String(format!("User{}", i)));
        values.insert("age".to_string(), Value::Int(20 + i));
        values
    })
    .collect();
let row_ids = engine.batch_insert("users", rows)?;

// SELECT
let rows = engine.select("users", Condition::Eq("age".into(), Value::Int(30)))?;

// UPDATE
let mut updates = HashMap::new();
updates.insert("age".to_string(), Value::Int(31));
let count = engine.update(
    "users",
    Condition::Eq("name".into(), Value::String("Alice".into())),
    updates
)?;

// DELETE
let count = engine.delete_rows("users", Condition::Lt("age".into(), Value::Int(18)))?;
}

Constraints

The engine supports four constraint types for data integrity:

ConstraintDescription
PrimaryKeyUnique + not null, identifies rows uniquely
UniqueValues must be unique (NULLs allowed)
ForeignKeyReferences rows in another table
NotNullColumn cannot contain NULL values
#![allow(unused)]
fn main() {
use relational_engine::{Constraint, ForeignKeyConstraint, ReferentialAction};

// Create table with constraints
let schema = Schema::with_constraints(
    vec![
        Column::new("id", ColumnType::Int),
        Column::new("email", ColumnType::String),
        Column::new("dept_id", ColumnType::Int).nullable(),
    ],
    vec![
        Constraint::primary_key("pk_users", vec!["id".to_string()]),
        Constraint::unique("uq_email", vec!["email".to_string()]),
    ],
);
engine.create_table("users", schema)?;

// Add constraint after table creation
engine.add_constraint("users", Constraint::not_null("nn_email", "email"))?;

// Add foreign key with referential actions
let fk = ForeignKeyConstraint::new(
    "fk_users_dept",
    vec!["dept_id".to_string()],
    "departments",
    vec!["id".to_string()],
)
.on_delete(ReferentialAction::SetNull)
.on_update(ReferentialAction::Cascade);
engine.add_constraint("users", Constraint::foreign_key(fk))?;

// Get constraints
let constraints = engine.get_constraints("users")?;

// Drop constraint
engine.drop_constraint("users", "uq_email")?;
}

Referential Actions

Foreign keys support these actions on delete/update of referenced rows:

ActionDescription
Restrict (default)Prevent the operation
CascadeCascade to referencing rows
SetNullSet referencing columns to NULL
SetDefaultSet referencing columns to default
NoActionSame as Restrict, checked at commit

ALTER TABLE Operations

#![allow(unused)]
fn main() {
// Add a new column (nullable or with default)
engine.add_column("users", Column::new("phone", ColumnType::String).nullable())?;

// Drop a column (fails if column has constraints)
engine.drop_column("users", "phone")?;

// Rename a column (updates constraints automatically)
engine.rename_column("users", "email", "email_address")?;
}

Joins

All six SQL join types are supported using hash join algorithm (O(n+m)):

#![allow(unused)]
fn main() {
// INNER JOIN - Only matching rows from both tables
let joined = engine.join("users", "posts", "_id", "user_id")?;
// Returns: Vec<(Row, Row)>

// LEFT JOIN - All rows from left, matching from right (or None)
let joined = engine.left_join("users", "posts", "_id", "user_id")?;
// Returns: Vec<(Row, Option<Row>)>

// RIGHT JOIN - All rows from right, matching from left (or None)
let joined = engine.right_join("users", "posts", "_id", "user_id")?;
// Returns: Vec<(Option<Row>, Row)>

// FULL JOIN - All rows from both tables
let joined = engine.full_join("users", "posts", "_id", "user_id")?;
// Returns: Vec<(Option<Row>, Option<Row>)>

// CROSS JOIN (Cartesian product)
let joined = engine.cross_join("users", "posts")?;
// Returns: Vec<(Row, Row)> with n*m rows

// NATURAL JOIN (on common column names)
let joined = engine.natural_join("users", "user_profiles")?;
// Returns: Vec<(Row, Row)> matching on all common columns
}

Aggregate Functions

#![allow(unused)]
fn main() {
// COUNT(*) - count all rows
let count = engine.count("users", Condition::True)?;

// COUNT(column) - count non-null values
let count = engine.count_column("users", "email", Condition::True)?;

// SUM - returns f64
let total = engine.sum("orders", "amount", Condition::True)?;

// AVG - returns Option<f64> (None if no matching rows)
let avg = engine.avg("orders", "amount", Condition::True)?;

// MIN/MAX - returns Option<Value>
let min = engine.min("products", "price", Condition::True)?;
let max = engine.max("products", "price", Condition::True)?;
}

Indexes

Hash Indexes

Hash indexes provide O(1) equality lookups for Condition::Eq queries:

#![allow(unused)]
fn main() {
// Create hash index
engine.create_index("users", "age")?;

// Check existence
engine.has_index("users", "age");  // -> bool

// Get indexed columns
engine.get_indexed_columns("users");  // -> Vec<String>

// Drop index
engine.drop_index("users", "age")?;
}

Hash Index Implementation Details:

graph LR
    subgraph "Hash Index Structure"
        Value[Column Value] --> Hash[hash_key()]
        Hash --> Bucket["_idx:table:col:hash"]
        Bucket --> IDs["Vec<row_id>"]
    end

The hash index uses value-specific hashing:

Value TypeHash FormatExample
Null"null""null"
Int(i)"i:{value}""i:42"
Float(f)"f:{bits}""f:4614253070214989087"
String(s)"s:{hash}""s:a1b2c3d4"
Bool(b)"b:{value}""b:true"

Hash index performance:

Query TypeWithout IndexWith IndexSpeedup
Equality (2% match on 5K rows)5.96ms126us47x
Single row by _id (5K rows)5.59ms3.5us1,597x

B-Tree Indexes

B-tree indexes accelerate range queries (Lt, Le, Gt, Ge) with O(log n + m) complexity:

#![allow(unused)]
fn main() {
// Create B-tree index
engine.create_btree_index("users", "age")?;

// Check existence
engine.has_btree_index("users", "age");  // -> bool

// Get B-tree indexed columns
engine.get_btree_indexed_columns("users");  // -> Vec<String>

// Drop index
engine.drop_btree_index("users", "age")?;

// Range queries now use the index
engine.select("users", Condition::Ge("age".into(), Value::Int(18)))?;
}

B-Tree Index Implementation Details:

The B-tree index uses a dual-storage approach:

  1. In-memory BTreeMap: For O(log n) range operations
  2. Persistent TensorStore: For durability and recovery
#![allow(unused)]
fn main() {
// Internal B-tree index structure
btree_indexes: RwLock<HashMap<
    (String, String),           // (table, column)
    BTreeMap<OrderedKey, Vec<u64>>  // value -> row_ids
>>
}

OrderedKey for Total Ordering:

The OrderedKey enum provides correct ordering semantics:

#![allow(unused)]
fn main() {
pub enum OrderedKey {
    Null,                    // Sorts first
    Bool(bool),              // false < true
    Int(i64),                // Standard integer ordering
    Float(OrderedFloat),     // NaN < all other values
    String(String),          // Lexicographic ordering
}
}

Sortable Key Encoding:

For persistent storage, values are encoded to maintain lexicographic ordering:

TypeEncodingExample
Null"0""0"
Int(i)"i{hex(i + 2^63)}""i8000000000000000" for 0
Float(f)"f{sortable_bits}"IEEE 754 with sign handling
String(s)"s{s}""sAlice"
Bool(b)"b0" or "b1""b1" for true

Integer encoding shifts the range from [-2^63, 2^63-1] to [0, 2^64-1] for correct lexicographic ordering of negative numbers.

B-Tree Range Operations:

#![allow(unused)]
fn main() {
// Internal range lookup
fn btree_range_lookup(&self, table: &str, column: &str,
                      value: &Value, op: RangeOp) -> Option<Vec<u64>> {
    match op {
        RangeOp::Lt => btree.range(..target),
        RangeOp::Le => btree.range(..=target),
        RangeOp::Gt => btree.range((Excluded(target), Unbounded)),
        RangeOp::Ge => btree.range(target..),
    }
}
}

Columnar Architecture

The engine uses columnar storage with SIMD-accelerated filtering:

Columnar Data Structures

graph TD
    subgraph "ColumnData"
        Name[name: String]
        RowIDs[row_ids: Vec<u64>]
        Nulls[nulls: NullBitmap]
        Values[values: ColumnValues]
    end

    subgraph "ColumnValues Variants"
        Int["Int(Vec<i64>)"]
        Float["Float(Vec<f64>)"]
        String["String { dict, indices }"]
        Bool["Bool(Vec<u64>)"]
    end

    subgraph "NullBitmap Variants"
        None["None (no nulls)"]
        Dense["Dense(Vec<u64>)"]
        Sparse["Sparse(Vec<u64>)"]
    end

    Values --> Int
    Values --> Float
    Values --> String
    Values --> Bool
    Nulls --> None
    Nulls --> Dense
    Nulls --> Sparse

Null Bitmap Selection:

  • None: When column has no null values
  • Sparse: When nulls are < 10% of rows (stores positions)
  • Dense: When nulls are >= 10% of rows (stores bitmap)

SIMD Filtering

Column data is stored in contiguous arrays enabling 4-wide SIMD vectorized comparisons using the wide crate:

#![allow(unused)]
fn main() {
// SIMD filter implementation using wide::i64x4
pub fn filter_lt_i64(values: &[i64], threshold: i64, result: &mut [u64]) {
    let chunks = values.len() / 4;
    let threshold_vec = i64x4::splat(threshold);

    for i in 0..chunks {
        let offset = i * 4;
        let v = i64x4::new([
            values[offset],
            values[offset + 1],
            values[offset + 2],
            values[offset + 3],
        ]);
        let cmp = v.cmp_lt(threshold_vec);
        let mask_arr: [i64; 4] = cmp.into();

        for (j, &m) in mask_arr.iter().enumerate() {
            if m != 0 {
                let bit_pos = offset + j;
                result[bit_pos / 64] |= 1u64 << (bit_pos % 64);
            }
        }
    }

    // Handle remainder with scalar fallback
    let start = chunks * 4;
    for i in start..values.len() {
        if values[i] < threshold {
            result[i / 64] |= 1u64 << (i % 64);
        }
    }
}
}

Available SIMD Filter Functions:

FunctionOperationTypes
filter_lt_i64Less thani64
filter_le_i64Less than or equali64
filter_gt_i64Greater thani64
filter_ge_i64Greater than or equali64
filter_eq_i64Equali64
filter_ne_i64Not equali64
filter_lt_f64Less thanf64
filter_gt_f64Greater thanf64
filter_eq_f64Equal (with epsilon)f64

Bitmap Operations:

#![allow(unused)]
fn main() {
// AND two selection bitmaps
pub fn bitmap_and(a: &[u64], b: &[u64], result: &mut [u64])

// OR two selection bitmaps
pub fn bitmap_or(a: &[u64], b: &[u64], result: &mut [u64])

// Count set bits
pub fn popcount(bitmap: &[u64]) -> usize

// Extract selected indices
pub fn selected_indices(bitmap: &[u64], max_count: usize) -> Vec<usize>
}

Selection Vectors

Query results use bitmap-based selection vectors to avoid copying data:

#![allow(unused)]
fn main() {
pub struct SelectionVector {
    bitmap: Vec<u64>,  // Packed bits indicating selected rows
    row_count: usize,
}

impl SelectionVector {
    // Create selection of all rows
    pub fn all(row_count: usize) -> Self;

    // Create empty selection
    pub fn none(row_count: usize) -> Self;

    // Check if row is selected
    pub fn is_selected(&self, idx: usize) -> bool;

    // Count selected rows
    pub fn count(&self) -> usize;

    // AND two selections (intersection)
    pub fn intersect(&self, other: &SelectionVector) -> SelectionVector;

    // OR two selections (union)
    pub fn union(&self, other: &SelectionVector) -> SelectionVector;
}
}

Columnar Select API

#![allow(unused)]
fn main() {
// Materialize columns for SIMD filtering
engine.materialize_columns("users", &["age", "name"])?;

// Check if columnar data exists
engine.has_columnar_data("users", "age");  // -> bool

// Select with columnar scan options
let options = ColumnarScanOptions {
    projection: Some(vec!["name".into()]),  // Only return these columns
    prefer_columnar: true,                   // Use SIMD when available
};

let rows = engine.select_columnar(
    "users",
    Condition::Gt("age".into(), Value::Int(50)),
    options
)?;

// Drop columnar data
engine.drop_columnar_data("users", "age")?;
}

Condition Evaluation

Two evaluation methods are available:

MethodInputPerformanceUse Case
evaluate(&row)Row structLegacy, creates intermediate objectsRow-by-row filtering
evaluate_tensor(&tensor)TensorData31% faster, no intermediate allocationDirect tensor filtering

The engine automatically chooses the optimal evaluation path:

flowchart TD
    Cond[Condition] --> CheckColumnar{Columnar Data Available?}
    CheckColumnar -->|Yes| CheckType{Int Column?}
    CheckColumnar -->|No| RowEval[evaluate_tensor per row]

    CheckType -->|Yes| SIMDEval[SIMD Vectorized Filter]
    CheckType -->|No| RowEval

    SIMDEval --> Bitmap[Selection Bitmap]
    RowEval --> Filter[Filter Matching Rows]

    Bitmap --> Materialize[Materialize Results]
    Filter --> Materialize

Join Algorithm Implementations

Hash Join (INNER, LEFT, RIGHT, FULL)

All equality joins use the hash join algorithm with O(n+m) complexity:

flowchart LR
    subgraph "Build Phase"
        RightTable[Right Table] --> BuildHash[Build Hash Index]
        BuildHash --> HashIndex["HashMap<hash, Vec<idx>>"]
    end

    subgraph "Probe Phase"
        LeftTable[Left Table] --> Probe[Probe Hash Index]
        Probe --> HashIndex
        HashIndex --> Match[Find Matching Rows]
    end

    Match --> Results[Join Results]

Hash Join Implementation:

#![allow(unused)]
fn main() {
pub fn join(&self, table_a: &str, table_b: &str,
            on_a: &str, on_b: &str) -> Result<Vec<(Row, Row)>> {
    let rows_a = self.select(table_a, Condition::True)?;
    let rows_b = self.select(table_b, Condition::True)?;

    // Build phase: index the right table
    let mut index: HashMap<String, Vec<usize>> = HashMap::with_capacity(rows_b.len());
    for (i, row) in rows_b.iter().enumerate() {
        if let Some(val) = row.get_with_id(on_b) {
            let hash = val.hash_key();
            index.entry(hash).or_default().push(i);
        }
    }

    // Probe phase: scan left table and probe index
    let mut results = Vec::with_capacity(min(rows_a.len(), rows_b.len()));
    for row_a in &rows_a {
        if let Some(val) = row_a.get_with_id(on_a) {
            let hash = val.hash_key();
            if let Some(indices) = index.get(&hash) {
                for &i in indices {
                    let row_b = &rows_b[i];
                    // Verify actual equality (handles hash collisions)
                    if row_b.get_with_id(on_b).as_ref() == Some(&val) {
                        results.push((row_a.clone(), row_b.clone()));
                    }
                }
            }
        }
    }
    Ok(results)
}
}

Parallel Join Optimization:

When left table exceeds PARALLEL_THRESHOLD (1000 rows), joins use Rayon for parallel probing:

#![allow(unused)]
fn main() {
if rows_a.len() >= Self::PARALLEL_THRESHOLD {
    rows_a.par_iter()
        .flat_map(|row_a| {
            // Parallel probe of hash index
        })
        .collect()
}
}

Natural Join

Natural join finds all common column names and joins on their equality:

#![allow(unused)]
fn main() {
pub fn natural_join(&self, table_a: &str, table_b: &str) -> Result<Vec<(Row, Row)>> {
    let schema_a = self.get_schema(table_a)?;
    let schema_b = self.get_schema(table_b)?;

    // Find common columns
    let cols_a: HashSet<_> = schema_a.columns.iter().map(|c| c.name.as_str()).collect();
    let cols_b: HashSet<_> = schema_b.columns.iter().map(|c| c.name.as_str()).collect();
    let common_cols: Vec<_> = cols_a.intersection(&cols_b).copied().collect();

    // No common columns = cross join
    if common_cols.is_empty() {
        return self.cross_join(table_a, table_b);
    }

    // Build composite hash key from all common columns
    // ...
}
}

Aggregate Function Internals

Parallel Aggregation

For tables exceeding PARALLEL_THRESHOLD (1000 rows), aggregates use parallel reduction:

#![allow(unused)]
fn main() {
pub fn avg(&self, table: &str, column: &str, condition: Condition) -> Result<Option<f64>> {
    let rows = self.select(table, condition)?;

    let (total, count) = if rows.len() >= Self::PARALLEL_THRESHOLD {
        // Parallel map-reduce
        rows.par_iter()
            .map(|row| extract_numeric(row, column))
            .reduce(|| (0.0, 0u64), |(s1, c1), (s2, c2)| (s1 + s2, c1 + c2))
    } else {
        // Sequential accumulation
        let mut total = 0.0;
        let mut count = 0u64;
        for row in &rows {
            // accumulate...
        }
        (total, count)
    };

    if count == 0 { Ok(None) } else { Ok(Some(total / count as f64)) }
}
}

MIN/MAX with Parallel Reduction

#![allow(unused)]
fn main() {
pub fn min(&self, table: &str, column: &str, condition: Condition) -> Result<Option<Value>> {
    let rows = self.select(table, condition)?;

    if rows.len() >= Self::PARALLEL_THRESHOLD {
        rows.par_iter()
            .filter_map(|row| row.get(column).filter(|v| !matches!(v, Value::Null)))
            .reduce_with(|a, b| {
                if a.partial_cmp_value(&b) == Some(Ordering::Less) { a } else { b }
            })
    } else {
        // Sequential scan
    }
}
}

Performance Characteristics

OperationComplexityNotes
insertO(1) + O(k)Schema validation + store put + k index updates
batch_insertO(n) + O(n*k)Single schema lookup, 59x faster than n inserts
select (no index)O(n)Full table scan with SIMD filter
select (hash index)O(1)Direct lookup via hash index
select (btree range)O(log n + m)B-tree lookup + m matching rows
updateO(n) + O(k)Scan + conditional update + index maintenance
delete_rowsO(n) + O(k)Scan + conditional delete + index removal
joinO(n+m)Hash join for all 6 join types
cross_joinO(n*m)Cartesian product
count/sum/avg/min/maxO(n)Single pass over matching rows
create_indexO(n)Scan all rows to build index
materialize_columnsO(n)Extract column to contiguous array

Where k = number of indexes on the table, n = rows in left table, m = rows in right table.

Parallel Threshold

Operations automatically switch to parallel execution when row count exceeds PARALLEL_THRESHOLD:

#![allow(unused)]
fn main() {
impl RelationalEngine {
    const PARALLEL_THRESHOLD: usize = 1000;
}
}

Parallel Operations:

  • delete_rows (parallel deletion via Rayon)
  • join (parallel probe phase)
  • sum, avg, min, max (parallel reduction)

Configuration

RelationalConfig

The engine can be configured with RelationalConfig:

#![allow(unused)]
fn main() {
let config = RelationalConfig {
    max_tables: Some(1000),              // Maximum tables allowed
    max_indexes_per_table: Some(10),     // Maximum indexes per table
    max_btree_entries: 10_000_000,       // Maximum B-tree index entries
    default_query_timeout_ms: Some(5000),// Default query timeout
    max_query_timeout_ms: Some(300_000), // Maximum allowed timeout (5 min)
    slow_query_threshold_ms: 100,        // Slow query warning threshold
    max_query_result_rows: Some(10_000), // Maximum rows per query
    transaction_timeout_secs: 60,        // Transaction timeout
    lock_timeout_secs: 30,               // Lock acquisition timeout
};
let engine = RelationalEngine::with_config(config);
}
OptionDefaultDescription
max_tablesNone (unlimited)Maximum number of tables
max_indexes_per_tableNone (unlimited)Maximum indexes per table
max_btree_entries10,000,000Maximum B-tree index entries total
default_query_timeout_msNoneDefault timeout for queries
max_query_timeout_ms300,000 (5 min)Maximum allowed query timeout
slow_query_threshold_ms100Threshold for slow query warnings
max_query_result_rowsNone (unlimited)Maximum rows returned per query
transaction_timeout_secs60Transaction timeout
lock_timeout_secs30Lock acquisition timeout

Internal Constants

ConstantValueDescription
PARALLEL_THRESHOLD1000Minimum rows for parallel operations
Null bitmap sparse threshold10%Use sparse bitmap when nulls < 10%
SIMD vector width4i64x4/f64x4 operations

Observability

The observability module provides query metrics, slow query detection, and index usage tracking.

Query Metrics

#![allow(unused)]
fn main() {
use relational_engine::observability::{QueryMetrics, check_slow_query};
use std::time::Duration;

let metrics = QueryMetrics::new("users", "select")
    .with_rows_scanned(10000)
    .with_rows_returned(50)
    .with_index("idx_user_id")
    .with_duration(Duration::from_millis(25));

// Log warning if query exceeds threshold
check_slow_query(&metrics, 100); // threshold in ms
}

Index Tracking

Track index usage to identify missing indexes:

#![allow(unused)]
fn main() {
use relational_engine::observability::IndexTracker;

let tracker = IndexTracker::new();

// Record when index is used
tracker.record_hit("users", "id");

// Record when index could have been used but wasn't
tracker.record_miss("users", "email");

// Get reports of columns needing indexes
let reports = tracker.report_misses();
for report in reports {
    println!(
        "Table {}, column {}: {} misses, {} hits",
        report.table, report.column, report.miss_count, report.hit_count
    );
}

// Aggregate statistics
let total_hits = tracker.total_hits();
let total_misses = tracker.total_misses();
}

Slow Query Warnings

The check_slow_query function logs a tracing::warn! when queries exceed the threshold:

#![allow(unused)]
fn main() {
use relational_engine::observability::{check_slow_query, warn_full_table_scan};

// Warn if query took > 100ms
check_slow_query(&metrics, 100);

// Warn about full table scans on large tables (> 1000 rows)
warn_full_table_scan("users", "select", 5000);
}

Streaming Cursor API

For large result sets, use streaming cursors to avoid loading all rows into memory at once. The cursor fetches rows in configurable batches.

Basic Usage

#![allow(unused)]
fn main() {
use relational_engine::{StreamingCursor, Condition};

// Create streaming cursor with default batch size (1000)
let cursor = engine.select_streaming("users", Condition::True);

// Iterate over results
for row_result in cursor {
    let row = row_result?;
    println!("User: {:?}", row);
}
}

Custom Options

#![allow(unused)]
fn main() {
// With custom batch size
let cursor = engine.select_streaming("users", Condition::True)
    .with_batch_size(100)
    .with_max_rows(5000);

// Using the builder
let cursor = engine.select_streaming_builder("users", Condition::True)
    .batch_size(100)
    .max_rows(5000)
    .build();

// Check cursor state
let mut cursor = engine.select_streaming("users", Condition::True);
while let Some(row) = cursor.next() {
    println!("Yielded so far: {}", cursor.rows_yielded());
}
println!("Exhausted: {}", cursor.is_exhausted());
}

Cursor Methods

MethodDescription
with_batch_size(n)Set rows fetched per batch (default: 1000)
with_max_rows(n)Limit total rows returned
rows_yielded()Number of rows returned so far
is_exhausted()Whether cursor has no more rows

Edge Cases and Gotchas

NULL Handling

  1. NULL in conditions: Comparisons with NULL columns return false:

    #![allow(unused)]
    fn main() {
    // If email is NULL, this returns false (not true!)
    Condition::Lt("email".into(), Value::String("z".into()))
    
    }

2. **NULL in joins**: NULL values never match in join conditions:

   ```rust
   // Post with user_id = NULL will not join with any user
   engine.join("users", "posts", "_id", "user_id")
  1. COUNT vs COUNT(column):
    • count() counts all rows
    • count_column() counts non-null values only

Type Mismatches

Comparisons between incompatible types return false rather than error:

#![allow(unused)]
fn main() {
// Age is Int, comparing with String returns 0 matches (not error)
engine.select("users", Condition::Lt("age".into(), Value::String("30".into())));
}

Index Maintenance

Indexes are automatically maintained on INSERT, UPDATE, and DELETE:

#![allow(unused)]
fn main() {
// Creating index AFTER data exists
engine.insert("users", values)?;  // No index update
engine.create_index("users", "age")?;  // Scans all rows

// Creating index BEFORE data exists
engine.create_index("users", "age")?;  // Empty index
engine.insert("users", values)?;  // Updates index
}

Batch Insert Atomicity

batch_insert validates ALL rows upfront before inserting any:

#![allow(unused)]
fn main() {
let rows = vec![valid_row, invalid_row];
// Fails on validation - NO rows inserted (not partial insert)
engine.batch_insert("users", rows);
}

B-Tree Index Recovery

B-tree indexes maintain both in-memory and persistent state. The in-memory BTreeMap is rebuilt lazily on first access after restart.

Best Practices

Index Selection

Query PatternRecommended Index
WHERE col = valueHash Index
WHERE col > valueB-Tree Index
WHERE col BETWEEN a AND bB-Tree Index
WHERE col IN (...)Hash Index
Unique lookups by IDHash Index on _id

Columnar Materialization

Materialize columns when:

  • Performing many range scans on large tables
  • Query selectivity is low (scanning most rows)
  • Column data fits in memory
#![allow(unused)]
fn main() {
// Good: Materialize frequently-filtered columns
engine.materialize_columns("events", &["timestamp", "user_id"])?;

// Query uses SIMD acceleration
engine.select_columnar("events",
    Condition::Gt("timestamp".into(), Value::Int(cutoff)),
    ColumnarScanOptions { prefer_columnar: true, .. }
)?;
}

Batch Operations

Use batch_insert for bulk loading:

#![allow(unused)]
fn main() {
// Bad: 1000 individual inserts
for row in rows {
    engine.insert("table", row)?;  // 1000 schema lookups
}

// Good: Single batch insert
engine.batch_insert("table", rows)?;  // 1 schema lookup, 59x faster
}

SQL Features via Query Router

When using the relational engine through query_router, additional SQL features are available:

ORDER BY and OFFSET

SELECT * FROM users ORDER BY age ASC;
SELECT * FROM users ORDER BY department DESC, name ASC;
SELECT * FROM users ORDER BY email NULLS FIRST;
SELECT * FROM users ORDER BY created_at DESC LIMIT 10 OFFSET 20;

GROUP BY and HAVING

SELECT department, COUNT(*), AVG(salary) FROM employees GROUP BY department;

SELECT product, SUM(quantity) as total
FROM orders
GROUP BY product
HAVING SUM(quantity) > 100;
ModuleRelationship
tensor_storeStorage backend for tables, rows, and indexes
query_routerExecutes SQL queries using RelationalEngine
neumann_parserParses SQL statements into AST
tensor_unifiedMulti-engine unified storage layer

Feature Summary

Implemented

FeatureDescription
Hash indexesO(1) equality lookups
B-tree indexesO(log n) range query acceleration
All 6 JOIN typesINNER, LEFT, RIGHT, FULL, CROSS, NATURAL
Aggregate functionsCOUNT, SUM, AVG, MIN, MAX
ORDER BYMulti-column sorting with ASC/DESC, NULLS FIRST/LAST
LIMIT/OFFSETPagination support
GROUP BY + HAVINGRow grouping with aggregate filtering
Columnar storageSIMD-accelerated filtering with selection vectors
Batch operations59x faster bulk inserts
Parallel operationsRayon-based parallelism for large tables
Dictionary encodingString column compression
TransactionsRow-level ACID with undo log - see Transactions
ConstraintsPRIMARY KEY, UNIQUE, FOREIGN KEY, NOT NULL
Foreign KeysFull referential integrity with CASCADE/SET NULL/RESTRICT
ALTER TABLEadd_column, drop_column, rename_column
Streaming cursorsMemory-efficient iteration over large result sets
ObservabilityQuery metrics, slow query detection, index tracking

Future Considerations

FeatureStatus
Query OptimizationNot implemented
SubqueriesNot implemented
Window FunctionsNot implemented
Composite IndexesNot implemented

Relational Engine Transactions

Local ACID transactions for single-shard relational operations. This module provides row-level locking, undo logging for rollback, and timeout-based deadlock prevention.

Transactions in the relational engine operate within a single shard and do not coordinate with other nodes. For distributed transactions across multiple shards, see Distributed Transactions.

Architecture

flowchart TD
    subgraph TransactionManager
        TxMap[Transaction Map]
        TxCounter[TX Counter]
        DefaultTimeout[Default Timeout]
    end

    subgraph RowLockManager
        LockMap["Locks: (table, row_id) -> RowLock"]
        TxLocks["TX Locks: tx_id -> Vec<(table, row_id)>"]
        LockTimeout[Lock Timeout]
    end

    subgraph Transaction
        TxId[Transaction ID]
        Phase[TxPhase State]
        UndoLog["Undo Log: Vec<UndoEntry>"]
        AffectedTables[Affected Tables]
        StartTime[Started At]
    end

    TransactionManager --> RowLockManager
    TransactionManager --> Transaction
    Transaction --> UndoLog

Component Relationships

+-------------------+
| RelationalEngine  |
+-------------------+
         |
         v
+-------------------+     +------------------+
| TransactionManager| --> | RowLockManager   |
| - begin()         |     | - try_lock()     |
| - commit()        |     | - release()      |
| - rollback()      |     | - cleanup_expired|
+-------------------+     +------------------+
         |
         v
+-------------------+
| Transaction       |
| - tx_id           |
| - phase           |
| - undo_log        |
+-------------------+

Transaction Lifecycle

Transactions follow a 5-state machine with well-defined transitions:

flowchart LR
    Active --> Committing
    Active --> Aborting
    Committing --> Committed
    Aborting --> Aborted

    style Committed fill:#9f9
    style Aborted fill:#f99
PhaseDescriptionValid Transitions
ActiveOperations allowed, locks acquiredCommitting, Aborting
CommittingFinalizing changesCommitted
CommittedChanges permanent (terminal)None
AbortingRolling back via undo logAborted
AbortedChanges reverted (terminal)None

Lifecycle Flow

#![allow(unused)]
fn main() {
// Begin transaction
let tx_id = tx_manager.begin();
// Phase: Active

// Acquire row locks for modifications
lock_manager.try_lock(tx_id, &[("users", 1), ("users", 2)])?;

// Record undo entries for rollback
tx_manager.record_undo(tx_id, UndoEntry::UpdatedRow { ... });

// Option 1: Commit
tx_manager.set_phase(tx_id, TxPhase::Committing);
// Apply changes...
tx_manager.set_phase(tx_id, TxPhase::Committed);
tx_manager.release_locks(tx_id);

// Option 2: Rollback
tx_manager.set_phase(tx_id, TxPhase::Aborting);
for entry in tx_manager.get_undo_log(tx_id).iter().rev() {
    // Apply undo entry...
}
tx_manager.set_phase(tx_id, TxPhase::Aborted);
tx_manager.release_locks(tx_id);
}

Undo Log

The undo log stores entries needed to reverse each operation on rollback. Entries are applied in reverse order during rollback.

UndoEntry Variants

VariantAction on RollbackStored Data
InsertedRowDelete the rowtable, slab_row_id, row_id, index_entries
UpdatedRowRestore old valuestable, slab_row_id, row_id, old_values, index_changes
DeletedRowRe-insert the rowtable, slab_row_id, row_id, old_values, index_entries

Undo Log Structure

Transaction Undo Log:
+---------------------------------------------------+
| Entry 0: InsertedRow { table: "users", row_id: 5 }|
+---------------------------------------------------+
| Entry 1: UpdatedRow { table: "users", row_id: 3,  |
|          old_values: [Int(25)], ... }             |
+---------------------------------------------------+
| Entry 2: DeletedRow { table: "orders", row_id: 7, |
|          old_values: [...], index_entries: [...] }|
+---------------------------------------------------+

Rollback order: Entry 2 -> Entry 1 -> Entry 0

Index Change Tracking

Updates that modify indexed columns record IndexChange entries:

#![allow(unused)]
fn main() {
pub struct IndexChange {
    pub column: String,     // Column name
    pub old_value: Value,   // Value before update
    pub new_value: Value,   // Value after update
}
}

On rollback, index entries are reverted:

  1. Remove new index entry for the new value
  2. Restore old index entry for the old value

Row-Level Locking

The RowLockManager provides pessimistic row-level locking with atomic multi-row acquisition.

Lock Acquisition

flowchart TD
    Request[Lock Request] --> Check{All rows available?}
    Check -->|Yes| Acquire[Acquire all locks atomically]
    Check -->|No| Conflict[Return LockConflictInfo]
    Acquire --> Success[Success]

    style Conflict fill:#f99
    style Success fill:#9f9

Locks are acquired atomically to prevent partial lock acquisition:

#![allow(unused)]
fn main() {
// Atomic multi-row locking
let rows = vec![
    ("users".to_string(), 1),
    ("users".to_string(), 2),
    ("orders".to_string(), 5),
];

match lock_manager.try_lock(tx_id, &rows) {
    Ok(()) => {
        // All locks acquired
    }
    Err(conflict) => {
        // No locks acquired, conflict info provided
        println!("Blocked by tx {}", conflict.blocking_tx);
    }
}
}

Lock Semantics

PropertyBehavior
GranularityRow-level: (table, row_id)
AcquisitionAll-or-nothing atomic
Re-entrySame transaction can re-acquire its locks
TimeoutConfigurable, default 30 seconds
ExpirationExpired locks are treated as available

Lock Conflict Detection

When a lock conflict occurs, LockConflictInfo provides details:

#![allow(unused)]
fn main() {
pub struct LockConflictInfo {
    pub blocking_tx: u64,   // Transaction holding the lock
    pub table: String,      // Table name
    pub row_id: u64,        // Row ID
}
}

Expired Lock Handling

Locks automatically expire after the configured timeout:

#![allow(unused)]
fn main() {
// Check if lock is expired
if lock.is_expired() {
    // Lock can be acquired by another transaction
}

// Periodic cleanup of expired locks
let cleaned = lock_manager.cleanup_expired();
}

Deadline

The Deadline struct provides monotonic time-based timeout checking:

#![allow(unused)]
fn main() {
// Create deadline with timeout
let deadline = Deadline::from_timeout_ms(Some(5000));

// Check expiration
if deadline.is_expired() {
    return Err(TimeoutError);
}

// Get remaining time
if let Some(remaining) = deadline.remaining_ms() {
    println!("{} ms remaining", remaining);
}

// Never-expiring deadline
let no_deadline = Deadline::never();
}

Benefits of monotonic time (Instant):

  • Immune to system clock changes
  • Consistent timeout behavior
  • No backwards time jumps

Configuration

Transaction Manager

ParameterDefaultDescription
default_timeout60 secondsMaximum transaction duration

Row Lock Manager

ParameterDefaultDescription
default_timeout30 secondsMaximum time to hold a lock

Custom Configuration

#![allow(unused)]
fn main() {
use std::time::Duration;

// Custom transaction timeout
let tx_manager = TransactionManager::with_timeout(
    Duration::from_secs(120)
);

// Custom lock timeout
let lock_manager = RowLockManager::with_default_timeout(
    Duration::from_secs(60)
);
}

Error Handling

Lock Errors

ErrorCauseRecovery
LockConflictRow locked by another transactionRetry with exponential backoff
Lock timeoutCould not acquire lock in timeRollback and retry

Transaction Errors

ErrorCauseRecovery
Transaction not foundInvalid transaction IDStart new transaction
Transaction expiredExceeded timeoutTransaction auto-aborted
Invalid phase transitionIllegal state changeCheck transaction state

Cleanup Operations

#![allow(unused)]
fn main() {
// Clean up expired transactions (releases their locks)
let expired_count = tx_manager.cleanup_expired();

// Clean up expired locks only
let expired_locks = lock_manager.cleanup_expired();
}

Comparison with Distributed Transactions

AspectRelational TxDistributed Tx (2PC)
ScopeSingle shardCross-shard
ProtocolLocal lockingPrepare/Commit phases
Deadlock detectionTimeout-basedWait-for graph analysis
CoordinatorNoneDistributedTxCoordinator
RecoveryUndo logWAL + 2PC recovery
LatencyLow (local)Higher (network round-trips)
IsolationRow-level locksKey-level locks

For cross-shard transactions, use tensor_chain’s Distributed Transactions.

Usage Examples

Basic Transaction

#![allow(unused)]
fn main() {
let engine = RelationalEngine::new();
let tx_manager = engine.tx_manager();

// Begin transaction
let tx_id = tx_manager.begin();

// Perform operations (simplified)
// engine.insert_tx(tx_id, "users", values)?;
// engine.update_tx(tx_id, "users", condition, updates)?;

// Commit
tx_manager.set_phase(tx_id, TxPhase::Committing);
// Apply pending changes...
tx_manager.set_phase(tx_id, TxPhase::Committed);
tx_manager.release_locks(tx_id);
tx_manager.remove(tx_id);
}

Rollback on Error

#![allow(unused)]
fn main() {
let tx_id = tx_manager.begin();

match perform_operations(tx_id) {
    Ok(()) => {
        tx_manager.set_phase(tx_id, TxPhase::Committed);
    }
    Err(e) => {
        tx_manager.set_phase(tx_id, TxPhase::Aborting);

        // Apply undo log in reverse
        if let Some(undo_log) = tx_manager.get_undo_log(tx_id) {
            for entry in undo_log.iter().rev() {
                apply_undo(entry);
            }
        }

        tx_manager.set_phase(tx_id, TxPhase::Aborted);
    }
}

tx_manager.release_locks(tx_id);
tx_manager.remove(tx_id);
}

Checking Lock Status

#![allow(unused)]
fn main() {
let lock_manager = tx_manager.lock_manager();

// Check if row is locked
if lock_manager.is_locked("users", 42) {
    println!("Row is locked");
}

// Get lock holder
if let Some(holder_tx) = lock_manager.lock_holder("users", 42) {
    println!("Locked by transaction {}", holder_tx);
}

// Count active locks
println!("{} active locks", lock_manager.active_lock_count());
println!("{} locks held by tx {}", lock_manager.locks_held_by(tx_id), tx_id);
}

Key Types

TypeDescription
TxPhase5-state transaction phase enum
TransactionActive transaction state
UndoEntryUndo log entry for rollback
IndexChangeIndex modification record
RowLockRow lock with timeout
RowLockManagerRow-level lock manager
LockConflictInfoLock conflict details
DeadlineMonotonic timeout checker
TransactionManagerTransaction lifecycle manager

Source References

  • relational_engine/src/transaction.rs - Transaction implementation
  • relational_engine/src/lib.rs - Integration with RelationalEngine

Graph Engine

The Graph Engine provides graph operations on top of the Tensor Store. It implements a labeled property graph model with support for both directed and undirected edges, BFS traversals, and shortest path finding. The engine inherits thread safety from TensorStore and supports cross-engine unified entity connections.

Design Principles

PrincipleDescription
Layered ArchitectureDepends only on Tensor Store for persistence
Direction-AwareSupports both directed and undirected edges
BFS TraversalBreadth-first search for shortest paths
Cycle-SafeHandles cyclic graphs without infinite loops via visited set
Unified EntitiesEdges can connect shared entities across engines
Thread SafetyInherits from Tensor Store’s DashMap (~16 shards)
Serializable TypesAll types implement serde Serialize/Deserialize
Parallel OptimizationHigh-degree node deletion uses rayon for parallelism

Key Types

Core Types

TypeDescription
GraphEngineMain entry point for graph operations
NodeGraph node with id, label, and properties
EdgeGraph edge with from/to nodes, type, properties, and direction flag
PathResult of path finding containing node and edge sequences
DirectionEdge traversal direction (Outgoing, Incoming, Both)
PropertyValueNode/edge property values (Null, Int, Float, String, Bool)
GraphErrorError types for graph operations

PropertyValue Variants

VariantRust TypeDescription
NullNULL value
Inti6464-bit signed integer
Floatf6464-bit floating point
StringStringUTF-8 string
BoolboolBoolean

Error Types

ErrorCause
NodeNotFound(u64)Node with given ID does not exist
EdgeNotFound(u64)Edge with given ID does not exist
PathNotFoundNo path exists between the specified nodes
StorageError(String)Underlying Tensor Store error

Architecture

graph TB
    subgraph GraphEngine
        GE[GraphEngine]
        NC[Node Counter<br/>AtomicU64]
        EC[Edge Counter<br/>AtomicU64]
    end

    subgraph Storage["Storage Model"]
        NM["node:{id}"]
        NO["node:{id}:out"]
        NI["node:{id}:in"]
        EM["edge:{id}"]
    end

    subgraph Operations
        CreateNode[create_node]
        CreateEdge[create_edge]
        Neighbors[neighbors]
        Traverse[traverse]
        FindPath[find_path]
    end

    GE --> NC
    GE --> EC
    GE --> TS[TensorStore]

    CreateNode --> NM
    CreateNode --> NO
    CreateNode --> NI
    CreateEdge --> EM
    CreateEdge --> NO
    CreateEdge --> NI

    Neighbors --> NO
    Neighbors --> NI
    Traverse --> Neighbors
    FindPath --> NO

Internal Architecture

GraphEngine Struct

#![allow(unused)]
fn main() {
pub struct GraphEngine {
    store: TensorStore,           // Underlying key-value storage
    node_counter: AtomicU64,      // Atomic counter for node IDs
    edge_counter: AtomicU64,      // Atomic counter for edge IDs
}
}

The engine uses atomic counters (SeqCst ordering) to generate unique IDs:

  • Node IDs start at 1 and increment monotonically
  • Edge IDs are separate from node IDs
  • Both counters support concurrent ID allocation

Key Generation Functions

#![allow(unused)]
fn main() {
fn node_key(id: u64) -> String { format!("node:{}", id) }
fn edge_key(id: u64) -> String { format!("edge:{}", id) }
fn outgoing_edges_key(node_id: u64) -> String { format!("node:{}:out", node_id) }
fn incoming_edges_key(node_id: u64) -> String { format!("node:{}:in", node_id) }
}

Storage Model

Nodes and edges are stored in Tensor Store using the following key patterns:

Key PatternContentTensorData Fields
node:{id}Node data_id, _type="node", _label, user properties
node:{id}:outList of outgoing edge IDse{edge_id} fields
node:{id}:inList of incoming edge IDse{edge_id} fields
edge:{id}Edge data_id, _type="edge", _from, _to, _edge_type, _directed, user properties

Edge List Storage Format

Edge lists are stored as TensorData with dynamically named fields:

#![allow(unused)]
fn main() {
// Each edge ID stored as: "e{edge_id}" -> edge_id
tensor.set("e1", TensorValue::Scalar(ScalarValue::Int(1)));
tensor.set("e5", TensorValue::Scalar(ScalarValue::Int(5)));
}

This format allows O(1) edge addition but O(n) edge listing. The edge retrieval scans all keys starting with ‘e’:

#![allow(unused)]
fn main() {
fn get_edge_list(&self, key: &str) -> Result<Vec<u64>> {
    let tensor = self.store.get(key)?;
    let mut edges = Vec::new();
    for k in tensor.keys() {
        if k.starts_with('e') {
            if let Some(TensorValue::Scalar(ScalarValue::Int(id))) = tensor.get(k) {
                edges.push(*id as u64);
            }
        }
    }
    Ok(edges)
}
}

API Reference

Engine Construction

#![allow(unused)]
fn main() {
// Create new engine with internal store
let engine = GraphEngine::new();

// Create engine with shared store (for cross-engine queries)
let store = TensorStore::new();
let engine = GraphEngine::with_store(store.clone());

// Access underlying store
let store = engine.store();
}

Node Operations

#![allow(unused)]
fn main() {
// Create node with properties
let mut props = HashMap::new();
props.insert("name".to_string(), PropertyValue::String("Alice".into()));
props.insert("age".to_string(), PropertyValue::Int(30));
let id = engine.create_node("Person", props)?;

// Get node by ID
let node = engine.get_node(id)?;

// Check node existence
let exists = engine.node_exists(id);

// Delete node (cascades to connected edges)
engine.delete_node(id)?;

// Count nodes in graph
let count = engine.node_count();
}

Edge Operations

#![allow(unused)]
fn main() {
// Create directed edge
let edge_id = engine.create_edge(from, to, "KNOWS", properties, true)?;

// Create undirected edge
let edge_id = engine.create_edge(from, to, "FRIENDS", properties, false)?;

// Get edge by ID
let edge = engine.get_edge(edge_id)?;
}

Undirected Edge Implementation

When an undirected edge is created, it is added to four edge lists to enable bidirectional traversal:

#![allow(unused)]
fn main() {
if !directed {
    // Add to both nodes' outgoing AND incoming lists
    self.add_edge_to_list(Self::outgoing_edges_key(to), id)?;
    self.add_edge_to_list(Self::incoming_edges_key(from), id)?;
}
}

This enables undirected edges to be traversed from either endpoint regardless of direction filter.

Traversal Operations

#![allow(unused)]
fn main() {
// Get neighbors (all edge types, both directions)
let neighbors = engine.neighbors(node_id, None, Direction::Both)?;

// Get neighbors filtered by edge type
let friends = engine.neighbors(node_id, Some("FRIENDS"), Direction::Both)?;

// BFS traversal with depth limit
let nodes = engine.traverse(start_id, Direction::Outgoing, max_depth, None)?;

// Traversal filtered by edge type
let deps = engine.traverse(start_id, Direction::Outgoing, 10, Some("DEPENDS_ON"))?;

// Find shortest path (BFS)
let path = engine.find_path(from_id, to_id)?;
}

Direction Enum

DirectionBehavior
OutgoingFollow edges away from the node
IncomingFollow edges toward the node
BothFollow edges in either direction

BFS Traversal Algorithm

The traverse method implements breadth-first search with depth limiting and cycle detection:

flowchart TD
    Start[Start: traverse] --> Init[Initialize visited set<br/>Initialize result vec<br/>Initialize queue with start, depth=0]
    Init --> Check{Queue empty?}
    Check -- No --> Pop[Pop current_id, depth]
    Pop --> GetNode[Get node, add to result]
    GetNode --> DepthCheck{depth >= max_depth?}
    DepthCheck -- Yes --> Check
    DepthCheck -- No --> GetNeighbors[Get neighbor IDs]
    GetNeighbors --> ForEach[For each neighbor]
    ForEach --> Visited{Already visited?}
    Visited -- Yes --> ForEach
    Visited -- No --> Add[Add to visited<br/>Push to queue with depth+1]
    Add --> ForEach
    Check -- Yes --> Return[Return result]

Implementation Details

#![allow(unused)]
fn main() {
pub fn traverse(
    &self,
    start: u64,
    direction: Direction,
    max_depth: usize,
    edge_type: Option<&str>,
) -> Result<Vec<Node>> {
    if !self.node_exists(start) {
        return Err(GraphError::NodeNotFound(start));
    }

    let mut visited = HashSet::new();
    let mut result = Vec::new();
    let mut queue = VecDeque::new();

    queue.push_back((start, 0usize));
    visited.insert(start);

    while let Some((current_id, depth)) = queue.pop_front() {
        if let Ok(node) = self.get_node(current_id) {
            result.push(node);
        }

        if depth >= max_depth {
            continue;
        }

        let neighbors = self.get_neighbor_ids(current_id, edge_type, direction)?;
        for neighbor_id in neighbors {
            if !visited.contains(&neighbor_id) {
                visited.insert(neighbor_id);
                queue.push_back((neighbor_id, depth + 1));
            }
        }
    }

    Ok(result)
}
}

Key Properties

  • Cycle-Safe: The visited HashSet prevents revisiting nodes
  • Depth-Limited: The max_depth parameter bounds traversal depth
  • Level-Order: BFS naturally visits nodes in level order
  • Start Node Included: The starting node is always in the result at depth 0

Shortest Path Algorithm

The find_path method uses BFS to find the shortest (minimum hop) path between two nodes:

flowchart TD
    Start[Start: find_path] --> Validate[Validate from and to exist]
    Validate --> SameNode{from == to?}
    SameNode -- Yes --> ReturnSingle[Return path with single node]
    SameNode -- No --> InitBFS[Initialize BFS:<br/>visited set<br/>queue with from<br/>parent map]
    InitBFS --> BFSLoop{Queue empty?}
    BFSLoop -- Yes --> NotFound[Return PathNotFound]
    BFSLoop -- No --> Dequeue[Dequeue current node]
    Dequeue --> GetEdges[Get outgoing edges]
    GetEdges --> ForEdge[For each edge]
    ForEdge --> GetNeighbor[Determine neighbor<br/>considering direction]
    GetNeighbor --> VisitedCheck{Visited?}
    VisitedCheck -- Yes --> ForEdge
    VisitedCheck -- No --> MarkVisited[Mark visited<br/>Record parent + edge]
    MarkVisited --> FoundTarget{neighbor == to?}
    FoundTarget -- Yes --> Reconstruct[Reconstruct path]
    FoundTarget -- No --> Enqueue[Enqueue neighbor]
    Enqueue --> ForEdge
    ForEdge --> BFSLoop
    Reconstruct --> Return[Return Path]

Implementation Details

#![allow(unused)]
fn main() {
pub fn find_path(&self, from: u64, to: u64) -> Result<Path> {
    // Validate endpoints exist
    if !self.node_exists(from) {
        return Err(GraphError::NodeNotFound(from));
    }
    if !self.node_exists(to) {
        return Err(GraphError::NodeNotFound(to));
    }

    // Handle trivial case
    if from == to {
        return Ok(Path {
            nodes: vec![from],
            edges: vec![],
        });
    }

    // BFS for shortest path
    let mut visited = HashSet::new();
    let mut queue = VecDeque::new();
    let mut parent: HashMap<u64, (u64, u64)> = HashMap::new(); // node -> (parent_node, edge_id)

    queue.push_back(from);
    visited.insert(from);

    while let Some(current) = queue.pop_front() {
        let out_edges = self.get_edge_list(&Self::outgoing_edges_key(current))?;

        for edge_id in out_edges {
            if let Ok(edge) = self.get_edge(edge_id) {
                let neighbor = if edge.from == current {
                    edge.to
                } else if !edge.directed && edge.to == current {
                    edge.from
                } else {
                    continue;
                };

                if !visited.contains(&neighbor) {
                    visited.insert(neighbor);
                    parent.insert(neighbor, (current, edge_id));

                    if neighbor == to {
                        return Ok(self.reconstruct_path(from, to, &parent));
                    }

                    queue.push_back(neighbor);
                }
            }
        }
    }

    Err(GraphError::PathNotFound)
}
}

Path Reconstruction

The path is reconstructed by following parent pointers backwards from the target to the source:

#![allow(unused)]
fn main() {
fn reconstruct_path(&self, from: u64, to: u64, parent: &HashMap<u64, (u64, u64)>) -> Path {
    let mut nodes = Vec::new();
    let mut edges = Vec::new();
    let mut current = to;

    // Walk backwards from target to source
    while current != from {
        nodes.push(current);
        if let Some((p, edge_id)) = parent.get(&current) {
            edges.push(*edge_id);
            current = *p;
        } else {
            break;
        }
    }
    nodes.push(from);

    // Reverse to get source-to-target order
    nodes.reverse();
    edges.reverse();

    Path { nodes, edges }
}
}

Parallel Deletion Optimization

High-degree nodes (>100 edges) use rayon’s parallel iterator for edge deletion:

#![allow(unused)]
fn main() {
const PARALLEL_THRESHOLD: usize = 100;

pub fn delete_node(&self, id: u64) -> Result<()> {
    if !self.node_exists(id) {
        return Err(GraphError::NodeNotFound(id));
    }

    // Collect all connected edges
    let out_edges = self.get_edge_list(&Self::outgoing_edges_key(id))?;
    let in_edges = self.get_edge_list(&Self::incoming_edges_key(id))?;
    let all_edges: Vec<u64> = out_edges.into_iter().chain(in_edges).collect();

    // Parallel deletion for high-degree nodes
    if all_edges.len() >= Self::PARALLEL_THRESHOLD {
        all_edges.par_iter().for_each(|edge_id| {
            let _ = self.store.delete(&Self::edge_key(*edge_id));
        });
    } else {
        for edge_id in all_edges {
            let _ = self.store.delete(&Self::edge_key(edge_id));
        }
    }

    // Delete node and edge lists
    self.store.delete(&Self::node_key(id))?;
    self.store.delete(&Self::outgoing_edges_key(id))?;
    self.store.delete(&Self::incoming_edges_key(id))?;

    Ok(())
}
}

Performance Characteristics

Edge CountDeletion StrategyBenefit
< 100SequentialLower overhead for small nodes
>= 100Parallel (rayon)~2-4x speedup on multi-core systems

Unified Entity API

The Unified Entity API connects any shared entities (not just graph nodes) for cross-engine queries. Entity edges use the _out and _in reserved fields in TensorData, enabling the same entity key to have relational fields, graph connections, and a vector embedding.

graph LR
    subgraph Entity["Entity (TensorData)"]
        Fields[User Fields<br/>name, age, etc.]
        Out["_out<br/>[edge keys]"]
        In["_in<br/>[edge keys]"]
        Emb["_embedding<br/>[vector]"]
    end

    subgraph Engines
        RE[Relational Engine]
        GE[Graph Engine]
        VE[Vector Engine]
    end

    Fields --> RE
    Out --> GE
    In --> GE
    Emb --> VE

Reserved Fields

FieldTypePurpose
_outVec<String>Outgoing edge keys
_inVec<String>Incoming edge keys
_embeddingVec<f32>Vector embedding
_typeStringEntity type
_idi64Entity numeric ID
_labelStringEntity label

Entity Edge Key Format

Entity edges use a different key format from node-based edges:

edge:{edge_type}:{edge_id}

For example: edge:follows:42

API Reference

#![allow(unused)]
fn main() {
// Create engine with shared store
let store = TensorStore::new();
let engine = GraphEngine::with_store(store.clone());

// Add directed edge between entities
let edge_key = engine.add_entity_edge("user:1", "user:2", "follows")?;

// Add undirected edge between entities
let edge_key = engine.add_entity_edge_undirected("user:1", "user:2", "friend")?;

// Get neighbors
let neighbors = engine.get_entity_neighbors("user:1")?;
let out_neighbors = engine.get_entity_neighbors_out("user:1")?;
let in_neighbors = engine.get_entity_neighbors_in("user:1")?;

// Get edge lists
let outgoing = engine.get_entity_outgoing("user:1")?;
let incoming = engine.get_entity_incoming("user:1")?;

// Get edge details
let (from, to, edge_type, directed) = engine.get_entity_edge(&edge_key)?;

// Check if entity has edges
let has_edges = engine.entity_has_edges("user:1");

// Delete edge
engine.delete_entity_edge(&edge_key)?;

// Scan for entities with edges
let entities = engine.scan_entities_with_edges();
}

Undirected Entity Edges

For undirected entity edges, both entities receive the edge in both _out and _in:

#![allow(unused)]
fn main() {
pub fn add_entity_edge_undirected(
    &self,
    key1: &str,
    key2: &str,
    edge_type: &str,
) -> Result<String> {
    // ... create edge data ...

    // Both entities get the edge in both directions
    let mut entity1 = self.get_or_create_entity(key1);
    entity1.add_outgoing_edge(edge_key.clone());
    entity1.add_incoming_edge(edge_key.clone());

    let mut entity2 = self.get_or_create_entity(key2);
    entity2.add_outgoing_edge(edge_key.clone());
    entity2.add_incoming_edge(edge_key.clone());

    Ok(edge_key)
}
}

Cross-Engine Integration

Query Router Integration

The Query Router provides unified queries combining graph traversal with vector similarity:

#![allow(unused)]
fn main() {
// Find entities similar to query that are connected to a specific entity
let items = router.find_similar_connected("query:entity", "connected_to:entity", top_k)?;

// Find graph neighbors sorted by embedding similarity
let items = router.find_neighbors_by_similarity("entity:key", &query_vector, top_k)?;
}

Tensor Vault Integration

Tensor Vault uses GraphEngine for access control relationships:

#![allow(unused)]
fn main() {
pub struct Vault {
    store: TensorStore,
    pub graph: Arc<GraphEngine>,  // Shared graph for access edges
    // ...
}
}

Access control edges connect principals to secrets with permission metadata.

Tensor Chain Integration

Tensor Chain uses GraphEngine for block linking:

#![allow(unused)]
fn main() {
pub struct Chain {
    graph: Arc<GraphEngine>,  // Stores blocks as nodes, links as edges
    // ...
}
}

Blocks are stored with chain:block:{height} keys and linked via graph edges with type chain_next.

Performance Characteristics

OperationComplexityNotes
create_nodeO(1)Store put
create_edgeO(1)Store put + edge list updates
get_nodeO(1)Store get
get_edgeO(1)Store get
neighborsO(e)e = edges from node
traverseO(n + e)BFS over reachable nodes
find_pathO(n + e)BFS shortest path
delete_nodeO(e)Parallel for e >= 100
node_countO(k)k = total keys (scan-based)
get_edge_listO(k)k = keys in edge list

Memory Characteristics

DataStorage
Node~50-200 bytes + properties
Edge~50-150 bytes + properties
Edge list entry~10 bytes per edge

Edge Cases and Gotchas

Self-Loop Edges

Self-loops (edges from a node to itself) are valid but filtered from neighbor results:

#![allow(unused)]
fn main() {
#[test]
fn self_loop_edge() {
    let engine = GraphEngine::new();
    let n1 = engine.create_node("A", HashMap::new()).unwrap();
    engine.create_edge(n1, n1, "SELF", HashMap::new(), true).unwrap();

    // Self-loop doesn't appear in neighbors
    let neighbors = engine.neighbors(n1, None, Direction::Both).unwrap();
    assert_eq!(neighbors.len(), 0);
}
}

Same-Node Path

Finding a path from a node to itself returns a single-node path:

#![allow(unused)]
fn main() {
let path = engine.find_path(n1, n1)?;
assert_eq!(path.nodes, vec![n1]);
assert!(path.edges.is_empty());
}

Deleted Edge Orphans

When deleting a node, connected edges are deleted from storage but may remain in other nodes’ edge lists. This is a known limitation - the edge retrieval gracefully handles missing edges.

Bytes Property Conversion

ScalarValue::Bytes converts to PropertyValue::Null since PropertyValue doesn’t support binary data:

#![allow(unused)]
fn main() {
let bytes = ScalarValue::Bytes(vec![1, 2, 3]);
assert_eq!(PropertyValue::from_scalar(&bytes), PropertyValue::Null);
}

Node Count Calculation

The node_count method uses a formula based on scan counts to account for edge lists:

#![allow(unused)]
fn main() {
pub fn node_count(&self) -> usize {
    // Each node has 3 keys: node:{id}, node:{id}:out, node:{id}:in
    self.store.scan_count("node:") - self.store.scan_count("node:") / 3 * 2
}
}

Best Practices

Use Shared Store for Cross-Engine Queries

#![allow(unused)]
fn main() {
// Create shared store first
let store = TensorStore::new();

// Create engines with shared store
let graph = GraphEngine::with_store(store.clone());
let vector = VectorEngine::with_store(store.clone());

// Now entities can have both graph edges and embeddings
}

Prefer Entity API for Cross-Engine Data

Use the Unified Entity API when entities need to combine relational, graph, and vector data:

#![allow(unused)]
fn main() {
// Good: Entity API preserves all fields
engine.add_entity_edge("user:1", "user:2", "follows")?;

// Less flexible: Node API creates graph-only entities
engine.create_node("User", props)?;
}

Batch Edge Creation

When creating many edges, avoid creating them one at a time if possible. Consider the overhead of multiple store operations.

Choose Direction Wisely

  • Use Direction::Outgoing for forward-only traversals (dependency graphs)
  • Use Direction::Both for symmetric relationships (social graphs)
  • Use Direction::Incoming for reverse lookups (finding predecessors)

Set Appropriate Traversal Depth

BFS traversal can be expensive on dense graphs. Set max_depth based on expected graph diameter:

#![allow(unused)]
fn main() {
// For typical social networks, 3-6 hops is usually sufficient
let reachable = engine.traverse(start, Direction::Both, 4, None)?;
}

Usage Examples

Social Network

#![allow(unused)]
fn main() {
let engine = GraphEngine::new();

// Create users
let alice = engine.create_node("User", user_props("Alice"))?;
let bob = engine.create_node("User", user_props("Bob"))?;
let charlie = engine.create_node("User", user_props("Charlie"))?;

// Create friendships (undirected)
engine.create_edge(alice, bob, "FRIENDS", HashMap::new(), false)?;
engine.create_edge(bob, charlie, "FRIENDS", HashMap::new(), false)?;

// Find path from Alice to Charlie
let path = engine.find_path(alice, charlie)?;
// path.nodes = [alice, bob, charlie]

// Get Alice's friends
let friends = engine.neighbors(alice, Some("FRIENDS"), Direction::Both)?;
}

Dependency Graph

#![allow(unused)]
fn main() {
let engine = GraphEngine::new();

// Create packages
let app = engine.create_node("Package", package_props("app"))?;
let lib_a = engine.create_node("Package", package_props("lib-a"))?;
let lib_b = engine.create_node("Package", package_props("lib-b"))?;

// Create dependencies (directed)
engine.create_edge(app, lib_a, "DEPENDS_ON", HashMap::new(), true)?;
engine.create_edge(app, lib_b, "DEPENDS_ON", HashMap::new(), true)?;
engine.create_edge(lib_a, lib_b, "DEPENDS_ON", HashMap::new(), true)?;

// Find all dependencies of app
let deps = engine.traverse(app, Direction::Outgoing, 10, Some("DEPENDS_ON"))?;
}

Cross-Engine Unified Entities

#![allow(unused)]
fn main() {
// Shared store for multiple engines
let store = TensorStore::new();
let graph = GraphEngine::with_store(store.clone());

// Add graph edges between entities
graph.add_entity_edge("user:1", "post:1", "created")?;
graph.add_entity_edge("user:2", "post:1", "liked")?;

// Query relationships
let creators = graph.get_entity_neighbors_in("post:1")?;
}

High-Degree Node Operations

#![allow(unused)]
fn main() {
let engine = GraphEngine::new();

// Create a hub with many connections (will use parallel deletion)
let hub = engine.create_node("Hub", HashMap::new())?;
for i in 0..150 {
    let leaf = engine.create_node("Leaf", HashMap::new())?;
    engine.create_edge(hub, leaf, "CONNECTS", HashMap::new(), true)?;
}

// Deletion will use parallel processing (150 > 100 threshold)
engine.delete_node(hub)?;
}

Configuration

The Graph Engine has minimal configuration as it inherits behavior from TensorStore:

SettingValueDescription
Parallel threshold100Edge count triggering parallel deletion
ID orderingSeqCstAtomic ordering for ID generation

Dependencies

CratePurpose
tensor_storeUnderlying key-value storage
rayonParallel iteration for high-degree node deletion
serdeSerialization of graph types
ModuleRelationship
Tensor StoreStorage backend
Tensor VaultUses graph for access control
Tensor ChainUses graph for block linking
Query RouterExecutes graph queries

Vector Engine

Module 4 of Neumann. Provides embeddings storage and similarity search with SIMD-accelerated distance computations.

The Vector Engine builds on tensor_store to provide k-NN search capabilities. It supports both brute-force O(n) search and HNSW O(log n) approximate search, with automatic sparse vector optimization for memory efficiency.

Design Principles

PrincipleDescription
Layered ArchitectureDepends only on Tensor Store for persistence
Multiple Distance MetricsCosine, Euclidean, and Dot Product similarity
SIMD Acceleration8-wide SIMD for dot products and magnitudes
Dual Search ModesBrute-force O(n) or HNSW O(log n)
Unified EntitiesEmbeddings can be attached to shared entities
Thread SafetyInherits from Tensor Store
Serializable TypesAll types implement serde::Serialize/Deserialize
Automatic Sparsity DetectionVectors with >50% zeros stored efficiently

Architecture

graph TB
    subgraph VectorEngine
        VE[VectorEngine]
        SR[SearchResult]
        DM[DistanceMetric]
        VE --> |uses| SR
        VE --> |uses| DM
    end

    subgraph TensorStore
        TS[TensorStore]
        HNSW[HNSWIndex]
        SV[SparseVector]
        SIMD[SIMD Functions]
        ES[EmbeddingStorage]
    end

    VE --> |stores to| TS
    VE --> |builds| HNSW
    VE --> |uses| SV
    VE --> |uses| SIMD

    subgraph Storage
        EMB["emb:{key}"]
        ENT["entity:{key}._embedding"]
    end

    TS --> EMB
    TS --> ENT

Key Types

TypeDescription
VectorEngineMain engine for storing and searching embeddings
VectorEngineConfigConfiguration for engine behavior and memory bounds
SearchResultResult with key and similarity score
DistanceMetricEnum: Cosine, Euclidean, DotProduct
ExtendedDistanceMetricExtended metrics for HNSW (9+ variants)
VectorErrorError types for vector operations
EmbeddingInputInput for batch store operations
BatchResultResult of batch operations
PaginationParameters for paginated queries
PagedResult<T>Paginated query result
HNSWIndexHierarchical navigable small world graph (re-exported from tensor_store)
HNSWConfigHNSW index configuration (re-exported from tensor_store)
SparseVectorMemory-efficient sparse embedding storage
FilterConditionFilter for metadata-based search (Eq, Ne, Lt, Gt, And, Or, In, etc.)
FilterValueValue type for filters (Int, Float, String, Bool, Null)
FilterStrategyStrategy selection (Auto, PreFilter, PostFilter)
FilteredSearchConfigConfiguration for filtered search behavior
VectorCollectionConfigConfiguration for vector collections
MetadataValueSimplified value type for embedding metadata
PersistentVectorIndexSerializable index for disk persistence

VectorError Variants

VariantDescriptionWhen Triggered
NotFoundEmbedding key doesn’t existget_embedding, delete_embedding
DimensionMismatchVectors have different dimensionscompute_similarity, exceeds max_dimension
EmptyVectorEmpty vector providedAny operation with vec![]
InvalidTopKtop_k is 0search_similar, search_with_hnsw
StorageErrorUnderlying Tensor Store errorStorage failures
BatchValidationErrorInvalid input in batchbatch_store_embeddings validation
BatchOperationErrorOperation failed in batchbatch_store_embeddings execution
ConfigurationErrorInvalid configurationVectorEngineConfig::validate()
CollectionExistsCollection already existscreate_collection with existing name
CollectionNotFoundCollection not foundCollection operations on missing collection
IoErrorIO error during persistencesave_to_file, load_from_file
SerializationErrorSerialization errorIndex persistence operations
SearchTimeoutSearch operation timed outSearch operations exceeding configured timeout

Configuration

VectorEngineConfig

Configuration for the Vector Engine with memory bounds and performance tuning.

FieldTypeDefaultDescription
default_dimensionOption<usize>NoneExpected embedding dimension
sparse_thresholdf320.5Sparsity threshold (0.0-1.0)
parallel_thresholdusize5000Dataset size for parallel search
default_metricDistanceMetricCosineDefault distance metric
max_dimensionOption<usize>NoneMaximum allowed dimension
max_keys_per_scanOption<usize>NoneLimit for unbounded scans
batch_parallel_thresholdusize100Batch size for parallel processing
search_timeoutOption<Duration>NoneSearch operation timeout

Configuration Presets

PresetDescriptionKey Settings
default()Balanced for most workloadsAll defaults
high_throughput()Optimized for write-heavy loadsparallel_threshold: 1000
low_memory()Memory-constrained environmentsmax_dimension: 4096, max_keys_per_scan: 10000, search_timeout: 30s

Builder Methods

All builder methods are const fn for compile-time configuration:

#![allow(unused)]
fn main() {
use std::time::Duration;

let config = VectorEngineConfig::default()
    .with_default_dimension(768)
    .with_sparse_threshold(0.7)
    .with_parallel_threshold(1000)
    .with_default_metric(DistanceMetric::Cosine)
    .with_max_dimension(4096)
    .with_max_keys_per_scan(50_000)
    .with_batch_parallel_threshold(200)
    .with_search_timeout(Duration::from_secs(5));

let engine = VectorEngine::with_config(config)?;
}

Memory Bounds

For production deployments, configure memory bounds to prevent resource exhaustion:

#![allow(unused)]
fn main() {
// Reject embeddings larger than 4096 dimensions
let config = VectorEngineConfig::default()
    .with_max_dimension(4096)
    .with_max_keys_per_scan(10_000);

let engine = VectorEngine::with_config(config)?;

// This will fail with DimensionMismatch
engine.store_embedding("too_big", vec![0.0; 5000])?; // Error!
}

Search Timeout

Configure a timeout for search operations to prevent runaway queries:

#![allow(unused)]
fn main() {
use std::time::Duration;
use vector_engine::{VectorEngine, VectorEngineConfig, VectorError};

let config = VectorEngineConfig::default()
    .with_search_timeout(Duration::from_secs(5));

let engine = VectorEngine::with_config(config)?;

match engine.search_similar(&query, 10) {
    Ok(results) => { /* process results */ },
    Err(VectorError::SearchTimeout { operation, timeout_ms }) => {
        eprintln!("Search '{}' timed out after {}ms", operation, timeout_ms);
    },
    Err(e) => { /* handle other errors */ },
}
}

The timeout applies to all search methods. When a timeout occurs, no partial results are returned to prevent misleading results that may miss better matches.

Distance Metrics

MetricFormulaScore RangeUse CaseHNSW Support
Cosinea.b / (‖a‖ * ‖b‖)-1.0 to 1.0Semantic similarityYes
Euclidean1 / (1 + sqrt(sum((a-b)^2)))0.0 to 1.0Spatial distanceNo (brute-force)
DotProductsum(a * b)unboundedMagnitude-awareNo (brute-force)

All metrics return higher scores for better matches. Euclidean distance is transformed to similarity score.

Extended Distance Metrics (HNSW)

The ExtendedDistanceMetric enum provides additional metrics for HNSW-based search via search_with_hnsw_and_metric():

MetricDescriptionBest For
CosineAngle-based similarityText embeddings, normalized vectors
EuclideanL2 distanceSpatial data, absolute distances
AngularCosine converted to angularWhen angle interpretation needed
ManhattanL1 normRobust to outliers
ChebyshevL-infinity (max diff)When max deviation matters
JaccardSet similarityBinary/sparse vectors, TF-IDF
OverlapMinimum overlap coefficientPartial matches
GeodesicSpherical distanceGeographic coordinates
CompositeWeighted combinationCustom similarity functions
#![allow(unused)]
fn main() {
use vector_engine::ExtendedDistanceMetric;

let (index, keys) = engine.build_hnsw_index_default()?;

// Search with Jaccard similarity for sparse vectors
let results = engine.search_with_hnsw_and_metric(
    &index,
    &keys,
    &query,
    10,
    ExtendedDistanceMetric::Jaccard,
)?;
}

Distance Metric Implementation Details

flowchart TD
    Query[Query Vector] --> MetricCheck{Which Metric?}

    MetricCheck -->|Cosine| CosMag[Pre-compute query magnitude]
    CosMag --> CosDot[SIMD dot product]
    CosDot --> CosDiv[Divide by magnitudes]
    CosDiv --> CosScore[Score: dot / mag_a * mag_b]

    MetricCheck -->|Euclidean| EucDiff[Compute differences]
    EucDiff --> EucSum[Sum of squares]
    EucSum --> EucSqrt[Square root]
    EucSqrt --> EucScore[Score: 1 / 1 + distance]

    MetricCheck -->|DotProduct| DotSIMD[SIMD dot product]
    DotSIMD --> DotScore[Score: raw dot product]

Cosine Similarity Edge Cases

#![allow(unused)]
fn main() {
// Zero-magnitude vectors return 0.0 similarity
let zero = vec![0.0, 0.0, 0.0];
let normal = vec![1.0, 2.0, 3.0];
VectorEngine::compute_similarity(&zero, &normal)?; // Returns 0.0

// Identical vectors return 1.0
VectorEngine::compute_similarity(&normal, &normal)?; // Returns 1.0

// Opposite vectors return -1.0
let opposite = vec![-1.0, -2.0, -3.0];
VectorEngine::compute_similarity(&normal, &opposite)?; // Returns -1.0

// Orthogonal vectors return 0.0
let a = vec![1.0, 0.0];
let b = vec![0.0, 1.0];
VectorEngine::compute_similarity(&a, &b)?; // Returns 0.0
}

Euclidean Distance Transformation

The engine transforms Euclidean distance to similarity score using 1 / (1 + distance):

DistanceSimilarity Score
0.01.0 (identical)
1.00.5
2.00.333
9.00.1
Infinity0.0

SIMD Implementation

The Vector Engine uses 8-wide SIMD operations via the wide crate for accelerated distance computations.

SIMD Dot Product Algorithm

#![allow(unused)]
fn main() {
// Simplified view of the SIMD implementation
pub fn dot_product(a: &[f32], b: &[f32]) -> f32 {
    let chunks = a.len() / 8;        // Process 8 floats at a time
    let remainder = a.len() % 8;

    let mut sum = f32x8::ZERO;

    // Process 8 elements at a time with SIMD
    for i in 0..chunks {
        let offset = i * 8;
        let va = f32x8::from(&a[offset..offset + 8]);
        let vb = f32x8::from(&b[offset..offset + 8]);
        sum += va * vb;  // Parallel multiply-add
    }

    // Sum SIMD lanes + handle remainder scalar
    let arr: [f32; 8] = sum.into();
    let mut result: f32 = arr.iter().sum();

    // Handle remainder with scalar operations
    let start = chunks * 8;
    for i in 0..remainder {
        result += a[start + i] * b[start + i];
    }

    result
}
}

SIMD Performance Characteristics

DimensionSIMD SpeedupNotes
81xBaseline (single SIMD operation)
644-6xFull pipeline utilization
3846-8xSentence Transformers size
7686-8xBERT embedding size
15366-8xOpenAI ada-002 size
30726-8xOpenAI text-embedding-3-large

SIMD operations are cache-friendly due to sequential memory access patterns.

API Reference

Basic Operations

#![allow(unused)]
fn main() {
let engine = VectorEngine::new();

// Store an embedding
engine.store_embedding("doc1", vec![0.1, 0.2, 0.3])?;

// Get an embedding
let vector = engine.get_embedding("doc1")?;

// Delete an embedding
engine.delete_embedding("doc1")?;

// Check existence
engine.exists("doc1");  // -> bool

// Count embeddings
engine.count();  // -> usize

// List all keys
let keys = engine.list_keys();

// Clear all embeddings
engine.clear()?;

// Get dimension (from first embedding)
engine.dimension();  // -> Option<usize>
}
#![allow(unused)]
fn main() {
// Find top-k most similar (cosine by default)
let query = vec![0.1, 0.2, 0.3];
let results = engine.search_similar(&query, 5)?;

for result in results {
    println!("Key: {}, Score: {}", result.key, result.score);
}

// Search with specific metric
let results = engine.search_similar_with_metric(
    &query,
    5,
    DistanceMetric::Euclidean
)?;

// Direct similarity computation
let similarity = VectorEngine::compute_similarity(&vec_a, &vec_b)?;
}

Search with metadata filters to narrow results without post-processing:

#![allow(unused)]
fn main() {
use vector_engine::{FilterCondition, FilterValue, FilteredSearchConfig, FilterStrategy};

// Build a filter condition
let filter = FilterCondition::Eq("category".to_string(), FilterValue::String("science".to_string()))
    .and(FilterCondition::Gt("year".to_string(), FilterValue::Int(2020)));

// Search with filter (auto strategy)
let results = engine.search_similar_filtered(&query, 10, &filter, None)?;

// Search with explicit pre-filter strategy (best for selective filters)
let config = FilteredSearchConfig::pre_filter();
let results = engine.search_similar_filtered(&query, 10, &filter, Some(config))?;

// Search with post-filter and custom oversample
let config = FilteredSearchConfig::post_filter().with_oversample(5);
let results = engine.search_similar_filtered(&query, 10, &filter, Some(config))?;
}

Filter Conditions

ConditionDescriptionExample
Eq(field, value)Equalitycategory = "science"
Ne(field, value)Not equalstatus != "deleted"
Lt(field, value)Less thanprice < 100
Le(field, value)Less than or equalprice <= 100
Gt(field, value)Greater thanyear > 2020
Ge(field, value)Greater than or equalyear >= 2020
And(a, b)Logical ANDCombined conditions
Or(a, b)Logical ORAlternative conditions
In(field, values)Value in liststatus IN ["active", "pending"]
Contains(field, substr)String containstitle CONTAINS "rust"
StartsWith(field, prefix)String prefixname STARTS WITH "doc:"
Exists(field)Field existsHAS embedding
TrueAlways matchesNo filter

Filter Strategies

StrategyWhen to UseBehavior
AutoDefaultEstimates selectivity and chooses
PreFilter< 10% matchesFilters first, then searches subset
PostFilter> 10% matchesSearches with oversample, then filters
flowchart TD
    Query[Query + Filter] --> Strategy{Which Strategy?}

    Strategy -->|Auto| Estimate[Estimate Selectivity]
    Estimate -->|< 10%| Pre[Pre-Filter]
    Estimate -->|>= 10%| Post[Post-Filter]

    Strategy -->|PreFilter| Pre
    Strategy -->|PostFilter| Post

    Pre --> Filter1[Filter all keys]
    Filter1 --> Search1[Search filtered subset]
    Search1 --> Result[Top-K Results]

    Post --> Search2[Search with oversample]
    Search2 --> Filter2[Filter candidates]
    Filter2 --> Result

Filter Helper Methods

Utilities for working with filters:

#![allow(unused)]
fn main() {
// Estimate how selective a filter is (0.0 = matches nothing, 1.0 = matches all)
let selectivity = engine.estimate_filter_selectivity(&filter);

// Count how many embeddings match a filter
let matching = engine.count_matching(&filter);

// Get keys of all matching embeddings
let keys = engine.list_keys_matching(&filter);
}

Metadata Storage

Store and retrieve metadata alongside embeddings:

#![allow(unused)]
fn main() {
use tensor_store::TensorValue;
use std::collections::HashMap;

// Store embedding with metadata
let mut metadata = HashMap::new();
metadata.insert("category".to_string(), TensorValue::from("science"));
metadata.insert("year".to_string(), TensorValue::from(2024i64));
metadata.insert("score".to_string(), TensorValue::from(0.95f64));

engine.store_embedding_with_metadata("doc1", vec![0.1, 0.2, 0.3], metadata)?;

// Get all metadata
let meta = engine.get_metadata("doc1")?;

// Get specific field
let category = engine.get_metadata_field("doc1", "category")?;

// Update metadata (merges with existing)
let mut updates = HashMap::new();
updates.insert("score".to_string(), TensorValue::from(0.98f64));
engine.update_metadata("doc1", updates)?;

// Check if metadata field exists
if engine.has_metadata_field("doc1", "category") {
    // Remove specific metadata field
    engine.remove_metadata_field("doc1", "category")?;
}
}

Batch Operations

For bulk insert and delete operations with parallel processing:

#![allow(unused)]
fn main() {
use vector_engine::EmbeddingInput;

// Batch store - validates all inputs first, then stores in parallel
let inputs = vec![
    EmbeddingInput::new("doc1", vec![0.1, 0.2, 0.3]),
    EmbeddingInput::new("doc2", vec![0.2, 0.3, 0.4]),
    EmbeddingInput::new("doc3", vec![0.3, 0.4, 0.5]),
];

let result = engine.batch_store_embeddings(inputs)?;
println!("Stored {} embeddings", result.count);  // -> 3

// Batch delete - returns count of successfully deleted
let keys = vec!["doc1".to_string(), "doc2".to_string()];
let deleted = engine.batch_delete_embeddings(keys)?;
println!("Deleted {} embeddings", deleted);  // -> 2
}

Batches larger than batch_parallel_threshold (default: 100) use parallel processing via rayon.

Pagination

For memory-efficient iteration over large datasets:

#![allow(unused)]
fn main() {
use vector_engine::Pagination;

// List keys with pagination
let page = Pagination::new(0, 100);  // skip=0, limit=100
let result = engine.list_keys_paginated(page);
println!("Items: {}, Has more: {}", result.items.len(), result.has_more);

// Get total count with pagination
let page = Pagination::new(0, 100).with_total();
let result = engine.list_keys_paginated(page);
println!("Total: {:?}", result.total_count);  // Some(total)

// Paginated similarity search
let page = Pagination::new(10, 5);  // skip first 10, return 5
let results = engine.search_similar_paginated(&query, 100, page)?;

// Paginated entity search
let results = engine.search_entities_paginated(&query, 100, page)?;
}

Use list_keys_bounded() for production to enforce max_keys_per_scan limits.

Search Flow Diagram

sequenceDiagram
    participant Client
    participant VE as VectorEngine
    participant TS as TensorStore
    participant SIMD

    Client->>VE: search_similar(query, k)
    VE->>VE: Validate query (non-empty, k > 0)
    VE->>SIMD: Pre-compute query magnitude
    VE->>TS: scan("emb:")
    TS-->>VE: List of embedding keys

    alt Dataset < 5000 vectors
        VE->>VE: Sequential search
    else Dataset >= 5000 vectors
        VE->>VE: Parallel search (rayon)
    end

    loop For each embedding
        VE->>TS: get(key)
        TS-->>VE: TensorData
        VE->>VE: Extract vector (dense or sparse)
        VE->>SIMD: cosine_similarity(query, stored)
        VE->>VE: Collect SearchResult
    end

    VE->>VE: Sort by score descending
    VE->>VE: Truncate to top k
    VE-->>Client: Vec<SearchResult>

HNSW Index

For large datasets, build an HNSW index for O(log n) search:

#![allow(unused)]
fn main() {
// Build index with default config
let (index, key_mapping) = engine.build_hnsw_index_default()?;

// Search using the index
let results = engine.search_with_hnsw(&index, &key_mapping, &query, 10)?;

// Build with custom config
let config = HNSWConfig::high_recall();
let (index, key_mapping) = engine.build_hnsw_index(config)?;

// Direct HNSW operations
let index = HNSWIndex::new();
index.insert(vec![1.0, 2.0, 3.0]);
let results = index.search(&query, 10);

// Search with custom ef (recall/speed tradeoff)
let results = index.search_with_ef(&query, 10, 200);
}

HNSW Search Flow

flowchart TD
    Query[Query Vector] --> Entry[Entry Point at Max Layer]

    Entry --> Greedy1[Greedy Search Layer L]
    Greedy1 --> |Find closest| Greedy2[Greedy Search Layer L-1]
    Greedy2 --> |...|GreedyN[Greedy Search until Layer 1]

    GreedyN --> Layer0[Full ef-Search at Layer 0]

    Layer0 --> Candidates[Candidate Pool]
    Candidates --> |BinaryHeap min-heap| Visit[Visit Neighbors]
    Visit --> Distance[Compute Distances]
    Distance --> |Update| Results[Result Pool]
    Results --> |BinaryHeap max-heap| Prune[Keep top ef]

    Prune --> |More candidates?| Visit
    Prune --> |Done| TopK[Return Top K]

Unified Entity Mode

Attach embeddings directly to entities for cross-engine queries:

#![allow(unused)]
fn main() {
let store = TensorStore::new();
let engine = VectorEngine::with_store(store.clone());

// Set embedding on an entity
engine.set_entity_embedding("user:1", vec![0.1, 0.2, 0.3])?;

// Get embedding from an entity
let embedding = engine.get_entity_embedding("user:1")?;

// Check if entity has embedding
engine.entity_has_embedding("user:1");  // -> bool

// Remove embedding (preserves other entity data)
engine.remove_entity_embedding("user:1")?;

// Search entities with embeddings
let results = engine.search_entities(&query, 5)?;

// Scan all entities with embeddings
let entity_keys = engine.scan_entities_with_embeddings();

// Count entities with embeddings
let count = engine.count_entities_with_embeddings();
}

Unified entity embeddings are stored in the _embedding field of the entity’s TensorData.

Collections

Collections provide isolated namespaces for organizing embeddings by type or purpose. Each collection can have its own dimension constraints and distance metric configuration.

Creating and Managing Collections

#![allow(unused)]
fn main() {
use vector_engine::{VectorEngine, VectorCollectionConfig, DistanceMetric};

let engine = VectorEngine::new();

// Create collection with custom config
let config = VectorCollectionConfig::default()
    .with_dimension(768)
    .with_metric(DistanceMetric::Cosine)
    .with_auto_index(5000);  // Auto-build HNSW at 5000 vectors

engine.create_collection("documents", config)?;

// List collections
let collections = engine.list_collections();

// Check if collection exists
engine.collection_exists("documents");  // -> true

// Get collection config
let config = engine.get_collection_config("documents");

// Delete collection (removes all vectors in it)
engine.delete_collection("documents")?;
}

Storing in Collections

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use tensor_store::TensorValue;

// Store vector in collection
engine.store_in_collection("documents", "doc1", vec![0.1, 0.2, 0.3])?;

// Store with metadata
let mut metadata = HashMap::new();
metadata.insert("title".to_string(), TensorValue::from("Introduction to Rust"));
metadata.insert("author".to_string(), TensorValue::from("Alice"));

engine.store_in_collection_with_metadata(
    "documents",
    "doc1",
    vec![0.1, 0.2, 0.3],
    metadata
)?;
}

Searching in Collections

#![allow(unused)]
fn main() {
use vector_engine::{FilterCondition, FilterValue};

// Basic search in collection
let results = engine.search_in_collection("documents", &query, 10)?;

// Filtered search in collection
let filter = FilterCondition::Eq("author".to_string(), FilterValue::String("Alice".to_string()));
let results = engine.search_filtered_in_collection(
    "documents",
    &query,
    10,
    &filter,
    None
)?;
}

Collection Key Isolation

Collections use prefixed storage keys to ensure isolation:

OperationStorage Key Pattern
Default embeddingsemb:{key}
Collection embeddingscoll:{collection}:emb:{key}
Entity embeddings{entity_key}._embedding

VectorCollectionConfig

FieldTypeDefaultDescription
dimensionOption<usize>NoneEnforced dimension (rejects mismatches)
distance_metricDistanceMetricCosineDefault metric for this collection
auto_indexboolfalseAuto-build HNSW on threshold
auto_index_thresholdusize1000Vector count to trigger auto-index

Index Persistence

Save and restore vector indices for fast startup:

#![allow(unused)]
fn main() {
use std::path::Path;

// Save all collections to directory (one JSON file per collection)
let saved = engine.save_all_indices(Path::new("./vector_index"))?;

// Load all indices from directory
let loaded = engine.load_all_indices(Path::new("./vector_index"))?;

// Save single collection to JSON
engine.save_index("documents", Path::new("./documents.json"))?;

// Save single collection to compact binary format
engine.save_index_binary("documents", Path::new("./documents.bin"))?;

// Load single collection from JSON (returns collection name)
let collection = engine.load_index(Path::new("./documents.json"))?;

// Load single collection from binary
let collection = engine.load_index_binary(Path::new("./documents.bin"))?;

// Get a snapshot for manual serialization
let index: PersistentVectorIndex = engine.snapshot_collection("documents");
}

PersistentVectorIndex Format

FieldTypeDescription
collectionStringCollection name
configVectorCollectionConfigCollection configuration
vectorsVec<VectorEntry>All vectors with metadata
created_atu64Unix timestamp
versionu32Format version (currently 1)

Storage Model

Key PatternContentUse Case
emb:{key}TensorData with “vector” fieldDefault collection embeddings
coll:{collection}:emb:{key}TensorData with “vector” fieldNamed collection embeddings
{entity_key}TensorData with “_embedding” fieldUnified entities

Automatic Sparse Storage

Vectors with >50% zeros are automatically stored as sparse vectors:

#![allow(unused)]
fn main() {
// Detection threshold: nnz * 2 <= len (i.e., sparsity >= 50%)
fn should_use_sparse(vector: &[f32]) -> bool {
    let nnz = vector.iter().filter(|&&v| v.abs() > 1e-6).count();
    nnz * 2 <= vector.len()
}

// 97% sparse vector (3 non-zeros in 100 elements)
let mut sparse = vec![0.0f32; 100];
sparse[0] = 1.0;
sparse[50] = 2.0;
sparse[99] = 3.0;

// Stored efficiently as SparseVector
engine.store_embedding("sparse_doc", sparse)?;

// Retrieved as dense for computation
let dense = engine.get_embedding("sparse_doc")?;
}

Storage Format Comparison

FormatMemory per ElementBest For
Dense4 bytesSparsity < 50%
Sparse8 bytes per non-zero (4 pos + 4 val)Sparsity > 50%

Example: 1000-dim vector with 100 non-zeros:

  • Dense: 4000 bytes
  • Sparse: 800 bytes (5x compression)

Sparse Vector Operations

Memory Layout

SparseVector {
    dimension: usize,        // Total vector dimension
    positions: Vec<u32>,     // Sorted indices of non-zeros
    values: Vec<f32>,        // Corresponding values
}

Sparse Dot Product Algorithm

#![allow(unused)]
fn main() {
// O(min(nnz_a, nnz_b)) - only overlapping positions contribute
pub fn dot(&self, other: &SparseVector) -> f32 {
    let mut result = 0.0;
    let mut i = 0;
    let mut j = 0;

    // Merge-sort style traversal
    while i < self.positions.len() && j < other.positions.len() {
        match self.positions[i].cmp(&other.positions[j]) {
            Equal => {
                result += self.values[i] * other.values[j];
                i += 1; j += 1;
            },
            Less => i += 1,
            Greater => j += 1,
        }
    }
    result
}
}

Sparse-Dense Dot Product

#![allow(unused)]
fn main() {
// O(nnz) - only iterate over sparse non-zeros
pub fn dot_dense(&self, dense: &[f32]) -> f32 {
    self.positions.iter()
        .zip(&self.values)
        .map(|(&pos, &val)| val * dense[pos as usize])
        .sum()
}
}

Sparse Distance Metrics

MetricComplexityDescription
dotO(min(nnz_a, nnz_b))Sparse-sparse dot product
dot_denseO(nnz)Sparse-dense dot product
cosine_similarityO(min(nnz_a, nnz_b))Angle-based similarity
euclidean_distanceO(nnz_a + nnz_b)L2 distance
manhattan_distanceO(nnz_a + nnz_b)L1 distance
jaccard_indexO(min(nnz_a, nnz_b))Position overlap
angular_distanceO(min(nnz_a, nnz_b))Arc-cosine

HNSW Configuration

Configuration Parameters

ParameterDefaultDescription
m16Max connections per node per layer
m032Max connections at layer 0 (2*m)
ef_construction200Candidates during index building
ef_search50Candidates during search
ml1/ln(m)Level multiplier for layer selection
sparsity_threshold0.5Auto-sparse threshold
max_nodes10,000,000Capacity limit

Presets

Presetmm0ef_constructionef_searchUse Case
default()163220050Balanced
high_recall()3264400200Accuracy over speed
high_speed()81610020Speed over accuracy

Tuning Guidelines

graph TD
    subgraph "Higher m / ef"
        A[More connections per node]
        B[Better recall]
        C[More memory]
        D[Slower insert]
    end

    subgraph "Lower m / ef"
        E[Fewer connections]
        F[Lower recall]
        G[Less memory]
        H[Faster insert]
    end

    A --> B
    A --> C
    A --> D

    E --> F
    E --> G
    E --> H

Workload-Specific Tuning

WorkloadRecommended ConfigRationale
RAG/Semantic Searchhigh_recall()Accuracy critical
Real-time recommendationshigh_speed()Latency critical
Batch processingdefault()Balanced
Small dataset (<10K)Brute-forceHNSW overhead not worth it
Large dataset (>100K)default() with higher ef_searchScale benefits

Memory vs Recall Tradeoff

ConfigMemory/NodeRecall@10Search Time
high_speed~128 bytes~85%0.1ms
default~256 bytes~95%0.3ms
high_recall~512 bytes~99%1.0ms

Performance Characteristics

OperationComplexityNotes
store_embeddingO(1)Single store put
get_embeddingO(1)Single store get
delete_embeddingO(1)Single store delete
search_similarO(n*d)Brute-force, n=count, d=dimension
search_with_hnswO(log n ef m)Approximate nearest neighbor
build_hnsw_indexO(n log n ef_construction * m)Index construction
countO(n)Scans all embeddings
list_keysO(n)Scans all embeddings

Parallel Search Threshold

Automatic parallel iteration for datasets >5000 vectors:

#![allow(unused)]
fn main() {
const PARALLEL_THRESHOLD: usize = 5000;

if keys.len() >= PARALLEL_THRESHOLD {
    // Use rayon parallel iterator
    keys.par_iter().filter_map(...)
} else {
    // Use sequential iterator
    keys.iter().filter_map(...)
}
}

Benchmark Results

Dataset SizeBrute-ForceWith HNSWSpeedup
200 vectors4.17s9.3us448,000x
1,000 vectors~5ms~20us250x
10,000 vectors~50ms~50us1000x
100,000 vectors~500ms~100us5000x

Supported Embedding Dimensions

ModelDimensionsRecommended Config
OpenAI text-embedding-ada-0021536default
OpenAI text-embedding-3-small1536default
OpenAI text-embedding-3-large3072high_recall
BERT base768default
Sentence Transformers384-768default
Cohere embed-v31024default
Custom/small<256high_speed

Edge Cases and Gotchas

Zero-Magnitude Vectors

MetricBehaviorRationale
CosineReturns empty resultsDivision by zero undefined
DotProductReturns empty resultsUndefined direction
EuclideanWorks correctlyFinds vectors closest to origin

Dimension Mismatch Handling

#![allow(unused)]
fn main() {
// Mismatched dimensions are silently skipped during search
engine.store_embedding("2d", vec![1.0, 0.0])?;
engine.store_embedding("3d", vec![1.0, 0.0, 0.0])?;

// Search with 2D query only matches 2D vectors
let results = engine.search_similar(&[1.0, 0.0], 10)?;
assert_eq!(results.len(), 1);  // Only "2d" matched
}

HNSW Limitations

LimitationDetailsWorkaround
Only cosine similarityHNSW uses cosine distance internallyUse brute-force for other metrics
No deletionCannot remove vectorsRebuild index
Static after buildIndex doesn’t update with new vectorsRebuild periodically
Memory overheadGraph structure adds ~2-4xUse for large datasets only

NaN/Infinity Handling

Sparse vector operations sanitize NaN/Inf results:

#![allow(unused)]
fn main() {
// cosine_similarity returns 0.0 for NaN/Inf
if result.is_nan() || result.is_infinite() {
    0.0
} else {
    result.clamp(-1.0, 1.0)
}

// cosine_distance_dense returns 1.0 (max distance) for NaN/Inf
if similarity.is_nan() || similarity.is_infinite() {
    1.0  // Maximum distance
} else {
    1.0 - similarity.clamp(-1.0, 1.0)
}
}

Best Practices

Memory Optimization

  1. Use sparse vectors for high-sparsity data: Automatic at >50% zeros
  2. Batch insert for HNSW: Build index once after all data loaded
  3. Choose appropriate HNSW config: Don’t over-provision m/ef
  4. Monitor memory with HNSWMemoryStats: Track dense vs sparse counts
#![allow(unused)]
fn main() {
let stats = index.memory_stats();
println!("Dense: {}, Sparse: {}, Total bytes: {}",
    stats.dense_count, stats.sparse_count, stats.embedding_bytes);
}

Search Performance

  1. Pre-compute query magnitude: Done automatically in search
  2. Use HNSW for >10K vectors: Brute-force for smaller sets
  3. Tune ef_search: Higher for recall, lower for speed
  4. Parallel threshold: Automatic at 5000 vectors

Unified Entity Best Practices

  1. Use for cross-engine queries: When embeddings relate to graph/relational data
  2. Entity key conventions: Use prefixes like user:, doc:, item:
  3. Separate embedding namespace: Use store_embedding for isolated vectors

Dependencies

CratePurpose
tensor_storePersistence, SparseVector, HNSWIndex, SIMD
rayonParallel iteration for large datasets
serdeSerialization of types
tracingInstrumentation and observability

Note: wide (SIMD f32x8 operations) is a transitive dependency via tensor_store.

Tensor Compress

Module 8 of Neumann. Provides tensor-native compression exploiting the mathematical structure of high-dimensional embeddings.

The primary compression method is Tensor Train (TT) decomposition, which decomposes vectors reshaped as tensors into a chain of smaller 3D cores using successive SVD truncations. This achieves 10-40x compression for 1024+ dimension vectors while enabling similarity computations directly in compressed space.

Design Principles

  1. Tensor Mathematics: Uses Tensor Train decomposition to exploit low-rank structure
  2. Higher Dimensions Are Lower: Decomposes vectors into products of smaller tensors
  3. Streaming I/O: Process large snapshots without loading entire dataset
  4. Incremental Updates: Delta snapshots for efficient replication
  5. Pure Rust: No external LAPACK/BLAS dependencies - fully portable

Key Types

TypeDescription
TTVectorComplete TT-decomposition of a vector with cores, shape, and ranks
TTCoreSingle 3D tensor core (left_rank x mode_size x right_rank)
TTConfigConfiguration for TT decomposition (shape, max_rank, tolerance)
CompressionConfigSnapshot compression settings (tensor mode, delta, RLE)
TensorModeCompression mode enum (currently TensorTrain variant)
RleEncoded<T>Run-length encoded data with values and run lengths
DeltaSnapshotSnapshot containing only changes since a base snapshot
DeltaChainChain of deltas with efficient lookup and compaction
StreamingWriterMemory-bounded incremental snapshot writer
StreamingReaderIterator-based snapshot reader
StreamingTTWriterStreaming TT-compressed vector writer
StreamingTTReaderStreaming TT-compressed vector reader
MatrixRow-major matrix for SVD operations
SvdResultTruncated SVD result (U, S, Vt matrices)
TensorViewZero-copy logical view of tensor data
DeltaBuilderBuilder for creating delta snapshots

Error Types

ErrorDescription
TTError::ShapeMismatchVector dimension doesn’t match reshape target
TTError::EmptyVectorCannot decompose empty vector
TTError::InvalidRankTT-rank must be >= 1
TTError::IncompatibleShapesTT vectors have different shapes for operation
TTError::InvalidShapeShape contains zero or is empty
TTError::InvalidToleranceTolerance must be 0 < tol <= 1
TTError::DecomposeSVD decomposition failed
FormatError::InvalidMagicFile magic bytes don’t match expected
FormatError::UnsupportedVersionFormat version is newer than supported
FormatError::SerializationBincode serialization/deserialization error
DeltaError::BaseNotFoundReferenced base snapshot doesn’t exist
DeltaError::SequenceGapDelta sequence numbers have gaps
DeltaError::ChainTooLongDelta chain exceeds maximum length
DecomposeError::EmptyMatrixCannot decompose empty matrix
DecomposeError::DimensionMismatchMatrix dimensions don’t match for operation
DecomposeError::SvdNotConvergedSVD iteration didn’t converge

Architecture

graph TD
    subgraph tensor_compress
        TT[tensor_train.rs<br/>TT-SVD decomposition]
        DC[decompose.rs<br/>SVD implementation]
        FMT[format.rs<br/>Snapshot format]
        STR[streaming.rs<br/>Streaming I/O]
        STT[streaming_tt.rs<br/>Streaming TT]
        INC[incremental.rs<br/>Delta snapshots]
        DLT[delta.rs<br/>Delta + varint encoding]
        RLE[rle.rs<br/>Run-length encoding]
    end

    TT --> DC
    FMT --> TT
    FMT --> DLT
    FMT --> RLE
    STR --> FMT
    STT --> TT
    INC --> FMT

Tensor Train Decomposition

Algorithm Overview

The TT-SVD algorithm (Oseledets 2011) decomposes a vector by:

  1. Reshape: Convert 1D vector to multi-dimensional tensor
  2. Left-to-right sweep: For each mode k from 1 to n-1:
    • Left-unfold the current tensor into a matrix
    • Compute truncated SVD: A = U S Vt
    • Store U as the k-th core
    • Multiply S * Vt to get the remainder for next iteration
  3. Final core: The last remainder becomes the final core
graph LR
    subgraph "TT-SVD Algorithm"
        V[Vector 4096-dim] --> R[Reshape to 8x8x8x8]
        R --> U1[Unfold mode 1<br/>64 x 64]
        U1 --> SVD1[SVD truncate<br/>rank=8]
        SVD1 --> C1[Core 1<br/>1x8x8]
        SVD1 --> R2[Remainder<br/>8x512]
        R2 --> SVD2[SVD truncate]
        SVD2 --> C2[Core 2<br/>8x8x8]
        SVD2 --> R3[Remainder]
        R3 --> SVD3[SVD truncate]
        SVD3 --> C3[Core 3<br/>8x8x8]
        SVD3 --> C4[Core 4<br/>8x8x1]
    end

Compression Example

For a 4096-dim embedding reshaped to (8, 8, 8, 8):

Original: 4096 floats = 16 KB
TT-cores: 1x8x8 + 8x8x8 + 8x8x8 + 8x8x1 = 64 + 512 + 512 + 64 = 1152 floats
With max_rank=8: 1x8x4 + 4x8x4 + 4x8x4 + 4x8x1 = 32 + 128 + 128 + 32 = 320 floats = 1.25 KB
Compression: 12.8x

SVD Implementation Details

The module implements two SVD algorithms:

1. Power Iteration with Deflation (small matrices)

Used when matrix dimensions are <= 32 or rank is close to matrix size:

#![allow(unused)]
fn main() {
// Simplified power iteration
fn power_iteration(a: &Matrix, max_iter: usize, tol: f32) -> (sigma, u, v) {
    // Initialize v randomly (deterministic seed)
    let mut v: Vec<f32> = (0..cols).map(|i| ((i * 7 + 3) % 13) as f32 / 13.0 - 0.5).collect();
    normalize(&mut v);

    for _ in 0..max_iter {
        // u = A * v, then normalize
        u = matmul(a, v);
        new_sigma = normalize(&mut u);

        // v = A^T * u, then normalize
        v = matmul_transpose(a, u);
        normalize(&mut v);

        // Check convergence
        if (new_sigma - sigma).abs() < tol * sigma.max(1.0) {
            return (new_sigma, u, v);
        }
        sigma = new_sigma;
    }
}
}

After finding each singular triplet, the algorithm deflates: A = A - sigma u vT

2. Randomized SVD (large matrices)

Uses the Halko-Martinsson-Tropp 2011 algorithm for matrices > 32 dimensions:

graph TD
    subgraph "Randomized SVD Pipeline"
        A[Input Matrix A<br/>m x n] --> OMEGA[Generate Gaussian<br/>Omega n x k+p]
        A --> SAMPLE[Y = A * Omega<br/>m x k+p]
        SAMPLE --> QR[QR decompose Y<br/>Q = orth basis]
        QR --> PROJECT[B = Q^T * A<br/>k+p x n]
        PROJECT --> SMALL_SVD[SVD of small B<br/>power iteration]
        SMALL_SVD --> RECONSTRUCT[U = Q * U_small]
    end

Key implementation details:

  • Gaussian matrix generation: Uses a Linear Congruential Generator (LCG) with Box-Muller transform for deterministic, portable random numbers
  • QR orthonormalization: Modified Gram-Schmidt for numerical stability
  • Oversampling: Adds 5 extra columns to improve accuracy
  • Convergence: 20 iterations max (sufficient for embedding vectors)
#![allow(unused)]
fn main() {
// LCG parameters from Numerical Recipes
fn lcg_next(state: &mut u64) -> u64 {
    *state = state.wrapping_mul(6_364_136_223_846_793_005)
                  .wrapping_add(1_442_695_040_888_963_407);
    *state
}

// Box-Muller transform for Gaussian
fn box_muller(u1: f32, u2: f32) -> (f32, f32) {
    let r = (-2.0 * u1.ln()).sqrt();
    let theta = 2.0 * PI * u2;
    (r * theta.cos(), r * theta.sin())
}
}

Optimal Shape Selection

The module includes hardcoded optimal shapes for common embedding dimensions:

DimensionShapeWhy
64[4, 4, 4]3 balanced factors
128[4, 4, 8]Near-balanced
256[4, 8, 8]Near-balanced
384[4, 8, 12]all-MiniLM-L6-v2
512[8, 8, 8]Perfect cube
768[8, 8, 12]BERT dimension
1024[8, 8, 16]Common LLM size
1536[8, 12, 16]OpenAI ada-002
2048[8, 16, 16]Near-balanced
3072[8, 16, 24]Large models
4096[8, 8, 8, 8]4D balanced
8192[8, 8, 8, 16]Extra large

For non-standard dimensions, factorize_balanced finds factors close to the nth root:

#![allow(unused)]
fn main() {
fn factorize_balanced(n: usize) -> Vec<usize> {
    // Target 2-6 factors based on log2(n)
    let target_factors = (ln(n) / ln(2)).ceil().clamp(2, 6);
    let target_size = n^(1/target_factors);

    // Greedily find factors close to target_size
    // ...
}
}

TT Operations

FunctionDescriptionComplexity
tt_decomposeDecompose vector to TT formatO(n d r^2)
tt_decompose_batchParallel batch decomposition (4+ vectors)O(batch n d * r^2 / threads)
tt_reconstructReconstruct vector from TTO(d^n * r^2)
tt_dot_productDot product in TT spaceO(n d r^4)
tt_dot_product_batchBatch dot productsParallel when >= 4 targets
tt_cosine_similarityCosine similarity in TT spaceO(n d r^4)
tt_cosine_similarity_batchBatch cosine similaritiesParallel when >= 4 targets
tt_euclidean_distanceEuclidean distance in TT spaceO(n d r^4)
tt_euclidean_distance_batchBatch Euclidean distancesParallel when >= 4 targets
tt_normL2 norm of TT vectorO(n d r^4)
tt_scaleScale TT vector by constantO(cores[0].size)

Where: n = number of modes, d = mode size, r = TT-rank

TT Gram Matrix Computation

Computing dot products and norms in TT space uses the Gram matrix approach:

#![allow(unused)]
fn main() {
// Gram matrix propagation for dot product
fn tt_dot_product(a: &TTVector, b: &TTVector) -> f32 {
    let mut gram = vec![1.0f32];  // Start with 1x1 identity

    for (core_a, core_b) in a.cores.iter().zip(b.cores.iter()) {
        let (r1a, n, r2a) = core_a.shape;
        let (r1b, _, r2b) = core_b.shape;
        let mut new_gram = vec![0.0; r2a * r2b];

        // Contract: new_gram[a,b] = sum_{k,i,j} gram[i,j] * A[i,k,a] * B[j,k,b]
        for a_idx in 0..r2a {
            for b_idx in 0..r2b {
                for k in 0..n {
                    for ia in 0..r1a {
                        for ib in 0..r1b {
                            let g = gram[ia * r1b + ib];
                            new_gram[a_idx * r2b + b_idx] +=
                                g * core_a.get(ia, k, a_idx) * core_b.get(ib, k, b_idx);
                        }
                    }
                }
            }
        }
        gram = new_gram;
    }

    gram[0]  // Final 1x1 Gram matrix
}
}

Usage

#![allow(unused)]
fn main() {
use tensor_compress::{tt_decompose, tt_reconstruct, tt_cosine_similarity, TTConfig};

let embedding: Vec<f32> = get_embedding();  // 4096-dim
let config = TTConfig::for_dim(4096)?;

// Decompose
let tt = tt_decompose(&embedding, &config)?;
println!("Compression: {:.1}x", tt.compression_ratio());
println!("Storage: {} floats", tt.storage_size());
println!("Max rank: {}", tt.max_rank());

// Reconstruct
let restored = tt_reconstruct(&tt);

// Compute similarity without reconstruction
let tt2 = tt_decompose(&other_embedding, &config)?;
let sim = tt_cosine_similarity(&tt, &tt2)?;
}

Batch Operations

Batch operations use rayon for parallel processing when handling 4+ vectors:

#![allow(unused)]
fn main() {
use tensor_compress::{tt_decompose_batch, tt_cosine_similarity_batch, TTConfig};

let vectors: Vec<Vec<f32>> = load_embeddings();
let config = TTConfig::for_dim(4096)?;

// Batch decompose (parallel for 4+ vectors)
let refs: Vec<&[f32]> = vectors.iter().map(|v| v.as_slice()).collect();
let tts = tt_decompose_batch(&refs, &config)?;

// Batch similarity search
let query_tt = &tts[0];
let similarities = tt_cosine_similarity_batch(query_tt, &tts[1..])?;

// Find top-k
let mut indexed: Vec<_> = similarities.iter().enumerate().collect();
indexed.sort_by(|a, b| b.1.partial_cmp(a.1).unwrap());
let top_5: Vec<_> = indexed.iter().take(5).collect();
}

The parallel threshold constant is:

#![allow(unused)]
fn main() {
const PARALLEL_THRESHOLD: usize = 4;
}

Configuration

TTConfig Presets

Presetmax_ranktoleranceUse Case
for_dim(d)81e-4Balanced compression/accuracy
high_compression(d)41e-2Maximize compression (2-3x more)
high_accuracy(d)161e-6Maximize accuracy (<0.1% error)

TTConfig Validation

#![allow(unused)]
fn main() {
impl TTConfig {
    pub fn validate(&self) -> Result<(), TTError> {
        if self.shape.is_empty() {
            return Err(TTError::InvalidShape("empty shape".into()));
        }
        if self.shape.contains(&0) {
            return Err(TTError::InvalidShape("shape contains zero".into()));
        }
        if self.max_rank < 1 {
            return Err(TTError::InvalidRank);
        }
        if self.tolerance <= 0.0 || self.tolerance > 1.0 || !self.tolerance.is_finite() {
            return Err(TTError::InvalidTolerance(self.tolerance));
        }
        Ok(())
    }
}
}

CompressionConfig

#![allow(unused)]
fn main() {
pub struct CompressionConfig {
    pub tensor_mode: Option<TensorMode>,  // TT compression for vectors
    pub delta_encoding: bool,             // For sorted ID lists
    pub rle_encoding: bool,               // For repeated values
}

// Presets
CompressionConfig::high_compression()  // max_rank=4, all encodings enabled
CompressionConfig::balanced(dim)       // max_rank=8, all encodings enabled
CompressionConfig::high_accuracy(dim)  // max_rank=16, all encodings enabled
}

Dimension Presets

ConstantValueModel
SMALL64MiniLM and small models
MEDIUM384all-MiniLM-L6-v2
STANDARD768BERT, sentence-transformers
LARGE1536OpenAI text-embedding-ada-002
XLARGE4096LLaMA and large models

Streaming Operations

State Machine

stateDiagram-v2
    [*] --> Created: new()
    Created --> Writing: write_entry() / write_vector()
    Writing --> Writing: write_entry() / write_vector()
    Writing --> Finishing: finish()
    Finishing --> [*]: success

    note right of Created
        Magic bytes written
        entry_count = 0
    end note

    note right of Writing
        Length-prefixed entries
        entry_count incremented
    end note

    note right of Finishing
        Trailer written with:
        - entry_count
        - config
        - data_start offset
    end note

File Format

Uses a trailer-based header so entry count is known at the end:

+------------------------+
| Magic (NEUS/NEUT)  4B  |  Identifies streaming snapshot/TT
+------------------------+
| Entry 1 length     4B  |  Little-endian u32
+------------------------+
| Entry 1 data      var  |  Bincode-serialized entry
+------------------------+
| Entry 2 length     4B  |
+------------------------+
| Entry 2 data      var  |
+------------------------+
| ...                    |
+------------------------+
| Trailer           var  |  Bincode-serialized header
+------------------------+
| Trailer size       8B  |  Little-endian u64
+------------------------+

Security limits:

  • Maximum trailer size: 1 MB (MAX_TRAILER_SIZE)
  • Maximum entry size: 100 MB (MAX_ENTRY_SIZE)

Usage

#![allow(unused)]
fn main() {
use tensor_compress::streaming::{StreamingWriter, StreamingReader};

// Write entries one at a time
let mut writer = StreamingWriter::new(file, config)?;
for entry in entries {
    writer.write_entry(&entry)?;
}
writer.finish()?;

// Read entries one at a time (iterator-based)
let reader = StreamingReader::open(file)?;
println!("Entry count: {}", reader.entry_count());
for entry in reader {
    process(entry?);
}
}

Streaming TT Operations

FunctionDescription
StreamingTTWriter::newCreate TT streaming writer
StreamingTTWriter::write_vectorDecompose and write vector
StreamingTTWriter::write_ttWrite pre-decomposed TT
StreamingTTWriter::finishFinalize with trailer
StreamingTTReader::openOpen TT streaming file
streaming_tt_similarity_searchSearch streaming TT file
convert_vectors_to_streaming_ttBatch convert vectors
read_streaming_tt_allLoad all TT vectors
#![allow(unused)]
fn main() {
use tensor_compress::streaming_tt::{StreamingTTWriter, StreamingTTReader,
    streaming_tt_similarity_search};

// Create streaming TT file
let config = TTConfig::for_dim(768)?;
let mut writer = StreamingTTWriter::new(file, config.clone())?;

for vector in vectors {
    writer.write_vector(&vector)?;  // Decompose on-the-fly
}
writer.finish()?;

// Similarity search without loading all into memory
let query_tt = tt_decompose(&query, &config)?;
let top_10 = streaming_tt_similarity_search(file, &query_tt, 10)?;
// Returns Vec<(index, similarity)> sorted by descending similarity
}

Merge and Convert Operations

#![allow(unused)]
fn main() {
use tensor_compress::streaming::{convert_to_streaming, read_streaming_to_snapshot,
    merge_streaming};

// Convert non-streaming snapshot to streaming format
let count = convert_to_streaming(&snapshot, output_file)?;

// Read streaming format into full snapshot (for compatibility)
let snapshot = read_streaming_to_snapshot(file)?;

// Merge multiple streaming snapshots
let count = merge_streaming(vec![file1, file2, file3], output, config)?;
}

Incremental Updates

Delta Snapshot Architecture

graph TD
    subgraph "Delta Chain"
        BASE[Base Snapshot<br/>Seq 0] --> D1[Delta 1<br/>Seq 1-10]
        D1 --> D2[Delta 2<br/>Seq 11-25]
        D2 --> D3[Delta 3<br/>Seq 26-30]
    end

    subgraph "Compaction"
        BASE2[Base] --> COMPACT[Compacted<br/>Snapshot]
        D1_2[Delta 1] --> COMPACT
        D2_2[Delta 2] --> COMPACT
        D3_2[Delta 3] --> COMPACT
    end

Delta Entry Types

#![allow(unused)]
fn main() {
pub enum ChangeType {
    Put,    // Entry was added or updated
    Delete, // Entry was deleted
}

pub struct DeltaEntry {
    pub key: String,
    pub change: ChangeType,
    pub value: Option<CompressedEntry>,  // None for Delete
    pub sequence: u64,
}
}

Usage

#![allow(unused)]
fn main() {
use tensor_compress::incremental::{DeltaBuilder, DeltaChain, apply_delta,
    merge_deltas, diff_snapshots};

// Create delta
let mut builder = DeltaBuilder::new("base_snapshot_id", sequence);
builder.put("key1", entry1);
builder.delete("key2");
let delta = builder.build();

// Apply delta
let new_snapshot = apply_delta(&base, &delta)?;

// Chain management
let mut chain = DeltaChain::new(base_snapshot);
chain.push(delta1)?;
chain.push(delta2)?;
let value = chain.get("key1");  // Checks chain then base

// Compact when chain grows long
if chain.should_compact(10) {
    let compacted = chain.compact()?;
}

// Compare two snapshots
let delta = diff_snapshots(&old_snapshot, &new_snapshot, "old_id")?;

// Merge multiple deltas into one
let merged = merge_deltas(&[delta1, delta2, delta3])?;
}

Delta Operations

FunctionDescription
DeltaBuilder::newCreate delta builder with base ID and start sequence
DeltaBuilder::putRecord a put (add/update) change
DeltaBuilder::deleteRecord a delete change
DeltaBuilder::buildBuild the delta snapshot
apply_deltaApply delta to base snapshot
merge_deltasMerge multiple deltas (keeps latest state per key)
diff_snapshotsCompute delta between two snapshots
DeltaChain::getGet current state of key (checks chain then base)
DeltaChain::compactCompact all deltas into new base
DeltaChain::should_compactCheck if compaction is recommended

Delta Format

+------------------------+
| Magic (NEUD)       4B  |
+------------------------+
| Version            2B  |
+------------------------+
| Base ID           var  |  String (length-prefixed)
+------------------------+
| Sequence Range     16B |  (start, end) u64 pair
+------------------------+
| Change Count        8B |
+------------------------+
| Created At          8B |  Unix timestamp
+------------------------+
| Entries           var  |  Bincode-serialized Vec<DeltaEntry>
+------------------------+

Lossless Compression

Delta + Varint Encoding

For sorted integer sequences (node IDs, timestamps):

graph LR
    subgraph "Delta + Varint Pipeline"
        IDS[IDs: 100, 101, 102, 105, 110] --> DELTA[Delta encode:<br/>100, 1, 1, 3, 5]
        DELTA --> VARINT[Varint encode]
        VARINT --> OUT[Bytes: ~7 bytes<br/>vs 40 bytes raw]
    end

Algorithm:

#![allow(unused)]
fn main() {
// Delta encoding: store first value, then differences
pub fn delta_encode(ids: &[u64]) -> Vec<u64> {
    let mut result = vec![ids[0]];
    for window in ids.windows(2) {
        result.push(window[1].saturating_sub(window[0]));
    }
    result
}

// Varint encoding: 7 bits per byte, high bit = continuation
pub fn varint_encode(values: &[u64]) -> Vec<u8> {
    let mut result = Vec::with_capacity(values.len() * 2);
    for &value in values {
        let mut v = value;
        loop {
            let byte = (v & 0x7f) as u8;
            v >>= 7;
            if v == 0 {
                result.push(byte);  // Final byte (no continuation)
                break;
            }
            result.push(byte | 0x80);  // Continuation bit set
        }
    }
    result
}
}

Usage:

#![allow(unused)]
fn main() {
use tensor_compress::{compress_ids, decompress_ids};

let ids: Vec<u64> = (1000..2000).collect();
let compressed = compress_ids(&ids);  // ~100 bytes vs 8000

let restored = decompress_ids(&compressed);
assert_eq!(ids, restored);
}

Varint byte sizes:

Value RangeBytes
0 - 1271
128 - 16,3832
16,384 - 2,097,1513
2,097,152 - 268,435,4554
… up to u64::MAX10

Run-Length Encoding

For repeated values:

#![allow(unused)]
fn main() {
use tensor_compress::{rle_encode, rle_decode};

let statuses = vec!["active"; 1000];
let encoded = rle_encode(&statuses);
assert_eq!(encoded.runs(), 1);  // Single run

// Storage: 1 string + 1 u32 = ~12 bytes vs 6000+ bytes
}

Internal representation:

#![allow(unused)]
fn main() {
pub struct RleEncoded<T: Eq> {
    pub values: Vec<T>,      // Unique values in order
    pub run_lengths: Vec<u32>, // Count for each value
}
}

Compression scenarios:

Data PatternRunsCompression
[5, 5, 5, 5, 5] (1000x)1500x
[1, 2, 3, 4, 5] (all different)50.8x (overhead)
[1, 1, 2, 2, 2, 3, 1, 1, 1, 1]42.5x
Status column (pending/active/done)~300 per 10000~33x

Sparse Vector Format

For vectors with >50% zeros:

#![allow(unused)]
fn main() {
use tensor_compress::{compress_sparse, compress_dense_as_sparse,
    should_use_sparse, should_use_sparse_threshold};

// Direct sparse compression
let positions = vec![0, 50, 99];
let values = vec![1.0, 2.0, 3.0];
let compressed = compress_sparse(100, &positions, &values);

// Auto-detect and compress
if should_use_sparse_threshold(&vector, 0.5) {
    let compressed = compress_dense_as_sparse(&vector);
}

// Check if sparse is beneficial
if should_use_sparse(dimension, non_zero_count) {
    // Use sparse format
}
}

Storage calculation:

#![allow(unused)]
fn main() {
// sparse_storage_size = 8 + 8 + nnz*2 + nnz*4 = 16 + nnz*6
// Dense storage = dimension * 4

// Sparse is better when: 16 + nnz*6 < dimension*4
// Solving: nnz < (dimension*4 - 16) / 6 = dimension*0.67 - 2.67
}
DimensionMax NNZ for SparseSparsity Threshold
1006464%
100066466.4%
4096272866.6%

Compressed Value Types

#![allow(unused)]
fn main() {
pub enum CompressedValue {
    Scalar(CompressedScalar),           // Int, Float, String, Bool, Null
    VectorRaw(Vec<f32>),                // Uncompressed
    VectorTT { cores, original_dim, shape, ranks },  // TT-compressed
    VectorSparse { dimension, positions, values },   // Sparse
    IdList(Vec<u8>),                    // Delta + varint encoded
    RleInt(RleEncoded<i64>),            // RLE encoded integers
    Pointer(String),                    // Single pointer
    Pointers(Vec<String>),              // Multiple pointers
}
}

Automatic Format Selection

#![allow(unused)]
fn main() {
pub fn compress_vector(vector: &[f32], key: &str, field_name: &str,
    config: &CompressionConfig) -> Result<CompressedValue, FormatError> {

    // 1. Check for embedding-like keys
    let is_embedding = key.starts_with("emb:") ||
                       field_name == "_embedding" ||
                       field_name == "vector";

    if is_embedding {
        if let Some(TensorMode::TensorTrain(tt_config)) = &config.tensor_mode {
            return Ok(CompressedValue::VectorTT { ... });
        }
    }

    // 2. Check for ID list pattern
    if config.delta_encoding && looks_like_id_list(vector, field_name) {
        return Ok(CompressedValue::IdList(...));
    }

    // 3. Fall back to raw
    Ok(CompressedValue::VectorRaw(vector.to_vec()))
}
}

Performance

Benchmarks on Apple M4 (aarch64, MacBook Air 24GB), release build:

DimensionDecomposeReconstructSimilarityCompression
646.2 us29.5 us1.1 us2.0x
25613.4 us113.0 us1.5 us4.6x
76826.9 us431.7 us2.4 us10.7x
153662.0 us709.8 us2.0 us16.0x
4096464.5 us2142.2 us2.4 us42.7x

Batch operations (768-dim, 1000 vectors):

OperationTimePer-vector
tt_decompose_batch21 ms21.0 us
tt_cosine_similarity_batch11.3 ms11.4 us

Throughput: 39,318 vectors/sec (768-dim decomposition)

Industry Comparison

MethodCompressionRecallNotes
Tensor Train (this)10-42x~99%Similarity in compressed space
Scalar Quantization4x99%+Industry default
Product Quantization16-64x56-90%Requires training
Binary Quantization32x80-95%Speed-optimized

Edge Cases and Gotchas

Vector Content Patterns

PatternCompressionReconstructionNotes
Constant (all same)Excellent (>5x)AccurateRank-1 structure
All zerosGoodAccurateDegenerate case
Single spikePoorModerateNo low-rank structure
Linear rampGood (>2x)GoodLow-rank
Alternating +1/-1PoorModerateHigh-frequency needs high rank
Random denseGoodGood (>0.9 cosine)Typical embeddings
90% zerosConsider sparse insteadn/aUse compress_dense_as_sparse

Numerical Edge Cases

#![allow(unused)]
fn main() {
// Very small values (denormalized floats)
let tiny: Vec<f32> = (0..64).map(|i| (i as f32) * 1e-38).collect();
// Works, but may lose precision

// Large values (1e6 range)
let large: Vec<f32> = (0..64).map(|i| (i as f32) * 1e6).collect();
// Works, no overflow

// Prime dimensions
let prime_127: Vec<f32> = (0..127).map(|i| (i as f32 * 0.1).sin()).collect();
// Works but may have poor compression
}

Streaming Gotchas

  1. Incomplete files: Magic bytes are written first, but entry count is in trailer. If writer crashes before finish(), the file is corrupt.

  2. Memory limits: MAX_ENTRY_SIZE = 100MB and MAX_TRAILER_SIZE = 1MB prevent allocation attacks. Exceeding these returns an error.

  3. Seek requirement: StreamingReader::open requires Seek to read the trailer. For non-seekable streams, use read_streaming_to_snapshot which buffers.

Delta Chain Gotchas

  1. Chain length: Default max_chain_len = 100. After this, push() returns ChainTooLong error. Call compact() periodically.

  2. Sequence gaps: Deltas should have contiguous sequences. The merge_deltas function only keeps the latest state per key.

  3. Base reference: Deltas store a base_id string but don’t validate it exists. Your application must track base snapshots.

Performance Tips and Best Practices

Choosing Configuration

#![allow(unused)]
fn main() {
// For search/retrieval (similarity queries)
let config = TTConfig::for_dim(dim)?;  // Balanced

// For archival/cold storage
let config = TTConfig::high_compression(dim)?;  // Smaller, slower queries

// For real-time applications
let config = TTConfig::high_accuracy(dim)?;  // Larger, faster queries
}

Batch Size Optimization

#![allow(unused)]
fn main() {
// Below parallel threshold (4), sequential is faster
// due to thread spawn overhead
let small_batch = tt_decompose_batch(&vectors[..3], &config);  // Sequential

// At threshold, parallel kicks in
let large_batch = tt_decompose_batch(&vectors, &config);  // Parallel if >= 4
}

Memory Efficiency

#![allow(unused)]
fn main() {
// Bad: Load all, then process
let all_vectors = read_streaming_tt_all(file)?;  // Loads all into memory

// Good: Stream process
for tt in StreamingTTReader::open(file)? {
    process(tt?);  // One at a time
}

// Best: Use streaming search
let results = streaming_tt_similarity_search(file, &query_tt, 10)?;
}

Delta Compaction Strategy

#![allow(unused)]
fn main() {
let mut chain = DeltaChain::new(base);

// After N deltas or M total changes
if chain.len() >= 10 || total_changes >= 10000 {
    let new_base = chain.compact()?;
    chain = DeltaChain::new(new_base);
}
}

Dependencies

  • serde: Serialization traits
  • bincode: Binary format
  • thiserror: Error types
  • rayon: Parallel batch operations

No external LAPACK/BLAS - pure Rust SVD implementation.

ModuleRelationship
tensor_storeUses compression for snapshot I/O
tensor_chainDelta compression for state replication
tensor_checkpointSnapshot format integration

Tensor Vault

Tensor Vault provides secure secret storage with AES-256-GCM encryption and graph-based access control. Designed for multi-agent environments, it implements a zero-trust architecture where access is determined by graph topology rather than traditional ACLs.

All secrets are encrypted at rest with authenticated encryption. The vault maintains a permanent audit trail of all operations and supports features like rate limiting, TTL-based grants, and namespace isolation for multi-tenant deployments.

Design Principles

PrincipleDescription
Encryption at RestAll secrets encrypted with AES-256-GCM
Topological Access ControlAccess determined by graph path, not ACLs
Zero TrustNo bypass mode; node:root is the only universal accessor
Memory SafetyKeys zeroized on drop via zeroize crate
Permanent Audit TrailAll operations logged with queryable API
Defense in DepthMultiple obfuscation layers hide patterns
Multi-Tenant ReadyNamespace isolation and rate limiting for agent systems

Key Types

Core Types

TypeDescription
VaultMain API for encrypted secret storage with graph-based access control
VaultConfigConfiguration for key derivation, rate limiting, and versioning
VaultErrorError types (AccessDenied, NotFound, CryptoError, etc.)
PermissionAccess levels: Read, Write, Admin
VersionInfoMetadata about a secret version (version number, timestamp)
ScopedVaultEntity-bound view for simplified API usage
NamespacedVaultNamespace-prefixed view for multi-tenant isolation

Cryptographic Types

TypeDescription
MasterKeyDerived encryption key with zeroize-on-drop (32 bytes)
CipherAES-256-GCM encryption wrapper
ObfuscatorHMAC-based key obfuscation and AEAD metadata encryption
PaddingSizePadding buckets for length hiding (256B to 64KB)

Access Control Types

TypeDescription
AccessControllerBFS-based graph path verification
GrantTTLTrackerMin-heap tracking grant expirations with persistence
RateLimiterSliding window rate limiting per entity
RateLimitConfigConfigurable limits per operation type

Audit Types

TypeDescription
AuditLogQuery interface for audit entries
AuditEntrySingle operation record (entity, key, operation, timestamp)
AuditOperationOperation types: Get, Set, Delete, Rotate, Grant, Revoke, List

Architecture

graph TB
    subgraph "Tensor Vault"
        API[Vault API]
        AC[AccessController]
        Cipher[Cipher<br/>AES-256-GCM]
        KDF[MasterKey<br/>Argon2id + HKDF]
        Obf[Obfuscator<br/>HMAC + Padding]
        Audit[AuditLog]
        TTL[GrantTTLTracker]
        RL[RateLimiter]
    end

    subgraph "Storage"
        TS[TensorStore]
        GE[GraphEngine]
    end

    API --> AC
    API --> Cipher
    API --> Obf
    API --> Audit
    API --> TTL
    API --> RL

    AC --> GE
    Cipher --> KDF
    Obf --> KDF

    API --> TS
    Audit --> TS

Data Flow

  1. Set Operation: Plaintext is padded, encrypted with random nonce, metadata obfuscated, stored via TensorStore
  2. Get Operation: Rate limit check, access path verified via BFS, ciphertext decrypted, padding removed, audit logged
  3. Grant Operation: Permission edge created in GraphEngine, TTL tracked if specified
  4. Revoke Operation: Permission edge deleted, expired grants cleaned up

Set Operation Flow

sequenceDiagram
    participant C as Client
    participant V as Vault
    participant RL as RateLimiter
    participant AC as AccessController
    participant O as Obfuscator
    participant Ci as Cipher
    participant TS as TensorStore
    participant GE as GraphEngine
    participant A as AuditLog

    C->>V: set(requester, key, value)
    V->>RL: check_rate_limit(requester, Set)
    alt Rate Limited
        RL-->>V: RateLimited error
        V-->>C: Error
    end

    alt New Secret
        V->>V: Check requester == ROOT
        alt Not Root
            V-->>C: AccessDenied
        end
    else Update
        V->>AC: check_path_with_permission(Write)
    end

    V->>O: pad_plaintext(value)
    O-->>V: padded_value
    V->>Ci: encrypt(padded_value)
    Ci-->>V: (ciphertext, nonce)

    V->>O: generate_storage_id(key, nonce)
    O-->>V: blob_key
    V->>TS: put(blob_key, ciphertext)

    V->>O: obfuscate_key(key)
    O-->>V: obfuscated_key
    V->>O: encrypt_metadata(creator)
    V->>O: encrypt_metadata(timestamp)
    V->>TS: put(_vk:obfuscated_key, metadata)

    alt New Secret
        V->>GE: add_entity_edge(ROOT, secret_node, VAULT_ACCESS_ADMIN)
    end

    V->>A: record(requester, key, Set)
    V-->>C: Ok(())

Access Control Model

Access is determined by graph topology using BFS traversal:

node:root ──VAULT_ACCESS_ADMIN──> vault_secret:api_key
                                          ^
user:alice ──VAULT_ACCESS_READ───────────┘
                                          ^
team:devs ──VAULT_ACCESS_WRITE───────────┘
      ^
user:bob ──MEMBER────────────────────────┘
RequesterPathAccess
node:rootAlwaysGranted (Admin)
user:aliceDirect edgeGranted (Read only)
team:devsDirect edgeGranted (Write)
user:bobbob -> team:devs -> secretGranted (Write via team)
user:carolNo pathDenied

Permission Levels

LevelCapabilities
Readget(), list(), get_version(), list_versions()
WriteRead + set() (update), rotate(), rollback()
AdminWrite + delete(), grant(), revoke()

Permission propagation follows graph paths. The effective permission is determined by the VAULT_ACCESS_* edge type at the end of the path.

Allowed Traversal Edges

Only these edge types can grant transitive access:

  • VAULT_ACCESS - Legacy edge type (treated as Admin for backward compatibility)
  • VAULT_ACCESS_READ - Read-only access
  • VAULT_ACCESS_WRITE - Read + Write access
  • VAULT_ACCESS_ADMIN - Full access including grant/revoke
  • MEMBER - Allows group membership traversal but does NOT grant permission directly

Access Control Algorithm

The AccessController uses BFS to find the best permission level along any path:

#![allow(unused)]
fn main() {
// Simplified algorithm from access.rs
pub fn get_permission_level(graph: &GraphEngine, source: &str, target: &str) -> Option<Permission> {
    if source == target {
        return Some(Permission::Admin);  // Self-access
    }

    let mut visited = HashSet::new();
    let mut queue = VecDeque::new();
    let mut best_permission: Option<Permission> = None;

    queue.push_back(source.to_string());
    visited.insert(source.to_string());

    while let Some(current) = queue.pop_front() {
        for edge in graph.get_entity_outgoing(&current) {
            let (_, to, edge_type, _) = graph.get_entity_edge(&edge);

            // Only traverse allowed edge types
            if !is_allowed_edge_type(&edge_type) {
                continue;
            }

            // VAULT_ACCESS_* edges grant permission to target
            if edge_type.starts_with("VAULT_ACCESS") && to == target {
                if let Some(perm) = Permission::from_edge_type(&edge_type) {
                    best_permission = max(best_permission, perm);
                }
            } else if edge_type == "MEMBER" {
                // MEMBER edges allow traversal but NO permission grant
                if !visited.contains(&to) {
                    visited.insert(to.clone());
                    queue.push_back(to);
                }
            }
        }
    }

    best_permission
}
}

Security Note: MEMBER edges enable traversal through groups but do not grant permissions. Only VAULT_ACCESS_* edges grant actual permissions. This prevents privilege escalation via group membership.

Access Control Flow

flowchart TD
    Start([Check Access]) --> IsRoot{Is requester ROOT?}
    IsRoot -->|Yes| Granted([Access Granted - Admin])
    IsRoot -->|No| BFS[Start BFS from requester]

    BFS --> Queue{Queue empty?}
    Queue -->|Yes| CheckBest{Best permission found?}
    Queue -->|No| Pop[Pop next node]

    Pop --> GetEdges[Get outgoing edges]
    GetEdges --> ForEdge{For each edge}

    ForEdge --> IsAllowed{Edge type allowed?}
    IsAllowed -->|No| ForEdge
    IsAllowed -->|Yes| IsVaultAccess{VAULT_ACCESS_* ?}

    IsVaultAccess -->|Yes| IsTarget{Points to target?}
    IsTarget -->|Yes| UpdateBest[Update best permission]
    IsTarget -->|No| ForEdge
    UpdateBest --> ForEdge

    IsVaultAccess -->|No| IsMember{MEMBER edge?}
    IsMember -->|Yes| AddQueue[Add destination to queue]
    IsMember -->|No| ForEdge
    AddQueue --> ForEdge

    ForEdge -->|Done| Queue

    CheckBest -->|Yes| CheckLevel{Permission >= required?}
    CheckBest -->|No| Denied([Access Denied])
    CheckLevel -->|Yes| Granted2([Access Granted])
    CheckLevel -->|No| Insufficient([Insufficient Permission])

Storage Format

Secrets use a two-tier storage model for security:

Metadata Tensor

Storage key: _vk:{HMAC(key)} (key name obfuscated via HMAC-BLAKE2b)

FieldTypeDescription
_blobPointerReference to current version ciphertext blob
_nonceBytes12-byte encryption nonce for current version
_versionsPointersList of all version blob keys (oldest first)
_key_encBytesAES-GCM encrypted original key name
_key_nonceBytesNonce for key encryption
_creator_obfBytesAEAD-encrypted creator (nonce prepended)
_created_obfBytesAEAD-encrypted timestamp (nonce prepended)
_rotator_obfBytesAEAD-encrypted last rotator (optional)
_rotated_obfBytesAEAD-encrypted last rotation timestamp (optional)

Ciphertext Blob

Storage key: _vs:{HMAC(key, nonce)} (random-looking storage ID)

FieldTypeDescription
_dataBytesPadded + encrypted secret
_nonceBytes12-byte encryption nonce
_tsIntUnix timestamp (seconds) when version was created

Storage Key Structure

_vault:salt          - Persisted 16-byte salt for key derivation
_vk:<32-hex-chars>   - Metadata tensor (HMAC of secret key)
_vs:<24-hex-chars>   - Ciphertext blob (HMAC of key + nonce)
_va:<timestamp>:<counter> - Audit log entries
_vault_ttl_grants    - Persisted TTL grants (JSON)
vault_secret:<32-hex-chars> - Secret node for graph access control

Encryption

Key Derivation

Master key derived using Argon2id with HKDF-based subkey separation:

#![allow(unused)]
fn main() {
// From key.rs - Argon2id parameters
pub const SALT_SIZE: usize = 16;  // 128-bit salt
pub const KEY_SIZE: usize = 32;   // 256-bit key (AES-256)

// Default VaultConfig values:
// argon2_memory_cost: 65536 (64 MiB)
// argon2_time_cost: 3 (iterations)
// argon2_parallelism: 4 (threads)

// Argon2id configuration
let params = Params::new(
    config.argon2_memory_cost,  // Memory in KiB
    config.argon2_time_cost,    // Iterations
    config.argon2_parallelism,  // Parallelism
    Some(KEY_SIZE),             // Output length
)?;

let argon2 = Argon2::new(Algorithm::Argon2id, Version::V0x13, params);
argon2.hash_password_into(input, salt, &mut key)?;
}

Argon2id Security Properties:

  • Hybrid algorithm: Argon2i (side-channel resistant) + Argon2d (GPU resistant)
  • Memory-hard: Requires 64 MiB by default, defeating GPU/ASIC attacks
  • Time-hard: 3 iterations increase computation time
  • Parallelism: 4 threads to utilize modern CPUs

HKDF Subkey Derivation

Each purpose gets a cryptographically independent key via HKDF-SHA256:

#![allow(unused)]
fn main() {
// From key.rs - Domain-separated subkeys
impl MasterKey {
    pub fn derive_subkey(&self, domain: &[u8]) -> [u8; KEY_SIZE] {
        let hk = Hkdf::<Sha256>::new(None, &self.bytes);
        let mut output = [0u8; KEY_SIZE];
        hk.expand(domain, &mut output).expect("HKDF expand cannot fail for 32 bytes");
        output
    }

    pub fn encryption_key(&self) -> [u8; KEY_SIZE] {
        self.derive_subkey(b"neumann_vault_encryption_v1")
    }

    pub fn obfuscation_key(&self) -> [u8; KEY_SIZE] {
        self.derive_subkey(b"neumann_vault_obfuscation_v1")
    }

    pub fn metadata_key(&self) -> [u8; KEY_SIZE] {
        self.derive_subkey(b"neumann_vault_metadata_v1")
    }
}
}

Key Hierarchy:

Master Password + Salt
        │
        ▼ Argon2id
    MasterKey (32 bytes)
        │
        ├──▶ HKDF("encryption_v1") ──▶ AES-256-GCM key
        ├──▶ HKDF("obfuscation_v1") ──▶ HMAC key for obfuscation
        └──▶ HKDF("metadata_v1") ──▶ AES-256-GCM key for metadata

Salt Persistence

The vault automatically manages salt persistence:

#![allow(unused)]
fn main() {
// From lib.rs - Salt handling on vault creation
pub fn new(master_key: &[u8], graph: Arc<GraphEngine>, store: TensorStore, config: VaultConfig) -> Result<Self> {
    let derived = if config.salt.is_some() {
        // Explicit salt provided - use it directly
        let (key, _) = MasterKey::derive(master_key, &config)?;
        key
    } else if let Some(persisted_salt) = Self::load_salt(&store) {
        // Use persisted salt for consistency across reopens
        MasterKey::derive_with_salt(master_key, &persisted_salt, &config)?
    } else {
        // Generate new random salt and persist it
        let (key, new_salt) = MasterKey::derive(master_key, &config)?;
        Self::save_salt(&store, new_salt)?;
        key
    };
    // ...
}
}

Encryption Process

  1. Pad plaintext to fixed bucket size (256B, 1KB, 4KB, 16KB, 32KB, or 64KB)
  2. Generate random 12-byte nonce
  3. Encrypt with AES-256-GCM
  4. Store ciphertext and nonce separately
#![allow(unused)]
fn main() {
// From encryption.rs
pub const NONCE_SIZE: usize = 12;  // 96-bit nonce (AES-GCM standard)

impl Cipher {
    pub fn encrypt(&self, plaintext: &[u8]) -> Result<(Vec<u8>, [u8; NONCE_SIZE])> {
        let cipher = Aes256Gcm::new_from_slice(self.key.as_bytes())?;

        // Generate random nonce - CRITICAL for security
        let mut nonce_bytes = [0u8; NONCE_SIZE];
        rand::thread_rng().fill_bytes(&mut nonce_bytes);
        let nonce = Nonce::from_slice(&nonce_bytes);

        // AES-GCM provides authenticated encryption
        // Output: ciphertext || 16-byte authentication tag
        let ciphertext = cipher.encrypt(nonce, plaintext)?;

        Ok((ciphertext, nonce_bytes))
    }

    pub fn decrypt(&self, ciphertext: &[u8], nonce_bytes: &[u8]) -> Result<Vec<u8>> {
        if nonce_bytes.len() != NONCE_SIZE {
            return Err(VaultError::CryptoError("Invalid nonce size"));
        }

        let cipher = Aes256Gcm::new_from_slice(self.key.as_bytes())?;
        let nonce = Nonce::from_slice(nonce_bytes);

        // Decryption verifies authentication tag
        // Fails if ciphertext was tampered
        cipher.decrypt(nonce, ciphertext)
    }
}
}

AES-256-GCM Security Properties:

  • Authenticated encryption: Detects tampering via 128-bit authentication tag
  • Nonce requirement: Each encryption MUST use a unique nonce
  • Ciphertext expansion: 16 bytes larger than plaintext (auth tag)

Obfuscation Layers

LayerPurposeImplementation
Key ObfuscationHide secret namesHMAC-BLAKE2b hash of key name
Pointer IndirectionHide storage patternsCiphertext in separate blob with random-looking key
Length PaddingHide plaintext sizePad to fixed bucket sizes
Metadata EncryptionHide creator/timestampsAES-GCM with per-record random nonces
Blind IndexesSearchable encryptionHMAC-based indexes for pattern matching

Padding Bucket Sizes

#![allow(unused)]
fn main() {
// From obfuscation.rs
pub enum PaddingSize {
    Small = 256,        // API keys, tokens
    Medium = 1024,      // Certificates, small configs
    Large = 4096,       // Private keys, large configs
    ExtraLarge = 16384, // Very large secrets
    Huge = 32768,       // Oversized secrets
    Maximum = 65536,    // Maximum supported
}

// Bucket selection (includes 4-byte length prefix + 1 byte min padding)
pub fn for_length(len: usize) -> Option<Self> {
    let min_required = len + 5;  // length prefix + min padding

    if min_required <= 256 { Some(Small) }
    else if min_required <= 1024 { Some(Medium) }
    else if min_required <= 4096 { Some(Large) }
    else if min_required <= 16384 { Some(ExtraLarge) }
    else if min_required <= 32768 { Some(Huge) }
    else if min_required <= 65536 { Some(Maximum) }
    else { None }
}
}

Padding Format

+----------------+-------------------+------------------+
| Length (4B LE) | Plaintext (N B)   | Random Padding   |
+----------------+-------------------+------------------+
|<--------------- Bucket Size (256/1K/4K/...) -------->|
#![allow(unused)]
fn main() {
// From obfuscation.rs
pub fn pad_plaintext(plaintext: &[u8]) -> Result<Vec<u8>> {
    let target_size = PaddingSize::for_length(plaintext.len())? as usize;
    let padding_len = target_size - 4 - plaintext.len();  // 4 = length prefix

    let mut padded = Vec::with_capacity(target_size);

    // Store original length as u32 little-endian
    let len_bytes = (plaintext.len() as u32).to_le_bytes();
    padded.extend_from_slice(&len_bytes);

    // Original data
    padded.extend_from_slice(plaintext);

    // Random padding (not zeros - prevents padding oracle attacks)
    let mut rng_bytes = vec![0u8; padding_len];
    rand::thread_rng().fill_bytes(&mut rng_bytes);
    padded.extend_from_slice(&rng_bytes);

    Ok(padded)
}
}

HMAC-BLAKE2b Construction

#![allow(unused)]
fn main() {
// From obfuscation.rs - HMAC construction for key obfuscation
fn hmac_hash(&self, data: &[u8], domain: &[u8]) -> [u8; 32] {
    // Inner hash: H((key XOR ipad) || domain || data)
    let mut inner_key = self.obfuscation_key;
    for byte in &mut inner_key {
        *byte ^= 0x36;  // ipad
    }
    let mut inner_hasher = Blake2b::<U32>::new();
    inner_hasher.update(inner_key);
    inner_hasher.update(domain);
    inner_hasher.update(data);
    let inner_hash = inner_hasher.finalize();

    // Outer hash: H((key XOR opad) || inner_hash)
    let mut outer_key = self.obfuscation_key;
    for byte in &mut outer_key {
        *byte ^= 0x5c;  // opad
    }
    let mut outer_hasher = Blake2b::<U32>::new();
    outer_hasher.update(outer_key);
    outer_hasher.update(inner_hash);

    outer_hasher.finalize().into()
}
}

Metadata AEAD Encryption

#![allow(unused)]
fn main() {
// From obfuscation.rs - Per-record AEAD encryption
pub fn encrypt_metadata(&self, data: &[u8]) -> Result<Vec<u8>> {
    let cipher = Aes256Gcm::new_from_slice(&self.metadata_key)?;

    // Random nonce for each encryption
    let mut nonce_bytes = [0u8; 12];
    rand::thread_rng().fill_bytes(&mut nonce_bytes);
    let nonce = Nonce::from_slice(&nonce_bytes);

    let ciphertext = cipher.encrypt(nonce, data)?;

    // Format: nonce || ciphertext
    let mut result = Vec::with_capacity(12 + ciphertext.len());
    result.extend_from_slice(&nonce_bytes);
    result.extend(ciphertext);
    Ok(result)
}
}

Rate Limiting

Rate limiting uses a sliding window algorithm to prevent brute-force attacks:

#![allow(unused)]
fn main() {
// From rate_limit.rs
pub struct RateLimiter {
    // (entity, operation) -> timestamps of recent requests
    history: DashMap<(String, String), VecDeque<Instant>>,
    config: RateLimitConfig,
}

impl RateLimiter {
    pub fn check_and_record(&self, entity: &str, op: Operation) -> Result<(), String> {
        let limit = op.limit(&self.config);
        if limit == u32::MAX {
            return Ok(());  // Unlimited
        }

        let key = (entity.to_string(), op.as_str().to_string());
        let now = Instant::now();
        let window_start = now - self.config.window;

        let mut entry = self.history.entry(key).or_default();
        let timestamps = entry.value_mut();

        // Remove expired entries outside window
        while let Some(front) = timestamps.front() {
            if *front < window_start {
                timestamps.pop_front();
            } else {
                break;
            }
        }

        let count = timestamps.len() as u32;
        if count >= limit {
            Err(format!("Rate limit exceeded: {} {} calls in {:?}", count, op, self.config.window))
        } else {
            timestamps.push_back(now);  // Record this request
            Ok(())
        }
    }
}
}

Sliding Window Visualization

Window: 60 seconds
Limit: 5 requests

Timeline:
|--[req1]--[req2]---[req3]--[req4]---[req5]---|
|<------------------ Window ----------------->|
                                               ^
                                               Now (6th request blocked)

After 10 seconds:
                    |--[req2]---[req3]--[req4]---[req5]---|
   [req1] expired   |<------------------ Window --------->|
                                                           ^
                                                           Now (6th request allowed)

Rate Limit Configuration Presets

#![allow(unused)]
fn main() {
// Default configuration
impl Default for RateLimitConfig {
    fn default() -> Self {
        Self {
            max_gets: 60,    // 60 get() calls per minute
            max_lists: 10,   // 10 list() calls per minute
            max_sets: 30,    // 30 set() calls per minute
            max_grants: 20,  // 20 grant() calls per minute
            window: Duration::from_secs(60),
        }
    }
}

// Strict configuration for testing
pub fn strict() -> Self {
    Self {
        max_gets: 5,
        max_lists: 2,
        max_sets: 3,
        max_grants: 2,
        window: Duration::from_secs(60),
    }
}

// No rate limiting
pub fn unlimited() -> Self {
    Self {
        max_gets: u32::MAX,
        max_lists: u32::MAX,
        max_sets: u32::MAX,
        max_grants: u32::MAX,
        window: Duration::from_secs(60),
    }
}
}

Note: node:root is exempt from rate limiting.

TTL Grant Tracking

TTL grants use a min-heap for efficient expiration tracking:

#![allow(unused)]
fn main() {
// From ttl.rs
pub struct GrantTTLTracker {
    // Priority queue of expiration times (min-heap)
    heap: Mutex<BinaryHeap<GrantTTLEntry>>,
}

struct GrantTTLEntry {
    expires_at: Instant,
    entity: String,
    secret_key: String,
}

// Reverse ordering for min-heap (earliest expiration first)
impl Ord for GrantTTLEntry {
    fn cmp(&self, other: &Self) -> Ordering {
        other.expires_at.cmp(&self.expires_at)  // Reversed!
    }
}
}

TTL Operations

#![allow(unused)]
fn main() {
// Add a grant with TTL
pub fn add(&self, entity: &str, secret_key: &str, ttl: Duration) {
    let entry = GrantTTLEntry {
        expires_at: Instant::now() + ttl,
        entity: entity.to_string(),
        secret_key: secret_key.to_string(),
    };
    self.heap.lock().unwrap().push(entry);
}

// Efficient expiration check - O(1) to peek, O(log n) to pop
pub fn get_expired(&self) -> Vec<(String, String)> {
    let now = Instant::now();
    let mut expired = Vec::new();
    let mut heap = self.heap.lock().unwrap();

    // Pop all expired entries (they're at the top due to min-heap)
    while let Some(entry) = heap.peek() {
        if entry.expires_at <= now {
            if let Some(entry) = heap.pop() {
                expired.push((entry.entity, entry.secret_key));
            }
        } else {
            break;  // No more expired entries
        }
    }

    expired
}
}

TTL Persistence

TTL grants survive vault restarts via TensorStore persistence:

#![allow(unused)]
fn main() {
// From ttl.rs
const TTL_STORAGE_KEY: &str = "_vault_ttl_grants";

#[derive(Serialize, Deserialize)]
pub struct PersistedGrant {
    pub expires_at_ms: i64,  // Unix timestamp
    pub entity: String,
    pub secret_key: String,
}

pub fn persist(&self, store: &TensorStore) -> Result<()> {
    let grants: Vec<PersistedGrant> = self.heap.lock().unwrap()
        .iter()
        .map(|e| PersistedGrant {
            expires_at_ms: instant_to_unix_ms(e.expires_at),
            entity: e.entity.clone(),
            secret_key: e.secret_key.clone(),
        })
        .collect();

    let data = serde_json::to_vec(&grants)?;
    store.put(TTL_STORAGE_KEY, tensor_with_bytes(data))?;
    Ok(())
}

pub fn load(store: &TensorStore) -> Result<Self> {
    let tracker = Self::new();
    let grants: Vec<PersistedGrant> = load_from_store(store)?;

    for grant in grants {
        // Skip already expired grants
        if !grant.is_expired() {
            tracker.add_with_expiration(
                &grant.entity,
                &grant.secret_key,
                unix_ms_to_instant(grant.expires_at_ms),
            );
        }
    }

    Ok(tracker)
}
}

Cleanup Strategy

Expired grants are cleaned up opportunistically during get() operations:

#![allow(unused)]
fn main() {
// From lib.rs
pub fn get(&self, requester: &str, key: &str) -> Result<String> {
    // Opportunistic cleanup of expired grants
    self.cleanup_expired_grants();

    // ... rest of get operation
}

pub fn cleanup_expired_grants(&self) -> usize {
    let expired = self.ttl_tracker.get_expired();
    let mut revoked = 0;

    for (entity, key) in expired {
        let secret_node = self.secret_node_key(&key);

        // Delete the VAULT_ACCESS_* edge
        if let Ok(edges) = self.graph.get_entity_outgoing(&entity) {
            for edge_key in edges {
                if let Ok((_, to, edge_type, _)) = self.graph.get_entity_edge(&edge_key) {
                    if to == secret_node && edge_type.starts_with("VAULT_ACCESS") {
                        if self.graph.delete_entity_edge(&edge_key).is_ok() {
                            revoked += 1;
                        }
                    }
                }
            }
        }
    }

    revoked
}
}

Audit Logging

Audit Entry Storage

#![allow(unused)]
fn main() {
// From audit.rs
const AUDIT_PREFIX: &str = "_va:";
static AUDIT_COUNTER: AtomicU64 = AtomicU64::new(0);

pub fn record(&self, entity: &str, secret_key: &str, operation: &AuditOperation) {
    let timestamp = now_millis();
    let counter = AUDIT_COUNTER.fetch_add(1, Ordering::SeqCst);
    let key = format!("{AUDIT_PREFIX}{timestamp}:{counter}");

    let mut tensor = TensorData::new();
    tensor.set("_entity", entity);
    tensor.set("_secret", secret_key);  // Already obfuscated by caller
    tensor.set("_op", operation.as_str());
    tensor.set("_ts", timestamp);

    // Additional fields for grant/revoke
    match operation {
        AuditOperation::Grant { to, permission } => {
            tensor.set("_target", to);
            tensor.set("_permission", permission);
        },
        AuditOperation::Revoke { from } => {
            tensor.set("_target", from);
        },
        _ => {},
    }

    // Best effort - audit failures don't block operations
    let _ = self.store.put(&key, tensor);
}
}

Audit Query Methods

MethodDescriptionTime Complexity
by_secret(key)All entries for a secretO(n) scan + filter
by_entity(entity)All entries by requesterO(n) scan + filter
since(timestamp)Entries since timestampO(n) scan + filter
between(start, end)Entries in time rangeO(n) scan + filter
recent(limit)Last N entriesO(n log n) sort + truncate

Note: Secret keys are obfuscated in audit logs to prevent leaking plaintext names.

Usage Examples

Basic Operations

#![allow(unused)]
fn main() {
use tensor_vault::{Vault, VaultConfig, Permission};
use graph_engine::GraphEngine;
use tensor_store::TensorStore;
use std::sync::Arc;

// Initialize vault
let graph = Arc::new(GraphEngine::new());
let store = TensorStore::new();
let vault = Vault::new(b"master_password", graph, store, VaultConfig::default())?;

// Store a secret (root only)
vault.set(Vault::ROOT, "api_key", "sk-secret123")?;

// Grant access with permission level
vault.grant_with_permission(Vault::ROOT, "user:alice", "api_key", Permission::Read)?;

// Retrieve secret
let value = vault.get("user:alice", "api_key")?;

// Revoke access
vault.revoke(Vault::ROOT, "user:alice", "api_key")?;
}

Permission-Based Access

#![allow(unused)]
fn main() {
// Grant different permission levels
vault.grant_with_permission(Vault::ROOT, "user:reader", "secret", Permission::Read)?;
vault.grant_with_permission(Vault::ROOT, "user:writer", "secret", Permission::Write)?;
vault.grant_with_permission(Vault::ROOT, "user:admin", "secret", Permission::Admin)?;

// Reader can only get/list
vault.get("user:reader", "secret")?;  // OK
vault.set("user:reader", "secret", "new")?;  // InsufficientPermission

// Writer can update
vault.rotate("user:writer", "secret", "new_value")?;  // OK
vault.delete("user:writer", "secret")?;  // InsufficientPermission

// Admin can do everything
vault.grant_with_permission("user:admin", "user:new", "secret", Permission::Read)?;  // OK
vault.delete("user:admin", "secret")?;  // OK
}

TTL Grants

#![allow(unused)]
fn main() {
use std::time::Duration;

// Grant temporary access (1 hour)
vault.grant_with_ttl(
    Vault::ROOT,
    "agent:temp",
    "api_key",
    Permission::Read,
    Duration::from_secs(3600),
)?;

// Access works during TTL
vault.get("agent:temp", "api_key")?;  // OK

// After 1 hour, access is automatically revoked
// (cleanup happens opportunistically on next vault operation)
}

Namespace Isolation

#![allow(unused)]
fn main() {
// Create namespaced vault for multi-tenant isolation
let backend = vault.namespace("team:backend", "user:alice");
let frontend = vault.namespace("team:frontend", "user:bob");

// Keys are automatically prefixed
backend.set("db_password", "secret1")?;   // Stored as "team:backend:db_password"
frontend.set("api_key", "secret2")?;      // Stored as "team:frontend:api_key"

// Cross-namespace access blocked
frontend.get("db_password")?;  // AccessDenied
}

Secret Versioning

#![allow(unused)]
fn main() {
// Each set/rotate creates a new version
vault.set(Vault::ROOT, "api_key", "v1")?;
vault.rotate(Vault::ROOT, "api_key", "v2")?;
vault.rotate(Vault::ROOT, "api_key", "v3")?;

// Get version info
let version = vault.current_version(Vault::ROOT, "api_key")?;  // 3
let versions = vault.list_versions(Vault::ROOT, "api_key")?;
// [VersionInfo { version: 1, created_at: ... }, ...]

// Get specific version
let old_value = vault.get_version(Vault::ROOT, "api_key", 1)?;  // "v1"

// Rollback (creates new version with old content)
vault.rollback(Vault::ROOT, "api_key", 1)?;
vault.get(Vault::ROOT, "api_key")?;  // "v1"
vault.current_version(Vault::ROOT, "api_key")?;  // 4 (rollback creates new version)
}

Audit Queries

#![allow(unused)]
fn main() {
// Query by secret
let entries = vault.audit_log("api_key");

// Query by entity
let alice_actions = vault.audit_by_entity("user:alice");

// Query by time
let recent = vault.audit_since(timestamp_millis);
let last_10 = vault.audit_recent(10);

// Audit entries include operation details
for entry in entries {
    match &entry.operation {
        AuditOperation::Grant { to, permission } => {
            println!("Granted {} to {} at {}", permission, to, entry.timestamp);
        },
        AuditOperation::Get => {
            println!("{} read secret at {}", entry.entity, entry.timestamp);
        },
        _ => {},
    }
}
}

Scoped Vault

#![allow(unused)]
fn main() {
// Create a scoped view for a specific entity
let alice = vault.scope("user:alice");

// All operations use alice as the requester
alice.get("api_key")?;  // Same as vault.get("user:alice", "api_key")
alice.list("*")?;       // Same as vault.list("user:alice", "*")
}

Configuration Options

VaultConfig

FieldTypeDefaultDescription
saltOption<[u8; 16]>NoneSalt for key derivation (random if not provided, persisted)
argon2_memory_costu3265536Memory cost in KiB (64MB)
argon2_time_costu323Iteration count
argon2_parallelismu324Thread count
rate_limitOption<RateLimitConfig>NoneRate limiting (disabled if None)
max_versionsusize5Maximum versions to retain per secret

RateLimitConfig

FieldTypeDefaultDescription
max_getsu3260Maximum get() calls per window
max_listsu3210Maximum list() calls per window
max_setsu3230Maximum set() calls per window
max_grantsu3220Maximum grant() calls per window
windowDuration60sSliding window duration

Environment Variables

VariableDescription
NEUMANN_VAULT_KEYBase64-encoded 32-byte master key

Shell Commands

VAULT INIT                              Initialize vault from NEUMANN_VAULT_KEY
VAULT IDENTITY 'node:alice'             Set current identity
VAULT NAMESPACE 'team:backend'          Set current namespace

VAULT SET 'api_key' 'sk-123'            Store encrypted secret
VAULT GET 'api_key'                     Retrieve secret
VAULT GET 'api_key' VERSION 2           Get specific version
VAULT DELETE 'api_key'                  Delete secret
VAULT LIST 'prefix:*'                   List accessible secrets
VAULT ROTATE 'api_key' 'new'            Rotate secret value
VAULT VERSIONS 'api_key'                List version history
VAULT ROLLBACK 'api_key' VERSION 2      Rollback to version

VAULT GRANT 'user:bob' ON 'api_key'              Grant admin access
VAULT GRANT 'user:bob' ON 'api_key' READ         Grant read-only access
VAULT GRANT 'user:bob' ON 'api_key' WRITE        Grant write access
VAULT GRANT 'user:bob' ON 'api_key' TTL 3600     Grant with 1-hour expiry
VAULT REVOKE 'user:bob' ON 'api_key'             Revoke access

VAULT AUDIT 'api_key'                   View audit log for secret
VAULT AUDIT BY 'user:alice'             View audit log for entity
VAULT AUDIT RECENT 10                   View last 10 operations

Security Considerations

Best Practices

  1. Use strong master passwords: At least 128 bits of entropy
  2. Rotate secrets regularly: Use rotate() to maintain version history
  3. Grant minimal permissions: Use Read when Write/Admin not needed
  4. Use TTL grants for temporary access: Prevents forgotten grants
  5. Enable rate limiting in production: Prevents brute-force attacks
  6. Use namespaces for multi-tenant: Enforces isolation
  7. Review audit logs: Monitor for suspicious access patterns

Edge Cases and Gotchas

ScenarioBehavior
Grant to non-existent entitySucceeds (edge created, entity may exist later)
Revoke non-existent grantSucceeds silently (idempotent)
Get non-existent secretReturns NotFound error
Set by non-root without WriteReturns AccessDenied or InsufficientPermission
TTL grant cleanupOpportunistic on get() - may not be immediate
Version limit exceededOldest versions automatically deleted
Plaintext > 64KBReturns CryptoError
Invalid UTF-8 in secretget() returns CryptoError
Concurrent modificationsThread-safe via DashMap sharding
MEMBER edge to secretPath exists but NO permission granted

Threat Model

ThreatMitigation
Password brute-forceArgon2id memory-hard KDF (64MB, 3 iterations)
Offline dictionary attackRandom 128-bit salt, stored in TensorStore
Ciphertext tamperingAES-GCM authentication tag (128-bit)
Nonce reuseRandom 96-bit nonce per encryption
Key leakageKeys zeroized on drop, subkeys via HKDF
Pattern analysisKey obfuscation, padding, metadata encryption
Access enumerationRate limiting, audit logging
Privilege escalationMEMBER edges don’t grant permissions
Replay attacksPer-operation nonces, timestamps in metadata

Performance

OperationTimeNotes
Key derivation (Argon2id)~80ms64MB memory cost
set (1KB)~29usIncludes encryption + versioning
get (1KB)~24usIncludes decryption + audit
set (10KB)~93usScales with data size
get (10KB)~91usScales with data size
Access check (shallow)~6usDirect edge
Access check (deep, 10 hops)~17usBFS traversal
grant~18usCreates graph edge
revoke~1.1msEdge deletion + TTL cleanup
list (100 secrets)~291usPattern matching + access check
list (1000 secrets)~2.7msScales linearly
ModuleRelationship
Tensor StoreUnderlying key-value storage for encrypted secrets
Graph EngineAccess control edges and audit trail
Query RouterVAULT command execution
Neumann ShellInteractive vault commands

Dependencies

CratePurpose
aes-gcmAES-256-GCM encryption
argon2Key derivation
hkdfSubkey derivation
blake2HMAC and obfuscation hashing
randNonce generation
zeroizeSecure memory cleanup
dashmapConcurrent rate limit tracking
serdeTTL grant persistence

Tensor Cache Architecture

Semantic caching for LLM responses with cost tracking and background eviction. Module 10 of Neumann.

The tensor_cache module provides multi-layer caching optimized for LLM workloads. It combines O(1) exact hash lookups with O(log n) semantic similarity search via HNSW indices. All cache entries are stored as TensorData in a shared TensorStore, following the tensor-native paradigm used by tensor_vault and tensor_blob.

Design Principles

PrincipleDescription
Multi-Layer CachingExact O(1), Semantic O(log n), Embedding O(1) lookups
Cost-AwareTracks tokens and estimates savings using tiktoken
Background EvictionAsync eviction with configurable strategies
TTL ExpirationTime-based entry expiration with min-heap tracking
Thread-SafeAll operations are concurrent via DashMap
Zero Allocation LookupEmbeddings stored inline, not as pointers
Sparse-AwareAutomatic sparse storage for vectors with >50% zeros

Key Types

Core Types

TypeDescription
CacheMain API - multi-layer LLM response cache
CacheConfigConfiguration (capacity, TTL, eviction, metrics)
CacheHitSuccessful cache lookup result
CacheStatsThread-safe statistics with atomic counters
StatsSnapshotPoint-in-time snapshot for reporting
CacheLayerEnum: Exact, Semantic, Embedding
CacheErrorError types for cache operations

Configuration Types

TypeDescription
EvictionStrategyLRU, LFU, CostBased, Hybrid
EvictionManagerBackground eviction task controller
EvictionScorerCalculates eviction priority scores
EvictionHandleHandle for controlling background eviction
EvictionConfigInterval, batch size, and strategy settings

Token Counting

TypeDescription
TokenCounterGPT-4 compatible token counting via tiktoken
ModelPricingPredefined pricing for GPT-4, Claude 3, etc.

Index Types (Internal)

TypeDescription
CacheIndexHNSW wrapper with key-to-node mapping
IndexSearchResultSemantic search result with similarity score

Architecture Diagram

+--------------------------------------------------+
|                  Cache (Public API)               |
|   - get(prompt, embedding) -> CacheHit           |
|   - put(prompt, embedding, response, ...)        |
|   - stats(), evict(), clear()                    |
+--------------------------------------------------+
            |           |           |
    +-------+    +------+    +------+
    |            |           |
+--------+  +----------+  +-----------+
| Exact  |  | Semantic |  | Embedding |
| Cache  |  |  Cache   |  |   Cache   |
| O(1)   |  | O(log n) |  |   O(1)    |
+--------+  +----------+  +-----------+
    |            |           |
    +-------+----+----+------+
            |
    +------------------+
    |   CacheIndex     |
    |  (HNSW wrapper)  |
    +------------------+
            |
    +------------------+
    |   tensor_store   |
    |     hnsw.rs      |
    +------------------+

Multi-Layer Cache Lookup Algorithm

The cache lookup algorithm is designed to maximize hit rates while minimizing latency. It follows a hierarchical approach, checking faster layers first before falling back to more expensive operations.

Lookup Flow Diagram

flowchart TD
    A[get prompt, embedding] --> B{Exact Cache Hit?}
    B -->|Yes| C[Return CacheHit layer=Exact]
    B -->|No| D[Record Exact Miss]
    D --> E{Embedding Provided?}
    E -->|No| F[Return None]
    E -->|Yes| G{Auto-Select Metric?}
    G -->|Yes| H{Sparsity >= Threshold?}
    G -->|No| I[Use Configured Metric]
    H -->|Yes| J[Use Jaccard]
    H -->|No| I
    J --> K[HNSW Search with Metric]
    I --> K
    K --> L{Results Above Threshold?}
    L -->|No| M[Record Semantic Miss]
    M --> F
    L -->|Yes| N{Entry Expired?}
    N -->|Yes| M
    N -->|No| O[Return CacheHit layer=Semantic]

Exact Cache Lookup (O(1))

The exact cache uses a hash-based key derived from the prompt text:

#![allow(unused)]
fn main() {
// Key generation using DefaultHasher
fn exact_key(prompt: &str) -> String {
    let mut hasher = DefaultHasher::new();
    prompt.hash(&mut hasher);
    let hash = hasher.finish();
    format!("_cache:exact:{:016x}", hash)
}
}

The lookup sequence:

  1. Generate hash key from prompt
  2. Query TensorStore with key
  3. Check expiration timestamp
  4. Return hit or proceed to semantic lookup

Semantic Cache Lookup (O(log n))

The semantic cache uses HNSW (Hierarchical Navigable Small World) graphs for approximate nearest neighbor search:

flowchart LR
    A[Query Vector] --> B[HNSW Entry Point]
    B --> C[Layer 2: Coarse Search]
    C --> D[Layer 1: Refined Search]
    D --> E[Layer 0: Fine Search]
    E --> F[Top-k Candidates]
    F --> G[Re-score with Metric]
    G --> H[Filter by Threshold]
    H --> I[Return Best Match]

Re-scoring Strategy: The HNSW index retrieves candidates using cosine similarity, then re-scores them with the requested metric. This allows using different metrics without rebuilding the index:

#![allow(unused)]
fn main() {
// Retrieve more candidates than needed for re-scoring
let ef = (k * 3).max(10);
let candidates = index.search(query, ef);

// Re-score with specified metric
let similarity = match &embedding {
    EmbeddingStorage::Dense(dense) => {
        let stored_sparse = SparseVector::from_dense(dense);
        let raw = metric.compute(&query_sparse, &stored_sparse);
        metric.to_similarity(raw)
    }
    EmbeddingStorage::Sparse(sparse) => {
        let raw = metric.compute(&query_sparse, sparse);
        metric.to_similarity(raw)
    }
    // ...handles Delta and TensorTrain storage types
};
}

Automatic Metric Selection

When auto_select_metric is enabled, the cache automatically selects the optimal distance metric based on embedding sparsity:

#![allow(unused)]
fn main() {
fn select_metric(&self, embedding: &[f32]) -> DistanceMetric {
    if !self.config.auto_select_metric {
        return self.config.distance_metric.clone();
    }

    let sparse = SparseVector::from_dense(embedding);
    if sparse.sparsity() >= self.config.sparsity_metric_threshold {
        DistanceMetric::Jaccard  // Better for sparse vectors
    } else {
        self.config.distance_metric.clone()  // Default (usually Cosine)
    }
}
}

Cache Layers

Exact Cache (O(1))

Hash-based lookup for identical queries. Keys are generated from prompt text using DefaultHasher. Stored with prefix _cache:exact:.

When to use: Repetitive queries with exact same prompts (e.g., FAQ systems, chatbots with canned responses).

Semantic Cache (O(log n))

HNSW-based similarity search for semantically similar queries. Uses configurable distance metrics (Cosine, Jaccard, Euclidean, Angular). Stored with prefix _cache:sem:.

When to use: Natural language queries with variations (e.g., “What’s the weather?” vs “How’s the weather today?”).

Embedding Cache (O(1))

Stores precomputed embeddings to avoid redundant embedding API calls. Keys combine source and content hash. Stored with prefix _cache:emb:.

When to use: When embedding computation is expensive and the same content is embedded multiple times.

Storage Format

Cache entries are stored as TensorData with standardized fields:

FieldTypeDescription
_responseStringCached response text
_embeddingVector/SparseEmbedding (semantic/embedding layers)
_embedding_dimIntEmbedding dimension
_input_tokensIntInput token count
_output_tokensIntOutput token count
_modelStringModel identifier
_layerStringCache layer (exact/semantic/embedding)
_created_atIntCreation timestamp (millis)
_expires_atIntExpiration timestamp (millis)
_access_countIntAccess count for LFU
_last_accessIntLast access timestamp for LRU
_versionStringOptional version tag
_sourceStringEmbedding source identifier
_content_hashIntContent hash for deduplication

Sparse Storage Optimization

Embeddings with high sparsity (>50% zeros) are automatically stored in sparse format to reduce memory usage:

#![allow(unused)]
fn main() {
fn should_use_sparse(vector: &[f32]) -> bool {
    if vector.is_empty() {
        return false;
    }
    let nnz = vector.iter().filter(|&&v| v.abs() > 1e-6).count();
    // Use sparse if non-zero count <= half of total length
    nnz * 2 <= vector.len()
}
}

Distance Metrics

Configurable distance metrics for semantic similarity:

MetricBest ForRangeFormula
CosineDense embeddings (default)-1 to 1dot(a,b) / (‖a‖ * ‖b‖)
AngularLinear angle relationships0 to PIacos(cosine_sim)
JaccardSparse/binary embeddings0 to 1‖A ∩ B‖ / ‖A ∪ B‖
EuclideanAbsolute distances0 to infsqrt(sum((a-b)^2))
WeightedJaccardSparse with magnitudes0 to 1Weighted set similarity

Auto-selection: When auto_select_metric is true, the cache automatically selects Jaccard for sparse embeddings (sparsity >= threshold, default 70%) and the configured metric otherwise.

Eviction Strategies

Strategy Comparison

StrategyDescriptionScore FormulaBest For
LRUEvicts entries that haven’t been accessed recently-last_access_secsGeneral purpose
LFUEvicts entries with lowest access countaccess_countStable workloads
CostBasedEvicts entries with lowest cost savings per bytecost_per_hit / size_bytesCost optimization
HybridCombines all strategies with configurable weightsWeighted combinationProduction systems

Hybrid Eviction Score Algorithm

The Hybrid strategy combines recency, frequency, and cost factors:

#![allow(unused)]
fn main() {
pub fn score(
    &self,
    last_access_secs: f64,
    access_count: u64,
    cost_per_hit: f64,
    size_bytes: usize,
) -> f64 {
    match self.strategy {
        EvictionStrategy::LRU => -last_access_secs,
        EvictionStrategy::LFU => access_count as f64,
        EvictionStrategy::CostBased => {
            if size_bytes == 0 { 0.0 }
            else { cost_per_hit / size_bytes as f64 }
        }
        EvictionStrategy::Hybrid { lru_weight, lfu_weight, cost_weight } => {
            let total = f64::from(lru_weight) + f64::from(lfu_weight) + f64::from(cost_weight);
            let recency_w = f64::from(lru_weight) / total;
            let frequency_w = f64::from(lfu_weight) / total;
            let cost_w = f64::from(cost_weight) / total;

            let age_minutes = last_access_secs / 60.0;
            let recency_score = 1.0 / (1.0 + age_minutes);    // Decays with age
            let frequency_score = (1.0 + access_count as f64).log2();  // Log scale
            let cost_score = cost_per_hit;

            recency_score * recency_w + frequency_score * frequency_w + cost_score * cost_w
        }
    }
}
}

Lower scores are evicted first. The hybrid formula:

  • recency_score: Decays as 1/(1 + age_in_minutes) - newer entries score higher
  • frequency_score: Grows logarithmically with access count - frequently accessed entries score higher
  • cost_score: Direct cost per hit - higher cost savings score higher

Background Eviction Flow

flowchart TD
    A[EvictionManager::start] --> B[Spawn Tokio Task]
    B --> C[Initialize Interval Timer]
    C --> D{Select Event}
    D -->|Timer Tick| E[Call evict_fn batch_size]
    D -->|Shutdown Signal| F[Set running=false]
    E --> G{Evicted > 0?}
    G -->|Yes| H[Record Eviction Stats]
    G -->|No| D
    H --> D
    F --> I[Break Loop]
#![allow(unused)]
fn main() {
// Starting background eviction
let handle = manager.start(move |batch_size| {
    cache.evict(batch_size)
});

// Later: graceful shutdown
handle.shutdown().await;
}

Configuration

Default Configuration

#![allow(unused)]
fn main() {
CacheConfig {
    exact_capacity: 10_000,
    semantic_capacity: 5_000,
    embedding_capacity: 50_000,
    default_ttl: Duration::from_secs(3600),
    max_ttl: Duration::from_secs(86400),
    semantic_threshold: 0.92,
    embedding_dim: 1536,
    eviction_strategy: EvictionStrategy::Hybrid {
        lru_weight: 40,
        lfu_weight: 30,
        cost_weight: 30
    },
    eviction_interval: Duration::from_secs(60),
    eviction_batch_size: 100,
    input_cost_per_1k: 0.0015,
    output_cost_per_1k: 0.002,
    inline_threshold: 4096,
    distance_metric: DistanceMetric::Cosine,
    auto_select_metric: true,
    sparsity_metric_threshold: 0.7,
}
}

Configuration Presets

PresetUse CaseExact CapacitySemantic CapacityEmbedding CapacityEviction Batch
default()General purpose10,0005,00050,000100
high_throughput()High-traffic server50,00020,000100,000500
low_memory()Memory-constrained1,0005005,00050
development()Dev/testing1005020010
sparse_embeddings()Sparse vectors10,0005,00050,000100

Configuration Validation

The config validates on cache creation:

#![allow(unused)]
fn main() {
pub fn validate(&self) -> Result<(), String> {
    if self.semantic_threshold < 0.0 || self.semantic_threshold > 1.0 {
        return Err("semantic_threshold must be between 0.0 and 1.0");
    }
    if self.embedding_dim == 0 {
        return Err("embedding_dim must be greater than 0");
    }
    if self.eviction_batch_size == 0 {
        return Err("eviction_batch_size must be greater than 0");
    }
    if self.default_ttl > self.max_ttl {
        return Err("default_ttl cannot exceed max_ttl");
    }
    if self.sparsity_metric_threshold < 0.0 || self.sparsity_metric_threshold > 1.0 {
        return Err("sparsity_metric_threshold must be between 0.0 and 1.0");
    }
    Ok(())
}
}

Usage Examples

Basic Usage

#![allow(unused)]
fn main() {
use tensor_cache::{Cache, CacheConfig};

let mut config = CacheConfig::default();
config.embedding_dim = 3;
let cache = Cache::with_config(config).unwrap();

// Store a response
let embedding = vec![0.1, 0.2, 0.3];
cache.put("What is 2+2?", &embedding, "4", "gpt-4", None).unwrap();

// Look up (tries exact first, then semantic)
if let Some(hit) = cache.get("What is 2+2?", Some(&embedding)) {
    println!("Cached: {}", hit.response);
}
}

Explicit Metric Queries

#![allow(unused)]
fn main() {
use tensor_cache::DistanceMetric;

let hit = cache.get_with_metric(
    "query",
    Some(&embedding),
    Some(&DistanceMetric::Euclidean),
);

if let Some(hit) = hit {
    println!("Metric used: {:?}", hit.metric_used);
}
}

Embedding Cache with Compute Fallback

#![allow(unused)]
fn main() {
// Get cached embedding or compute on miss
let embedding = cache.get_or_compute_embedding(
    "openai",           // source
    "Hello, world!",    // content
    "text-embedding-3-small",  // model
    || {
        // Compute function called only on cache miss
        Ok(compute_embedding("Hello, world!"))
    }
)?;
}

Token Counting and Cost Estimation

#![allow(unused)]
fn main() {
use tensor_cache::{TokenCounter, ModelPricing};

// Count tokens in text
let tokens = TokenCounter::count("Hello, world!");

// Count tokens in chat messages (includes overhead)
let messages = vec![("user", "Hello"), ("assistant", "Hi there!")];
let total = TokenCounter::count_messages(&messages);

// Estimate cost with custom rates
let cost = TokenCounter::estimate_cost(1000, 500, 0.01, 0.03);

// Use predefined model pricing
let pricing = ModelPricing::GPT4O;
let cost = pricing.estimate(1000, 500);

// Lookup pricing by model name
if let Some(pricing) = ModelPricing::for_model("gpt-4o-mini") {
    println!("Cost: ${:.4}", pricing.estimate(1000, 500));
}
}

Statistics and Monitoring

#![allow(unused)]
fn main() {
let stats = cache.stats_snapshot();

// Hit rates by layer
println!("Exact hit rate: {:.2}%", stats.hit_rate(CacheLayer::Exact) * 100.0);
println!("Semantic hit rate: {:.2}%", stats.hit_rate(CacheLayer::Semantic) * 100.0);

// Tokens and cost saved
println!("Input tokens saved: {}", stats.tokens_saved_in);
println!("Output tokens saved: {}", stats.tokens_saved_out);
println!("Cost saved: ${:.2}", stats.cost_saved_dollars);

// Cache utilization
println!("Total entries: {}", stats.total_entries());
println!("Evictions: {}", stats.evictions);
println!("Expirations: {}", stats.expirations);
println!("Uptime: {} seconds", stats.uptime_secs);
}

Shared TensorStore Integration

#![allow(unused)]
fn main() {
use tensor_store::TensorStore;
use tensor_cache::{Cache, CacheConfig};

// Share store with other engines
let store = TensorStore::new();
let cache = Cache::with_store(store.clone(), CacheConfig::default())?;

// Other engines can use the same store
let vault = Vault::with_store(store.clone(), VaultConfig::default())?;
}

Token Counting Implementation

The TokenCounter uses tiktoken’s cl100k_base encoding, which is compatible with GPT-4, GPT-3.5-turbo, and text-embedding-ada-002.

Lazy Encoder Initialization

#![allow(unused)]
fn main() {
static CL100K_ENCODER: OnceLock<Option<CoreBPE>> = OnceLock::new();

impl TokenCounter {
    fn encoder() -> Option<&'static CoreBPE> {
        CL100K_ENCODER
            .get_or_init(|| tiktoken_rs::cl100k_base().ok())
            .as_ref()
    }
}
}

Fallback Estimation

If tiktoken is unavailable, falls back to character-based estimation (~4 chars per token for English text):

#![allow(unused)]
fn main() {
const fn estimate_tokens(text: &str) -> usize {
    text.len().div_ceil(4)
}
}

Message Token Counting

Chat messages include overhead tokens per message (role markers, separators):

#![allow(unused)]
fn main() {
pub fn count_message(role: &str, content: &str) -> usize {
    Self::encoder().map_or_else(
        || Self::estimate_tokens(role) + Self::estimate_tokens(content) + 4,
        |enc| {
            let role_tokens = enc.encode_ordinary(role).len();
            let content_tokens = enc.encode_ordinary(content).len();
            role_tokens + content_tokens + 4  // 4 tokens overhead per message
        },
    )
}

pub fn count_messages(messages: &[(&str, &str)]) -> usize {
    let mut total = 0;
    for (role, content) in messages {
        total += Self::count_message(role, content);
    }
    total + 3  // 3 tokens for assistant reply priming
}
}

Cost Calculation Formulas

#![allow(unused)]
fn main() {
// Basic cost calculation
pub fn estimate_cost(
    input_tokens: usize,
    output_tokens: usize,
    input_rate: f64,   // $/1000 tokens
    output_rate: f64,  // $/1000 tokens
) -> f64 {
    (input_tokens as f64 / 1000.0) * input_rate +
    (output_tokens as f64 / 1000.0) * output_rate
}

// For atomic operations (avoids floating point accumulation errors)
pub fn estimate_cost_microdollars(...) -> u64 {
    let dollars = Self::estimate_cost(...);
    (dollars * 1_000_000.0) as u64
}
}

Model Pricing

ModelInput/1KOutput/1KNotes
GPT-4o$0.005$0.015Best for complex tasks
GPT-4o mini$0.00015$0.0006Cost-effective
GPT-4 Turbo$0.01$0.03High capability
GPT-3.5 Turbo$0.0005$0.0015Budget option
Claude 3 Opus$0.015$0.075Highest quality
Claude 3 Sonnet$0.003$0.015Balanced
Claude 3 Haiku$0.00025$0.00125Fast and cheap

Model Name Matching

#![allow(unused)]
fn main() {
pub fn for_model(model: &str) -> Option<Self> {
    let model_lower = model.to_lowercase();
    if model_lower.contains("gpt-4o-mini") {
        Some(Self::GPT4O_MINI)
    } else if model_lower.contains("gpt-4o") {
        Some(Self::GPT4O)
    } else if model_lower.contains("gpt-4-turbo") {
        Some(Self::GPT4_TURBO)
    } else if model_lower.contains("gpt-3.5") {
        Some(Self::GPT35_TURBO)
    } else if model_lower.contains("claude-3-opus") || model_lower.contains("claude-opus") {
        Some(Self::CLAUDE3_OPUS)
    } else if model_lower.contains("claude-3-sonnet") || model_lower.contains("claude-sonnet") {
        Some(Self::CLAUDE3_SONNET)
    } else if model_lower.contains("claude-3-haiku") || model_lower.contains("claude-haiku") {
        Some(Self::CLAUDE3_HAIKU)
    } else {
        None
    }
}
}

Semantic Search Index Internals

CacheIndex Structure

#![allow(unused)]
fn main() {
pub struct CacheIndex {
    index: RwLock<HNSWIndex>,           // HNSW graph
    config: HNSWConfig,                  // For recreation on clear
    key_to_node: DashMap<String, usize>, // Cache key -> HNSW node
    node_to_key: DashMap<usize, String>, // HNSW node -> Cache key
    dimension: usize,                    // Expected embedding dimension
    entry_count: AtomicUsize,            // Entry count
    distance_metric: DistanceMetric,     // Default metric
}
}

Insert Strategies

#![allow(unused)]
fn main() {
// Dense embedding insert
pub fn insert(&self, key: &str, embedding: &[f32]) -> Result<usize>;

// Sparse embedding insert (memory efficient)
pub fn insert_sparse(&self, key: &str, embedding: &SparseVector) -> Result<usize>;

// Auto-select based on sparsity threshold
pub fn insert_auto(
    &self,
    key: &str,
    embedding: &[f32],
    sparsity_threshold: f32,
) -> Result<usize>;
}

Key Orphaning on Re-insert

When a key is re-inserted, the old HNSW node is orphaned (not deleted) because HNSW doesn’t support efficient deletion:

#![allow(unused)]
fn main() {
let is_new = !self.key_to_node.contains_key(key);
if !is_new {
    // Remove mapping but leave HNSW node (will be ignored in search)
    self.key_to_node.remove(key);
}
}

Memory Statistics

#![allow(unused)]
fn main() {
pub fn memory_stats(&self) -> Option<HNSWMemoryStats> {
    self.index.read().ok().map(|index| index.memory_stats())
}
// Returns: dense_count, sparse_count, delta_count, embedding_bytes, etc.
}

Error Types

ErrorDescriptionRecovery
NotFoundCache entry not foundCheck key exists
DimensionMismatchEmbedding dimension does not match configVerify embedding size
StorageErrorUnderlying tensor store errorCheck store health
SerializationErrorSerialization/deserialization failedVerify data format
TokenizerErrorToken counting failedFalls back to estimation
CacheFullCache capacity exceededRun eviction or increase capacity
InvalidConfigInvalid configuration providedFix config values
CancelledOperation was cancelledRetry operation
LockPoisonedInternal lock was poisonedRestart cache

Error Conversion

#![allow(unused)]
fn main() {
impl From<tensor_store::TensorStoreError> for CacheError {
    fn from(e: TensorStoreError) -> Self {
        Self::StorageError(e.to_string())
    }
}

impl From<bitcode::Error> for CacheError {
    fn from(e: bitcode::Error) -> Self {
        Self::SerializationError(e.to_string())
    }
}
}

Performance

Benchmarks (10,000 entries, 128-dim embeddings)

OperationTimeNotes
Exact lookup (hit)~50nsHash lookup + TensorStore get
Exact lookup (miss)~30nsHash lookup only
Semantic lookup~5usHNSW search + re-scoring
Put (exact + semantic)~10usTwo stores + HNSW insert
Eviction (100 entries)~200usBatch deletion
Clear (full index)~1msHNSW recreation

Distance Metric Performance (128-dim, 1000 entries)

MetricSearch TimeNotes
Cosine21 usDefault, best for dense
Jaccard18 usBest for sparse
Angular23 us+acos overhead
Euclidean19 usAbsolute distance

Auto-Selection Overhead

OperationTime
Sparsity check~50 ns
Metric selection~10 ns

Memory Efficiency

Storage TypeMemory per EntryBest For
Dense Vector4 * dim bytesLow sparsity (<50% zeros)
Sparse Vector8 * nnz bytesHigh sparsity (>50% zeros)

Edge Cases and Gotchas

TTL Behavior

  • Entries with expires_at = 0 never expire
  • Expired entries return None on lookup but remain in storage until cleanup
  • cleanup_expired() must be called explicitly or via background eviction

Capacity Limits

  • put() fails with CacheFull when capacity is reached
  • Capacity is checked per-layer (exact, semantic, embedding)
  • No automatic eviction on put - must be explicit

Hash Collisions

  • Extremely unlikely with 64-bit hashes (~1 in 18 quintillion)
  • If collision occurs, exact cache will return wrong response
  • Semantic cache provides fallback for semantically different queries

Metric Re-scoring

  • HNSW always uses cosine similarity for graph navigation
  • Re-scoring with different metrics may change result order
  • Retrieves 3x candidates to account for re-ranking

Sparse Storage Threshold

  • Uses sparse format when nnz * 2 <= len (50% zeros)
  • Different from auto-metric selection threshold (default 70%)
  • Both thresholds are configurable

Performance Tips and Best Practices

Configuration Tuning

  1. Semantic Threshold: Start with 0.92, lower to 0.85 for fuzzy matching
  2. Eviction Weights: Increase cost_weight if API costs matter most
  3. Batch Size: Larger batches (500+) for high-throughput systems
  4. TTL: Match to your content freshness requirements

Memory Optimization

  1. Use sparse_embeddings() preset for sparse data
  2. Set inline_threshold based on typical response sizes
  3. Enable auto_select_metric for mixed workloads
  4. Monitor memory_stats() to track sparse vs dense ratio

Hit Rate Optimization

  1. Normalize prompts before caching (lowercase, trim whitespace)
  2. Use versioning for model/prompt template changes
  3. Set appropriate semantic threshold for your domain
  4. Consider domain-specific embeddings

Cost Tracking

  1. Use estimate_cost_microdollars() for atomic accumulation
  2. Record cost per cache hit for ROI analysis
  3. Compare tokens_saved against capacity costs

Shell Commands

CACHE INIT     Initialize semantic cache
CACHE STATS    Show cache statistics
CACHE CLEAR    Clear all cache entries

API Reference

Cache Methods

MethodDescription
new()Create with default config
with_config(config)Create with custom config
with_store(store, config)Create with shared TensorStore
get(prompt, embedding)Look up cached response
get_with_metric(prompt, embedding, metric)Look up with explicit metric
put(prompt, embedding, response, model, ttl)Store response
get_embedding(source, content)Get cached embedding
put_embedding(source, content, embedding, model)Store embedding
get_or_compute_embedding(source, content, model, compute)Get or compute embedding
get_simple(key)Simple key-value lookup
put_simple(key, value)Simple key-value store
invalidate(prompt)Remove exact entry
invalidate_version(version)Remove entries by version
invalidate_embeddings(source)Remove embeddings by source
evict(count)Manually evict entries
cleanup_expired()Remove expired entries
clear()Clear all entries
stats()Get statistics reference
stats_snapshot()Get statistics snapshot
config()Get configuration reference
len()Total cached entries
is_empty()Check if cache is empty

CacheHit Fields

FieldTypeDescription
responseStringCached response text
layerCacheLayerWhich layer matched
similarityOption<f32>Similarity score (semantic only)
input_tokensusizeInput tokens saved
output_tokensusizeOutput tokens saved
cost_savedf64Estimated cost saved (dollars)
metric_usedOption<DistanceMetric>Metric used (semantic only)

StatsSnapshot Fields

FieldTypeDescription
exact_hitsu64Exact cache hits
exact_missesu64Exact cache misses
semantic_hitsu64Semantic cache hits
semantic_missesu64Semantic cache misses
embedding_hitsu64Embedding cache hits
embedding_missesu64Embedding cache misses
tokens_saved_inu64Total input tokens saved
tokens_saved_outu64Total output tokens saved
cost_saved_dollarsf64Total cost saved
evictionsu64Total evictions
expirationsu64Total expirations
exact_sizeusizeCurrent exact cache size
semantic_sizeusizeCurrent semantic cache size
embedding_sizeusizeCurrent embedding cache size
uptime_secsu64Cache uptime in seconds

Dependencies

CratePurpose
tensor_storeHNSW index implementation, TensorStore
tiktoken-rsGPT-compatible token counting
dashmapConcurrent hash maps
tokioAsync runtime for background eviction
uuidUnique ID generation
thiserrorError type derivation
serdeConfiguration serialization
bincodeBinary serialization
  • tensor_store - Backing storage and HNSW index
  • query_router - Cache integration for query execution
  • neumann_shell - CLI commands for cache management

Tensor Blob Architecture

S3-style object storage for large artifacts using content-addressable chunked storage with tensor-native metadata. Artifacts are split into SHA-256 hashed chunks for automatic deduplication, with metadata stored in the tensor store for integration with graph, relational, and vector queries.

All I/O operations are async via Tokio. Large files are streamed through BlobWriter and BlobReader without loading entirely into memory. Background garbage collection removes orphaned chunks automatically.

Key Types

Core Types

TypeDescription
BlobStoreMain API for storing, retrieving, and managing artifacts
BlobConfigConfiguration for chunk size, GC intervals, and limits
BlobWriterStreaming upload with incremental chunking and hash computation
BlobReaderStreaming download with chunk-by-chunk reads and verification
ChunkContent-addressed data segment with SHA-256 hash
ChunkerSplits data into fixed-size content-addressable chunks
StreamingHasherIncremental SHA-256 computation for large files
GarbageCollectorBackground task for cleaning orphaned chunks

Metadata Types

TypeDescription
ArtifactMetadataFull metadata including filename, size, checksum, links, tags
PutOptionsUpload options: content type, creator, links, tags, custom metadata, embedding
MetadataUpdatesPartial updates for filename, content type, custom fields
SimilarArtifactSearch result with artifact ID, filename, and similarity score
WriteStateInternal state tracking artifact metadata during streaming upload

Statistics Types

TypeDescription
BlobStatsStorage statistics: artifact count, chunk count, dedup ratio, orphaned chunks
GcStatsGC results: chunks deleted, bytes freed
RepairStatsRepair results: artifacts checked, chunks verified, refs fixed, orphans deleted

Error Types

ErrorDescription
NotFoundArtifact does not exist
ChunkMissingReferenced chunk not found in storage
ChecksumMismatchData corruption detected during verification
EmptyDataCannot store empty artifact
InvalidConfigInvalid configuration parameter (e.g., zero chunk size)
InvalidArtifactIdMalformed artifact ID format
StorageErrorUnderlying tensor store error
GraphErrorGraph engine integration error (feature-gated)
VectorErrorVector engine integration error (feature-gated)
IoErrorI/O error during streaming operations
GcErrorGarbage collection failure
AlreadyExistsArtifact with given ID already exists
DimensionMismatchEmbedding dimension mismatch

Architecture Diagram

+--------------------------------------------------+
|                BlobStore (Public API)            |
|   - put, get, delete, exists                     |
|   - metadata, update_metadata                    |
|   - link, unlink, tag, untag                     |
|   - verify, repair, gc, full_gc                  |
+--------------------------------------------------+
            |              |              |
    +-------+      +-------+      +-------+
    |              |              |
+--------+   +-----------+   +----------+
| Writer |   |  Reader   |   |    GC    |
| Stream |   |  Stream   |   | (Tokio)  |
+--------+   +-----------+   +----------+
    |              |              |
    +-------+------+------+-------+
            |
    +------------------+
    |     Chunker      |
    |   SHA-256 hash   |
    +------------------+
            |
    +------------------+
    |   tensor_store   |
    | _blob:meta:*     |
    | _blob:chunk:*    |
    +------------------+

Storage Format

Artifact Metadata

Stored at _blob:meta:{artifact_id}:

FieldTypeDescription
_typeStringAlways "blob_artifact"
_idStringUnique artifact identifier (UUID v4)
_filenameStringOriginal filename
_content_typeStringMIME type
_sizeIntTotal size in bytes
_checksumStringSHA-256 hash of full content (sha256:{hex})
_chunk_sizeIntSize of each chunk (except possibly last)
_chunk_countIntNumber of chunks
_chunksPointersOrdered list of chunk keys
_createdIntUnix timestamp (seconds)
_modifiedIntUnix timestamp (seconds)
_created_byStringCreator identity
_linked_toPointersLinked entity IDs
_tagsPointersApplied tags (prefixed with tag:)
_meta:*StringCustom metadata fields
_embeddingVector/SparseOptional embedding (sparse if >50% zeros)
_embedded_modelStringEmbedding model name

Chunk Data

Stored at _blob:chunk:sha256:{hex}:

FieldTypeDescription
_typeStringAlways "blob_chunk"
_dataBytesRaw chunk data
_sizeIntChunk size in bytes
_refsIntReference count for deduplication
_createdIntUnix timestamp (seconds)

Content-Addressable Chunking Algorithm

The chunker uses a fixed-size chunking strategy with SHA-256 content addressing:

flowchart TD
    A[Input Data] --> B[Split into fixed-size chunks]
    B --> C{For each chunk}
    C --> D[Compute SHA-256 hash]
    D --> E{Chunk exists?}
    E -->|Yes| F[Increment ref count]
    E -->|No| G[Store new chunk]
    F --> H[Record chunk key]
    G --> H
    H --> C
    C -->|Done| I[Compute full-file checksum]
    I --> J[Store metadata with chunk list]

Chunker Implementation

#![allow(unused)]
fn main() {
// Chunker splits data into fixed-size segments
pub struct Chunker {
    chunk_size: usize,  // Default: 1MB (1,048,576 bytes)
}

impl Chunker {
    // Split data into chunks using Rust's chunks() iterator
    pub fn chunk<'a>(&'a self, data: &'a [u8]) -> impl Iterator<Item = Chunk> + 'a {
        data.chunks(self.chunk_size).map(|chunk_data| {
            let hash = compute_hash(chunk_data);
            Chunk {
                hash,
                data: chunk_data.to_vec(),
                size: chunk_data.len(),
            }
        })
    }

    // Count chunks without allocating (useful for progress estimation)
    pub fn chunk_count(&self, data_len: usize) -> usize {
        if data_len == 0 { 0 } else { data_len.div_ceil(self.chunk_size) }
    }
}
}

Chunk Key Format

Chunk keys follow a deterministic format for content addressing:

_blob:chunk:sha256:{64_hex_chars}

Example:

_blob:chunk:sha256:b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9

SHA-256 Checksum Computation

The system uses the sha2 crate for cryptographic hashing:

#![allow(unused)]
fn main() {
use sha2::{Digest, Sha256};

// Single-shot hash for chunk content
pub fn compute_hash(data: &[u8]) -> String {
    let mut hasher = Sha256::new();
    hasher.update(data);
    let result = hasher.finalize();
    format!("sha256:{:x}", result)  // Lowercase hex encoding
}

// Streaming hash for large files (used by BlobWriter)
pub struct StreamingHasher {
    hasher: Sha256,
}

impl StreamingHasher {
    pub fn new() -> Self {
        Self { hasher: Sha256::new() }
    }

    pub fn update(&mut self, data: &[u8]) {
        self.hasher.update(data);
    }

    pub fn finalize(self) -> String {
        let result = self.hasher.finalize();
        format!("sha256:{:x}", result)
    }
}

// Multi-segment hash (for verification)
pub fn compute_hash_streaming<'a>(segments: impl Iterator<Item = &'a [u8]>) -> String {
    let mut hasher = Sha256::new();
    for segment in segments {
        hasher.update(segment);
    }
    let result = hasher.finalize();
    format!("sha256:{:x}", result)
}
}

Content-Addressable Deduplication

Chunks are keyed by SHA-256 hash, enabling automatic deduplication:

  1. When writing data, the chunker splits it into fixed-size segments (default 1MB)
  2. Each chunk is hashed with SHA-256 to produce a unique key
  3. If the chunk already exists, only the reference count is incremented
  4. Identical data across different artifacts shares the same physical chunks
#![allow(unused)]
fn main() {
let data = vec![0u8; 10_000];

// Store same data twice
blob.put("file1.bin", &data, PutOptions::default()).await?;
blob.put("file2.bin", &data, PutOptions::default()).await?;

let stats = blob.stats().await?;
// stats.chunk_count = 1 (deduplicated)
// stats.dedup_ratio > 0.0
}

Deduplication Ratio Calculation

#![allow(unused)]
fn main() {
let dedup_ratio = if total_bytes > 0 {
    1.0 - (unique_bytes as f64 / total_bytes as f64)
} else {
    0.0
};
}

A ratio of 0.5 means 50% space savings through deduplication.

Streaming Upload State Machine

The BlobWriter manages incremental uploads with proper buffering:

stateDiagram-v2
    [*] --> Created: new()
    Created --> Buffering: write()
    Buffering --> Buffering: write() [buffer < chunk_size]
    Buffering --> ChunkReady: write() [buffer >= chunk_size]
    ChunkReady --> StoreChunk: drain buffer
    StoreChunk --> CheckExists: compute hash
    CheckExists --> IncrementRefs: chunk exists
    CheckExists --> StoreNew: chunk new
    IncrementRefs --> Buffering
    StoreNew --> Buffering
    Buffering --> FlushFinal: finish()
    FlushFinal --> StoreMetadata: store remaining buffer
    StoreMetadata --> [*]: return artifact_id

BlobWriter Internal State

#![allow(unused)]
fn main() {
pub struct BlobWriter {
    store: TensorStore,
    chunker: Chunker,
    state: WriteState,      // Artifact metadata (filename, content_type, etc.)
    chunks: Vec<String>,    // Ordered list of chunk keys
    total_size: usize,      // Running total of bytes written
    hasher: StreamingHasher, // Incremental full-file hash
    buffer: Vec<u8>,        // Incomplete chunk buffer
}
}

Write Operation Flow

#![allow(unused)]
fn main() {
pub async fn write(&mut self, data: &[u8]) -> Result<()> {
    if data.is_empty() { return Ok(()); }

    // 1. Update full-file hash (computed independently of chunking)
    self.hasher.update(data);
    self.total_size += data.len();

    // 2. Add to internal buffer
    self.buffer.extend_from_slice(data);

    // 3. Process complete chunks (may be multiple if large write)
    while self.buffer.len() >= self.chunker.chunk_size() {
        let chunk_data: Vec<u8> = self.buffer.drain(..self.chunker.chunk_size()).collect();
        let chunk = Chunk::new(chunk_data);
        self.store_chunk(chunk).await?;
    }

    Ok(())
}
}

Finish Operation

#![allow(unused)]
fn main() {
pub async fn finish(mut self) -> Result<String> {
    // 1. Flush remaining buffer as final (possibly smaller) chunk
    if !self.buffer.is_empty() {
        let chunk = Chunk::new(std::mem::take(&mut self.buffer));
        self.store_chunk(chunk).await?;
    }

    // 2. Finalize full-file checksum
    let checksum = self.hasher.finalize();

    // 3. Build and store metadata tensor
    let mut tensor = TensorData::new();
    tensor.set("_type", "blob_artifact");
    tensor.set("_id", self.state.artifact_id.clone());
    tensor.set("_checksum", checksum);
    tensor.set("_chunks", TensorValue::Pointers(self.chunks));
    // ... additional fields ...

    let meta_key = format!("_blob:meta:{}", self.state.artifact_id);
    self.store.put(&meta_key, tensor)?;

    Ok(self.state.artifact_id)
}
}

Streaming Download State Machine

The BlobReader manages incremental downloads with chunk-level iteration:

stateDiagram-v2
    [*] --> Initialized: new()
    Initialized --> LoadMetadata: read chunk list
    LoadMetadata --> Ready: chunks loaded
    Ready --> ReadChunk: next_chunk()
    ReadChunk --> ChunkLoaded: fetch from store
    ChunkLoaded --> Ready: return data
    Ready --> [*]: all chunks read
    Ready --> Verify: verify()
    Verify --> HashAll: reset and hash all chunks
    HashAll --> Compare: compare checksums
    Compare --> [*]: return bool

BlobReader Internal State

#![allow(unused)]
fn main() {
pub struct BlobReader {
    store: TensorStore,
    chunks: Vec<String>,       // Ordered list of chunk keys
    current_chunk: usize,      // Index of next chunk to read
    current_data: Option<Vec<u8>>, // Cached current chunk for read()
    current_offset: usize,     // Offset within current_data
    total_size: usize,         // Total artifact size
    bytes_read: usize,         // Bytes read so far
    checksum: String,          // Expected checksum for verification
}
}

Read Modes

#![allow(unused)]
fn main() {
// Mode 1: Chunk-at-a-time (best for processing in batches)
while let Some(chunk) = reader.next_chunk().await? {
    process_chunk(&chunk);
}

// Mode 2: Read all into memory (convenient for small files)
let data = reader.read_all().await?;

// Mode 3: Buffer-based reading (for streaming to other APIs)
let mut buf = vec![0u8; 4096];
loop {
    let n = reader.read(&mut buf).await?;
    if n == 0 { break; }
    output.write_all(&buf[..n])?;
}
}

Garbage Collection Reference Counting

The GC system uses reference counting with two operational modes:

Reference Count Management

#![allow(unused)]
fn main() {
// When storing a chunk (in BlobWriter::store_chunk)
if self.store.exists(&chunk_key) {
    // Chunk already exists - just increment ref count
    increment_chunk_refs(&self.store, &chunk_key)?;
} else {
    // New chunk - store with ref count of 1
    let mut tensor = TensorData::new();
    tensor.set("_refs", TensorValue::Scalar(ScalarValue::Int(1)));
    // ... store chunk data ...
}

// When deleting an artifact
pub fn delete_artifact(store: &TensorStore, artifact_id: &str) -> Result<()> {
    let tensor = store.get(&meta_key)?;
    if let Some(chunks) = get_pointers(&tensor, "_chunks") {
        for chunk_key in chunks {
            decrement_chunk_refs(store, &chunk_key)?;  // Saturating at 0
        }
    }
    store.delete(&meta_key)?;
    Ok(())
}
}

Incremental GC (gc())

Processes a limited batch of chunks per cycle, respecting age requirements:

flowchart TD
    A[Start GC Cycle] --> B[Scan chunk keys]
    B --> C{Take batch_size chunks}
    C --> D{For each chunk}
    D --> E{refs == 0?}
    E -->|No| D
    E -->|Yes| F{age > min_age?}
    F -->|No| D
    F -->|Yes| G[Delete chunk]
    G --> H[Track freed bytes]
    H --> D
    D -->|Done| I[Return GcStats]
#![allow(unused)]
fn main() {
pub async fn gc_cycle(&self) -> GcStats {
    let mut deleted = 0;
    let mut freed_bytes = 0;

    let now = current_timestamp();
    let min_created = now.saturating_sub(self.config.min_age.as_secs());

    let chunk_keys = self.store.scan("_blob:chunk:");

    for chunk_key in chunk_keys.into_iter().take(self.config.batch_size) {
        if let Ok(tensor) = self.store.get(&chunk_key) {
            let refs = get_int(&tensor, "_refs").unwrap_or(0);
            let created = get_int(&tensor, "_created").unwrap_or(0) as u64;

            // Zero refs AND old enough
            if refs == 0 && created < min_created {
                let size = get_int(&tensor, "_size").unwrap_or(0) as usize;
                if self.store.delete(&chunk_key).is_ok() {
                    deleted += 1;
                    freed_bytes += size;
                }
            }
        }
    }

    GcStats { deleted, freed_bytes }
}
}

Full GC (full_gc())

Rebuilds reference counts from scratch and deletes all unreferenced chunks:

flowchart TD
    A[Start Full GC] --> B[Build reference set from all artifacts]
    B --> C[Scan all artifact metadata]
    C --> D[Extract chunk lists]
    D --> E[Add to HashSet]
    E --> C
    C -->|Done| F[Scan all chunks]
    F --> G{Chunk in reference set?}
    G -->|Yes| F
    G -->|No| H[Delete chunk]
    H --> I[Track freed bytes]
    I --> F
    F -->|Done| J[Return GcStats]
#![allow(unused)]
fn main() {
pub async fn full_gc(&self) -> Result<GcStats> {
    // Phase 1: Build reference set from all artifacts
    let mut referenced: HashSet<String> = HashSet::new();
    for meta_key in self.store.scan("_blob:meta:") {
        if let Ok(tensor) = self.store.get(&meta_key) {
            if let Some(chunks) = get_pointers(&tensor, "_chunks") {
                referenced.extend(chunks);
            }
        }
    }

    // Phase 2: Delete unreferenced chunks (ignores age requirement)
    let mut deleted = 0;
    let mut freed_bytes = 0;
    for chunk_key in self.store.scan("_blob:chunk:") {
        if !referenced.contains(&chunk_key) {
            if let Ok(tensor) = self.store.get(&chunk_key) {
                let size = get_int(&tensor, "_size").unwrap_or(0) as usize;
                if self.store.delete(&chunk_key).is_ok() {
                    deleted += 1;
                    freed_bytes += size;
                }
            }
        }
    }

    Ok(GcStats { deleted, freed_bytes })
}
}

Background GC Task

#![allow(unused)]
fn main() {
pub fn start(self: Arc<Self>) -> JoinHandle<()> {
    let gc = Arc::clone(&self);
    tokio::spawn(async move {
        gc.run().await;
    })
}

async fn run(&self) {
    let mut interval = interval(self.config.check_interval);
    let mut shutdown_rx = self.shutdown_tx.subscribe();

    loop {
        tokio::select! {
            _ = interval.tick() => {
                let _ = self.gc_cycle().await;
            }
            _ = shutdown_rx.recv() => {
                break;
            }
        }
    }
}
}

Integrity Repair Algorithm

The repair operation fixes reference count inconsistencies and removes orphans:

flowchart TD
    A[Start Repair] --> B[Phase 1: Build true reference counts]
    B --> C[Scan all artifacts]
    C --> D[Count chunk references]
    D --> E[Build HashMap chunk -> count]
    E --> F[Phase 2: Verify and fix chunks]
    F --> G[Scan all chunks]
    G --> H{Current refs == expected?}
    H -->|Yes| I{Expected refs == 0?}
    H -->|No| J[Update refs to expected]
    J --> I
    I -->|Yes| K[Mark as orphan]
    I -->|No| G
    K --> G
    G -->|Done| L[Phase 3: Delete orphans]
    L --> M[Delete marked chunks]
    M --> N[Return RepairStats]

Repair Implementation

#![allow(unused)]
fn main() {
pub async fn repair(store: &TensorStore) -> Result<RepairStats> {
    let mut stats = RepairStats::default();

    // Phase 1: Build true reference counts from all artifacts
    let mut true_refs: HashMap<String, i64> = HashMap::new();
    for meta_key in store.scan("_blob:meta:") {
        stats.artifacts_checked += 1;
        if let Ok(tensor) = store.get(&meta_key) {
            if let Some(chunks) = get_pointers(&tensor, "_chunks") {
                for chunk_key in chunks {
                    *true_refs.entry(chunk_key).or_insert(0) += 1;
                }
            }
        }
    }

    // Phase 2: Verify and fix reference counts
    let mut orphan_keys = Vec::new();
    for chunk_key in store.scan("_blob:chunk:") {
        stats.chunks_verified += 1;
        if let Ok(mut tensor) = store.get(&chunk_key) {
            let current_refs = get_int(&tensor, "_refs").unwrap_or(0);
            let expected_refs = true_refs.get(&chunk_key).copied().unwrap_or(0);

            if current_refs != expected_refs {
                tensor.set("_refs", TensorValue::Scalar(ScalarValue::Int(expected_refs)));
                store.put(&chunk_key, tensor)?;
                stats.refs_fixed += 1;
            }

            if expected_refs == 0 {
                orphan_keys.push(chunk_key);
            }
        }
    }

    // Phase 3: Delete orphans
    for orphan_key in orphan_keys {
        if store.delete(&orphan_key).is_ok() {
            stats.orphans_deleted += 1;
        }
    }

    Ok(stats)
}
}

Artifact Verification

#![allow(unused)]
fn main() {
pub async fn verify_artifact(store: &TensorStore, artifact_id: &str) -> Result<bool> {
    let meta_key = format!("_blob:meta:{artifact_id}");
    let tensor = store.get(&meta_key)?;

    let expected_checksum = get_string(&tensor, "_checksum")?;
    let chunks = get_pointers(&tensor, "_chunks")?;

    // Recompute checksum by hashing all chunks in order
    let mut hasher = StreamingHasher::new();
    for chunk_key in &chunks {
        let chunk_tensor = store.get(chunk_key)?;
        let chunk_data = get_bytes(&chunk_tensor, "_data")?;
        hasher.update(&chunk_data);
    }

    let actual_checksum = hasher.finalize();
    Ok(actual_checksum == expected_checksum)
}

// Verify individual chunk integrity
pub fn verify_chunk(store: &TensorStore, chunk_key: &str) -> Result<bool> {
    let expected_hash = chunk_key.strip_prefix("_blob:chunk:")?;
    let tensor = store.get(chunk_key)?;
    let data = get_bytes(&tensor, "_data")?;
    let actual_hash = compute_hash(&data);
    Ok(actual_hash == expected_hash)
}
}

Usage Examples

Basic Storage

#![allow(unused)]
fn main() {
use tensor_blob::{BlobStore, BlobConfig, PutOptions};
use tensor_store::TensorStore;

let store = TensorStore::new();
let blob = BlobStore::new(store, BlobConfig::default()).await?;

// Store an artifact
let artifact_id = blob.put(
    "report.pdf",
    &file_bytes,
    PutOptions::new()
        .with_created_by("user:alice")
        .with_tag("quarterly")
        .with_link("task:123"),
).await?;

// Retrieve it
let data = blob.get(&artifact_id).await?;

// Get metadata
let meta = blob.metadata(&artifact_id).await?;
}

Streaming API

#![allow(unused)]
fn main() {
// Streaming upload (memory-efficient for large files)
let mut writer = blob.writer("large_file.bin", PutOptions::default()).await?;
for chunk in file_chunks {
    writer.write(&chunk).await?;
}
let artifact_id = writer.finish().await?;

// Streaming download
let mut reader = blob.reader(&artifact_id).await?;
while let Some(chunk) = reader.next_chunk().await? {
    process_chunk(&chunk);
}

// Verify integrity after download
let mut reader = blob.reader(&artifact_id).await?;
let valid = reader.verify().await?;
}

Entity Linking and Tagging

#![allow(unused)]
fn main() {
// Link artifact to entities
blob.link(&artifact_id, "user:alice").await?;
blob.link(&artifact_id, "task:123").await?;

// Find artifacts linked to an entity
let artifacts = blob.artifacts_for("user:alice").await?;

// Add tags
blob.tag(&artifact_id, "important").await?;

// Find artifacts by tag
let important_files = blob.by_tag("important").await?;
}

Semantic Search (with vector feature)

#![allow(unused)]
fn main() {
// Set embedding for artifact
blob.set_embedding(&artifact_id, embedding, "text-embedding-3-small").await?;

// Find similar artifacts
let similar = blob.similar(&artifact_id, 10).await?;
}

Configuration Options

OptionDefaultDescription
chunk_size1 MB (1,048,576 bytes)Size of each chunk in bytes
max_artifact_sizeNone (unlimited)Maximum artifact size limit
max_artifactsNone (unlimited)Maximum number of artifacts
gc_interval5 minutes (300s)Background GC check frequency
gc_batch_size100Chunks processed per GC cycle
gc_min_age1 minute (60s)Minimum age before GC eligible
default_content_typeapplication/octet-streamDefault MIME type
#![allow(unused)]
fn main() {
let config = BlobConfig::new()
    .with_chunk_size(1024 * 1024)
    .with_gc_interval(Duration::from_secs(300))
    .with_gc_batch_size(100)
    .with_gc_min_age(Duration::from_secs(3600))
    .with_max_artifact_size(100 * 1024 * 1024);
}

Configuration Validation

#![allow(unused)]
fn main() {
// Configuration is validated on BlobStore::new()
pub fn validate(&self) -> Result<()> {
    if self.chunk_size == 0 {
        return Err(BlobError::InvalidConfig("chunk_size must be > 0"));
    }
    if self.gc_batch_size == 0 {
        return Err(BlobError::InvalidConfig("gc_batch_size must be > 0"));
    }
    Ok(())
}
}

Garbage Collection

Two GC modes are available:

MethodDescriptionAge RequirementReference Source
gc()Incremental GC: processes batch_size chunks per cycleRespects min_ageUses stored _refs field
full_gc()Full GC: recounts all references from artifactsIgnores ageRebuilds from artifact metadata

Background GC runs automatically when started:

#![allow(unused)]
fn main() {
blob.start().await?;     // Start background GC
// ... use blob store ...
blob.shutdown().await?;  // Graceful shutdown (waits for current cycle)
}

BlobStore API

MethodDescription
new(store, config)Create with configuration (validates config)
start()Start background GC task
shutdown()Graceful shutdown (sends signal and awaits task)
store()Get reference to underlying TensorStore
put(filename, data, options)Store bytes, return artifact ID
get(artifact_id)Retrieve all bytes
delete(artifact_id)Delete artifact and decrement chunk refs
exists(artifact_id)Check if artifact exists
writer(filename, options)Create streaming upload writer
reader(artifact_id)Create streaming download reader
metadata(artifact_id)Get artifact metadata
update_metadata(artifact_id, updates)Apply metadata updates
set_meta(artifact_id, key, value)Set custom metadata field
get_meta(artifact_id, key)Get custom metadata field
link(artifact_id, entity)Link to entity
unlink(artifact_id, entity)Remove link
links(artifact_id)Get linked entities
artifacts_for(entity)Find artifacts by linked entity
tag(artifact_id, tag)Add tag
untag(artifact_id, tag)Remove tag
by_tag(tag)Find artifacts by tag
list(prefix)List artifacts with optional prefix filter
by_content_type(type)Find by content type
by_creator(creator)Find by creator
verify(artifact_id)Verify checksum integrity
repair()Repair broken references
gc()Run incremental GC
full_gc()Run full GC
stats()Get storage statistics
set_embedding(id, vec, model)Set artifact embedding (feature-gated)
similar(id, k)Find k similar artifacts (feature-gated)
search_by_embedding(vec, k)Search by embedding vector (feature-gated)

BlobWriter API

MethodDescription
write(data)Write chunk of data (buffers until chunk_size reached)
finish()Finalize, flush buffer, store metadata, return artifact ID
bytes_written()Total bytes written so far
chunks_written()Chunks stored so far (not including buffered data)

BlobReader API

MethodDescription
next_chunk()Read next chunk, returns None when done
read_all()Read all remaining data into buffer
read(buf)Read into buffer, returns bytes read (for streaming)
verify()Verify checksum against stored value (resets read position)
checksum()Get expected checksum
total_size()Total artifact size
bytes_read()Bytes read so far
chunk_count()Number of chunks

Shell Commands

BLOB PUT 'filename' 'data'              Store inline data
BLOB PUT 'filename' FROM 'path'         Store from file path
BLOB GET 'artifact_id'                  Retrieve data
BLOB GET 'artifact_id' TO 'path'        Write to file
BLOB DELETE 'artifact_id'               Delete artifact
BLOB INFO 'artifact_id'                 Show metadata
BLOB VERIFY 'artifact_id'               Verify integrity

BLOB LINK 'artifact_id' TO 'entity'     Link to entity
BLOB UNLINK 'artifact_id' FROM 'entity' Remove link
BLOB TAG 'artifact_id' 'tag'            Add tag
BLOB UNTAG 'artifact_id' 'tag'          Remove tag

BLOB META SET 'artifact_id' 'key' 'value'  Set custom metadata
BLOB META GET 'artifact_id' 'key'          Get custom metadata

BLOB GC                                 Run incremental GC
BLOB GC FULL                            Full garbage collection
BLOB REPAIR                             Repair broken references
BLOB STATS                              Show storage statistics

BLOBS                                   List all artifacts
BLOBS FOR 'entity'                      Find by linked entity
BLOBS BY TAG 'tag'                      Find by tag
BLOBS WHERE TYPE = 'content/type'       Find by content type
BLOBS SIMILAR TO 'artifact_id' LIMIT n  Find similar (requires embeddings)

Edge Cases and Gotchas

Empty Data

#![allow(unused)]
fn main() {
// Empty data is rejected
let result = blob.put("empty.txt", b"", PutOptions::default()).await;
assert!(matches!(result, Err(BlobError::EmptyData)));
}

Size Limits

#![allow(unused)]
fn main() {
// Exceeding max_artifact_size returns InvalidConfig error
let config = BlobConfig::new().with_max_artifact_size(1024);
let blob = BlobStore::new(store, config).await?;

let result = blob.put("large.bin", &vec![0u8; 2048], PutOptions::default()).await;
// Returns Err(BlobError::InvalidConfig("data size 2048 exceeds max 1024"))
}

Concurrent Deduplication

The reference counting is not fully atomic. If two writers simultaneously store the same chunk:

  • Both may check exists() and find it missing
  • Both may store the chunk with refs = 1
  • One write will overwrite the other
  • Result: ref count may be 1 instead of 2

Mitigation: For high-concurrency scenarios, use full_gc() periodically to rebuild accurate reference counts.

GC Timing

  • Incremental GC respects min_age to avoid deleting chunks from in-progress uploads
  • A writer that takes longer than min_age to complete may have chunks collected
  • Recommendation: Set gc_min_age longer than your maximum expected upload time

Checksum vs Chunk Hash

  • Checksum (_checksum): SHA-256 of the entire file content
  • Chunk hash (in key): SHA-256 of individual chunk data
  • These are different values and cannot be compared directly

Sparse Embedding Detection

#![allow(unused)]
fn main() {
// Embeddings with >50% zeros are stored in sparse format
pub(crate) fn should_use_sparse(vector: &[f32]) -> bool {
    if vector.is_empty() { return false; }
    let nnz = vector.iter().filter(|&&v| v.abs() > 1e-6).count();
    nnz * 2 <= vector.len()  // Use sparse if nnz <= 50%
}
}

Performance Tips and Best Practices

Chunk Size Selection

Chunk SizeBest ForTrade-offs
256 KBMany small files, high dedup potentialMore metadata overhead
1 MB (default)General purposeGood balance
4 MBLarge media files, sequential accessLess dedup, fewer chunks
#![allow(unused)]
fn main() {
// Benchmark different chunk sizes for your workload
let config = BlobConfig::new().with_chunk_size(512 * 1024); // 512KB
}

Streaming for Large Files

#![allow(unused)]
fn main() {
// Bad: Loads entire file into memory
let data = std::fs::read("large_file.bin")?;
blob.put("large_file.bin", &data, PutOptions::default()).await?;

// Good: Streams file in chunks
let mut writer = blob.writer("large_file.bin", PutOptions::default()).await?;
let file = std::fs::File::open("large_file.bin")?;
let mut reader = std::io::BufReader::new(file);
let mut buffer = vec![0u8; 64 * 1024]; // 64KB read buffer
loop {
    let n = reader.read(&mut buffer)?;
    if n == 0 { break; }
    writer.write(&buffer[..n]).await?;
}
let artifact_id = writer.finish().await?;
}

GC Tuning

#![allow(unused)]
fn main() {
// High-throughput: More aggressive GC
let config = BlobConfig::new()
    .with_gc_interval(Duration::from_secs(60))   // Check every minute
    .with_gc_batch_size(500)                      // Process more per cycle
    .with_gc_min_age(Duration::from_secs(300));   // 5 minute grace period

// Low-priority background: Less aggressive
let config = BlobConfig::new()
    .with_gc_interval(Duration::from_secs(3600)) // Check hourly
    .with_gc_batch_size(50)                       // Small batches
    .with_gc_min_age(Duration::from_secs(86400)); // 24 hour grace period
}

Batch Operations

#![allow(unused)]
fn main() {
// For multiple related artifacts, batch metadata updates
for artifact_id in artifact_ids {
    blob.tag(&artifact_id, "batch-processed").await?;
}

// Use full_gc() after bulk deletions
for artifact_id in to_delete {
    blob.delete(&artifact_id).await?;
}
blob.full_gc().await?; // Clean up all orphans at once
}

Verification Strategy

#![allow(unused)]
fn main() {
// Verify on read (paranoid mode)
let mut reader = blob.reader(&artifact_id).await?;
let data = reader.read_all().await?;
if !reader.verify().await? {
    return Err("Corruption detected");
}

// Periodic verification (background task)
for artifact_id in blob.list(None).await? {
    if !blob.verify(&artifact_id).await? {
        log::warn!("Corruption in artifact: {}", artifact_id);
    }
}
}
ModuleRelationship
tensor_storeUnderlying key-value storage for chunks and metadata
query_routerExecutes BLOB commands from parsed queries
neumann_shellInteractive CLI for blob operations
vector_engineOptional semantic search via embeddings
graph_engineOptional entity linking via graph edges

Dependencies

CratePurpose
tensor_storeKey-value storage layer
tokioAsync runtime for streaming and background GC
sha2SHA-256 hashing for content addressing
uuidArtifact ID generation (UUID v4)

Tensor Checkpoint

Tensor Checkpoint provides point-in-time snapshots of the database state for recovery operations. It enables users to create manual checkpoints before important operations, automatically checkpoint before destructive operations, and rollback to any previous checkpoint. Checkpoints are stored as blob artifacts in tensor_blob for content-addressable storage with automatic deduplication.

The module integrates with the query router to provide SQL-like commands (CHECKPOINT, CHECKPOINTS, ROLLBACK TO) and supports interactive confirmation prompts for destructive operations with configurable retention policies.

Module Structure

tensor_checkpoint/
  src/
    lib.rs          # CheckpointManager, CheckpointConfig
    state.rs        # CheckpointState, DestructiveOp, metadata types
    storage.rs      # Blob storage integration
    retention.rs    # Count-based purge logic
    preview.rs      # Destructive operation previews
    error.rs        # Error types

Key Types

Core Types

TypeDescription
CheckpointManagerMain API for checkpoint operations
CheckpointConfigConfiguration (retention, auto-checkpoint, interactive mode)
CheckpointStateFull checkpoint data with snapshot and metadata
CheckpointInfoLightweight checkpoint listing info
CheckpointTriggerContext for auto-checkpoints (command, operation, preview)

State Types

TypeDescription
DestructiveOpEnum of destructive operations that trigger auto-checkpoints
OperationPreviewSummary and sample data for confirmation prompts
CheckpointMetadataStatistics for validation (tables, nodes, embeddings)
RelationalMetaTable and row counts
GraphMetaNode and edge counts
VectorMetaEmbedding count

Error Types

VariantDescriptionCommon Cause
NotFoundCheckpoint not found by ID or nameTypo in checkpoint name or ID was pruned by retention
StorageBlob storage errorDisk full, permissions issue
SerializationBincode serialization errorCorrupt in-memory state
DeserializationBincode deserialization errorCorrupt checkpoint file
BlobUnderlying blob store errorBlobStore not initialized
SnapshotTensorStore snapshot errorStore locked or corrupted
CancelledOperation cancelled by userUser rejected confirmation prompt
InvalidIdInvalid checkpoint identifierEmpty or malformed ID string
RetentionRetention enforcement errorFailed to delete old checkpoints

Architecture

flowchart TB
    subgraph Commands
        CP[CHECKPOINT]
        CPS[CHECKPOINTS]
        RB[ROLLBACK TO]
    end

    subgraph CheckpointManager
        Create[create / create_auto]
        List[list]
        Rollback[rollback]
        Delete[delete]
        Confirm[request_confirmation]
        Preview[generate_preview]
    end

    subgraph Storage Layer
        CS[CheckpointStorage]
        RM[RetentionManager]
        PG[PreviewGenerator]
    end

    subgraph Dependencies
        Blob[tensor_blob::BlobStore]
        Store[tensor_store::TensorStore]
    end

    CP --> Create
    CPS --> List
    RB --> Rollback

    Create --> CS
    Create --> RM
    List --> CS
    Rollback --> CS
    Delete --> CS

    Confirm --> PG
    Preview --> PG

    CS --> Blob
    Create --> Store
    Rollback --> Store

Checkpoint Creation Flow

sequenceDiagram
    participant User
    participant Manager as CheckpointManager
    participant Store as TensorStore
    participant Storage as CheckpointStorage
    participant Retention as RetentionManager
    participant Blob as BlobStore

    User->>Manager: create(name, store)
    Manager->>Manager: Generate UUID
    Manager->>Manager: collect_metadata(store)
    Manager->>Store: snapshot_bytes()
    Store-->>Manager: Vec<u8>
    Manager->>Manager: Create CheckpointState
    Manager->>Storage: store(state, blob)
    Storage->>Storage: bitcode::encode(state)
    Storage->>Blob: put(filename, data, options)
    Blob-->>Storage: artifact_id
    Storage-->>Manager: artifact_id
    Manager->>Retention: enforce(blob)
    Retention->>Storage: list(blob)
    Storage-->>Retention: Vec<CheckpointInfo>
    Retention->>Retention: Sort by created_at DESC
    Retention->>Storage: delete(oldest beyond limit)
    Retention-->>Manager: deleted_count
    Manager-->>User: checkpoint_id

Rollback Flow

sequenceDiagram
    participant User
    participant Manager as CheckpointManager
    participant Storage as CheckpointStorage
    participant Blob as BlobStore
    participant Store as TensorStore

    User->>Manager: rollback(id_or_name, store)
    Manager->>Storage: load(id_or_name, blob)
    Storage->>Storage: find_by_id_or_name()
    Storage->>Storage: list() and match
    Storage->>Blob: get(artifact_id)
    Blob-->>Storage: checkpoint_bytes
    Storage->>Storage: bitcode::decode()
    Storage-->>Manager: CheckpointState
    Manager->>Store: restore_from_bytes(state.store_snapshot)
    Store->>Store: SlabRouter::from_bytes()
    Store->>Store: clear() current data
    Store->>Store: copy all entries from new router
    Store-->>Manager: Ok(())
    Manager-->>User: Success

Storage Format

Checkpoints are stored as blob artifacts using content-addressable storage:

PropertyValue
Tag_system:checkpoint
Content-Typeapplication/x-neumann-checkpoint
Formatbincode-serialized CheckpointState
Filenamecheckpoint_{id}.ncp
Creatorsystem:checkpoint

Checkpoint State Structure

The CheckpointState is serialized using bincode for efficient binary encoding:

#![allow(unused)]
fn main() {
#[derive(Serialize, Deserialize)]
pub struct CheckpointState {
    pub id: String,           // UUID v4
    pub name: String,         // User-provided or auto-generated
    pub created_at: u64,      // Unix timestamp (seconds)
    pub trigger: Option<CheckpointTrigger>,  // For auto-checkpoints
    pub store_snapshot: Vec<u8>,  // Serialized SlabRouterSnapshot
    pub metadata: CheckpointMetadata,
}
}

Snapshot Serialization Format

The store_snapshot field contains a V3 format snapshot:

#![allow(unused)]
fn main() {
// V3 snapshot structure (bincode serialized)
pub struct V3Snapshot {
    pub header: SnapshotHeader,     // Magic bytes, version, entry count
    pub router: SlabRouterSnapshot, // All slab data
}

pub struct SlabRouterSnapshot {
    pub index: EntityIndexSnapshot,      // Key-to-entity mapping
    pub embeddings: EmbeddingSlabSnapshot,
    pub graph: GraphTensorSnapshot,
    pub relations: RelationalSlabSnapshot,
    pub metadata: MetadataSlabSnapshot,
    pub cache: CacheRingSnapshot<TensorData>,
    pub blobs: BlobLogSnapshot,
}
}

Custom metadata stored with each artifact:

KeyTypeDescription
checkpoint_idStringUUID identifier
checkpoint_nameStringUser-provided or auto-generated name
created_atStringUnix timestamp (parsed to u64)
triggerStringOperation name (for auto-checkpoints only)

Metadata Collection Algorithm

When creating a checkpoint, metadata is collected by scanning the store:

#![allow(unused)]
fn main() {
fn collect_metadata(&self, store: &TensorStore) -> CheckpointMetadata {
    let store_key_count = store.len();

    // Count relational tables by scanning _schema: prefix
    let table_keys: Vec<_> = store.scan("_schema:");
    let table_count = table_keys.len();
    let mut total_rows = 0;
    for key in &table_keys {
        if let Some(table_name) = key.strip_prefix("_schema:") {
            total_rows += store.scan_count(&format!("{table_name}:"));
        }
    }

    // Count graph entities
    let node_count = store.scan_count("node:");
    let edge_count = store.scan_count("edge:");

    // Count embeddings
    let embedding_count = store.scan_count("_embed:");

    CheckpointMetadata::new(
        RelationalMeta::new(table_count, total_rows),
        GraphMeta::new(node_count, edge_count),
        VectorMeta::new(embedding_count),
        store_key_count,
    )
}
}

Configuration

CheckpointConfig

FieldTypeDefaultDescription
max_checkpointsusize10Maximum checkpoints before pruning
auto_checkpointbooltrueEnable auto-checkpoints before destructive ops
interactive_confirmbooltrueRequire confirmation for destructive ops
preview_sample_sizeusize5Number of sample rows in previews

Builder Pattern

#![allow(unused)]
fn main() {
let config = CheckpointConfig::default()
    .with_max_checkpoints(20)
    .with_auto_checkpoint(true)
    .with_interactive_confirm(false)
    .with_preview_sample_size(10);
}

Configuration Presets

Presetmax_checkpointsauto_checkpointinteractive_confirmUse Case
Default10truetrueInteractive CLI usage
Automated20truefalseBatch processing scripts
Minimal3falsefalseMemory-constrained environments
Safe50truetrueProduction with high retention

Destructive Operations

Operations that trigger auto-checkpoints when auto_checkpoint is enabled:

OperationVariantFieldsAffected Count
DELETEDeletetable, row_countrow_count
DROP TABLEDropTabletable, row_countrow_count
DROP INDEXDropIndextable, column1
NODE DELETENodeDeletenode_id, edge_count1 + edge_count
EMBED DELETEEmbedDeletekey1
VAULT DELETEVaultDeletekey1
BLOB DELETEBlobDeleteartifact_id, size1
CACHE CLEARCacheClearentry_countentry_count

DestructiveOp Implementation

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum DestructiveOp {
    Delete { table: String, row_count: usize },
    DropTable { table: String, row_count: usize },
    DropIndex { table: String, column: String },
    NodeDelete { node_id: u64, edge_count: usize },
    EmbedDelete { key: String },
    VaultDelete { key: String },
    BlobDelete { artifact_id: String, size: usize },
    CacheClear { entry_count: usize },
}

impl DestructiveOp {
    pub fn operation_name(&self) -> &'static str {
        match self {
            DestructiveOp::Delete { .. } => "DELETE",
            DestructiveOp::DropTable { .. } => "DROP TABLE",
            // ... etc
        }
    }

    pub fn affected_count(&self) -> usize {
        match self {
            DestructiveOp::Delete { row_count, .. } => *row_count,
            DestructiveOp::NodeDelete { edge_count, .. } => 1 + edge_count,
            DestructiveOp::DropIndex { .. } => 1,
            // ... etc
        }
    }
}
}

SQL Commands

CHECKPOINT

-- Named checkpoint
CHECKPOINT 'before-migration'

-- Auto-generated name (checkpoint-{timestamp})
CHECKPOINT

CHECKPOINTS

-- List all checkpoints
CHECKPOINTS

-- List last N checkpoints
CHECKPOINTS LIMIT 10

Returns: ID, Name, Created, Type (manual/auto)

ROLLBACK TO

-- By name
ROLLBACK TO 'checkpoint-name'

-- By ID
ROLLBACK TO 'uuid-string'

API Reference

CheckpointManager

#![allow(unused)]
fn main() {
impl CheckpointManager {
    /// Create manager with blob storage and configuration
    pub async fn new(
        blob: Arc<Mutex<BlobStore>>,
        config: CheckpointConfig
    ) -> Self;

    /// Create a manual checkpoint
    pub async fn create(
        &self,
        name: Option<&str>,
        store: &TensorStore
    ) -> Result<String>;

    /// Create an auto-checkpoint before destructive operation
    pub async fn create_auto(
        &self,
        command: &str,
        op: DestructiveOp,
        preview: OperationPreview,
        store: &TensorStore
    ) -> Result<String>;

    /// Rollback to a checkpoint by ID or name
    pub async fn rollback(
        &self,
        id_or_name: &str,
        store: &TensorStore
    ) -> Result<()>;

    /// List checkpoints, most recent first
    pub async fn list(
        &self,
        limit: Option<usize>
    ) -> Result<Vec<CheckpointInfo>>;

    /// Delete a checkpoint by ID or name
    pub async fn delete(&self, id_or_name: &str) -> Result<()>;

    /// Generate preview for a destructive operation
    pub fn generate_preview(
        &self,
        op: &DestructiveOp,
        sample_data: Vec<String>
    ) -> OperationPreview;

    /// Request user confirmation for an operation
    pub fn request_confirmation(
        &self,
        op: &DestructiveOp,
        preview: &OperationPreview
    ) -> bool;

    /// Set custom confirmation handler
    pub fn set_confirmation_handler(
        &mut self,
        handler: Arc<dyn ConfirmationHandler>
    );

    /// Check if auto-checkpoint is enabled
    pub fn auto_checkpoint_enabled(&self) -> bool;

    /// Check if interactive confirmation is enabled
    pub fn interactive_confirm_enabled(&self) -> bool;

    /// Access the current configuration
    pub fn config(&self) -> &CheckpointConfig;
}
}

ConfirmationHandler

#![allow(unused)]
fn main() {
pub trait ConfirmationHandler: Send + Sync {
    fn confirm(&self, op: &DestructiveOp, preview: &OperationPreview) -> bool;
}
}

Built-in implementations:

TypeBehaviorUse Case
AutoConfirmAlways returns trueAutomated scripts, testing
AutoRejectAlways returns falseTesting cancellation paths

CheckpointStorage

Internal storage layer for checkpoint persistence:

#![allow(unused)]
fn main() {
impl CheckpointStorage {
    /// Store a checkpoint state to blob storage
    pub async fn store(state: &CheckpointState, blob: &BlobStore) -> Result<String>;

    /// Load a checkpoint by ID or name
    pub async fn load(checkpoint_id: &str, blob: &BlobStore) -> Result<CheckpointState>;

    /// List all checkpoints (sorted by created_at descending)
    pub async fn list(blob: &BlobStore) -> Result<Vec<CheckpointInfo>>;

    /// Delete a checkpoint by artifact ID
    pub async fn delete(artifact_id: &str, blob: &BlobStore) -> Result<()>;
}
}

PreviewGenerator

Generates human-readable previews for destructive operations:

#![allow(unused)]
fn main() {
impl PreviewGenerator {
    pub fn new(sample_size: usize) -> Self;

    pub fn generate(&self, op: &DestructiveOp, sample_data: Vec<String>) -> OperationPreview;
}

// Utility functions
pub fn format_warning(op: &DestructiveOp) -> String;
pub fn format_confirmation_prompt(op: &DestructiveOp, preview: &OperationPreview) -> String;
}

Usage Examples

Basic Usage

#![allow(unused)]
fn main() {
use tensor_checkpoint::{CheckpointManager, CheckpointConfig};
use tensor_blob::{BlobStore, BlobConfig};
use tensor_store::TensorStore;

// Initialize
let store = TensorStore::new();
let blob = BlobStore::new(store.clone(), BlobConfig::default()).await?;
let blob = Arc::new(Mutex::new(blob));

let config = CheckpointConfig::default();
let manager = CheckpointManager::new(blob, config).await;

// Create checkpoint
let id = manager.create(Some("before-migration"), &store).await?;

// ... make changes ...

// Rollback if needed
manager.rollback("before-migration", &store).await?;
}

With Query Router

#![allow(unused)]
fn main() {
use query_router::QueryRouter;

let mut router = QueryRouter::new();
router.init_blob()?;
router.init_checkpoint()?;

// Execute checkpoint commands via SQL
router.execute_parsed("CHECKPOINT 'backup'")?;
router.execute_parsed("CHECKPOINTS")?;
router.execute_parsed("ROLLBACK TO 'backup'")?;
}

Custom Confirmation Handler

#![allow(unused)]
fn main() {
use tensor_checkpoint::{ConfirmationHandler, DestructiveOp, OperationPreview};
use std::io::{self, Write};

struct InteractiveHandler;

impl ConfirmationHandler for InteractiveHandler {
    fn confirm(&self, op: &DestructiveOp, preview: &OperationPreview) -> bool {
        println!("{}", tensor_checkpoint::format_confirmation_prompt(op, preview));
        io::stdout().flush().unwrap();

        let mut input = String::new();
        io::stdin().read_line(&mut input).unwrap();
        input.trim().to_lowercase() == "yes"
    }
}

// Usage
manager.set_confirmation_handler(Arc::new(InteractiveHandler));
}

Auto-Checkpoint with Rejection

#![allow(unused)]
fn main() {
use tensor_checkpoint::{AutoReject, CheckpointConfig};

// Create config with auto-checkpoint enabled
let config = CheckpointConfig::default()
    .with_auto_checkpoint(true)
    .with_interactive_confirm(true);

let mut manager = CheckpointManager::new(blob, config).await;
manager.set_confirmation_handler(Arc::new(AutoReject));

// DELETE will be rejected, no checkpoint created, operation cancelled
let result = router.execute("DELETE FROM users WHERE age > 50");
assert!(result.is_err());  // Operation cancelled by user
}

Retention Management

Checkpoints are automatically pruned when max_checkpoints is exceeded:

Retention Algorithm

#![allow(unused)]
fn main() {
pub async fn enforce(&self, blob: &BlobStore) -> Result<usize> {
    let checkpoints = CheckpointStorage::list(blob).await?;

    if checkpoints.len() <= self.max_checkpoints {
        return Ok(0);
    }

    let to_remove = checkpoints.len() - self.max_checkpoints;
    let mut removed = 0;

    // Checkpoints are sorted by created_at descending, oldest are at end
    for checkpoint in checkpoints.iter().rev().take(to_remove) {
        if CheckpointStorage::delete(&checkpoint.artifact_id, blob)
            .await
            .is_ok()
        {
            removed += 1;
        }
    }

    Ok(removed)
}
}

Retention Timing

Retention is enforced after every checkpoint creation:

  1. Create new checkpoint
  2. Store in blob storage
  3. Call retention.enforce()
  4. Return checkpoint ID

This ensures the checkpoint count never exceeds max_checkpoints + 1 at any point.

Retention Edge Cases

ScenarioBehavior
Creation failsRetention not called, count unchanged
Retention delete failsLogged but not fatal, continues deleting
max_checkpoints = 0All checkpoints deleted after creation
max_checkpoints = 1Only newest checkpoint retained

Interactive Confirmation

When interactive_confirm is enabled, destructive operations display a preview:

WARNING: About to delete 5 row(s) from table 'users'
Will delete 5 row(s) from table 'users'

Affected data sample:
  1. id=1, name='Alice'
  2. id=2, name='Bob'
  ... and 3 more

Type 'yes' to proceed, anything else to cancel:

Preview Generation

The preview generator formats human-readable summaries:

#![allow(unused)]
fn main() {
fn format_summary(&self, op: &DestructiveOp) -> String {
    match op {
        DestructiveOp::Delete { table, row_count } => {
            format!("Will delete {row_count} row(s) from table '{table}'")
        },
        DestructiveOp::DropTable { table, row_count } => {
            format!("Will drop table '{table}' containing {row_count} row(s)")
        },
        DestructiveOp::BlobDelete { artifact_id, size } => {
            let size_str = format_bytes(*size);
            format!("Will delete blob artifact '{artifact_id}' ({size_str})")
        },
        // ... etc
    }
}
}

Size Formatting

Blob sizes are formatted for readability:

BytesDisplay
< 1024“N bytes”
>= 1 KB“N.NN KB”
>= 1 MB“N.NN MB”
>= 1 GB“N.NN GB”

Rollback Algorithm

The rollback process completely replaces the store contents:

Algorithm Steps

  1. Locate Checkpoint: Search by ID first, then by name
  2. Load State: Deserialize CheckpointState from blob storage
  3. Deserialize Snapshot: Convert store_snapshot bytes to SlabRouter
  4. Clear Current Data: Remove all entries from current store
  5. Copy Restored Data: Iterate and copy all entries from restored router

Rollback Implementation

#![allow(unused)]
fn main() {
pub async fn rollback(&self, id_or_name: &str, store: &TensorStore) -> Result<()> {
    let blob = self.blob.lock().await;
    let state = CheckpointStorage::load(id_or_name, &blob).await?;

    store
        .restore_from_bytes(&state.store_snapshot)
        .map_err(|e| CheckpointError::Snapshot(e.to_string()))?;

    Ok(())
}

// In TensorStore
pub fn restore_from_bytes(&self, bytes: &[u8]) -> SnapshotResult<()> {
    let new_router = SlabRouter::from_bytes(bytes)?;

    // Clear current and copy data from new router
    self.router.clear();
    for key in new_router.scan("") {
        if let Ok(value) = new_router.get(&key) {
            let _ = self.router.put(&key, value);
        }
    }

    Ok(())
}
}

Rollback Characteristics

AspectBehavior
AtomicityNot atomic - partial restore possible on failure
IsolationNo locking - concurrent operations may see partial state
DurationO(n) where n = number of entries
MemoryRequires 2x memory during restore (old + new)

Edge Cases and Gotchas

Name vs ID Lookup

Checkpoints can be referenced by either name or ID:

#![allow(unused)]
fn main() {
async fn find_by_id_or_name(id_or_name: &str, blob: &BlobStore) -> Result<String> {
    let checkpoints = Self::list(blob).await?;

    for cp in checkpoints {
        // Exact match on ID or name
        if cp.id == id_or_name || cp.name == id_or_name {
            return Ok(cp.artifact_id);
        }
    }

    Err(CheckpointError::NotFound(id_or_name.to_string()))
}
}

Gotcha: If a checkpoint is named with a valid UUID format, it may conflict with ID lookup.

Auto-Generated Names

When no name is provided:

#![allow(unused)]
fn main() {
let name = name.map(String::from).unwrap_or_else(|| {
    let now = std::time::SystemTime::now()
        .duration_since(std::time::UNIX_EPOCH)
        .map(|d| d.as_secs())
        .unwrap_or(0);
    format!("checkpoint-{now}")
});
}

Auto-checkpoint names follow the pattern: auto-before-{operation-name}

Timestamp Edge Cases

ScenarioBehavior
System time before epochTimestamp becomes 0
Rapid checkpoint creationMay have same second timestamp
Clock driftCheckpoints may be out of order

Blob Store Dependency

#![allow(unused)]
fn main() {
pub fn init_checkpoint(&mut self) -> Result<()> {
    self.init_checkpoint_with_config(CheckpointConfig::default())
}

pub fn init_checkpoint_with_config(&mut self, config: CheckpointConfig) -> Result<()> {
    let blob = self
        .blob
        .as_ref()
        .ok_or_else(|| {
            RouterError::CheckpointError(
                "Blob store must be initialized first".to_string()
            )
        })?;
    // ...
}
}

Gotcha: Always call init_blob() before init_checkpoint().

Performance Tips

Checkpoint Creation Performance

FactorImpactRecommendation
Store sizeO(n) serializationKeep hot data separate
Retention limitMore deletions on creationSet appropriate max_checkpoints
Blob storageNetwork latency for remoteUse local storage for fast checkpoints

Memory Considerations

  • Full snapshot is held in memory during creation
  • Rollback requires 2x memory temporarily
  • Large embeddings significantly increase checkpoint size

Optimization Strategies

  1. Incremental Checkpoints (not yet supported)

    • Currently full snapshots only
    • Future: delta-based checkpoints
  2. Selective Checkpointing

    • Use separate stores for hot vs. cold data
    • Only checkpoint critical data
  3. Compression

    • TensorStore supports compressed snapshots for file I/O
    • Checkpoint uses bincode (no compression)

Benchmarks

Store SizeCheckpoint TimeRollback TimeMemory
1K entries~5ms~3ms~100KB
10K entries~50ms~30ms~1MB
100K entries~500ms~300ms~10MB
1M entries~5s~3s~100MB
ModuleRelationship
tensor_blobStorage backend for checkpoint data
tensor_storeSource of snapshots and restore target
query_routerSQL command integration
neumann_shellInteractive confirmation handling

Limitations

  • Full snapshots only (no incremental checkpoints)
  • Single-node operation (no distributed checkpoints)
  • In-memory restore (entire snapshot loaded)
  • No automatic scheduling (manual or trigger-based only)
  • Not atomic (partial restore possible on failure)
  • No encryption (checkpoints stored in plaintext)
  • Bloom filter state not preserved (rebuilt on load if needed)

Tensor Unified

Cross-engine operations and unified entity management for Neumann. Provides a single interface for queries that span relational, graph, and vector engines with async-first design and thread safety inherited from TensorStore.

Design Principles

  1. Cross-Engine Abstraction: Single interface for operations spanning multiple engines
  2. Unified Entities: Entities can have relational fields, graph connections, and embeddings
  3. Composable Queries: Combine vector similarity with graph connectivity
  4. Async-First: All cross-engine operations support async execution
  5. Thread Safety: Inherits from underlying engines via TensorStore

Architecture

                    +------------------+
                    | UnifiedEngine    |
                    +------------------+
                           |
        +------------------+------------------+
        |                  |                  |
        v                  v                  v
+---------------+  +---------------+  +---------------+
|  Relational   |  |    Graph      |  |    Vector     |
|    Engine     |  |    Engine     |  |    Engine     |
+---------------+  +---------------+  +---------------+
        |                  |                  |
        +------------------+------------------+
                           |
                    +------v------+
                    | TensorStore |
                    +-------------+

All engines share the same TensorStore instance, enabling cross-engine queries without data duplication.

Internal Engine Coordination

sequenceDiagram
    participant Client
    participant UnifiedEngine
    participant VectorEngine
    participant GraphEngine
    participant TensorStore

    Client->>UnifiedEngine: create_entity("user:1", fields, embedding)
    UnifiedEngine->>VectorEngine: set_entity_embedding("user:1", embedding)
    VectorEngine->>TensorStore: put("user:1", TensorData{_embedding: ...})
    UnifiedEngine->>TensorStore: get("user:1")
    TensorStore-->>UnifiedEngine: TensorData
    UnifiedEngine->>TensorStore: put("user:1", TensorData{fields + _embedding})
    UnifiedEngine-->>Client: Ok(())

Key Types

TypeDescription
UnifiedEngineMain entry point for cross-engine operations
UnifiedResultQuery result containing description and items
UnifiedItemSingle item with source, id, data, embedding, and score
UnifiedErrorError type wrapping engine-specific errors
FindPatternPattern for FIND queries (Nodes or Edges)
DistanceMetricSimilarity metric (Cosine, Euclidean, DotProduct)
EntityInputTuple type for batch operations: (key, fields, embedding)
UnifiedTrait for converting engine types to UnifiedItem
FilterConditionRe-exported from vector_engine for filtered search
FilterValueRe-exported from vector_engine for filter values
VectorCollectionConfigRe-exported from vector_engine for collection config

UnifiedItem

#![allow(unused)]
fn main() {
pub struct UnifiedItem {
    pub source: String,                    // "relational", "graph", "vector", or combined
    pub id: String,                        // Entity key
    pub data: HashMap<String, String>,     // Entity fields
    pub embedding: Option<Vec<f32>>,       // Optional embedding
    pub score: Option<f32>,                // Similarity score if applicable
}
}

The source field indicates which engine(s) produced the result:

  • "graph" - Result from graph operations (nodes, edges)
  • "vector" - Result from vector similarity search
  • "unified" - Result from cross-engine entity retrieval
  • "vector+graph" - Result from find_similar_connected (similarity + connectivity)
  • "graph+vector" - Result from find_neighbors_by_similarity (connectivity + similarity)

UnifiedError

VariantCause
RelationalErrorError from relational engine
GraphErrorError from graph engine
VectorErrorError from vector engine
NotFoundEntity not found
InvalidOperationInvalid operation attempted

Error conversion is automatic via From implementations:

#![allow(unused)]
fn main() {
impl From<graph_engine::GraphError> for UnifiedError {
    fn from(e: graph_engine::GraphError) -> Self {
        UnifiedError::GraphError(e.to_string())
    }
}

impl From<vector_engine::VectorError> for UnifiedError {
    fn from(e: vector_engine::VectorError) -> Self {
        UnifiedError::VectorError(e.to_string())
    }
}

impl From<relational_engine::RelationalError> for UnifiedError {
    fn from(e: relational_engine::RelationalError) -> Self {
        UnifiedError::RelationalError(e.to_string())
    }
}
}

Entity Storage Format

Unified entities use reserved field prefixes in TensorData to store cross-engine data within a single key-value entry:

FieldTypeDescription
_outPointers(Vec<String>)Outgoing edge keys
_inPointers(Vec<String>)Incoming edge keys
_embeddingVector(Vec<f32>) or Sparse(SparseVector)Embedding vector
_labelScalar(String)Entity type/label
_typeScalar(String)Discriminator (“node”, “edge”, “row”)
_idScalar(Int)Numeric entity ID
_fromScalar(String)Edge source key
_toScalar(String)Edge target key
_edge_typeScalar(String)Edge type
_directedScalar(Bool)Whether edge is directed
_tableScalar(String)Table name for relational rows

Entity Storage Example

Key: "user:alice"
TensorData:
  _embedding: Vector([0.1, 0.2, 0.3, 0.4])
  _out: Pointers(["edge:follows:1", "edge:likes:2"])
  _in: Pointers(["edge:follows:3"])
  name: Scalar(String("Alice"))
  role: Scalar(String("admin"))

Key: "edge:follows:1"
TensorData:
  _type: Scalar(String("edge"))
  _from: Scalar(String("user:alice"))
  _to: Scalar(String("user:bob"))
  _edge_type: Scalar(String("follows"))
  _directed: Scalar(Bool(true))

Sparse Vector Auto-Detection

Embeddings are automatically stored in sparse format when >50% of values are zero:

#![allow(unused)]
fn main() {
fn should_use_sparse(vector: &[f32]) -> bool {
    if vector.is_empty() {
        return false;
    }
    let nnz = vector.iter().filter(|&&v| v.abs() > 1e-6).count();
    // Sparse if nnz <= len/2
    nnz * 2 <= vector.len()
}
}

Initialization

#![allow(unused)]
fn main() {
use tensor_unified::UnifiedEngine;
use tensor_store::TensorStore;

// Create with new store
let engine = UnifiedEngine::new();

// Create with shared store
let store = TensorStore::new();
let engine = UnifiedEngine::with_store(store);

// Create with existing engines
let engine = UnifiedEngine::with_engines(store, relational, graph, vector);
}

Internal Structure

#![allow(unused)]
fn main() {
pub struct UnifiedEngine {
    store: TensorStore,
    relational: Arc<RelationalEngine>,
    graph: Arc<GraphEngine>,
    vector: Arc<VectorEngine>,
}
}

The Arc wrappers enable:

  • Thread-safe sharing across async tasks
  • Zero-copy cloning of the engine
  • Independent engine access when needed

Entity Operations

Creating Entities

#![allow(unused)]
fn main() {
use std::collections::HashMap;

// Create an entity with fields and optional embedding
let mut fields = HashMap::new();
fields.insert("name".to_string(), "Alice".to_string());
fields.insert("role".to_string(), "admin".to_string());

engine.create_entity(
    "user:1",
    fields,
    Some(vec![0.1, 0.2, 0.3, 0.4])  // Optional embedding
).await?;
}

Internal flow:

flowchart TD
    A[create_entity] --> B{Has embedding?}
    B -->|Yes| C[VectorEngine::set_entity_embedding]
    C --> D[Store to TensorData._embedding]
    B -->|No| E[Get existing TensorData or new]
    D --> E
    E --> F[For each field]
    F --> G[Set field as TensorValue::Scalar]
    G --> H[TensorStore::put]

Connecting Entities

#![allow(unused)]
fn main() {
// Connect entities via graph edge
let edge_id = engine.connect_entities("user:1", "user:2", "follows").await?;
}

Edge creation updates three TensorData entries:

  1. Creates new edge entry with _from, _to, _edge_type, _directed
  2. Adds edge key to source entity’s _out field
  3. Adds edge key to target entity’s _in field

Retrieving Entities

#![allow(unused)]
fn main() {
// Get entity with all data and embedding
let item = engine.get_entity("user:1").await?;
println!("Fields: {:?}", item.data);
println!("Embedding: {:?}", item.embedding);
}

Gotcha: Returns UnifiedError::NotFound if the entity has neither fields nor embedding.

Cross-Engine Queries

Find Similar Connected

Find entities similar to a query that are also connected via graph edges:

flowchart LR
    A[Query Entity] --> B[Get embedding]
    B --> C[Vector search top_k*2]
    C --> D[Get connected neighbors]
    D --> E[HashSet intersection]
    E --> F[Take top_k]
    F --> G[Return UnifiedItems]
#![allow(unused)]
fn main() {
// Find entities similar to query AND connected to target
let results = engine.find_similar_connected(
    "user:1",      // Query entity (uses its embedding)
    "hub:main",    // Find entities connected to this
    10             // Top-k results
).await?;
}

Algorithm details:

  1. Retrieves embedding from query_key via VectorEngine::get_entity_embedding
  2. Searches for top k*2 similar entities (over-fetches for filtering)
  3. Gets neighbors of connected_to via GraphEngine::get_entity_neighbors
  4. Builds HashSet of connected neighbors for O(1) lookup
  5. Filters similar results to only those in the neighbor set
  6. Returns top-k results with source "vector+graph"

Edge case: If query_key has no embedding, returns VectorError::NotFound.

Find Similar Connected with Filter

Enhanced version that combines vector similarity, graph connectivity, and metadata filtering:

#![allow(unused)]
fn main() {
use vector_engine::{FilterCondition, FilterValue};

// Build a filter for metadata
let filter = FilterCondition::Eq(
    "category".to_string(),
    FilterValue::String("article".to_string())
);

// Find entities similar to query, connected to hub, matching filter
let results = engine.find_similar_connected_filtered(
    "user:1",      // Query entity (uses its embedding)
    "hub:main",    // Find entities connected to this
    Some(&filter), // Optional metadata filter
    10             // Top-k results
).await?;
}

Algorithm:

  1. Gets query embedding from query_key
  2. Gets connected neighbor keys from graph
  3. Builds combined filter: key IN neighbors AND user_filter
  4. Uses pre-filter strategy for high selectivity
  5. Returns filtered results with source "vector+graph"

The filtered version eliminates post-processing by pushing filters into the vector search, improving performance for selective queries.

Find Neighbors by Similarity

Find graph neighbors sorted by similarity to a query vector:

flowchart LR
    A[Entity Key] --> B[Get neighbors via graph]
    B --> C[For each neighbor]
    C --> D[Get embedding]
    D --> E{Dimension match?}
    E -->|Yes| F[Compute cosine similarity]
    E -->|No| G[Skip]
    F --> H[Collect results]
    H --> I[Sort by score desc]
    I --> J[Truncate to top_k]
#![allow(unused)]
fn main() {
// Find neighbors of an entity sorted by similarity to a vector
let results = engine.find_neighbors_by_similarity(
    "user:1",                    // Entity to get neighbors of
    &[0.1, 0.2, 0.3, 0.4],      // Query vector
    10                           // Top-k results
).await?;
}

Algorithm details:

  1. Gets all neighbors (both directions) via GraphEngine::get_entity_neighbors
  2. For each neighbor:
    • Attempts to get embedding via VectorEngine::get_entity_embedding
    • Skips if no embedding or dimension mismatch
    • Computes cosine similarity with query vector
  3. Sorts results by score descending
  4. Truncates to top-k
  5. Returns results with source "graph+vector"

Gotcha: Neighbors without embeddings are silently skipped.

Unified Entity Storage

The unified entity storage methods provide a streamlined API for storing entity fields as vector metadata, eliminating double-storage overhead.

Creating Unified Entities

#![allow(unused)]
fn main() {
use std::collections::HashMap;

let mut fields = HashMap::new();
fields.insert("title".to_string(), "Introduction to Rust".to_string());
fields.insert("author".to_string(), "Alice".to_string());

// Store entity with fields as vector metadata
engine.create_entity_unified(
    "doc:1",
    fields,
    Some(vec![0.1, 0.2, 0.3, 0.4])
).await?;

// Without embedding, stores to TensorStore only
engine.create_entity_unified("doc:2", fields, None).await?;
}

When an embedding is provided, fields are stored as vector metadata alongside the embedding. This enables filtered search without requiring a separate storage lookup.

Retrieving Unified Entities

#![allow(unused)]
fn main() {
// Get entity with fields from vector metadata
let item = engine.get_entity_unified("doc:1").await?;

println!("Title: {:?}", item.data.get("title"));
println!("Embedding: {:?}", item.embedding);
}

The retrieval first attempts to load from vector metadata. If not found, it falls back to the standard TensorStore lookup.

Collection-Based Entity Organization

Collections provide type-based organization for entities, enabling scoped searches and dimension enforcement.

Creating Entities in Collections

#![allow(unused)]
fn main() {
use vector_engine::VectorCollectionConfig;
use std::collections::HashMap;

// Create a collection for documents
let config = VectorCollectionConfig::default()
    .with_dimension(768)
    .with_metric(DistanceMetric::Cosine);

engine.create_entity_collection("documents", config)?;

// Store entity in collection
let mut fields = HashMap::new();
fields.insert("title".to_string(), "ML Paper".to_string());

engine.create_entity_in_collection(
    "documents",
    "paper:1",
    fields,
    vec![0.1; 768]
).await?;
}

Searching in Collections

#![allow(unused)]
fn main() {
use vector_engine::FilterCondition;

// Basic search in collection
let results = engine.find_similar_in_collection(
    "documents",
    &query_embedding,
    None,  // No filter
    10
).await?;

// Filtered search in collection
let filter = FilterCondition::Eq("author".to_string(), "Alice".into());
let results = engine.find_similar_in_collection(
    "documents",
    &query_embedding,
    Some(&filter),
    10
).await?;
}

Managing Collections

#![allow(unused)]
fn main() {
// List all entity collections
let collections = engine.list_entity_collections();

// Delete a collection
engine.delete_entity_collection("documents")?;
}

Collection Isolation

Collections ensure entity isolation:

  • Each collection has its own key namespace
  • Dimension mismatches are rejected per-collection config
  • Searches only see entities within the specified collection
  • Deleting a collection removes all entities in it

Find Nodes and Edges

#![allow(unused)]
fn main() {
// Find all nodes with optional label filter
let nodes = engine.find_nodes(Some("person"), None).await?;

// Find all edges with optional type filter
let edges = engine.find_edges(Some("follows"), None).await?;

// Find with pattern and limit
let pattern = FindPattern::Nodes { label: Some("document".to_string()) };
let result = engine.find(&pattern, Some(10)).await?;
}

Find Pattern Matching Implementation

The find_nodes and find_edges methods scan the TensorStore for matching entities:

Node Scanning Algorithm

#![allow(unused)]
fn main() {
fn scan_nodes(&self, label_filter: Option<&str>) -> Result<Vec<Node>> {
    let keys = self.store.scan("node:");  // Prefix scan

    for key in keys {
        // Filter out edge lists (node:123:out, node:123:in)
        if key.contains(":out") || key.contains(":in") {
            continue;
        }

        // Parse node ID from key "node:{id}"
        if let Some(id_str) = key.strip_prefix("node:") {
            if let Ok(id) = id_str.parse::<u64>() {
                // Fetch and optionally filter by label
            }
        }
    }
}
}

Condition Matching

Conditions are evaluated against node/edge properties:

ConditionNode FieldsEdge Fields
Eq("id", ...)Matches node.idMatches edge.id
Eq("label", ...)Matches node.labelN/A
Eq("type", ...)N/AMatches edge.edge_type
Eq("edge_type", ...)N/AMatches edge.edge_type (alias)
Eq("from", ...)N/AMatches edge.from
Eq("to", ...)N/AMatches edge.to
Eq(property, ...)Matches node.properties[property]Matches edge.properties[property]
And(a, b)Both must matchBoth must match
Or(a, b)Either must matchEither must match
Other conditionsReturns true (pass-through)Returns true (pass-through)

Gotcha: Conditions other than Eq, And, Or return true (not yet implemented for graph entities).

Batch Operations

#![allow(unused)]
fn main() {
// Store multiple embeddings
let items = vec![
    ("doc1".to_string(), vec![0.1, 0.2, 0.3]),
    ("doc2".to_string(), vec![0.4, 0.5, 0.6]),
];
let count = engine.embed_batch(items).await?;

// Create multiple entities
let entities: Vec<EntityInput> = vec![
    ("e1".to_string(), HashMap::from([("name".to_string(), "A".to_string())]), None),
    ("e2".to_string(), HashMap::from([("name".to_string(), "B".to_string())]), Some(vec![0.1, 0.2])),
];
let count = engine.create_entities_batch(entities).await?;
}

Note: Batch operations process sequentially (not parallel). Failed individual operations are counted as failures but don’t abort the batch.

Unified Trait

Types implementing the Unified trait can be converted to UnifiedItem:

#![allow(unused)]
fn main() {
pub trait Unified {
    fn as_unified(&self) -> UnifiedItem;
    fn source_engine(&self) -> &'static str;
    fn unified_id(&self) -> String;
}
}

Implemented for:

  • graph_engine::Node - Converts label and properties to data fields
  • graph_engine::Edge - Converts from, to, type, and properties to data fields
  • vector_engine::SearchResult - Converts key and score

Implementation Examples

#![allow(unused)]
fn main() {
impl Unified for Node {
    fn as_unified(&self) -> UnifiedItem {
        let mut item = UnifiedItem::new("graph", self.id.to_string());
        item.set("label", &self.label);
        for (k, v) in &self.properties {
            item.set(k.clone(), format!("{:?}", v));  // Debug format for PropertyValue
        }
        item
    }

    fn source_engine(&self) -> &'static str { "graph" }
    fn unified_id(&self) -> String { self.id.to_string() }
}

impl Unified for SearchResult {
    fn as_unified(&self) -> UnifiedItem {
        UnifiedItem::new("vector", &self.key).with_score(self.score)
    }

    fn source_engine(&self) -> &'static str { "vector" }
    fn unified_id(&self) -> String { self.key.clone() }
}
}

Query Language

Cross-engine operations are exposed via the query language:

Entity Creation

-- Create entity with fields and embedding
ENTITY CREATE 'user:1' {name: 'Alice', role: 'admin'} EMBEDDING [0.1, 0.2, 0.3]

-- Create entity with fields only
ENTITY CREATE 'user:2' {name: 'Bob'}

-- Connect entities
ENTITY CONNECT 'user:1' -> 'user:2' : follows

Cross-Engine Similarity

-- Find similar entities that are also connected to a hub
SIMILAR 'query:key' CONNECTED TO 'hub:entity' LIMIT 10

-- Find neighbors sorted by similarity
NEIGHBORS 'entity:key' BY SIMILAR [0.1, 0.2, 0.3] LIMIT 10

QueryRouter Integration

QueryRouter integrates with UnifiedEngine for cross-engine operations. When created with with_shared_store(), the router automatically initializes an internal UnifiedEngine:

classDiagram
    class QueryRouter {
        -relational: Arc~RelationalEngine~
        -graph: Arc~GraphEngine~
        -vector: Arc~VectorEngine~
        -unified: Option~UnifiedEngine~
        -hnsw_index: Option~HNSWIndex~
        +with_shared_store(store) QueryRouter
        +unified() Option~UnifiedEngine~
        +find_similar_connected()
        +find_neighbors_by_similarity()
    }

    class UnifiedEngine {
        -store: TensorStore
        -relational: Arc~RelationalEngine~
        -graph: Arc~GraphEngine~
        -vector: Arc~VectorEngine~
    }

    QueryRouter --> UnifiedEngine : contains
    QueryRouter --> RelationalEngine : shares Arc
    QueryRouter --> GraphEngine : shares Arc
    QueryRouter --> VectorEngine : shares Arc
    UnifiedEngine --> RelationalEngine : shares Arc
    UnifiedEngine --> GraphEngine : shares Arc
    UnifiedEngine --> VectorEngine : shares Arc
#![allow(unused)]
fn main() {
use query_router::QueryRouter;
use tensor_store::TensorStore;

// Create router with shared store - this initializes UnifiedEngine
let store = TensorStore::new();
let router = QueryRouter::with_shared_store(store);

// Verify UnifiedEngine is available
assert!(router.unified().is_some());

// Cross-engine Rust API methods delegate to UnifiedEngine
let results = router.find_neighbors_by_similarity("entity:1", &[0.1, 0.2], 10)?;
let results = router.find_similar_connected("query:1", "hub:1", 5)?;

// Query language commands also use the integrated engines
router.execute_parsed("ENTITY CREATE 'doc:1' {title: 'Hello'} EMBEDDING [0.1, 0.2]")?;
router.execute_parsed("ENTITY CONNECT 'user:1' -> 'doc:1' : authored")?;
router.execute_parsed("SIMILAR 'query:doc' CONNECTED TO 'user:1' LIMIT 5")?;
}

HNSW Optimization Path

When QueryRouter has an HNSW index, find_similar_connected uses it instead of brute-force search:

#![allow(unused)]
fn main() {
// Use HNSW index if available, otherwise fall back to brute-force
let similar = if let Some((ref index, ref keys)) = self.hnsw_index {
    self.vector.search_with_hnsw(index, keys, &query_embedding, top_k * 2)
} else {
    self.vector.search_entities(&query_embedding, top_k * 2)
};
}

Performance

OperationComplexityNotes
create_entityO(1)Single store put + optional embedding
connect_entitiesO(1)Three store operations (edge + 2 entity updates)
get_entityO(1)Single store get + optional embedding lookup
find_similar_connectedO(k log n)HNSW search + graph intersection
find_similar_connected (brute)O(n)Linear scan when no HNSW index
find_similar_connected_filteredO(m)Pre-filter search, m = matching keys
create_entity_unifiedO(1)Single store with metadata
get_entity_unifiedO(1)Metadata lookup with fallback
create_entity_in_collectionO(1)Collection-scoped store
find_similar_in_collectionO(c)c = collection size
find_neighbors_by_similarityO(d * k)Neighbor fetch + k similarity computations
find_nodesO(n)Full scan with prefix filter
find_edgesO(e)Full scan with prefix filter
embed_batchO(b)Sequential embedding storage
create_entities_batchO(b)Sequential entity creation

Where:

  • n = number of entities with embeddings
  • d = average degree (number of neighbors)
  • k = top-k results requested
  • e = number of edges
  • b = batch size

Benchmarks

From tensor_unified_bench.rs:

Operation10 items100 items1000 items
create_entity~50us~500us~5ms
embed_batch~30us~300us~3ms
find_nodes~10us~100us~1ms
UnifiedItem::new~50ns
UnifiedItem::with_data~200ns

Thread Safety

UnifiedEngine is thread-safe via:

  • Arc<VectorEngine>, Arc<GraphEngine>, Arc<RelationalEngine>
  • All underlying engines share thread-safe TensorStore (DashMap)
  • No lock poisoning (parking_lot semantics)
#![allow(unused)]
fn main() {
impl Clone for UnifiedEngine {
    fn clone(&self) -> Self {
        Self {
            store: self.store.clone(),           // Arc<DashMap> clone
            relational: Arc::clone(&self.relational),
            graph: Arc::clone(&self.graph),
            vector: Arc::clone(&self.vector),
        }
    }
}
}

Safe concurrent patterns:

  • Multiple readers on same entity
  • Multiple writers on different entities
  • Mixed reads/writes (DashMap shard locking)

Gotcha: Concurrent writes to the same entity may interleave fields. Use transactions for atomicity.

Configuration

UnifiedEngine uses the configuration of its underlying engines:

  • TensorStore: Storage configuration
  • VectorEngine: HNSW index parameters, similarity metrics
  • GraphEngine: Graph traversal settings
  • RelationalEngine: Table and index configuration

Best Practices

Entity Key Naming

Use prefixed keys to distinguish entity types:

#![allow(unused)]
fn main() {
"user:123"      // User entities
"doc:456"       // Document entities
"hub:main"      // Hub/aggregate entities
"edge:follows:1" // Edge entities (auto-generated)
}

Embedding Dimensions

Ensure consistent embedding dimensions across entities:

#![allow(unused)]
fn main() {
// Good: All entities use 384-dimensional embeddings
engine.create_entity("doc:1", fields, Some(vec![0.0; 384])).await?;
engine.create_entity("doc:2", fields, Some(vec![0.0; 384])).await?;

// Bad: Dimension mismatch causes similarity search to skip entities
engine.create_entity("doc:1", fields, Some(vec![0.0; 384])).await?;
engine.create_entity("doc:2", fields, Some(vec![0.0; 768])).await?;  // Different dimension!
}

Cross-Engine Query Optimization

For find_similar_connected:

  1. Build HNSW index for large vector sets (>5000 entities)
  2. Ensure connected_to entity has edges (empty neighbors returns empty results)
  3. Request top_k * 2 internally to account for filtering

For find_neighbors_by_similarity:

  1. Ensure neighbors have embeddings (no embedding = skipped)
  2. Use same dimension for query vector as stored embeddings
  3. Consider degree distribution (high-degree nodes = more similarity computations)
ModuleRelationship
tensor_storeShared storage backend, provides TensorData and fields constants
relational_engineRelational data, conditions for filtering
graph_engineGraph connectivity, entity edges, neighbor queries
vector_engineEmbeddings, similarity search, HNSW index, FilterCondition, FilterValue, VectorCollectionConfig
query_routerQuery execution, language integration, HNSW optimization, re-exports filter types

Dependencies

  • tensor_store: Core storage
  • relational_engine: Table operations
  • graph_engine: Graph operations
  • vector_engine: Vector search
  • tokio: Async runtime (multi-threaded)
  • futures: Async utilities
  • serde: Serialization for results and items
  • serde_json: JSON output for UnifiedResult

Example: Code Intelligence System

From examples/code_search.rs:

use std::collections::HashMap;
use tensor_unified::UnifiedEngine;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let engine = UnifiedEngine::new();

    // Store functions with embeddings representing semantic meaning
    let mut props = HashMap::new();
    props.insert("type".to_string(), "function".to_string());
    props.insert("language".to_string(), "rust".to_string());

    // "process_data" - embedding represents data processing semantics
    engine.create_entity(
        "func:process_data",
        props.clone(),
        Some(vec![1.0, 0.9, 0.0, 0.0])
    ).await?;

    // "validate_input" - embedding represents validation semantics
    engine.create_entity(
        "func:validate_input",
        props.clone(),
        Some(vec![0.0, 0.1, 0.9, 0.9])
    ).await?;

    // Create call graph relationship
    engine.connect_entities(
        "func:process_data",
        "func:validate_input",
        "CALLS"
    ).await?;

    // Find functions similar to "data processing" that call validate_input
    let results = engine.find_similar_connected(
        "func:process_data",   // Query by this function's embedding
        "func:validate_input", // Must be connected to validation
        5
    ).await?;

    for item in results {
        println!("Found: {} (Score: {:.4})", item.id, item.score.unwrap_or(0.0));
    }

    Ok(())
}

Tensor Chain Architecture

Tensor-native blockchain with semantic conflict detection, hierarchical codebook-based validation, and Tensor-Raft distributed consensus. This is the most complex module in Neumann, providing distributed transaction coordination across a cluster of nodes.

Tensor Chain treats transactions as geometric objects in embedding space. Changes are represented as delta vectors, enabling similarity-based conflict detection and automatic merging of orthogonal transactions. The module integrates Raft consensus for leader election, two-phase commit (2PC) for cross-shard transactions, SWIM gossip for failure detection, and wait-for graph analysis for deadlock detection.

Key Concepts

Raft Consensus

Tensor-Raft extends the standard Raft consensus protocol with tensor-native optimizations:

  • Similarity Fast-Path: Followers can skip full validation when block embeddings are similar (>0.95 cosine) to recent blocks from the same leader
  • Geometric Tie-Breaking: During elections with equal logs, candidates with similar state embeddings to the cluster centroid are preferred
  • Pre-Vote Phase: Prevents disruptive elections by requiring majority agreement before incrementing term
  • Automatic Heartbeat: Background task spawned on leader election maintains quorum

The leader replicates log entries containing blocks to followers. Entries are committed when a quorum (majority) acknowledges them. Committed entries are applied to the chain state machine.

Raft State Machine

stateDiagram-v2
    [*] --> Follower: Node startup

    Follower --> Candidate: Election timeout
    Follower --> Follower: AppendEntries from leader
    Follower --> Follower: Higher term seen

    Candidate --> Leader: Received quorum votes
    Candidate --> Follower: Higher term seen
    Candidate --> Candidate: Election timeout (split vote)

    Leader --> Follower: Higher term seen
    Leader --> Follower: Lost quorum (heartbeat failure)

    note right of Follower
        Receives log entries
        Grants votes
        Resets election timer on heartbeat
    end note

    note right of Candidate
        Increments term
        Votes for self
        Requests votes from peers
    end note

    note right of Leader
        Proposes blocks
        Sends heartbeats
        Handles client requests
        Tracks replication progress
    end note

Pre-Vote Protocol

Pre-vote prevents disruptive elections from partitioned nodes:

Node A (partitioned, stale)              Healthy Cluster
    |                                         |
    |-- PreVote(term=5) --------------------->|
    |                                         |
    |<-- PreVoteResponse(granted=false) ------|
    |                                         |
    | Does NOT increment term                 |
    | (prevents term inflation)               |

A pre-vote is granted only if:

  1. Candidate’s term >= our term
  2. Election timeout has elapsed (no recent leader heartbeat)
  3. Candidate’s log is at least as up-to-date as ours

Log Replication Flow

sequenceDiagram
    participant C as Client
    participant L as Leader
    participant F1 as Follower 1
    participant F2 as Follower 2

    C->>L: propose(block)

    par Replicate to followers
        L->>F1: AppendEntries(entries, prev_index, commit)
        L->>F2: AppendEntries(entries, prev_index, commit)
    end

    F1->>L: AppendEntriesResponse(success, match_index)
    F2->>L: AppendEntriesResponse(success, match_index)

    Note over L: Quorum achieved (2/3)
    L->>L: Update commit_index
    L->>L: Apply to state machine

    par Notify commit
        L->>F1: AppendEntries(commit_index updated)
        L->>F2: AppendEntries(commit_index updated)
    end

    L->>C: commit_success

Quorum Calculation

Quorum requires a strict majority of voting members:

#![allow(unused)]
fn main() {
pub fn quorum_size(total_nodes: usize) -> usize {
    (total_nodes / 2) + 1
}

// Examples:
// 3 nodes: quorum = 2
// 5 nodes: quorum = 3
// 7 nodes: quorum = 4
}

Fast-Path Validation

When enabled, followers can skip expensive block validation for similar blocks:

#![allow(unused)]
fn main() {
pub struct FastPathValidator {
    similarity_threshold: f32,  // Default: 0.95
    min_history: usize,         // Default: 3 blocks
}

// Validation logic:
// 1. Check if we have enough history from this leader
// 2. Compute cosine similarity with recent embeddings
// 3. If similarity > threshold for all recent blocks:
//    - Skip full validation
//    - Record acceptance in stats
// 4. Otherwise: perform full validation
}

Two-Phase Commit (2PC)

Cross-shard distributed transactions use 2PC with delta-based conflict detection:

Phase 1 - PREPARE: Coordinator sends TxPrepareMsg to each participant shard. Participants acquire locks, compute delta embeddings, and vote Yes, No, or Conflict.

Phase 2 - COMMIT/ABORT: If all votes are Yes and cross-shard deltas are orthogonal (cosine < 0.1), coordinator sends TxCommitMsg. Otherwise, sends TxAbortMsg with retry.

Orthogonal transactions (operating on independent data dimensions) can commit in parallel without coordination, reducing contention.

2PC Coordinator State Machine

stateDiagram-v2
    [*] --> Preparing: begin()

    Preparing --> Prepared: All votes YES + deltas orthogonal
    Preparing --> Aborting: Any vote NO/Conflict
    Preparing --> Aborting: Timeout
    Preparing --> Aborting: Cross-shard conflict detected

    Prepared --> Committing: commit()
    Prepared --> Aborting: abort()

    Committing --> Committed: All ACKs received
    Committing --> Committed: Timeout (presumed commit)

    Aborting --> Aborted: All ACKs received
    Aborting --> Aborted: Timeout (presumed abort)

    Committed --> [*]
    Aborted --> [*]

2PC Participant State Machine

stateDiagram-v2
    [*] --> Idle

    Idle --> LockAcquiring: TxPrepareMsg received

    LockAcquiring --> Locked: Locks acquired
    LockAcquiring --> VoteNo: Lock conflict

    Locked --> ConflictCheck: Compute delta

    ConflictCheck --> VoteYes: No conflicts
    ConflictCheck --> VoteConflict: Semantic conflict

    VoteYes --> WaitingDecision: Send YES vote
    VoteNo --> [*]: Send NO vote
    VoteConflict --> [*]: Send CONFLICT vote

    WaitingDecision --> Committed: TxCommitMsg
    WaitingDecision --> Aborted: TxAbortMsg
    WaitingDecision --> Aborted: Timeout

    Committed --> [*]: Release locks, apply ops
    Aborted --> [*]: Release locks, rollback

Lock Ordering (Deadlock Prevention)

The coordinator follows strict lock ordering to prevent internal deadlocks:

Lock acquisition order:
1. pending           - Transaction state map
2. lock_manager.locks     - Key-level locks
3. lock_manager.tx_locks  - Per-transaction lock sets
4. pending_aborts    - Abort queue

CRITICAL: Never acquire pending_aborts while holding pending

WAL Recovery Protocol

The coordinator uses write-ahead logging for crash recovery:

#![allow(unused)]
fn main() {
// Recovery state machine:
// 1. Replay WAL to reconstruct pending transactions
// 2. For each transaction, determine recovery action:

match tx.phase {
    TxPhase::Preparing => {
        // Incomplete prepare - abort (presumed abort)
        tx.phase = TxPhase::Aborting;
    }
    TxPhase::Prepared => {
        // All YES votes recorded - check if can commit
        if all_yes_votes && deltas_orthogonal {
            tx.phase = TxPhase::Committing;
        } else {
            tx.phase = TxPhase::Aborting;
        }
    }
    TxPhase::Committing => {
        // Continue commit - presumed commit
        complete_commit(tx);
    }
    TxPhase::Aborting => {
        // Continue abort
        complete_abort(tx);
    }
}
}

SWIM Gossip Protocol

Scalable membership management replaces O(N) sequential pings with O(log N) epidemic propagation:

  • Peer Sampling: Select k peers per round (default: 3) using geometric routing
  • LWW-CRDT State: Last-Writer-Wins conflict resolution with Lamport timestamps
  • Suspicion Protocol: Direct ping failure triggers indirect probes via intermediaries. Suspicion timer (5s default) allows refutation before marking node as failed

Gossip Message Types

#![allow(unused)]
fn main() {
pub enum GossipMessage {
    /// Sync message with piggy-backed node states
    Sync {
        sender: NodeId,
        states: Vec<GossipNodeState>,
        sender_time: u64,  // Lamport timestamp
    },

    /// Suspect a node of failure
    Suspect {
        reporter: NodeId,
        suspect: NodeId,
        incarnation: u64,
    },

    /// Refute suspicion by proving aliveness
    Alive {
        node_id: NodeId,
        incarnation: u64,  // Incremented to refute
    },

    /// Indirect ping request (SWIM protocol)
    PingReq {
        origin: NodeId,
        target: NodeId,
        sequence: u64,
    },

    /// Indirect ping response
    PingAck {
        origin: NodeId,
        target: NodeId,
        sequence: u64,
        success: bool,
    },
}
}

LWW-CRDT State Merging

#![allow(unused)]
fn main() {
// State supersession rules:
impl GossipNodeState {
    pub fn supersedes(&self, other: &GossipNodeState) -> bool {
        // Incarnation takes precedence
        if self.incarnation != other.incarnation {
            self.incarnation > other.incarnation
        } else {
            // Same incarnation: higher timestamp wins
            self.timestamp > other.timestamp
        }
    }
}

// Merge algorithm:
pub fn merge(&mut self, incoming: &[GossipNodeState]) -> Vec<NodeId> {
    let mut changed = Vec::new();

    for state in incoming {
        match self.states.get(&state.node_id) {
            Some(existing) if state.supersedes(existing) => {
                self.states.insert(state.node_id.clone(), state.clone());
                changed.push(state.node_id.clone());
            }
            None => {
                self.states.insert(state.node_id.clone(), state.clone());
                changed.push(state.node_id.clone());
            }
            _ => {} // Existing state is newer, ignore
        }
    }

    // Sync Lamport time to max + 1
    if let Some(max_ts) = incoming.iter().map(|s| s.timestamp).max() {
        self.lamport_time = self.lamport_time.max(max_ts) + 1;
    }

    changed
}
}

SWIM Failure Detection Flow

sequenceDiagram
    participant A as Node A
    participant B as Node B (suspect)
    participant C as Node C (intermediary)
    participant D as Node D (intermediary)

    A->>B: Direct Ping
    Note over B: No response (timeout)

    par Indirect probes
        A->>C: PingReq(target=B)
        A->>D: PingReq(target=B)
    end

    C->>B: Ping (on behalf of A)
    D->>B: Ping (on behalf of A)

    alt B responds to C
        B->>C: Pong
        C->>A: PingAck(success=true)
        Note over A: B is healthy
    else All indirect pings fail
        C->>A: PingAck(success=false)
        D->>A: PingAck(success=false)
        Note over A: Start suspicion timer (5s)
        A->>A: Broadcast Suspect(B)

        alt B refutes within 5s
            B->>A: Alive(incarnation++)
            Note over A: Cancel suspicion
        else Timer expires
            Note over A: Mark B as Failed
        end
    end

Incarnation Number Protocol

Scenario: Node B receives Suspect about itself

B's current incarnation: 5
Suspect message incarnation: 5

B increments: incarnation = 6
B broadcasts: Alive { node_id: B, incarnation: 6 }

All nodes receiving Alive update B's state:
- incarnation: 6
- health: Healthy
- timestamp: <lamport_time++>

Deadlock Detection

Wait-for graph tracks transaction dependencies for cycle detection:

  1. Edge A -> B added when transaction A blocks waiting for B to release locks
  2. Periodic DFS traversal detects cycles (deadlocks)
  3. Victim selected based on policy (youngest, oldest, lowest priority, or most locks)
  4. Victim transaction aborted to break the cycle

Wait-For Graph Structure

#![allow(unused)]
fn main() {
pub struct WaitForGraph {
    /// Maps tx_id -> set of tx_ids it is waiting for
    edges: HashMap<u64, HashSet<u64>>,

    /// Reverse edges for O(1) removal: holder -> waiters
    reverse_edges: HashMap<u64, HashSet<u64>>,

    /// Timestamp when wait started (for victim selection)
    wait_started: HashMap<u64, EpochMillis>,

    /// Priority values (lower = higher priority)
    priorities: HashMap<u64, u32>,
}
}

Tarjan’s DFS Cycle Detection Algorithm

#![allow(unused)]
fn main() {
fn dfs_detect(
    &self,
    node: u64,
    edges: &HashMap<u64, HashSet<u64>>,
    visited: &mut HashSet<u64>,
    rec_stack: &mut HashSet<u64>,  // Current recursion path
    path: &mut Vec<u64>,           // Explicit path for extraction
    cycles: &mut Vec<Vec<u64>>,
) {
    visited.insert(node);
    rec_stack.insert(node);
    path.push(node);

    if let Some(neighbors) = edges.get(&node) {
        for &neighbor in neighbors {
            if !visited.contains(&neighbor) {
                // Continue DFS on unvisited
                self.dfs_detect(neighbor, edges, visited, rec_stack, path, cycles);
            } else if rec_stack.contains(&neighbor) {
                // Back-edge to ancestor = cycle found!
                if let Some(cycle_start) = path.iter().position(|&n| n == neighbor) {
                    cycles.push(path[cycle_start..].to_vec());
                }
            }
        }
    }

    path.pop();
    rec_stack.remove(&node);
}
}

Victim Selection Policies

PolicySelection CriteriaTrade-off
YoungestMost recent wait start (highest timestamp)Minimizes wasted work, may starve long transactions
OldestEarliest wait start (lowest timestamp)Prevents starvation, wastes more completed work
LowestPriorityHighest priority valueBusiness-rule based, requires priority assignment
MostLocksTransaction holding most locksMaximizes freed resources, may abort complex transactions

Ed25519 Signing and Identity

Cryptographic identity binding ensures message authenticity and enables geometric routing:

Identity Generation and NodeId Derivation

#![allow(unused)]
fn main() {
pub struct Identity {
    signing_key: SigningKey,  // Ed25519 private key (zeroized on drop)
}

impl Identity {
    pub fn generate() -> Self {
        let signing_key = SigningKey::generate(&mut OsRng);
        Self { signing_key }
    }

    /// NodeId = BLAKE2b-128(domain || public_key)
    /// 16 bytes = 32 hex characters
    pub fn node_id(&self) -> NodeId {
        let mut hasher = Blake2b::<U16>::new();
        hasher.update(b"neumann_node_id_v1");
        hasher.update(self.signing_key.verifying_key().as_bytes());
        hex::encode(hasher.finalize())
    }

    /// Embedding = BLAKE2b-512(domain || public_key) -> 16 f32 coords
    /// Normalized to [-1, 1] for geometric operations
    pub fn to_embedding(&self) -> SparseVector {
        let mut hasher = Blake2b::<U64>::new();
        hasher.update(b"neumann_node_embedding_v1");
        hasher.update(self.signing_key.verifying_key().as_bytes());
        let hash = hasher.finalize();

        // 64 bytes -> 16 f32 coordinates
        let coords: Vec<f32> = hash.chunks(4)
            .map(|c| {
                let bits = u32::from_le_bytes([c[0], c[1], c[2], c[3]]);
                (bits as f64 / u32::MAX as f64 * 2.0 - 1.0) as f32
            })
            .collect();

        SparseVector::from_dense(&coords)
    }
}
}

Signed Message Envelope

#![allow(unused)]
fn main() {
pub struct SignedMessage {
    pub sender: NodeId,           // Derived from public key
    pub public_key: [u8; 32],     // Ed25519 verifying key
    pub payload: Vec<u8>,         // Message content
    pub signature: Vec<u8>,       // 64-byte Ed25519 signature
    pub sequence: u64,            // Replay protection
    pub timestamp_ms: u64,        // Freshness check
}

// Signature covers: sender || sequence || timestamp || payload
// This binds identity, ordering, and content together
}

Replay Protection

#![allow(unused)]
fn main() {
pub struct SequenceTracker {
    sequences: DashMap<NodeId, (u64, Instant)>,
    config: SequenceTrackerConfig,
}

impl SequenceTracker {
    pub fn check_and_record(
        &self,
        sender: &NodeId,
        sequence: u64,
        timestamp_ms: u64,
    ) -> Result<()> {
        // 1. Reject messages from the future (allow 1 min clock skew)
        if timestamp_ms > now_ms + 60_000 {
            return Err("message timestamp is in the future");
        }

        // 2. Reject stale messages (default: 5 min max age)
        if now_ms > timestamp_ms + self.config.max_age_ms {
            return Err("message too old");
        }

        // 3. Check sequence number is strictly increasing
        let entry = self.sequences.entry(sender.clone()).or_insert((0, now));
        if sequence <= entry.0 {
            return Err("replay detected: sequence <= last seen");
        }

        *entry = (sequence, now);
        Ok(())
    }
}
}

Message Validation Pipeline

All incoming messages pass through validation before processing:

flowchart TB
    subgraph Validation["Message Validation Pipeline"]
        Input["Incoming Message"]
        NodeIdCheck["Validate NodeId Format"]
        TypeDispatch["Dispatch by Type"]

        TermCheck["Validate Term Bounds"]
        ShardCheck["Validate Shard ID"]
        TimeoutCheck["Validate Timeout"]
        EmbeddingCheck["Validate Embedding"]
        SignatureCheck["Validate Signature"]

        Accept["Accept Message"]
        Reject["Reject with Error"]
    end

    Input --> NodeIdCheck
    NodeIdCheck -->|Invalid| Reject
    NodeIdCheck -->|Valid| TypeDispatch

    TypeDispatch -->|Raft| TermCheck
    TypeDispatch -->|2PC| ShardCheck
    TypeDispatch -->|Signed| SignatureCheck

    TermCheck -->|Invalid| Reject
    TermCheck -->|Valid| EmbeddingCheck

    ShardCheck -->|Invalid| Reject
    ShardCheck -->|Valid| TimeoutCheck

    TimeoutCheck -->|Invalid| Reject
    TimeoutCheck -->|Valid| EmbeddingCheck

    EmbeddingCheck -->|Invalid| Reject
    EmbeddingCheck -->|Valid| Accept

    SignatureCheck -->|Invalid| Reject
    SignatureCheck -->|Valid| Accept

Embedding Validation

#![allow(unused)]
fn main() {
pub struct EmbeddingValidator {
    max_dimension: usize,    // Default: 65,536
    max_magnitude: f32,      // Default: 1,000,000
}

impl EmbeddingValidator {
    pub fn validate(&self, embedding: &SparseVector, field: &str) -> Result<()> {
        // 1. Dimension bounds
        if embedding.dimension() == 0 {
            return Err("dimension cannot be zero");
        }
        if embedding.dimension() > self.max_dimension {
            return Err("dimension exceeds maximum");
        }

        // 2. NaN/Inf detection (prevents computation errors)
        for (i, value) in embedding.values().iter().enumerate() {
            if value.is_nan() {
                return Err(format!("NaN value at position {}", i));
            }
            if value.is_infinite() {
                return Err(format!("infinite value at position {}", i));
            }
        }

        // 3. Magnitude bounds (prevents DoS via huge vectors)
        if embedding.magnitude() > self.max_magnitude {
            return Err("magnitude exceeds maximum");
        }

        // 4. Position validity (sorted, within bounds)
        let positions = embedding.positions();
        for (i, &pos) in positions.iter().enumerate() {
            if pos as usize >= embedding.dimension() {
                return Err("position out of bounds");
            }
            if i > 0 && positions[i - 1] >= pos {
                return Err("positions not strictly sorted");
            }
        }

        Ok(())
    }
}
}

Semantic Conflict Detection

The consensus manager uses hybrid detection combining angular and structural similarity:

Conflict Classification Algorithm

#![allow(unused)]
fn main() {
pub fn detect_conflict(&self, d1: &DeltaVector, d2: &DeltaVector) -> ConflictResult {
    let cosine = d1.cosine_similarity(d2);
    let jaccard = d1.structural_similarity(d2);  // Jaccard index
    let overlapping_keys = d1.overlapping_keys(d2);
    let all_keys_overlap = overlapping_keys.len() == d1.affected_keys.len()
        && overlapping_keys.len() == d2.affected_keys.len();

    // Classification hierarchy:
    let (class, action) = if cosine >= 0.99 && all_keys_overlap {
        // Identical: same direction, same keys
        (ConflictClass::Identical, MergeAction::Deduplicate)

    } else if cosine <= -0.95 && all_keys_overlap {
        // Opposite: cancel out (A + (-A) = 0)
        (ConflictClass::Opposite, MergeAction::Cancel)

    } else if cosine.abs() < 0.1 && jaccard < 0.5 {
        // Truly orthogonal: different directions AND different positions
        (ConflictClass::Orthogonal, MergeAction::VectorAdd)

    } else if cosine >= 0.7 {
        // Angular conflict: pointing same direction
        (ConflictClass::Conflicting, MergeAction::Reject)

    } else if jaccard >= 0.5 {
        // Structural conflict: same positions modified
        // Catches conflicts that cosine misses
        (ConflictClass::Conflicting, MergeAction::Reject)

    } else if !overlapping_keys.is_empty() {
        // Key overlap without structural/angular conflict
        (ConflictClass::Ambiguous, MergeAction::Reject)

    } else {
        // Low conflict: merge with weighted average
        (ConflictClass::LowConflict, MergeAction::WeightedAverage {
            weight1: 50, weight2: 50
        })
    };

    ConflictResult { class, cosine, jaccard, overlapping_keys, action, .. }
}
}

Merge Operations

#![allow(unused)]
fn main() {
impl DeltaVector {
    /// Vector addition for orthogonal deltas
    pub fn add(&self, other: &DeltaVector) -> DeltaVector {
        let delta = self.delta.add(&other.delta);
        let keys = self.affected_keys.union(&other.affected_keys).cloned().collect();
        DeltaVector::from_sparse(delta, keys, 0)
    }

    /// Weighted average for low-conflict deltas
    pub fn weighted_average(&self, other: &DeltaVector, w1: f32, w2: f32) -> DeltaVector {
        let total = w1 + w2;
        if total == 0.0 {
            return DeltaVector::zero(0);
        }
        let delta = self.delta.weighted_average(&other.delta, w1, w2);
        let keys = self.affected_keys.union(&other.affected_keys).cloned().collect();
        DeltaVector::from_sparse(delta, keys, 0)
    }

    /// Project out conflicting component
    pub fn project_non_conflicting(&self, conflict_direction: &SparseVector) -> DeltaVector {
        let delta = self.delta.project_orthogonal(conflict_direction);
        DeltaVector::from_sparse(delta, self.affected_keys.clone(), self.tx_id)
    }
}
}

Types Reference

Core Types

TypeModuleDescription
TensorChainlib.rsMain API for chain operations, transaction management
Blockblock.rsBlock structure with header, transactions, signatures
BlockHeaderblock.rsHeight, prev_hash, delta_embedding, quantized_codes
Transactionblock.rsPut, Delete, Update operations
ChainConfiglib.rsNode ID, max transactions, conflict threshold, auto-merge
ChainErrorerror.rsError types for all chain operations
ChainMetricslib.rsAggregated metrics from all components

Consensus Types

TypeModuleDescription
RaftNoderaft.rsRaft state machine with leader election, log replication
RaftStateraft.rsFollower, Candidate, or Leader
RaftConfigraft.rsElection timeout, heartbeat interval, fast-path settings
RaftStatsraft.rsFast-path acceptance, heartbeat timing, quorum tracking
QuorumTrackerraft.rsTracks heartbeat responses to detect quorum loss
SnapshotMetadataraft.rsLog compaction point with hash and membership config
LogEntrynetwork.rsRaft log entry with term, index, and data
ConsensusManagerconsensus.rsSemantic conflict detection and transaction merging
DeltaVectorconsensus.rsSparse delta embedding with affected keys
ConflictClassconsensus.rsOrthogonal, LowConflict, Ambiguous, Conflicting, Identical, Opposite
FastPathValidatorvalidation.rsBlock similarity validation for fast-path acceptance
FastPathStateraft.rsPer-leader embedding history for fast-path
TransferStateraft.rsActive leadership transfer tracking
HeartbeatStatsraft.rsHeartbeat success/failure counters

Distributed Transaction Types

TypeModuleDescription
DistributedTxCoordinatordistributed_tx.rs2PC coordinator with timeout and retry
DistributedTransactiondistributed_tx.rsTransaction spanning multiple shards
TxPhasedistributed_tx.rsPreparing, Prepared, Committing, Committed, Aborting, Aborted
PrepareVotedistributed_tx.rsYes (with lock handle), No (with reason), Conflict
LockManagerdistributed_tx.rsKey-level locking for transaction isolation
KeyLockdistributed_tx.rsLock on a key with timeout and handle
TxWaltx_wal.rsWrite-ahead log for crash recovery
TxWalEntrytx_wal.rsWAL entry types: TxBegin, PrepareVote, PhaseChange, TxComplete
TxRecoveryStatetx_wal.rsReconstructed state from WAL replay
PrepareRequestdistributed_tx.rsRequest to prepare a transaction on a shard
CommitRequestdistributed_tx.rsRequest to commit a prepared transaction
AbortRequestdistributed_tx.rsRequest to abort a transaction
CoordinatorStatedistributed_tx.rsSerializable coordinator state for persistence
ParticipantStatedistributed_tx.rsSerializable participant state for persistence

Gossip Types

TypeModuleDescription
GossipMembershipManagergossip.rsSWIM-style gossip with signing support
GossipConfiggossip.rsFanout, interval, suspicion timeout, signature requirements
GossipMessagegossip.rsSync, Suspect, Alive, PingReq, PingAck
GossipNodeStategossip.rsNode health, Lamport timestamp, incarnation
LWWMembershipStategossip.rsCRDT for conflict-free state merging
PendingSuspiciongossip.rsSuspicion timer tracking
HealProgressgossip.rsRecovery tracking for partitioned nodes
SignedGossipMessagesigning.rsGossip message with Ed25519 signature

Deadlock Detection Types

TypeModuleDescription
DeadlockDetectordeadlock.rsCycle detection with configurable victim selection
WaitForGraphdeadlock.rsDirected graph of transaction dependencies
DeadlockInfodeadlock.rsDetected cycle with selected victim
VictimSelectionPolicydeadlock.rsYoungest, Oldest, LowestPriority, MostLocks
DeadlockStatsdeadlock.rsDetection timing and cycle length statistics
WaitInfodeadlock.rsLock conflict information for wait-graph edges

Identity and Signing Types

TypeModuleDescription
Identitysigning.rsEd25519 private key (zeroized on drop)
PublicIdentitysigning.rsEd25519 public key for verification
SignedMessagesigning.rsMessage envelope with signature and replay protection
ValidatorRegistrysigning.rsRegistry of known validator public keys
SequenceTrackersigning.rsReplay attack detection via sequence numbers
SequenceTrackerConfigsigning.rsMax age, max entries, cleanup interval

Message Validation Types

TypeModuleDescription
MessageValidationConfigmessage_validation.rsBounds for DoS prevention
CompositeValidatormessage_validation.rsValidates all message types
EmbeddingValidatormessage_validation.rsChecks dimension, magnitude, NaN/Inf
MessageValidatormessage_validation.rsTrait for pluggable validation

Architecture Diagram

flowchart TB
    subgraph Client["Client Layer"]
        TensorChain["TensorChain API"]
        TransactionWorkspace["Transaction Workspace"]
    end

    subgraph Consensus["Consensus Layer"]
        RaftNode["Raft Node"]
        ConsensusManager["Consensus Manager"]
        FastPath["Fast-Path Validator"]
    end

    subgraph Network["Network Layer"]
        Transport["Transport Trait"]
        TcpTransport["TCP Transport"]
        MemoryTransport["Memory Transport"]
        MessageValidator["Message Validator"]
    end

    subgraph Membership["Membership Layer"]
        GossipManager["Gossip Manager"]
        MembershipManager["Membership Manager"]
        GeometricMembership["Geometric Membership"]
    end

    subgraph DistTx["Distributed Transactions"]
        Coordinator["2PC Coordinator"]
        LockManager["Lock Manager"]
        DeadlockDetector["Deadlock Detector"]
        TxWal["Transaction WAL"]
    end

    subgraph Storage["Storage Layer"]
        Chain["Chain (Graph Engine)"]
        Codebook["Codebook Manager"]
        RaftWal["Raft WAL"]
    end

    TensorChain --> TransactionWorkspace
    TensorChain --> ConsensusManager
    TensorChain --> Codebook

    TransactionWorkspace --> Chain

    RaftNode --> Transport
    RaftNode --> ConsensusManager
    RaftNode --> FastPath
    RaftNode --> RaftWal

    Transport --> TcpTransport
    Transport --> MemoryTransport
    TcpTransport --> MessageValidator

    GossipManager --> Transport
    GossipManager --> MembershipManager
    MembershipManager --> GeometricMembership

    Coordinator --> LockManager
    Coordinator --> DeadlockDetector
    Coordinator --> Transport
    Coordinator --> TxWal

    LockManager --> DeadlockDetector

    Chain --> Codebook

Subsystems

Consensus Subsystem

The Raft consensus implementation provides strong consistency guarantees:

State Machine:

  • Follower: Receives AppendEntries from leader, grants votes
  • Candidate: Requests votes after election timeout
  • Leader: Proposes blocks, sends heartbeats, handles client requests

Log Replication:

Leader:  propose(block) -> AppendEntries to followers
                        -> Wait for quorum acknowledgment
                        -> Update commit_index
                        -> Apply to state machine

Fast-Path Validation: When enabled and block embedding similarity exceeds threshold (default 0.95), followers skip full validation. This optimization assumes semantically similar blocks from the same leader are likely valid.

Log Compaction: After snapshot_threshold entries (default 10,000), a snapshot captures the state machine at the commit point. Entries before the snapshot can be truncated, keeping only snapshot_trailing_logs entries for followers catching up.

Distributed Transactions Subsystem

Cross-shard coordination uses two-phase commit with tensor-native conflict detection:

Phase 1 - Prepare:

Coordinator                    Participant (per shard)
    |                                  |
    |--- TxPrepareMsg --------------->|
    |    (ops, delta_embedding)        |
    |                                  |-- acquire locks
    |                                  |-- compute local delta
    |                                  |-- check conflicts
    |<--- TxPrepareResponse ----------|
    |     (Yes/No/Conflict)            |

Phase 2 - Commit or Abort:

If all Yes AND deltas orthogonal:
    |--- TxCommitMsg ---------------->| -- release locks, apply ops
    |<--- TxAckMsg -------------------|

Otherwise:
    |--- TxAbortMsg ----------------->| -- release locks, rollback
    |<--- TxAckMsg -------------------|

Conflict Detection: Uses hybrid detection combining cosine similarity (angular conflict) and Jaccard index (structural conflict):

CosineJaccardClassificationAction
< 0.1< 0.5OrthogonalAuto-merge (vector add)
0.1-0.7< 0.5LowConflictWeighted merge
>= 0.7anyConflictingReject
any>= 0.5ConflictingReject (structural)
>= 0.99all keysIdenticalDeduplicate
<= -0.95all keysOppositeCancel (no-op)

Gossip Protocol Subsystem

SWIM-style failure detection with LWW-CRDT state:

Gossip Round:

1. Select k peers (fanout=3) using geometric routing
2. Send Sync message with piggybacked node states
3. Merge received states (higher incarnation wins)
4. Update Lamport time

Failure Detection:

Direct ping failed
    |
    v
Send PingReq to k intermediaries
    |
    v
All indirect pings failed?
    |-- No --> Mark healthy
    |-- Yes --> Start suspicion timer (5s)
                    |
                    v
                Timer expired without Alive?
                    |-- No --> Mark healthy (refuted)
                    |-- Yes --> Mark failed

Incarnation Numbers: When a node receives a Suspect about itself, it increments its incarnation and broadcasts Alive to refute the suspicion.

Deadlock Detection Subsystem

Wait-for graph analysis for cycle detection:

Graph Structure:

Edge: waiter_tx -> holder_tx
Meaning: waiter is blocked waiting for holder to release locks

Detection Algorithm (Tarjan’s DFS):

1. For each unvisited node, start DFS
2. Track recursion stack for back-edge detection
3. Back-edge to ancestor = cycle found
4. Extract cycle path for victim selection

Victim Selection Policies:

  • Youngest: Abort most recent transaction (minimize wasted work)
  • Oldest: Abort earliest transaction (prevent starvation)
  • LowestPriority: Abort transaction with highest priority value
  • MostLocks: Abort transaction holding most locks (minimize cascade)

Configuration Options

RaftConfig

FieldDefaultDescription
election_timeout(150, 300)Random timeout range in ms
heartbeat_interval50Heartbeat interval in ms
similarity_threshold0.95Fast-path similarity threshold
enable_fast_pathtrueEnable fast-path validation
enable_pre_votetrueEnable pre-vote phase
enable_geometric_tiebreaktrueEnable geometric tie-breaking
geometric_tiebreak_threshold0.3Minimum similarity for tiebreak
snapshot_threshold10,000Entries before compaction
snapshot_trailing_logs100Entries to keep after snapshot
snapshot_chunk_size1MBChunk size for snapshot transfer
transfer_timeout_ms1,000Leadership transfer timeout
compaction_check_interval10Ticks between compaction checks
compaction_cooldown_ms60,000Minimum time between compactions
snapshot_max_memory256MBMax memory for snapshot buffering
auto_heartbeattrueSpawn heartbeat task on leader election
max_heartbeat_failures3Failures before logging warning

DistributedTxConfig

FieldDefaultDescription
max_concurrent100Maximum concurrent transactions
prepare_timeout_ms5,000Prepare phase timeout
commit_timeout_ms10,000Commit phase timeout
orthogonal_threshold0.1Cosine threshold for orthogonality
optimistic_lockingtrueEnable semantic conflict detection

GossipConfig

FieldDefaultDescription
fanout3Peers per gossip round
gossip_interval_ms200Interval between rounds
suspicion_timeout_ms5,000Time before failure declaration
max_states_per_message20State limit per message
geometric_routingtrueUse embedding-based peer selection
indirect_ping_count3Indirect pings on direct failure
indirect_ping_timeout_ms500Timeout for indirect pings
require_signaturesfalseRequire Ed25519 signatures
max_message_age_ms300,000Maximum signed message age

DeadlockDetectorConfig

FieldDefaultDescription
enabledtrueEnable deadlock detection
detection_interval_ms100Detection cycle interval
victim_policyYoungestVictim selection policy
max_cycle_length100Maximum detectable cycle length
auto_abort_victimtrueAutomatically abort victim

MessageValidationConfig

FieldDefaultDescription
enabledtrueEnable validation
max_termu64::MAX - 1Prevent overflow attacks
max_shard_id65,536Bound shard addressing
max_tx_timeout_ms300,000Maximum transaction timeout
max_node_id_len256Maximum node ID length
max_key_len4,096Maximum key length
max_embedding_dimension65,536Prevent huge allocations
max_embedding_magnitude1,000,000Detect invalid values
max_query_len1MBMaximum query string length
max_message_age_ms300,000Reject stale/replayed messages
max_blocks_per_request1,000Limit block range requests
max_snapshot_chunk_size10MBLimit snapshot chunk size

Edge Cases and Gotchas

Raft Edge Cases

  1. Split Vote: When multiple candidates split the vote evenly, election timeout triggers new election. Randomized timeouts (150-300ms) reduce collision probability.

  2. Network Partition: During partition, minority side cannot commit (lacks quorum). Pre-vote prevents term inflation when partition heals.

  3. Stale Leader: A partitioned leader may not know it lost leadership. Quorum tracker detects heartbeat failures and steps down.

  4. Log Divergence: Followers with divergent logs are overwritten by leader’s log (consistency > availability).

  5. Snapshot During Election: Snapshot transfer continues even if leadership changes. New leader may need to resend snapshot.

2PC Edge Cases

  1. Coordinator Failure After Prepare: Participants holding locks may timeout. WAL recovery allows new coordinator to resume.

  2. Participant Failure: Coordinator times out waiting for vote. Transaction aborts, participant recovers from WAL on restart.

  3. Network Partition Between Phases: Commit messages may not reach all participants. Retry loop ensures eventual delivery.

  4. Lock Timeout vs Transaction Timeout: Lock timeout (30s) should exceed transaction timeout (5s) to prevent premature lock release.

  5. Orphaned Locks: Locks from crashed transactions are cleaned up by periodic cleanup_expired() or WAL recovery.

Gossip Edge Cases

  1. Incarnation Overflow: Theoretically possible with u64, but requires 2^64 restarts. Practically impossible.

  2. Clock Skew: Lamport timestamps are logical, not wall-clock. Sync messages update local Lamport time to max(local, remote) + 1.

  3. Signature Replay: Sequence numbers and timestamp freshness checks prevent replaying old signed messages.

  4. Rapid Restart: Node restarting rapidly may have lower incarnation than suspected state. New incarnation on restart resolves this.

Conflict Detection Edge Cases

  1. Zero Vector: Empty deltas (no changes) have undefined cosine similarity. Treated as orthogonal.

  2. Nearly Identical: Transactions with 0.99 < similarity < 1.0 may conflict. Use structural overlap (Jaccard) as secondary check.

  3. Large Dimension Mismatch: Deltas with different dimensions cannot be directly compared. Pad smaller to match larger.

Recovery Procedures

Raft Recovery from WAL

#![allow(unused)]
fn main() {
// 1. Open WAL and replay entries
let wal = RaftWal::open(wal_path)?;
let recovery = RaftRecoveryState::from_wal(&wal)?;

// 2. Restore term and voted_for
node.current_term = recovery.current_term;
node.voted_for = recovery.voted_for;

// 3. Validate snapshot if present
if let Some((meta, data)) = load_snapshot() {
    let computed_hash = sha256(&data);
    if computed_hash == meta.snapshot_hash {
        // Valid snapshot - restore state machine
        apply_snapshot(meta, data);
    } else {
        // Corrupted snapshot - ignore
        warn!("Snapshot hash mismatch, starting fresh");
    }
}

// 4. Start as follower
node.state = RaftState::Follower;
}

2PC Coordinator Recovery

#![allow(unused)]
fn main() {
// 1. Replay WAL to reconstruct pending transactions
let recovery = TxRecoveryState::from_wal(&wal)?;

// 2. Process each transaction based on phase
for tx in recovery.prepared_txs {
    // All YES votes - resume commit
    coordinator.pending.insert(tx.tx_id, restore_tx(tx, TxPhase::Prepared));
}

for tx in recovery.committing_txs {
    // Was committing - complete commit
    coordinator.complete_commit(tx.tx_id)?;
}

for tx in recovery.aborting_txs {
    // Was aborting - complete abort
    coordinator.complete_abort(tx.tx_id)?;
}

// 3. Timed out transactions default to abort (presumed abort)
for tx in recovery.timed_out_txs {
    coordinator.abort(tx.tx_id, "recovered - timeout")?;
}
}

Gossip State Recovery

#![allow(unused)]
fn main() {
// Gossip state is reconstructed via protocol, not WAL
// 1. Start with only local node in state
let mut state = LWWMembershipState::new();
state.update_local(local_node.clone(), NodeHealth::Healthy, 0);

// 2. Add known peers as Unknown
for peer in known_peers {
    state.merge(&[GossipNodeState::new(peer, NodeHealth::Unknown, 0, 0)]);
}

// 3. Gossip protocol will converge to correct state
// - Healthy nodes will respond to Sync
// - Failed nodes will be suspected and eventually marked failed
}

Operational Best Practices

Cluster Sizing

  • Minimum: 3 nodes (tolerates 1 failure)
  • Recommended: 5 nodes (tolerates 2 failures)
  • Large: 7 nodes (tolerates 3 failures)
  • Avoid even numbers (split-brain risk)

Timeout Tuning

#![allow(unused)]
fn main() {
// Network latency < 10ms (same datacenter)
RaftConfig {
    election_timeout: (150, 300),
    heartbeat_interval: 50,
}

// Network latency 10-50ms (cross-datacenter)
RaftConfig {
    election_timeout: (500, 1000),
    heartbeat_interval: 150,
}

// Network latency > 50ms (geo-distributed)
RaftConfig {
    election_timeout: (2000, 4000),
    heartbeat_interval: 500,
}
}

Monitoring

Key metrics to monitor:

MetricWarning ThresholdCritical Threshold
heartbeat_success_rate< 0.95< 0.80
fast_path_rate< 0.50< 0.20
commit_rate< 0.80< 0.50
conflict_rate> 0.10> 0.30
deadlocks_detected> 0/min> 10/min
quorum_lost_events> 0/hour> 0/min

Security Considerations

  1. Enable Message Signing: Set require_signatures: true in production
  2. Rotate Keys: Periodically generate new identities and update registry
  3. Network Isolation: Use TLS for transport, firewall cluster ports
  4. Audit Logging: Log all state transitions for forensic analysis

Formal Verification

The three core protocols (Raft, 2PC, SWIM gossip) are formally specified in TLA+ (specs/tla/) and exhaustively model-checked with TLC:

SpecDistinct StatesProperties Verified
Raft.tla18,268,659ElectionSafety, LogMatching, StateMachineSafety, LeaderCompleteness, VoteIntegrity, TermMonotonicity
TwoPhaseCommit.tla2,264,939Atomicity, NoOrphanedLocks, ConsistentDecision, VoteIrrevocability, DecisionStability
Membership.tla54,148NoFalsePositivesSafety, MonotonicEpochs, MonotonicIncarnations

Model checking discovered protocol-level bugs (out-of-order message handling, self-message processing, heartbeat log truncation) that were fixed in both the specs and the Rust implementation. See Formal Verification for full results and the list of bugs found.

ModuleRelationship
tensor_storeProvides TensorStore for persistence, SparseVector for embeddings, ArchetypeRegistry for delta compression
graph_engineBlocks linked via graph edges, chain structure built on graph
tensor_compressInt8 quantization for delta embeddings (4x compression)
tensor_checkpointSnapshot persistence for crash recovery

Performance Characteristics

OperationTimeNotes
Transaction commit~50usSingle transaction block
Conflict detection~1usCosine + Jaccard calculation
Deadlock detection~350nsDFS cycle detection
Gossip round~200msConfigurable interval
Heartbeat~50msLeader to all followers
Fast-path validation~2usSimilarity check only
Full validation~50usComplete block verification
Lock acquisition~100nsUncontended case
Lock acquisition (contended)~10usWith wait-graph update
Signature verification~50usEd25519 verify
Message validation~1usBounds checking

Neumann Parser

The neumann_parser crate provides a hand-written recursive descent parser for the Neumann unified query language. It converts source text into an Abstract Syntax Tree (AST) that can be executed by the query router.

The parser is designed with zero external dependencies, full span tracking for error reporting, and support for SQL, graph, vector, and domain-specific operations in a single unified syntax.

Key Concepts

ConceptDescription
Recursive DescentTop-down parsing where each grammar rule becomes a function
Pratt ParsingOperator precedence parsing for expressions with correct associativity
Span TrackingEvery AST node carries source location for error messages
Case InsensitivityKeywords are matched case-insensitively via uppercase conversion
Single LookaheadParser uses one-token lookahead with optional peek
Depth LimitingExpression nesting is limited to 64 levels to prevent stack overflow

Architecture

flowchart LR
    subgraph Input
        Source["Source String"]
    end

    subgraph Lexer
        Chars["char iterator"] --> Tokenizer
        Tokenizer --> Tokens["Token Stream"]
    end

    subgraph Parser
        Tokens --> StatementParser["Statement Parser"]
        StatementParser --> ExprParser["Expression Parser (Pratt)"]
        ExprParser --> AST["Abstract Syntax Tree"]
    end

    Source --> Chars
    AST --> Output["Statement + Span"]

Detailed Parsing Flow

sequenceDiagram
    participant User
    participant parse()
    participant Parser
    participant Lexer
    participant ExprParser

    User->>parse(): "SELECT * FROM users"
    parse()->>Parser: new(source)
    Parser->>Lexer: new(source)
    Lexer-->>Parser: first token

    Parser->>Parser: parse_statement()
    Parser->>Parser: match on token kind
    Parser->>Parser: parse_select()
    Parser->>Parser: parse_select_body()

    loop For each select item
        Parser->>ExprParser: parse_expr()
        ExprParser->>ExprParser: parse_expr_bp(0)
        ExprParser->>ExprParser: parse_prefix_expr()
        ExprParser-->>Parser: Expr with span
    end

    Parser-->>parse(): Statement
    parse()-->>User: Result<Statement>

Source Files

FilePurposeKey Functions
lib.rsPublic API exportsparse(), parse_all(), parse_expr(), tokenize()
lexer.rsTokenization (source to tokens)Lexer::next_token(), scan_ident(), scan_number(), scan_string()
token.rsToken definitions and keyword lookupTokenKind, keyword_from_str()
parser.rsStatement parsing (recursive descent)Parser::parse_statement(), parse_select(), parse_insert()
expr.rsExpression parsing (Pratt algorithm)ExprParser::parse_expr(), parse_expr_bp(), infix_binding_power()
ast.rsAST node definitionsStatement, StatementKind, Expr, ExprKind
span.rsSource location trackingBytePos, Span, line_col(), get_line()
error.rsError types with source contextParseError, ParseErrorKind, format_with_source()

Core Types

Token System

classDiagram
    class Token {
        +TokenKind kind
        +Span span
        +is_eof() bool
        +is_keyword() bool
    }

    class TokenKind {
        <<enumeration>>
        Ident(String)
        Integer(i64)
        Float(f64)
        String(String)
        Select
        From
        Where
        ...
        Error(String)
        Eof
    }

    class Span {
        +BytePos start
        +BytePos end
        +len() u32
        +merge(Span) Span
        +extract(str) str
    }

    Token --> TokenKind
    Token --> Span
TypeDescription
TokenA token with its kind and span
TokenKindEnum of all token variants (130+ variants including keywords, literals, operators)
LexerStateful tokenizer that produces tokens from source

AST Structure

classDiagram
    class Statement {
        +StatementKind kind
        +Span span
    }

    class StatementKind {
        <<enumeration>>
        Select(SelectStmt)
        Insert(InsertStmt)
        Node(NodeStmt)
        Edge(EdgeStmt)
        Similar(SimilarStmt)
        Vault(VaultStmt)
        ...
    }

    class Expr {
        +ExprKind kind
        +Span span
        +boxed() Box~Expr~
    }

    class ExprKind {
        <<enumeration>>
        Literal(Literal)
        Ident(Ident)
        Binary(Box~Expr~, BinaryOp, Box~Expr~)
        Unary(UnaryOp, Box~Expr~)
        Call(FunctionCall)
        ...
    }

    Statement --> StatementKind
    StatementKind --> Expr
    Expr --> ExprKind
TypeDescription
StatementTop-level parsed statement with span
StatementKindEnum of all statement variants (30+ variants)
ExprExpression node with span
ExprKindEnum of expression variants (20+ variants)
LiteralLiteral values (Null, Boolean, Integer, Float, String)
IdentIdentifier with name and span
BinaryOpBinary operators with precedence (18 operators)
UnaryOpUnary operators (Not, Neg, BitNot)

Span Types

TypeDescriptionExample
BytePosA byte offset into source text (u32)BytePos(7)
SpanA range of bytes (start, end)Span { start: 0, end: 6 }
Spanned<T>A value paired with its source locationSpanned::new(42, span)
#![allow(unused)]
fn main() {
// Span operations
let span1 = Span::from_offsets(0, 6);   // "SELECT"
let span2 = Span::from_offsets(7, 8);   // "*"
let merged = span1.merge(span2);         // "SELECT *"

// Extract source text
let source = "SELECT * FROM users";
let text = span1.extract(source);        // "SELECT"

// Line/column computation
let (line, col) = line_col(source, BytePos(7));  // (1, 8)
}

Error Types

TypeDescription
ParseErrorError with kind, span, and optional help message
ParseErrorKindEnum of error variants (10 kinds)
ParseResult<T>Result<T, ParseError>
ErrorsCollection of parse errors with iteration support
#![allow(unused)]
fn main() {
// Error kinds
pub enum ParseErrorKind {
    UnexpectedToken { found: TokenKind, expected: String },
    UnexpectedEof { expected: String },
    InvalidSyntax(String),
    InvalidNumber(String),
    UnterminatedString,
    UnknownCommand(String),
    DuplicateColumn(String),
    InvalidEscape(char),
    TooDeep,           // Expression nesting > 64 levels
    Custom(String),
}
}

Lexer Implementation

State Machine

The lexer is implemented as an iterator-based state machine with single-character lookahead:

stateDiagram-v2
    [*] --> Initial
    Initial --> Whitespace: is_whitespace
    Initial --> LineComment: --
    Initial --> BlockComment: /*
    Initial --> Identifier: a-zA-Z_
    Initial --> Number: 0-9
    Initial --> String: ' or "
    Initial --> Operator: +-*/etc
    Initial --> EOF: end

    Whitespace --> Initial: skip
    LineComment --> Initial: newline
    BlockComment --> Initial: */

    Identifier --> Token: non-alnum
    Number --> Token: non-digit
    String --> Token: closing quote
    Operator --> Token: complete

    Token --> Initial: emit token
    EOF --> [*]

Internal Structure

#![allow(unused)]
fn main() {
pub struct Lexer<'a> {
    source: &'a str,      // Original source text
    chars: Chars<'a>,     // Character iterator
    pos: u32,             // Current byte position
    peeked: Option<char>, // One-character lookahead
}
}

Character Classification

CategoryCharactersHandling
Whitespacespace, tab, newlineSkipped
Line comment-- to newlineSkipped
Block comment/* */ (nestable)Skipped, supports nesting
Identifier[a-zA-Z_][a-zA-Z0-9_]*Keyword lookup then Ident
Integer[0-9]+Parse as i64
Float[0-9]+\.[0-9]+ or scientificParse as f64
String'...' or "..."Handle escapes

String Escape Sequences

EscapeResult
\nNewline
\rCarriage return
\tTab
\\Backslash
\'Single quote
\"Double quote
\0Null character
''Single quote (SQL-style doubled)
\xUnknown: preserved as \x

Operator Recognition

Multi-character operators are recognized with lookahead:

#![allow(unused)]
fn main() {
// Lexer::next_token() operator matching
match c {
    '-' => if self.eat('>') { Arrow }      // ->
           else { Minus },                  // -
    '=' => if self.eat('>') { FatArrow }   // =>
           else { Eq },                     // =
    '<' => if self.eat('=') { Le }         // <=
           else if self.eat('>') { Ne }    // <>
           else if self.eat('<') { Shl }   // <<
           else { Lt },                     // <
    '>' => if self.eat('=') { Ge }         // >=
           else if self.eat('>') { Shr }   // >>
           else { Gt },                     // >
    '|' => if self.eat('|') { Concat }     // ||
           else { Pipe },                   // |
    '&' => if self.eat('&') { AmpAmp }     // &&
           else { Amp },                    // &
    ':' => if self.eat(':') { ColonColon } // ::
           else { Colon },                  // :
    // ...
}
}

Pratt Parser (Expression Parsing)

Algorithm Overview

The Pratt parser handles operator precedence through “binding power” - each operator has a left and right binding power that determines associativity and precedence.

flowchart TD
    A[parse_expr_bp\nmin_bp] --> B[parse_prefix]
    B --> C{More tokens?}
    C -->|No| D[Return lhs]
    C -->|Yes| E[parse_postfix]
    E --> F{Infix op?}
    F -->|No| D
    F -->|Yes| G{l_bp >= min_bp?}
    G -->|No| D
    G -->|Yes| H[Advance]
    H --> I[parse_expr_bp\nr_bp]
    I --> J[Build Binary node]
    J --> C

Binding Power Table

Each operator has left and right binding powers (l_bp, r_bp):

PrecedenceOperatorsBinding Power (l, r)Associativity
1 (lowest)OR(1, 2)Left
2AND(3, 4)Left
3=, !=, <, <=, >, >=(5, 6)Left
4| (bitwise OR)(7, 8)Left
5^ (bitwise XOR)(9, 10)Left
6& (bitwise AND)(11, 12)Left
7<<, >>(13, 14)Left
8+, -, || (concat)(15, 16)Left
9*, /, %(17, 18)Left
10 (highest)NOT, -, ~ (unary)19 (prefix)Right

Left associativity is achieved by having r_bp > l_bp.

Implementation Details

#![allow(unused)]
fn main() {
const MAX_DEPTH: usize = 64;  // Prevent stack overflow

fn parse_expr_bp(&mut self, min_bp: u8) -> ParseResult<Expr> {
    self.depth += 1;
    if self.depth > MAX_DEPTH {
        return Err(ParseError::new(ParseErrorKind::TooDeep, self.current.span));
    }

    let mut lhs = self.parse_prefix()?;

    loop {
        // Handle postfix operators (IS NULL, IN, BETWEEN, LIKE, DOT)
        lhs = self.parse_postfix(lhs)?;

        // Check for infix operator
        let op = match self.current_binary_op() {
            Some(op) => op,
            None => break,
        };

        let (l_bp, r_bp) = infix_binding_power(op);
        if l_bp < min_bp {
            break;  // Operator binds less tightly than current context
        }

        self.advance();
        let rhs = self.parse_expr_bp(r_bp)?;

        let span = lhs.span.merge(rhs.span);
        lhs = Expr::new(ExprKind::Binary(Box::new(lhs), op, Box::new(rhs)), span);
    }

    self.depth -= 1;
    Ok(lhs)
}
}

Prefix Expression Handling

#![allow(unused)]
fn main() {
fn parse_prefix(&mut self) -> ParseResult<Expr> {
    match &self.current.kind {
        // Literals
        TokenKind::Integer(n) => { /* emit Literal(Integer) */ },
        TokenKind::Float(n) => { /* emit Literal(Float) */ },
        TokenKind::String(s) => { /* emit Literal(String) */ },
        TokenKind::True | TokenKind::False => { /* emit Literal(Boolean) */ },
        TokenKind::Null => { /* emit Literal(Null) */ },

        // Identifiers and function calls
        TokenKind::Ident(_) => self.parse_ident_or_call(),

        // Aggregate functions (COUNT, SUM, AVG, MIN, MAX)
        TokenKind::Count | TokenKind::Sum | ... => self.parse_aggregate_call(),

        // Wildcard
        TokenKind::Star => { /* emit Wildcard */ },

        // Parenthesized expression or tuple
        TokenKind::LParen => self.parse_paren_expr(),

        // Array literal
        TokenKind::LBracket => self.parse_array(),

        // Unary operators
        TokenKind::Minus => { /* parse operand with PREFIX_BP, emit Unary(Neg) */ },
        TokenKind::Not | TokenKind::Bang => { /* emit Unary(Not) */ },
        TokenKind::Tilde => { /* emit Unary(BitNot) */ },

        // Special expressions
        TokenKind::Case => self.parse_case(),
        TokenKind::Exists => self.parse_exists(),
        TokenKind::Cast => self.parse_cast(),

        // Contextual keywords as identifiers
        _ if token.kind.is_contextual_keyword() => self.parse_keyword_as_ident(),

        // Error
        _ => Err(ParseError::unexpected(...)),
    }
}
}

Postfix Expression Handling

Postfix operators bind tighter than any infix operator:

#![allow(unused)]
fn main() {
fn parse_postfix(&mut self, mut expr: Expr) -> ParseResult<Expr> {
    loop {
        // Handle NOT IN, NOT BETWEEN, NOT LIKE
        if self.check(&TokenKind::Not) {
            let next = self.peek().kind.clone();
            if next == TokenKind::In { /* parse NOT IN */ }
            else if next == TokenKind::Between { /* parse NOT BETWEEN */ }
            else if next == TokenKind::Like { /* parse NOT LIKE */ }
        }

        match self.current.kind {
            TokenKind::Is => {
                // IS [NOT] NULL
                self.advance();
                let negated = self.eat(&TokenKind::Not);
                self.expect(&TokenKind::Null)?;
                expr = Expr::new(ExprKind::IsNull { expr, negated }, span);
            },
            TokenKind::In => { /* parse IN (values) or IN (subquery) */ },
            TokenKind::Between => { /* parse BETWEEN low AND high */ },
            TokenKind::Like => { /* parse LIKE pattern */ },
            TokenKind::Dot => {
                // Qualified name: table.column or table.*
                self.advance();
                if self.eat(&TokenKind::Star) {
                    expr = Expr::new(ExprKind::QualifiedWildcard(ident), span);
                } else {
                    let field = self.expect_ident()?;
                    expr = Expr::new(ExprKind::Qualified(Box::new(expr), field), span);
                }
            },
            _ => return Ok(expr),
        }
    }
}
}

Statement Kinds

SQL Statements

StatementExampleAST Type
SelectSELECT * FROM users WHERE id = 1SelectStmt
InsertINSERT INTO users (name) VALUES ('Alice')InsertStmt
UpdateUPDATE users SET name = 'Bob' WHERE id = 1UpdateStmt
DeleteDELETE FROM users WHERE id = 1DeleteStmt
CreateTableCREATE TABLE users (id INT PRIMARY KEY)CreateTableStmt
DropTableDROP TABLE IF EXISTS users CASCADEDropTableStmt
CreateIndexCREATE UNIQUE INDEX idx ON users(email)CreateIndexStmt
DropIndexDROP INDEX idx_nameDropIndexStmt
ShowTablesSHOW TABLESunit
DescribeDESCRIBE TABLE usersDescribeStmt

Graph Statements

StatementExampleAST Type
NodeNODE CREATE person {name: 'Alice'}NodeStmt
EdgeEDGE CREATE 1 -> 2 : FOLLOWS {since: 2023}EdgeStmt
NeighborsNEIGHBORS 'entity' OUTGOING followsNeighborsStmt
PathPATH SHORTEST 1 TO 5PathStmt
FindFIND NODE person WHERE age > 18FindStmt

Vector Statements

StatementExampleAST Type
EmbedEMBED STORE 'key' [0.1, 0.2, 0.3]EmbedStmt
SimilarSIMILAR 'query' LIMIT 10 COSINESimilarStmt

Domain Statements

StatementExampleAST Type
VaultVAULT SET 'secret' 'value'VaultStmt
CacheCACHE STATSCacheStmt
BlobBLOB PUT 'file.txt' 'data'BlobStmt
BlobsBLOBS BY TAG 'important'BlobsStmt
ChainBEGIN CHAIN TRANSACTIONChainStmt
ClusterCLUSTER STATUSClusterStmt
CheckpointCHECKPOINT 'backup1'CheckpointStmt
EntityENTITY CREATE 'key' {props} EMBEDDING [vec]EntityStmt

Expression Kinds

KindExampleNotes
Literal42, 3.14, 'hello', TRUE, NULLFive literal types
Identcolumn_nameSimple identifier
Qualifiedtable.columnDot notation
Binarya + b, x AND y18 binary operators
UnaryNOT flag, -value, ~bits3 unary operators
CallCOUNT(*), MAX(price)Function with args
CaseCASE WHEN x THEN y ELSE z ENDSimple and searched
Subquery(SELECT ...)Nested SELECT
ExistsEXISTS (SELECT 1 ...)Existence test
Inx IN (1, 2, 3) or x IN (SELECT...)Value list or subquery
Betweenx BETWEEN 1 AND 10Range check
Likename LIKE '%smith'Pattern matching
IsNullx IS NULL, y IS NOT NULLNULL check
Array[1, 2, 3]Array literal
Tuple(1, 2, 3)Tuple/row literal
CastCAST(x AS INT)Type conversion
Wildcard*All columns
QualifiedWildcardtable.*All columns from table

Span Tracking Implementation

BytePos and Span

#![allow(unused)]
fn main() {
/// A byte position in source code (u32 for memory efficiency).
pub struct BytePos(pub u32);

/// A span representing a range of source code.
pub struct Span {
    pub start: BytePos,  // Inclusive
    pub end: BytePos,    // Exclusive
}

impl Span {
    /// Combines two spans into one that covers both.
    pub const fn merge(self, other: Span) -> Span {
        let start = if self.start.0 < other.start.0 { self.start } else { other.start };
        let end = if self.end.0 > other.end.0 { self.end } else { other.end };
        Span { start, end }
    }
}
}

Line/Column Calculation

#![allow(unused)]
fn main() {
/// Computes line and column from a byte position.
pub fn line_col(source: &str, pos: BytePos) -> (usize, usize) {
    let offset = pos.as_usize().min(source.len());
    let mut line = 1;
    let mut col = 1;

    for (i, ch) in source.char_indices() {
        if i >= offset { break; }
        if ch == '\n' {
            line += 1;
            col = 1;
        } else {
            col += 1;
        }
    }

    (line, col)
}

/// Returns the line containing a position.
pub fn get_line(source: &str, pos: BytePos) -> &str {
    let offset = pos.as_usize().min(source.len());
    let line_start = source[..offset].rfind('\n').map(|i| i + 1).unwrap_or(0);
    let line_end = source[offset..].find('\n').map(|i| offset + i).unwrap_or(source.len());
    &source[line_start..line_end]
}
}

Error Recovery

Error Creation Patterns

#![allow(unused)]
fn main() {
// Unexpected token
ParseError::unexpected(
    TokenKind::Comma,
    self.current.span,
    "column name"
)

// Unexpected EOF
ParseError::unexpected_eof(
    self.current.span,
    "expression"
)

// Invalid syntax
ParseError::invalid(
    "CASE requires at least one WHEN clause",
    self.current.span
)

// With help message
ParseError::invalid("unknown keyword SELCT", span)
    .with_help("did you mean SELECT?")
}

Error Formatting

#![allow(unused)]
fn main() {
pub fn format_with_source(&self, source: &str) -> String {
    let (line, col) = line_col(source, self.span.start);
    let line_text = get_line(source, self.span.start);

    // Build error message with source context
    let mut result = format!("error: {}\n", self.kind);
    result.push_str(&format!("  --> line {}:{}\n", line, col));
    result.push_str("   |\n");
    result.push_str(&format!("{:3} | {}\n", line, line_text));
    result.push_str("   | ");

    // Add carets under the error location
    for _ in 0..(col - 1) { result.push(' '); }
    let len = self.span.len().max(1) as usize;
    for _ in 0..len.min(line_text.len() - col + 1).max(1) {
        result.push('^');
    }
    result.push('\n');

    if let Some(help) = &self.help {
        result.push_str(&format!("   = help: {}\n", help));
    }

    result
}
}

Example Error Output

error: unexpected FROM, expected expression or '*' after SELECT
  --> line 1:8
   |
  1 | SELECT FROM users
   |        ^^^^

Usage Examples

Parse a Statement

#![allow(unused)]
fn main() {
use neumann_parser::parse;

let stmt = parse("SELECT * FROM users WHERE id = 1")?;

match &stmt.kind {
    StatementKind::Select(select) => {
        println!("Distinct: {}", select.distinct);
        println!("Columns: {}", select.columns.len());
        if let Some(from) = &select.from {
            println!("Table: {:?}", from.table.kind);
        }
        if let Some(where_clause) = &select.where_clause {
            println!("Has WHERE clause");
        }
    }
    _ => {}
}
}

Parse Multiple Statements

#![allow(unused)]
fn main() {
use neumann_parser::parse_all;

let stmts = parse_all("SELECT 1; SELECT 2; INSERT INTO t VALUES (1)")?;
assert_eq!(stmts.len(), 3);

for stmt in &stmts {
    println!("Statement at {:?}", stmt.span);
}
}

Parse an Expression

#![allow(unused)]
fn main() {
use neumann_parser::parse_expr;

let expr = parse_expr("1 + 2 * 3")?;
// Parses as: 1 + (2 * 3) due to precedence

if let ExprKind::Binary(lhs, BinaryOp::Add, rhs) = expr.kind {
    // lhs = Literal(1)
    // rhs = Binary(Literal(2), Mul, Literal(3))
}
}

Tokenize Source

#![allow(unused)]
fn main() {
use neumann_parser::tokenize;

let tokens = tokenize("SELECT * FROM users");
for token in tokens {
    println!("{:?} at {:?}", token.kind, token.span);
}

// Output:
// Select at 0..6
// Star at 7..8
// From at 9..13
// Ident("users") at 14..19
// Eof at 19..19
}

Working with the Parser Directly

#![allow(unused)]
fn main() {
use neumann_parser::Parser;

let mut parser = Parser::new("SELECT 1; SELECT 2");

// Parse first statement
let stmt1 = parser.parse_statement()?;
assert!(matches!(stmt1.kind, StatementKind::Select(_)));

// Parse second statement
let stmt2 = parser.parse_statement()?;
assert!(matches!(stmt2.kind, StatementKind::Select(_)));

// Third call returns Empty (EOF)
let stmt3 = parser.parse_statement()?;
assert!(matches!(stmt3.kind, StatementKind::Empty));
}

Error Handling

#![allow(unused)]
fn main() {
use neumann_parser::parse;

let result = parse("SELECT FROM users");
if let Err(e) = result {
    // Format error with source context
    let formatted = e.format_with_source("SELECT FROM users");
    println!("{}", formatted);

    // Access error details
    println!("Error kind: {:?}", e.kind);
    println!("Span: {:?}", e.span);
    if let Some(help) = &e.help {
        println!("Help: {}", help);
    }
}
}

Grammar Overview

SELECT Statement

SELECT [DISTINCT | ALL] columns
FROM table [alias]
[JOIN table ON condition | USING (cols)]...
[WHERE condition]
[GROUP BY exprs]
[HAVING condition]
[ORDER BY expr [ASC|DESC] [NULLS FIRST|LAST]]...
[LIMIT n]
[OFFSET n]

CREATE TABLE Statement

CREATE TABLE [IF NOT EXISTS] name (
    column type [NULL|NOT NULL] [PRIMARY KEY] [UNIQUE]
                [DEFAULT expr] [CHECK(expr)]
                [REFERENCES table(col) [ON DELETE action] [ON UPDATE action]]
    [, ...]
    [, PRIMARY KEY (cols)]
    [, UNIQUE (cols)]
    [, FOREIGN KEY (cols) REFERENCES table(cols)]
    [, CHECK (expr)]
)

Graph Operations

NODE CREATE label {properties}
NODE GET id
NODE DELETE id
NODE LIST [label]

EDGE CREATE from -> to : type {properties}
EDGE GET id
EDGE DELETE id
EDGE LIST [type]

NEIGHBORS node [OUTGOING|INCOMING|BOTH] [edge_type]
          [BY SIMILAR [vector] LIMIT n]

PATH [SHORTEST|ALL] from TO to [MAX depth]

Vector Operations

EMBED STORE key [vector]
EMBED GET key
EMBED DELETE key
EMBED BUILD INDEX
EMBED BATCH [(key, [vector]), ...]

SIMILAR key|[vector] [LIMIT k] [COSINE|EUCLIDEAN|DOT_PRODUCT]
        [CONNECTED TO entity]

Chain Operations

BEGIN CHAIN TRANSACTION
COMMIT CHAIN
ROLLBACK CHAIN TO height

CHAIN HEIGHT
CHAIN TIP
CHAIN BLOCK height
CHAIN VERIFY
CHAIN HISTORY key
CHAIN SIMILAR [embedding] LIMIT n
CHAIN DRIFT FROM height TO height

SHOW CODEBOOK GLOBAL
SHOW CODEBOOK LOCAL domain
ANALYZE CODEBOOK TRANSITIONS

Reserved Keywords

Keywords are case-insensitive. The lexer converts to uppercase for matching.

SQL (70+ keywords): SELECT, DISTINCT, ALL, FROM, WHERE, INSERT, INTO, VALUES, UPDATE, SET, DELETE, CREATE, DROP, TABLE, INDEX, AND, OR, NOT, NULL, IS, IN, LIKE, BETWEEN, ORDER, BY, ASC, DESC, NULLS, FIRST, LAST, LIMIT, OFFSET, GROUP, HAVING, JOIN, INNER, LEFT, RIGHT, FULL, OUTER, CROSS, NATURAL, ON, USING, AS, PRIMARY, KEY, UNIQUE, REFERENCES, FOREIGN, CHECK, DEFAULT, CASCADE, RESTRICT, IF, EXISTS, SHOW, TABLES, UNION, INTERSECT, EXCEPT, CASE, WHEN, THEN, ELSE, END, CAST, ANY

Types (16 keywords): INT, INTEGER, BIGINT, SMALLINT, FLOAT, DOUBLE, REAL, DECIMAL, NUMERIC, VARCHAR, CHAR, TEXT, BOOLEAN, DATE, TIME, TIMESTAMP

Aggregates (5 keywords): COUNT, SUM, AVG, MIN, MAX

Graph (16 keywords): NODE, EDGE, NEIGHBORS, PATH, GET, LIST, STORE, OUTGOING, INCOMING, BOTH, SHORTEST, PROPERTIES, LABEL, VERTEX, VERTICES, EDGES

Vector (10 keywords): EMBED, SIMILAR, VECTOR, EMBEDDING, DIMENSION, DISTANCE, COSINE, EUCLIDEAN, DOT_PRODUCT, BUILD

Unified (6 keywords): FIND, WITH, RETURN, MATCH, ENTITY, CONNECTED

Domain (30+ keywords): VAULT, GRANT, REVOKE, ROTATE, CACHE, INIT, STATS, CLEAR, EVICT, PUT, SEMANTIC, THRESHOLD, CHECKPOINT, ROLLBACK, CHAIN, BEGIN, COMMIT, TRANSACTION, HISTORY, DRIFT, CODEBOOK, GLOBAL, LOCAL, ANALYZE, HEIGHT, TIP, BLOCK, CLUSTER, CONNECT, DISCONNECT, STATUS, NODES, LEADER, BLOB, BLOBS, INFO, LINK, TAG, VERIFY, GC, REPAIR

Contextual Keywords

These keywords can be used as identifiers in expression contexts (column names, etc.):

#![allow(unused)]
fn main() {
pub fn is_contextual_keyword(&self) -> bool {
    matches!(self,
        Status | Nodes | Leader | Connect | Disconnect | Cluster |
        Blobs | Info | Link | Unlink | Links | Tag | Untag | Verify | Gc | Repair |
        Height | Transitions | Tip | Block | Codebook | Global | Local | Drift |
        Begin | Commit | Transaction | ...
    )
}
}

Edge Cases and Gotchas

Ambiguous Token Sequences

  1. Minus vs Arrow: - vs -> distinguished by lookahead
  2. Less-than variants: < vs <= vs <> vs <<
  3. Pipe variants: | (bitwise) vs || (concat)
  4. Keyword as identifier: SELECT status FROM orders - status is contextual keyword

Number Parsing Edge Cases

#![allow(unused)]
fn main() {
// "3." is integer 3 followed by dot
tokens("3. ") // [Integer(3), Dot, Eof]

// "3.0" is float
tokens("3.0") // [Float(3.0), Eof]

// "3.x" is integer 3, dot, identifier x
tokens("3.x") // [Integer(3), Dot, Ident("x"), Eof]

// Scientific notation
tokens("1e10")   // [Float(1e10), Eof]
tokens("2.5E-3") // [Float(0.0025), Eof]
}

String Literal Edge Cases

#![allow(unused)]
fn main() {
// SQL-style doubled quotes
tokens("'it''s'") // [String("it's"), Eof]

// Unterminated string
tokens("'unterminated") // [Error("unterminated string literal"), Eof]

// String with newline (error)
tokens("'line1\nline2'") // Error - strings cannot span lines
}

Expression Depth Limit

#![allow(unused)]
fn main() {
// Deeply nested expressions hit the depth limit (64)
let mut expr = "x".to_string();
for _ in 0..70 {
    expr = format!("({})", expr);
}
parse_expr(&expr) // Err(ParseErrorKind::TooDeep)
}

BETWEEN Precedence

The AND in BETWEEN low AND high is part of the BETWEEN syntax, not a logical operator:

#![allow(unused)]
fn main() {
// "x BETWEEN 1 AND 10 AND y = 5" parses as:
// (x BETWEEN 1 AND 10) AND (y = 5)
// Not: x BETWEEN 1 AND (10 AND y = 5)
}

Qualified Wildcard Restriction

#![allow(unused)]
fn main() {
// Valid: table.*
parse_expr("users.*") // Ok(QualifiedWildcard)

// Invalid: (expr).*
parse_expr("(1 + 2).*") // Err("qualified wildcard requires identifier")
}

Performance

OperationComplexityNotes
TokenizeO(n)Single pass, no backtracking
ParseO(n)Single pass, constant stack per token
TotalO(n)Where n = input length

Memory Usage

  • Lexer: O(1) - only stores position and one peeked character
  • Parser: O(1) - only stores current and peeked token
  • AST: O(n) - proportional to number of nodes
  • Span: 8 bytes per span (two u32 values)

Optimizations

  1. Keyword lookup: O(1) via match statement on uppercase string
  2. Token comparison: Uses std::mem::discriminant for enum comparison
  3. Span tracking: Constant-time merge operation
  4. No allocations during parsing: Identifiers and strings owned in tokens
ModuleRelationship
query_routerConsumes AST and executes queries against engines
neumann_shellUses parser for interactive REPL commands
tensor_chainChain statements (BEGIN, COMMIT, HISTORY) parsed here
tensor_vaultVault statements (SET, GET, GRANT) parsed here
tensor_cacheCache statements (INIT, STATS, PUT) parsed here
tensor_blobBlob statements (PUT, GET, TAG) parsed here

Testing

The parser has comprehensive test coverage including:

  • Unit tests in each module: Token, span, lexer, parser, expression tests
  • Integration tests: Complex SQL queries, multi-statement parsing
  • Edge case tests: Unterminated strings, deeply nested expressions, ambiguous operators
  • Fuzz targets: parser_parse, parser_parse_all, parser_parse_expr, parser_tokenize
# Run parser tests
cargo test -p neumann_parser

# Run parser fuzz targets
cargo +nightly fuzz run parser_parse -- -max_total_time=60
cargo +nightly fuzz run parser_tokenize -- -max_total_time=60

Query Router

Query Router is the unified query execution layer for Neumann. It parses shell commands, routes them to appropriate engines, and combines results. All query types (relational, graph, vector, unified) flow through the router, which provides a single entry point for the entire system.

The router supports both synchronous and asynchronous execution, optional result caching, and distributed query execution when cluster mode is enabled.

Key Types

TypeDescription
QueryRouterMain router orchestrating queries across all engines
QueryResultUnified result enum for all query types
RouterErrorError types for query routing failures
NodeResultGraph node result with id, label, properties
EdgeResultGraph edge result with id, from, to, label
SimilarResultVector similarity result with key and score
UnifiedResultCross-engine query result with description and items
ChainResultBlockchain operation results
QueryPlannerPlans distributed query execution across shards
ResultMergerMerges results from multiple shards
ShardResultResult from a single shard with timing and error info
DistributedQueryConfigConfiguration for distributed execution
DistributedQueryStatsStatistics tracking for distributed queries
FilterConditionRe-exported from vector_engine for programmatic filter building
FilterValueRe-exported from vector_engine for filter values
FilterStrategyRe-exported from vector_engine for search strategy
FilteredSearchConfigRe-exported from vector_engine for filtered search config

QueryResult Variants

VariantDescriptionTypical Source
EmptyNo result (CREATE, INSERT)DDL, writes
Value(String)Single value resultScalar queries, DESCRIBE
Count(usize)Count of affected rows/nodes/edgesUPDATE, DELETE
Ids(Vec<u64>)List of IDsINSERT
Rows(Vec<Row>)Relational query resultsSELECT
Nodes(Vec<NodeResult>)Graph node resultsNODE queries
Edges(Vec<EdgeResult>)Graph edge resultsEDGE queries
Path(Vec<u64>)Graph traversal pathPATH queries
Similar(Vec<SimilarResult>)Vector similarity resultsSIMILAR queries
Unified(UnifiedResult)Cross-engine query resultsFIND queries
TableList(Vec<String>)List of table namesSHOW TABLES
Blob(Vec<u8>)Blob data bytesBLOB GET
ArtifactInfo(ArtifactInfoResult)Blob artifact metadataBLOB INFO
ArtifactList(Vec<String>)List of artifact IDsBLOBS LIST
BlobStats(BlobStatsResult)Blob storage statisticsBLOB STATS
CheckpointList(Vec<CheckpointInfo>)List of checkpointsCHECKPOINTS
Chain(ChainResult)Chain operation resultCHAIN queries

RouterError Types

ErrorCauseRecovery
ParseErrorInvalid query syntaxFix query syntax
UnknownCommandUnknown command or keywordCheck command spelling
RelationalErrorError from relational engineCheck table/column names
GraphErrorError from graph engineVerify node/edge IDs
VectorErrorError from vector engineCheck embedding dimensions
VaultErrorError from vaultVerify permissions
CacheErrorError from cacheCheck cache configuration
BlobErrorError from blob storageVerify artifact exists
CheckpointErrorError from checkpoint systemCheck blob store initialized
ChainErrorError from chain systemVerify chain initialized
InvalidArgumentInvalid argument valueCheck argument types
MissingArgumentMissing required argumentProvide required args
TypeMismatchType mismatch in queryCheck value types
AuthenticationRequiredVault operations require identityCall SET IDENTITY first

Error Propagation

The router implements From traits to convert engine-specific errors:

#![allow(unused)]
fn main() {
// Errors from underlying engines are automatically converted
impl From<RelationalError> for RouterError {
    fn from(e: RelationalError) -> Self {
        RouterError::RelationalError(e.to_string())
    }
}

impl From<GraphError> for RouterError { ... }
impl From<VectorError> for RouterError { ... }
impl From<VaultError> for RouterError { ... }
impl From<CacheError> for RouterError { ... }
impl From<BlobError> for RouterError { ... }
impl From<CheckpointError> for RouterError { ... }
impl From<ChainError> for RouterError { ... }
impl From<UnifiedError> for RouterError { ... }
}

This allows using the ? operator throughout execution methods:

#![allow(unused)]
fn main() {
fn exec_select(&self, select: &SelectStmt) -> Result<QueryResult> {
    // RelationalError automatically converts to RouterError
    let rows = self.relational.select_columnar(table_name, condition, options)?;
    Ok(QueryResult::Rows(rows))
}
}

Architecture

graph TB
    subgraph QueryRouter
        Execute[execute_parsed]
        ExecuteAsync[execute_parsed_async]
        Distributed[try_execute_distributed]
        Cache[Query Cache]
        Statement[execute_statement]
        StatementAsync[execute_statement_async]
    end

    Execute --> Distributed
    ExecuteAsync --> StatementAsync
    Distributed -->|cluster active| ScatterGather[Scatter-Gather]
    Distributed -->|local| Cache
    Cache -->|cache hit| Return[Return Result]
    Cache -->|cache miss| Statement

    Statement --> Relational[RelationalEngine]
    Statement --> Graph[GraphEngine]
    Statement --> Vector[VectorEngine]
    Statement --> Vault[Vault]
    Statement --> CacheOps[Cache Operations]
    Statement --> Blob[BlobStore]
    Statement --> Checkpoint[CheckpointManager]
    Statement --> Chain[TensorChain]
    Statement --> Cluster[ClusterOrchestrator]

    subgraph Engines
        Relational
        Graph
        Vector
    end

    subgraph Optional Services
        Vault
        CacheOps
        Blob
        Checkpoint
        Chain
        Cluster
    end

    Relational --> Store[TensorStore]
    Graph --> Store
    Vector --> Store

Internal Router Structure

#![allow(unused)]
fn main() {
pub struct QueryRouter {
    // Core engines (always initialized)
    relational: Arc<RelationalEngine>,
    graph: Arc<GraphEngine>,
    vector: Arc<VectorEngine>,

    // Unified engine for cross-engine queries (lazily initialized)
    unified: Option<UnifiedEngine>,

    // Optional services (require explicit initialization)
    vault: Option<Arc<Vault>>,
    cache: Option<Arc<Cache>>,
    blob: Option<Arc<tokio::sync::Mutex<BlobStore>>>,
    blob_runtime: Option<Arc<Runtime>>,
    checkpoint: Option<Arc<tokio::sync::Mutex<CheckpointManager>>>,
    chain: Option<Arc<TensorChain>>,

    // Cluster mode
    cluster: Option<Arc<ClusterOrchestrator>>,
    cluster_runtime: Option<Arc<Runtime>>,
    distributed_planner: Option<Arc<QueryPlanner>>,
    distributed_config: DistributedQueryConfig,
    local_shard_id: ShardId,

    // Authentication state
    current_identity: Option<String>,

    // Vector index for fast similarity search
    hnsw_index: Option<(HNSWIndex, Vec<String>)>,
}
}

Initialization

#![allow(unused)]
fn main() {
use query_router::QueryRouter;
use tensor_store::TensorStore;

// Create with independent engines
let router = QueryRouter::new();

// Create with existing engines
let router = QueryRouter::with_engines(relational, graph, vector);

// Create with shared storage (enables unified entities)
let store = TensorStore::new();
let router = QueryRouter::with_shared_store(store);
}

Constructor Comparison

ConstructorUnifiedEngineUse Case
new()NoSimple single-engine queries
with_engines(...)NoCustom engine configuration
with_shared_store(...)YesCross-engine unified queries

Shared Store Benefits

When using with_shared_store(), all engines share the same underlying TensorStore:

#![allow(unused)]
fn main() {
pub fn with_shared_store(store: TensorStore) -> Self {
    let relational = Arc::new(RelationalEngine::with_store(store.clone()));
    let graph = Arc::new(GraphEngine::with_store(store.clone()));
    let vector = Arc::new(VectorEngine::with_store(store.clone()));
    let unified = UnifiedEngine::with_engines(
        store,
        Arc::clone(&relational),
        Arc::clone(&graph),
        Arc::clone(&vector),
    );
    // ...
}
}

This enables:

  • Cross-engine queries via UnifiedEngine
  • Entity-level operations spanning all modalities
  • Consistent view of data across engines

Query Execution

Execution Methods

MethodParserAsyncDistributedCache
execute(command)Regex (legacy)NoNoNo
execute_parsed(command)ASTNoYesYes
execute_parsed_async(command)ASTYesNoYes
execute_statement(stmt)Pre-parsedNoNoNo
execute_statement_async(stmt)Pre-parsedYesNoNo

Execution Flow

flowchart TD
    A[execute_parsed] --> B{Cluster Active?}
    B -->|Yes| C[try_execute_distributed]
    B -->|No| D[Parse Command]

    C --> E{Plan Type}
    E -->|Local| D
    E -->|Remote| F[execute_on_shard]
    E -->|ScatterGather| G[execute_scatter_gather]

    D --> H{Cacheable?}
    H -->|Yes| I{Cache Hit?}
    H -->|No| J[execute_statement]

    I -->|Yes| K[Return Cached]
    I -->|No| J

    J --> L[Engine Dispatch]
    L --> M{Write Op?}
    M -->|Yes| N[Invalidate Cache]
    M -->|No| O[Cache Result]

    O --> P[Return Result]
    N --> P
    K --> P
    F --> P
    G --> P

Detailed Execution Steps

  1. Distributed Check: If cluster is active, try_execute_distributed plans query execution
  2. Parse: Convert command string to AST via neumann_parser
  3. Cache Check: For cacheable queries (SELECT, SIMILAR, NEIGHBORS, PATH), check cache first
  4. Execute: Dispatch to appropriate engine based on StatementKind
  5. Cache Update: Store result for cacheable queries (as JSON via serde)
  6. Invalidate: Clear entire cache on write operations (INSERT, UPDATE, DELETE, DDL)
#![allow(unused)]
fn main() {
// Synchronous execution
let result = router.execute_parsed("SELECT * FROM users")?;

// Async execution
let result = router.execute_parsed_async("SELECT * FROM users").await?;

// Concurrent queries
let (users, posts, similar) = tokio::join!(
    router.execute_parsed_async("SELECT * FROM users"),
    router.execute_parsed_async("SELECT * FROM posts"),
    router.execute_parsed_async("SIMILAR 'doc:1' LIMIT 10"),
);
}

Cache Key Generation

#![allow(unused)]
fn main() {
fn cache_key_for_query(command: &str) -> String {
    format!("query:{}", command.trim().to_lowercase())
}
}

This normalizes queries for cache lookup by trimming whitespace and lowercasing.

Statement Routing

The router dispatches statements based on their StatementKind:

flowchart LR
    subgraph StatementKind
        SQL[Select/Insert/Update/Delete]
        DDL[CreateTable/DropTable/CreateIndex/DropIndex]
        Graph[Node/Edge/Neighbors/Path]
        Vector[Embed/Similar]
        Unified[Find/Entity]
        Services[Vault/Cache/Blob/Checkpoint/Chain/Cluster]
    end

    SQL --> RE[RelationalEngine]
    DDL --> RE
    Graph --> GE[GraphEngine]
    Vector --> VE[VectorEngine]
    Unified --> UE[UnifiedEngine]
    Services --> Svc[Optional Services]

Complete Statement Routing Table

Statement TypeEngineHandler MethodOperations
SelectRelationalexec_selectTable queries with WHERE, JOIN, GROUP BY, ORDER BY
InsertRelationalexec_insertSingle/multi-row insert, INSERT…SELECT
UpdateRelationalexec_updateRow updates with conditions
DeleteRelationalexec_deleteRow deletion with protection
CreateTableRelationalexec_create_tableTable DDL
DropTableRelationalinlineTable removal with protection
CreateIndexRelationalinlineIndex creation
DropIndexRelationalinlineIndex removal with protection
ShowTablesRelationalinlineList tables
DescribeMultipleexec_describeSchema/node/edge info
NodeGraphexec_nodeCREATE/GET/DELETE/LIST/UPDATE
EdgeGraphexec_edgeCREATE/GET/DELETE/LIST/UPDATE
NeighborsGraphexec_neighborsNeighbor traversal
PathGraphexec_pathPath finding
EmbedVectorexec_embedEmbedding storage, batch, delete
SimilarVectorexec_similark-NN search
ShowEmbeddingsVectorinlineList embedding keys
CountEmbeddingsVectorinlineCount embeddings
FindUnifiedexec_findCross-engine queries
EntityUnifiedexec_entityEntity CRUD
VaultVaultexec_vaultSecret management
CacheCacheexec_cacheLLM response cache
BlobBlobStoreexec_blobArtifact operations
BlobsBlobStoreexec_blobsArtifact listing
CheckpointCheckpointexec_checkpointCreate snapshot
RollbackCheckpointexec_rollbackRestore snapshot
CheckpointsCheckpointexec_checkpointsList snapshots
ChainTensorChainexec_chainBlockchain operations
ClusterOrchestratorexec_clusterCluster management
EmptyinlineNo-op

Statement Handler Pattern

Each handler follows a consistent pattern:

#![allow(unused)]
fn main() {
fn exec_<statement>(&self, stmt: &<Statement>Stmt) -> Result<QueryResult> {
    // 1. Validate/extract parameters
    let param = self.eval_string_expr(&stmt.field)?;

    // 2. Check service availability (for optional services)
    let service = self.service.as_ref()
        .ok_or_else(|| RouterError::ServiceError("Service not initialized".to_string()))?;

    // 3. For destructive ops, check protection
    if is_destructive {
        match self.protect_destructive_op(...)? {
            ProtectedOpResult::Cancelled => return Err(...),
            ProtectedOpResult::Proceed => {},
        }
    }

    // 4. Execute operation
    let result = service.operation(...)?;

    // 5. Convert to QueryResult
    Ok(QueryResult::Variant(result))
}
}

Supported Queries

Relational Operations

-- DDL
CREATE TABLE users (id INT, name VARCHAR(100), email VARCHAR(255))
DROP TABLE users

-- DML
INSERT INTO users (id, name, email) VALUES (1, 'Alice', 'alice@example.com')
INSERT INTO users SELECT * FROM temp_users
UPDATE users SET name = 'Bob' WHERE id = 1
DELETE FROM users WHERE id = 1

-- Queries
SELECT * FROM users WHERE id = 1
SELECT id, name FROM users ORDER BY name ASC LIMIT 10 OFFSET 5
SELECT COUNT(*), AVG(age) FROM users WHERE active = true GROUP BY dept HAVING COUNT(*) > 5

-- JOINs
SELECT * FROM users u INNER JOIN orders o ON u.id = o.user_id
SELECT * FROM users u LEFT JOIN profiles p ON u.id = p.user_id
SELECT * FROM a CROSS JOIN b
SELECT * FROM a NATURAL JOIN b

Aggregate Functions

FunctionDescriptionNull Handling
COUNT(*)Count all rowsCounts nulls
COUNT(col)Count non-null valuesExcludes nulls
SUM(col)Sum numeric valuesSkips nulls
AVG(col)Average numeric valuesSkips nulls, returns NULL if no values
MIN(col)Minimum valueSkips nulls
MAX(col)Maximum valueSkips nulls

Graph Operations

-- Node operations
NODE CREATE person {name: 'Alice', age: 30}
NODE GET 123
NODE DELETE 123
NODE LIST person LIMIT 100
NODE UPDATE 123 {name: 'Alice Smith'}

-- Edge operations
EDGE CREATE person:1 friend person:2 {since: 2020}
EDGE GET 456
EDGE DELETE 456
EDGE LIST friend LIMIT 50

-- Traversals
NEIGHBORS person:1 friend OUTGOING
NEIGHBORS 123 * BOTH
PATH person:1 TO person:5 VIA friend

Vector Operations

-- Single embedding
EMBED doc1 [0.1, 0.2, 0.3, 0.4]
EMBED DELETE doc1

-- Batch embedding
EMBED BATCH [('key1', [0.1, 0.2]), ('key2', [0.3, 0.4])]

-- Similarity search
SIMILAR 'doc1' LIMIT 5
SIMILAR 'doc1' LIMIT 5 EUCLIDEAN
SIMILAR [0.1, 0.2, 0.3] LIMIT 10 COSINE

-- Listing
SHOW EMBEDDINGS LIMIT 100
COUNT EMBEDDINGS

Distance Metrics

MetricDescriptionUse CaseFormula
COSINECosine similarity (default)Semantic similarity1 - (a.b) / (‖a‖ * ‖b‖)
EUCLIDEANEuclidean distance (L2)Spatial distancesqrt(sum((a_i - b_i)^2))
DOT_PRODUCTDot productMagnitude-aware similaritysum(a_i * b_i)

Unified Entity Operations

-- Create entity with all modalities
ENTITY CREATE 'user:1' {name: 'Alice'} EMBEDDING [0.1, 0.2, 0.3]

-- Connect entities
ENTITY CONNECT 'user:1' -> 'doc:1' : authored

-- Combined similarity + graph search
SIMILAR 'query:key' CONNECTED TO 'hub:entity' LIMIT 10

Cross-Engine Queries

Cross-engine queries combine graph relationships with vector similarity:

#![allow(unused)]
fn main() {
let store = TensorStore::new();
let mut router = QueryRouter::with_shared_store(store);

// Set up entities with embeddings
router.vector().set_entity_embedding("user:1", vec![0.1, 0.2, 0.3])?;
router.vector().set_entity_embedding("user:2", vec![0.15, 0.25, 0.35])?;

// Connect via graph edges
router.connect_entities("user:1", "user:2", "follows")?;

// Build HNSW index for O(log n) similarity search
router.build_vector_index()?;

// Find neighbors sorted by similarity
let results = router.find_neighbors_by_similarity("user:1", &query_vec, 10)?;

// Find similar AND connected entities
let results = router.find_similar_connected("user:1", "user:2", 5)?;
}

Cross-Engine Methods

MethodDescriptionComplexity
build_vector_index()Build HNSW index for O(log n) searchO(n log n)
connect_entities(from, to, type)Add graph edge between entitiesO(1)
find_neighbors_by_similarity(key, query, k)Neighbors sorted by vector similarityO(k * log n) with HNSW
find_similar_connected(query, connected_to, k)Similar AND connected entitiesO(k * log n) + O(neighbors)
create_unified_entity(key, fields, embedding)Create entity with all modalitiesO(1)

Implementation Details

The find_similar_connected method combines vector and graph operations:

#![allow(unused)]
fn main() {
pub fn find_similar_connected(
    &self,
    query_key: &str,
    connected_to: &str,
    top_k: usize,
) -> Result<Vec<UnifiedItem>> {
    let query_embedding = self.vector.get_entity_embedding(query_key)?;

    // Use HNSW index if available, otherwise brute-force
    let similar = if let Some((ref index, ref keys)) = self.hnsw_index {
        self.vector.search_with_hnsw(index, keys, &query_embedding, top_k * 2)?
    } else {
        self.vector.search_entities(&query_embedding, top_k * 2)?
    };

    // Get graph neighbors of connected_to entity
    let connected_neighbors: HashSet<String> = self.graph
        .get_entity_neighbors(connected_to)
        .unwrap_or_default()
        .into_iter()
        .collect();

    // Filter to entities that are both similar AND connected
    let items: Vec<UnifiedItem> = similar
        .into_iter()
        .filter(|s| connected_neighbors.contains(&s.key))
        .take(top_k)
        .map(|s| UnifiedItem::new("vector+graph", &s.key).with_score(s.score))
        .collect();

    Ok(items)
}
}

Optional Services

Services are lazily initialized and can be enabled as needed:

flowchart TD
    subgraph Initialization Order
        A[QueryRouter::new] --> B[Core Engines Ready]
        B --> C{Need Vault?}
        C -->|Yes| D[init_vault]
        B --> E{Need Cache?}
        E -->|Yes| F[init_cache]
        B --> G{Need Blob?}
        G -->|Yes| H[init_blob]
        H --> I{Need Checkpoint?}
        I -->|Yes| J[init_checkpoint]
        B --> K{Need Chain?}
        K -->|Yes| L[init_chain]
        B --> M{Need Cluster?}
        M -->|Yes| N[init_cluster]
    end

    style J fill:#ffcccc
    note[Checkpoint requires Blob]

Vault

#![allow(unused)]
fn main() {
// Initialize with master key
router.init_vault(master_key)?;

// Or auto-initialize from NEUMANN_VAULT_KEY env var
router.ensure_vault()?;

// Set identity for access control
router.set_identity("user:alice");
}

Vault requires authentication for all operations:

#![allow(unused)]
fn main() {
fn exec_vault(&self, stmt: &VaultStmt) -> Result<QueryResult> {
    let vault = self.vault.as_ref()
        .ok_or_else(|| RouterError::VaultError("Vault not initialized".to_string()))?;

    // SECURITY: Require explicit authentication
    let identity = self.require_identity()?;

    match &stmt.operation {
        VaultOp::Get { key } => {
            let value = vault.get(identity, &key)?;
            Ok(QueryResult::Value(value))
        },
        // ...
    }
}
}

Cache

#![allow(unused)]
fn main() {
// Default configuration
router.init_cache();

// Custom configuration
router.init_cache_with_config(CacheConfig::default())?;

// Auto-initialize
router.ensure_cache();
}

Cache operations are available through queries:

CACHE INIT
CACHE STATS
CACHE CLEAR
CACHE EVICT 100
CACHE GET 'key'
CACHE PUT 'key' 'value'
CACHE SEMANTIC GET 'query' THRESHOLD 0.9
CACHE SEMANTIC PUT 'query' 'response' [0.1, 0.2, 0.3]

Blob Storage

#![allow(unused)]
fn main() {
// Initialize blob store
router.init_blob()?;
router.start_blob()?;  // Start GC

// Graceful shutdown
router.shutdown_blob()?;
}

Blob operations use async execution internally:

#![allow(unused)]
fn main() {
fn exec_blob(&self, stmt: &BlobStmt) -> Result<QueryResult> {
    let blob = self.blob.as_ref()
        .ok_or_else(|| RouterError::BlobError("Blob store not initialized".to_string()))?;
    let runtime = self.blob_runtime.as_ref()
        .ok_or_else(|| RouterError::BlobError("Blob runtime not initialized".to_string()))?;

    match &stmt.operation {
        BlobOp::Put { filename, data, ... } => {
            let artifact_id = runtime.block_on(async {
                let blob_guard = blob.lock().await;
                blob_guard.put(&filename, &data, options).await
            })?;
            Ok(QueryResult::Value(artifact_id))
        },
        // ...
    }
}
}

Checkpoint

#![allow(unused)]
fn main() {
// Requires blob storage
router.init_blob()?;
router.init_checkpoint()?;

// Set confirmation handler for destructive ops
router.set_confirmation_handler(handler)?;
}

Checkpoint provides automatic protection for destructive operations:

#![allow(unused)]
fn main() {
fn protect_destructive_op(
    &self,
    command: &str,
    op: DestructiveOp,
    sample_data: Vec<String>,
) -> Result<ProtectedOpResult> {
    let Some(checkpoint) = self.checkpoint.as_ref() else {
        return Ok(ProtectedOpResult::Proceed);
    };

    runtime.block_on(async {
        let cp = checkpoint.lock().await;

        if !cp.auto_checkpoint_enabled() {
            return Ok(ProtectedOpResult::Proceed);
        }

        let preview = cp.generate_preview(&op, sample_data);

        if !cp.request_confirmation(&op, &preview) {
            return Ok(ProtectedOpResult::Cancelled);
        }

        // Create auto-checkpoint before operation
        cp.create_auto(command, op, preview, store).await?;

        Ok(ProtectedOpResult::Proceed)
    })
}
}

Protected operations include:

  • DELETE (relational rows)
  • DROP TABLE
  • DROP INDEX
  • NODE DELETE
  • EMBED DELETE
  • VAULT DELETE
  • BLOB DELETE
  • CACHE CLEAR

Chain

#![allow(unused)]
fn main() {
// Initialize tensor chain
router.init_chain("node_1")?;

// Auto-initialize with default node ID
router.ensure_chain()?;
}

Chain operations available through queries:

CHAIN BEGIN
CHAIN COMMIT
CHAIN ROLLBACK 100
CHAIN HISTORY 'key'
CHAIN HEIGHT
CHAIN TIP
CHAIN BLOCK 42
CHAIN VERIFY
CHAIN SHOW CODEBOOK GLOBAL
CHAIN SHOW CODEBOOK LOCAL 'domain'
CHAIN ANALYZE TRANSITIONS

Cluster

#![allow(unused)]
fn main() {
// Initialize cluster mode
router.init_cluster("node_1", bind_addr, &peers)?;

// Check cluster status
if router.is_cluster_active() {
    // Distributed queries enabled
}

// Graceful shutdown
router.shutdown_cluster()?;
}

Cluster initialization creates:

  1. ClusterOrchestrator for Raft consensus
  2. ConsistentHashPartitioner for key-based routing
  3. QueryPlanner for distributed execution

Distributed Query Execution

When cluster mode is active, queries are automatically distributed:

flowchart TD
    A[Query] --> B[QueryPlanner]
    B --> C{classify_query}

    C -->|GET key| D{partition key}
    D -->|Local| E[QueryPlan::Local]
    D -->|Remote| F[QueryPlan::Remote]

    C -->|SIMILAR| G[QueryPlan::ScatterGather]
    C -->|SELECT *| G
    C -->|COUNT| H[QueryPlan::ScatterGather + Aggregate]
    C -->|Unknown| E

    F --> I[execute_on_shard]
    G --> J[execute_scatter_gather]
    H --> J

    J --> K[ResultMerger::merge]
    K --> L[QueryResult]

Query Classification

The QueryPlanner classifies queries based on text pattern matching:

#![allow(unused)]
fn main() {
fn classify_query(&self, query: &str) -> QueryType {
    let query_upper = query.to_uppercase();

    // Point lookups
    if query_upper.starts_with("GET ")
       || query_upper.starts_with("NODE GET ")
       || query_upper.starts_with("ENTITY GET ") {
        if let Some(key) = self.extract_key(query) {
            return QueryType::PointLookup { key };
        }
    }

    // Similarity search
    if query_upper.starts_with("SIMILAR ") {
        let k = self.extract_top_k(query).unwrap_or(10);
        return QueryType::SimilaritySearch { k };
    }

    // Table scans with aggregates
    if query_upper.starts_with("SELECT ") {
        if query_upper.contains("COUNT(") {
            return QueryType::Aggregate { func: AggregateFunction::Count };
        }
        if query_upper.contains("SUM(") {
            return QueryType::Aggregate { func: AggregateFunction::Sum };
        }
        return QueryType::TableScan;
    }

    QueryType::Unknown
}
}

Query Plans

PlanWhen UsedExampleShards Contacted
LocalPoint lookups on local shardGET user:1 (local key)1
RemotePoint lookups on remote shardGET user:2 (remote key)1
ScatterGatherFull scans, aggregates, similaritySELECT *, SIMILAR, COUNTAll

Merge Strategies

StrategyDescriptionUse CaseAlgorithm
UnionCombine all resultsSELECT, NODE queriesConcatenate rows/nodes/edges
TopK(k)Keep top K by scoreSIMILAR queriesSort by score desc, truncate
Aggregate(func)SUM, COUNT, AVG, MAX, MINAggregate queriesCombine partial aggregates
FirstNonEmptyFirst result foundPoint lookupsShort-circuit on first result
ConcatConcatenate in orderOrdered resultsSame as Union

Result Merger Implementation

#![allow(unused)]
fn main() {
impl ResultMerger {
    pub fn merge(results: Vec<ShardResult>, strategy: &MergeStrategy) -> Result<QueryResult> {
        // Filter out errors if not fail-fast
        let successful: Vec<_> = results.into_iter()
            .filter(|r| r.error.is_none())
            .collect();

        if successful.is_empty() {
            return Ok(QueryResult::Empty);
        }

        match strategy {
            MergeStrategy::Union => Self::merge_union(successful),
            MergeStrategy::TopK(k) => Self::merge_top_k(successful, *k),
            MergeStrategy::Aggregate(func) => Self::merge_aggregate(successful, *func),
            MergeStrategy::FirstNonEmpty => Self::merge_first_non_empty(successful),
            MergeStrategy::Concat => Self::merge_concat(successful),
        }
    }

    fn merge_top_k(results: Vec<ShardResult>, k: usize) -> Result<QueryResult> {
        let mut all_similar: Vec<SimilarResult> = Vec::new();

        for shard_result in results {
            if let QueryResult::Similar(similar) = shard_result.result {
                all_similar.extend(similar);
            }
        }

        // Sort by score descending
        all_similar.sort_by(|a, b|
            b.score.partial_cmp(&a.score).unwrap_or(std::cmp::Ordering::Equal)
        );

        // Take top K
        all_similar.truncate(k);

        Ok(QueryResult::Similar(all_similar))
    }
}
}

Distributed Query Configuration

#![allow(unused)]
fn main() {
pub struct DistributedQueryConfig {
    /// Maximum concurrent shard queries (default: 10)
    pub max_concurrent: usize,
    /// Query timeout per shard in milliseconds (default: 5000)
    pub shard_timeout_ms: u64,
    /// Retry count for failed shards (default: 2)
    pub retry_count: usize,
    /// Whether to fail fast on first shard error (default: false)
    pub fail_fast: bool,
}
}

Semantic Routing

For embedding-aware routing, use plan_with_embedding:

#![allow(unused)]
fn main() {
pub fn plan_with_embedding(&self, query: &str, embedding: &[f32]) -> QueryPlan {
    // Get semantically relevant shards
    let relevant_shards = self.shards_for_embedding(embedding);

    if relevant_shards.is_empty() {
        return self.plan(query);  // Fallback to all shards
    }

    // Route similarity search to relevant shards only
    match self.classify_query(query) {
        QueryType::SimilaritySearch { k } => QueryPlan::ScatterGather {
            shards: relevant_shards,
            query: query.to_string(),
            merge: MergeStrategy::TopK(k),
        },
        _ => self.plan(query),
    }
}
}

Performance Characteristics

OperationComplexityNotes
ParseO(n)n = query length
SELECTO(m)m = rows in table
SELECT with indexO(log m + k)k = matching rows
INSERTO(1)Single row insert
NODEO(1)Single node create
EDGEO(1)Single edge create
PATHO(V+E)BFS traversal
SIMILAR (brute-force)O(n*d)n = embeddings, d = dimensions
SIMILAR (HNSW)O(log n * d)After build_vector_index()
find_similar_connectedO(log n) or O(n)Uses HNSW if index built
Distributed queryO(query) / shardsParallelized across shards
Result merge (Union)O(total results)Linear in combined size
Result merge (TopK)O(n log k)Sort + truncate

HNSW Index Performance

EntitiesBrute-forceWith HNSWSpeedup
2004.17s9.3us448,000x

Distributed Query Overhead

OperationOverhead
Query planning~1-5 us
Network round-trip~1-10 ms (depends on network)
Result serialization~10-100 us (depends on result size)
Result merging~1-10 us (TopK), O(n) for Union

Query Caching

Cacheable statements are automatically cached when a cache is configured:

  • Cacheable: SELECT, SIMILAR, NEIGHBORS, PATH
  • Write operations: INSERT, UPDATE, DELETE, DDL invalidate cache
#![allow(unused)]
fn main() {
fn is_cacheable_statement(stmt: &Statement) -> bool {
    matches!(&stmt.kind,
        StatementKind::Select(_)
        | StatementKind::Similar(_)
        | StatementKind::Neighbors(_)
        | StatementKind::Path(_)
    )
}

fn is_write_statement(stmt: &Statement) -> bool {
    matches!(&stmt.kind,
        StatementKind::Insert(_)
        | StatementKind::Update(_)
        | StatementKind::Delete(_)
        | StatementKind::CreateTable(_)
        | StatementKind::DropTable(_)
        | StatementKind::CreateIndex(_)
        | StatementKind::DropIndex(_)
    )
}
}

Cache Usage Example

#![allow(unused)]
fn main() {
// Enable caching
router.init_cache();

// First call executes and caches (JSON serialization)
let result1 = router.execute_parsed("SELECT * FROM users")?;

// Second call returns cached result (JSON deserialization)
let result2 = router.execute_parsed("SELECT * FROM users")?;

// Write operations invalidate entire cache
router.execute_parsed("INSERT INTO users VALUES (2, 'Bob')")?;
// Cache is now empty
}

Cache Gotchas

  1. Full cache invalidation: Any write operation clears the entire cache. No table-level tracking.
  2. Case sensitivity: Cache keys are lowercased, so SELECT and select hit the same entry.
  3. Whitespace normalization: Queries are trimmed but not fully normalized.
  4. No TTL: Cached entries persist until invalidated by writes or explicit CACHE CLEAR.

Best Practices

Service Initialization Order

#![allow(unused)]
fn main() {
// Initialize in dependency order
let mut router = QueryRouter::with_shared_store(store);

// Optional services (no dependencies)
router.init_vault(key)?;
router.init_cache();

// Blob first (required for checkpoint)
router.init_blob()?;
router.start_blob()?;

// Checkpoint depends on blob
router.init_checkpoint()?;
router.set_confirmation_handler(handler)?;

// Chain is independent
router.init_chain("node_1")?;

// Cluster is independent but typically last
router.init_cluster("node_1", addr, &peers)?;
}

Identity Management

#![allow(unused)]
fn main() {
// Always set identity before vault operations
router.set_identity("user:alice");

// Check authentication status
if !router.is_authenticated() {
    return Err("Authentication required");
}

// Identity persists across queries
router.execute_parsed("VAULT GET 'secret'")?;  // Uses alice's identity
}

Error Handling

#![allow(unused)]
fn main() {
match router.execute_parsed(query) {
    Ok(result) => handle_result(result),
    Err(RouterError::ParseError(msg)) => println!("Invalid query: {}", msg),
    Err(RouterError::AuthenticationRequired) => println!("Please run SET IDENTITY first"),
    Err(RouterError::RelationalError(msg)) if msg.contains("not found") => {
        println!("Table not found");
    },
    Err(e) => println!("Error: {}", e),
}
}

Async vs Sync

#![allow(unused)]
fn main() {
// Use sync for simple scripts
let result = router.execute_parsed("SELECT * FROM users")?;

// Use async for concurrent operations
async fn parallel_queries(router: &QueryRouter) -> Result<()> {
    let (users, orders) = tokio::join!(
        router.execute_parsed_async("SELECT * FROM users"),
        router.execute_parsed_async("SELECT * FROM orders"),
    );
    // Both queries execute concurrently
    Ok(())
}

// Note: async execution doesn't support distributed routing yet
}

Building Vector Index

#![allow(unused)]
fn main() {
// Build index after loading embeddings
for (key, embedding) in embeddings {
    router.vector().set_entity_embedding(&key, embedding)?;
}

// Build HNSW index for fast similarity search
router.build_vector_index()?;

// Now SIMILAR queries use O(log n) search
let results = router.execute_parsed("SIMILAR 'query' LIMIT 10")?;
}
ModuleRelationship
Tensor StoreUnderlying storage layer
Relational EngineTable operations
Graph EngineNode/edge operations
Vector EngineEmbedding operations
Tensor UnifiedCross-engine queries
Neumann ParserQuery parsing
Tensor VaultSecret storage
Tensor CacheLLM response caching
Tensor BlobArtifact storage
Tensor CheckpointSnapshots
Tensor ChainBlockchain
Neumann ShellCLI interface

Neumann Shell Architecture

The Neumann Shell (neumann_shell) provides an interactive CLI interface for the Neumann database. It is a thin layer that delegates query execution to the Query Router while providing readline-based input handling, command history, output formatting, and crash recovery via write-ahead logging.

The shell follows four design principles: human-first interface (readable prompts, formatted output, command history), thin layer (minimal logic, delegates to Query Router), graceful handling (Ctrl+C does not exit, errors displayed cleanly), and zero configuration (works out of the box with sensible defaults).

Key Types

TypeDescription
ShellMain shell struct holding router, config, and WAL state
ShellConfigConfiguration for history file, history size, and prompt
CommandResultResult enum: Output, Exit, Help, Empty, Error
LoopActionAction after command: Continue or Exit
ShellErrorError type for initialization failures
WalInternal write-ahead log for crash recovery
RouterExecutorWrapper implementing QueryExecutor trait for cluster operations
ShellConfirmationHandlerInteractive confirmation handler for destructive operations

Shell Configuration

FieldTypeDefaultDescription
history_fileOption<PathBuf>~/.neumann_historyPath for persistent history
history_sizeusize1000Maximum history entries
promptString"> "Input prompt string

The default history file location is determined by reading the HOME environment variable:

#![allow(unused)]
fn main() {
fn dirs_home() -> Option<PathBuf> {
    std::env::var_os("HOME").map(PathBuf::from)
}
}

Command Result Types

VariantDescriptionREPL Behavior
Output(String)Query executed successfully with outputPrint to stdout, continue loop
ExitShell should exitPrint “Goodbye!”, break loop
Help(String)Help text to displayPrint to stdout, continue loop
EmptyEmpty input (no-op)Continue loop silently
Error(String)Error occurredPrint to stderr, continue loop

REPL Loop Implementation

The shell implements a Read-Eval-Print Loop (REPL) using the rustyline crate for readline functionality. Here is the complete control flow:

flowchart TD
    A[Start run] --> B[Create Editor]
    B --> C[Load history file]
    C --> D[Set max history size]
    D --> E[Set confirmation handler if checkpoint available]
    E --> F[Print version banner]
    F --> G[readline with prompt]
    G --> H{Input result?}
    H -->|Ok line| I{Line empty?}
    I -->|No| J[Add to history]
    I -->|Yes| G
    J --> K[execute command]
    K --> L[process_result]
    L --> M{LoopAction?}
    M -->|Continue| G
    M -->|Exit| N[Save history]
    H -->|Ctrl+C| O[Print ^C]
    O --> G
    H -->|Ctrl+D EOF| P[Print Goodbye!]
    P --> N
    H -->|Error| Q[Print error]
    Q --> N
    N --> R[End]

Initialization Sequence

#![allow(unused)]
fn main() {
pub fn run(&mut self) -> Result<(), ShellError> {
    // 1. Create rustyline editor
    let editor: Editor<(), DefaultHistory> =
        DefaultEditor::new().map_err(|e| ShellError::Init(e.to_string()))?;
    let editor = Arc::new(Mutex::new(editor));

    // 2. Load existing history
    {
        let mut ed = editor.lock();
        if let Some(ref path) = self.config.history_file {
            let _ = ed.load_history(path);
        }
        ed.history_mut()
            .set_max_len(self.config.history_size)
            .map_err(|e| ShellError::Init(e.to_string()))?;
    }

    // 3. Set up confirmation handler for destructive operations
    {
        let router = self.router.read();
        if router.has_checkpoint() {
            let handler = Arc::new(ShellConfirmationHandler::new(Arc::clone(&editor)));
            drop(router);
            let router = self.router.write();
            if let Err(e) = router.set_confirmation_handler(handler) {
                eprintln!("Warning: Failed to set confirmation handler: {e}");
            }
        }
    }

    println!("Neumann Database Shell v{}", Self::version());
    println!("Type 'help' for available commands.\n");

    // 4. Main REPL loop
    loop {
        let readline_result = {
            let mut ed = editor.lock();
            ed.readline(&self.config.prompt)
        };

        match readline_result {
            Ok(line) => {
                if !line.trim().is_empty() {
                    let mut ed = editor.lock();
                    let _ = ed.add_history_entry(line.trim());
                }
                if Self::process_result(&self.execute(&line)) == LoopAction::Exit {
                    break;
                }
            },
            Err(ReadlineError::Interrupted) => println!("^C"),
            Err(ReadlineError::Eof) => {
                println!("Goodbye!");
                break;
            },
            Err(err) => {
                eprintln!("Error: {err}");
                break;
            },
        }
    }

    // 5. Save history on exit
    if let Some(ref path) = self.config.history_file {
        let mut ed = editor.lock();
        let _ = ed.save_history(path);
    }
    Ok(())
}
}

Command Execution Flow

flowchart TD
    A[execute input] --> B{Trim empty?}
    B -->|Yes| C[Return Empty]
    B -->|No| D[Convert to lowercase]
    D --> E{Built-in command?}
    E -->|exit/quit/\q| F[Return Exit]
    E -->|help/\h/\?| G[Return Help]
    E -->|tables/\dt| H[list_tables]
    E -->|clear/\c| I[Return ANSI clear]
    E -->|wal status| J[handle_wal_status]
    E -->|wal truncate| K[handle_wal_truncate]
    E -->|No match| L{Prefix match?}
    L -->|save compressed| M[handle_save_compressed]
    L -->|save| N[handle_save]
    L -->|load| O[handle_load]
    L -->|vault init| P[handle_vault_init]
    L -->|vault identity| Q[handle_vault_identity]
    L -->|cache init| R[handle_cache_init]
    L -->|cluster connect| S[handle_cluster_connect]
    L -->|cluster disconnect| T[handle_cluster_disconnect]
    L -->|None| U[router.execute_parsed]
    U --> V{Result?}
    V -->|Ok| W{is_write_command?}
    W -->|Yes| X{WAL active?}
    X -->|Yes| Y[wal.append]
    Y --> Z[Return Output]
    X -->|No| Z
    W -->|No| Z
    V -->|Err| AA[Return Error]

Usage Examples

Shell Creation

#![allow(unused)]
fn main() {
use neumann_shell::{Shell, ShellConfig};

// Default configuration
let shell = Shell::new();

// Custom configuration
let config = ShellConfig {
    history_file: Some("/custom/path/.neumann_history".into()),
    history_size: 500,
    prompt: "neumann> ".to_string(),
};
let shell = Shell::with_config(config);
}

Running the REPL

#![allow(unused)]
fn main() {
shell.run()?;
}

Programmatic Execution

#![allow(unused)]
fn main() {
use neumann_shell::CommandResult;

match shell.execute("SELECT * FROM users") {
    CommandResult::Output(text) => println!("{}", text),
    CommandResult::Error(err) => eprintln!("Error: {}", err),
    CommandResult::Exit => println!("Goodbye!"),
    CommandResult::Help(text) => println!("{}", text),
    CommandResult::Empty => {},
}
}

Direct Router Access

The shell provides thread-safe access to the underlying Query Router:

#![allow(unused)]
fn main() {
// Read-only access
let router_guard = shell.router();
let tables = router_guard.list_tables();

// Mutable access
let mut router_guard = shell.router_mut();
router_guard.init_vault(&key)?;

// Get Arc clone for shared ownership
let router_arc = shell.router_arc();
}

Built-in Commands

CommandAliasesDescription
help\h, \?Show help message
exitquit, \qExit the shell
tables\dtList all tables
clear\cClear the screen (ANSI escape: \x1B[2J\x1B[H)
save 'path'Save database snapshot to file
save compressed 'path'Save compressed snapshot (int8 quantization)
load 'path'Load database snapshot from file (auto-detects format)
wal statusShow write-ahead log status
wal truncateClear the write-ahead log
vault initInitialize vault from NEUMANN_VAULT_KEY environment variable
vault identity 'name'Set current identity for vault access control
cache initInitialize semantic cache with default configuration
cluster connectConnect to cluster with specified node addresses
cluster disconnectDisconnect from cluster

Command Parsing Details

All built-in commands are case-insensitive. The shell first converts input to lowercase before matching:

#![allow(unused)]
fn main() {
let lower = trimmed.to_lowercase();
match lower.as_str() {
    "exit" | "quit" | "\\q" => return CommandResult::Exit,
    "help" | "\\h" | "\\?" => return CommandResult::Help(Self::help_text()),
    "tables" | "\\dt" => return self.list_tables(),
    "clear" | "\\c" => return CommandResult::Output("\x1B[2J\x1B[H".to_string()),
    "wal status" => return self.handle_wal_status(),
    "wal truncate" => return self.handle_wal_truncate(),
    _ => {},
}
}

Path Extraction

The extract_path function handles both quoted and unquoted paths:

#![allow(unused)]
fn main() {
fn extract_path(input: &str, command: &str) -> Option<String> {
    let rest = input[command.len()..].trim();
    if rest.is_empty() {
        return None;
    }

    // Handle quoted path (single or double quotes)
    if (rest.starts_with('\'') && rest.ends_with('\''))
        || (rest.starts_with('"') && rest.ends_with('"'))
    {
        if rest.len() > 2 {
            return Some(rest[1..rest.len() - 1].to_string());
        }
        return None;
    }

    // Handle unquoted path
    Some(rest.to_string())
}
}

Examples:

  • save 'foo.bin' -> Some("foo.bin")
  • LOAD "bar.bin" -> Some("bar.bin")
  • save /path/to/file.bin -> Some("/path/to/file.bin")
  • save '' -> None
  • save -> None

Query Support

The shell supports all query types from the Query Router:

Relational (SQL)

CREATE TABLE users (id INT, name TEXT, email TEXT)
INSERT INTO users VALUES (1, 'Alice', 'alice@example.com')
SELECT * FROM users WHERE id = 1
UPDATE users SET name = 'Bob' WHERE id = 1
DELETE FROM users WHERE id = 1
DROP TABLE users

Graph

NODE CREATE person {name: 'Alice', age: 30}
NODE LIST [label]
NODE GET id
EDGE CREATE node1 -> node2 : label [{props}]
EDGE LIST [type]
EDGE GET id
NEIGHBORS node_id OUTGOING|INCOMING|BOTH [: label]
PATH node1 -> node2 [LIMIT n]

Vector

EMBED STORE 'key' [vector values]
EMBED GET 'key'
EMBED DELETE 'key'
SIMILAR 'key' [COSINE|EUCLIDEAN|DOT_PRODUCT] LIMIT n

Unified (Cross-Engine)

FIND NODE [label] [WHERE condition] [LIMIT n]
FIND EDGE [type] [WHERE condition] [LIMIT n]

Blob Storage

BLOB PUT 'path' [CHUNK size] [TAGS 'a','b'] [FOR 'entity']
BLOB GET 'id' TO 'path'
BLOB DELETE 'id'
BLOB INFO 'id'
BLOB LINK 'id' TO 'entity'
BLOB UNLINK 'id' FROM 'entity'
BLOBS
BLOBS FOR 'entity'
BLOBS BY TAG 'tag'

Vault (Secrets)

VAULT INIT
VAULT IDENTITY 'node:name'
VAULT SET 'key' 'value'
VAULT GET 'key'
VAULT DELETE 'key'
VAULT LIST 'pattern'
VAULT ROTATE 'key' 'new'
VAULT GRANT 'entity' ON 'key'
VAULT REVOKE 'entity' ON 'key'

Cache (LLM Responses)

CACHE INIT
CACHE STATS
CACHE CLEAR
CACHE EVICT [n]
CACHE GET 'key'
CACHE PUT 'key' 'value'

Checkpoints (Rollback)

CHECKPOINT
CHECKPOINT 'name'
CHECKPOINTS
CHECKPOINTS LIMIT n
ROLLBACK TO 'name-or-id'

Write-Ahead Log (WAL)

The shell includes a write-ahead log for crash recovery. When enabled, all write commands are logged to a file that can be replayed after loading a snapshot.

WAL Data Structure

#![allow(unused)]
fn main() {
struct Wal {
    file: File,    // Open file handle for appending
    path: PathBuf, // Path to WAL file (derived from snapshot: data.bin -> data.log)
}

impl Wal {
    fn open_append(path: &Path) -> std::io::Result<Self>;
    fn append(&mut self, cmd: &str) -> std::io::Result<()>;  // Writes line + flush
    fn truncate(&mut self) -> std::io::Result<()>;           // Recreates empty file
    fn path(&self) -> &Path;
    fn size(&self) -> std::io::Result<u64>;
}
}

WAL File Format

The WAL is a simple text file with one command per line. Each command is written verbatim followed by a newline and an immediate flush:

INSERT INTO users VALUES (1, 'Alice')
NODE CREATE person {name: 'Bob'}
EMBED STORE 'doc1' [0.1, 0.2, 0.3]

Format details:

  • Line-delimited plain text
  • UTF-8 encoded
  • Each line is the exact command string
  • Flushed immediately after each write for durability
  • Empty lines are skipped during replay

WAL Lifecycle

stateDiagram-v2
    [*] --> Inactive: Shell created
    Inactive --> Active: LOAD 'snapshot.bin'
    Active --> Active: Write command logged
    Active --> Active: Read command (no log)
    Active --> Empty: SAVE 'snapshot.bin'
    Empty --> Active: Write command
    Active --> Empty: WAL TRUNCATE
    Active --> [*]: Shell exits

Write Command Detection

The is_write_command function determines which commands should be logged to the WAL:

#![allow(unused)]
fn main() {
fn is_write_command(cmd: &str) -> bool {
    let upper = cmd.to_uppercase();
    let first_word = upper.split_whitespace().next().unwrap_or("");

    match first_word {
        "INSERT" | "UPDATE" | "DELETE" | "CREATE" | "DROP" => true,
        "NODE" => !upper.contains("NODE GET"),
        "EDGE" => !upper.contains("EDGE GET"),
        "EMBED" => upper.contains("EMBED STORE") || upper.contains("EMBED DELETE"),
        "VAULT" => {
            upper.contains("VAULT SET")
                || upper.contains("VAULT DELETE")
                || upper.contains("VAULT ROTATE")
                || upper.contains("VAULT GRANT")
                || upper.contains("VAULT REVOKE")
        },
        "CACHE" => upper.contains("CACHE CLEAR"),
        "BLOB" => {
            upper.contains("BLOB PUT")
                || upper.contains("BLOB DELETE")
                || upper.contains("BLOB LINK")
                || upper.contains("BLOB UNLINK")
                || upper.contains("BLOB TAG")
                || upper.contains("BLOB UNTAG")
                || upper.contains("BLOB GC")
                || upper.contains("BLOB REPAIR")
                || upper.contains("BLOB META SET")
        },
        _ => false,
    }
}
}

Write commands logged to WAL:

CategoryCommands
RelationalINSERT, UPDATE, DELETE, CREATE, DROP
GraphNODE CREATE, NODE DELETE, EDGE CREATE, EDGE DELETE
VectorEMBED STORE, EMBED DELETE
VaultVAULT SET, VAULT DELETE, VAULT ROTATE, VAULT GRANT, VAULT REVOKE
CacheCACHE CLEAR
BlobBLOB PUT, BLOB DELETE, BLOB LINK, BLOB UNLINK, BLOB TAG, BLOB UNTAG, BLOB GC, BLOB REPAIR, BLOB META SET

WAL Replay Algorithm

#![allow(unused)]
fn main() {
fn replay_wal(&self, wal_path: &Path) -> Result<usize, String> {
    let file = File::open(wal_path).map_err(|e| format!("Failed to open WAL: {e}"))?;
    let reader = BufReader::new(file);

    let mut count = 0;
    for (line_num, line) in reader.lines().enumerate() {
        let cmd = line.map_err(|e| format!("Failed to read WAL line {}: {e}", line_num + 1))?;
        let cmd = cmd.trim();

        if cmd.is_empty() {
            continue;  // Skip empty lines
        }

        let result = self.router.read().execute_parsed(cmd);
        if let Err(e) = result {
            return Err(format!("WAL replay failed at line {}: {e}", line_num + 1));
        }
        count += 1;
    }

    Ok(count)
}
}

Example Session

> LOAD 'data.bin'
Loaded snapshot from: data.bin

> INSERT INTO users VALUES (1, 'Alice')
1 row affected

> -- If the shell crashes here, the INSERT is saved in data.log

> -- On next load, the WAL is automatically replayed:
> LOAD 'data.bin'
Loaded snapshot from: data.bin
Replayed 1 commands from WAL

> WAL STATUS
WAL enabled
  Path: data.log
  Size: 42 bytes

> SAVE 'data.bin'
Saved snapshot to: data.bin

> WAL STATUS
WAL enabled
  Path: data.log
  Size: 0 bytes

WAL Behavior Summary:

  • WAL is activated after LOAD (stored as <snapshot>.log)
  • All write commands (INSERT, UPDATE, DELETE, NODE CREATE, etc.) are logged
  • On subsequent LOAD, the snapshot is loaded first, then WAL is replayed
  • SAVE truncates the WAL (snapshot now contains all data)
  • WAL TRUNCATE manually clears the log without saving

Persistence Commands

Save and Load

> CREATE TABLE users (id INT, name TEXT)
OK

> INSERT INTO users VALUES (1, 'Alice')
1 row affected

> SAVE 'backup.bin'
Saved snapshot to: backup.bin

> SAVE COMPRESSED 'backup_compressed.bin'
Saved compressed snapshot to: backup_compressed.bin

> LOAD 'backup.bin'
Loaded snapshot from: backup.bin

Auto-Detection of Embedding Dimension

For compressed snapshots, the shell auto-detects the embedding dimension by sampling stored vectors:

#![allow(unused)]
fn main() {
fn detect_embedding_dimension(store: &TensorStore) -> usize {
    // Sample vectors to find dimension
    let keys = store.scan("");
    for key in keys.iter().take(100) {
        if let Ok(tensor) = store.get(key) {
            for field in tensor.keys() {
                match tensor.get(field) {
                    Some(TensorValue::Vector(v)) => return v.len(),
                    Some(TensorValue::Sparse(s)) => return s.dimension(),
                    _ => {},
                }
            }
        }
    }

    // Default to standard BERT dimension if no vectors found
    tensor_compress::CompressionDefaults::STANDARD  // 768
}
}

Compression Options:

  • SAVE: Uncompressed bincode format
  • SAVE COMPRESSED: Uses int8 quantization (4x smaller), delta encoding, and RLE
  • LOAD: Auto-detects format (works with both compressed and uncompressed)

Output Formatting

The shell converts QueryResult variants into human-readable strings through the format_result function:

#![allow(unused)]
fn main() {
fn format_result(result: &QueryResult) -> String {
    match result {
        QueryResult::Empty => "OK".to_string(),
        QueryResult::Value(s) => s.clone(),
        QueryResult::Count(n) => format_count(*n),
        QueryResult::Ids(ids) => format_ids(ids),
        QueryResult::Rows(rows) => format_rows(rows),
        QueryResult::Nodes(nodes) => format_nodes(nodes),
        QueryResult::Edges(edges) => format_edges(edges),
        QueryResult::Path(path) => format_path(path),
        QueryResult::Similar(results) => format_similar(results),
        QueryResult::Unified(unified) => unified.description.clone(),
        QueryResult::TableList(tables) => format_table_list(tables),
        QueryResult::Blob(data) => format_blob(data),
        QueryResult::ArtifactInfo(info) => format_artifact_info(info),
        QueryResult::ArtifactList(ids) => format_artifact_list(ids),
        QueryResult::BlobStats(stats) => format_blob_stats(stats),
        QueryResult::CheckpointList(checkpoints) => format_checkpoint_list(checkpoints),
        QueryResult::Chain(chain) => format_chain_result(chain),
    }
}
}

Table Formatting Algorithm (ASCII Tables)

The format_rows function implements dynamic column width calculation:

#![allow(unused)]
fn main() {
fn format_rows(rows: &[Row]) -> String {
    if rows.is_empty() {
        return "(0 rows)".to_string();
    }

    // Get column names from first row
    let columns: Vec<&String> = rows[0].values.iter().map(|(k, _)| k).collect();
    if columns.is_empty() {
        return "(0 rows)".to_string();
    }

    // Convert rows to string values
    let string_rows: Vec<Vec<String>> = rows
        .iter()
        .map(|row| {
            columns
                .iter()
                .map(|col| row.get(col).map(|v| format!("{v:?}")).unwrap_or_default())
                .collect()
        })
        .collect();

    // Calculate column widths (max of header and all cell widths)
    let mut widths: Vec<usize> = columns.iter().map(|c| c.len()).collect();
    for row in &string_rows {
        for (i, cell) in row.iter().enumerate() {
            if i < widths.len() {
                widths[i] = widths[i].max(cell.len());
            }
        }
    }

    // Build output with header, separator, and data rows
    let mut output = String::new();

    // Header
    let header: Vec<String> = columns
        .iter()
        .zip(&widths)
        .map(|(col, &w)| format!("{col:w$}"))
        .collect();
    output.push_str(&header.join(" | "));
    output.push('\n');

    // Separator
    let sep: Vec<String> = widths.iter().map(|&w| "-".repeat(w)).collect();
    output.push_str(&sep.join("-+-"));
    output.push('\n');

    // Data rows
    for row in &string_rows {
        let formatted: Vec<String> = row
            .iter()
            .zip(&widths)
            .map(|(cell, &w)| format!("{cell:w$}"))
            .collect();
        output.push_str(&formatted.join(" | "));
        output.push('\n');
    }

    let _ = write!(output, "({} rows)", rows.len());
    output
}
}

Output example:

name  | age | email
------+-----+------------------
Alice | 30  | alice@example.com
Bob   | 25  | bob@example.com
(2 rows)

Node Formatting

#![allow(unused)]
fn main() {
fn format_nodes(nodes: &[NodeResult]) -> String {
    if nodes.is_empty() {
        "(0 nodes)".to_string()
    } else {
        let lines: Vec<String> = nodes
            .iter()
            .map(|n| {
                let props: Vec<String> = n
                    .properties
                    .iter()
                    .map(|(k, v)| format!("{k}: {v}"))
                    .collect();
                if props.is_empty() {
                    format!("  [{}] {} {{}}", n.id, n.label)
                } else {
                    format!("  [{}] {} {{{}}}", n.id, n.label, props.join(", "))
                }
            })
            .collect();
        format!("Nodes:\n{}\n({} nodes)", lines.join("\n"), nodes.len())
    }
}
}

Output example:

Nodes:
  [1] person {name: Alice, age: 30}
  [2] person {name: Bob, age: 25}
(2 nodes)

Edge Formatting

Edges:
  [1] 1 -> 2 : knows
(1 edges)

Path Formatting

Path: 1 -> 3 -> 5 -> 7

Similar Embeddings Formatting

Similar:
  1. doc1 (similarity: 0.9800)
  2. doc2 (similarity: 0.9500)

Blob Formatting

Binary data handling with size threshold:

#![allow(unused)]
fn main() {
fn format_blob(data: &[u8]) -> String {
    let size = data.len();
    if size <= 256 {
        // Try to display as UTF-8 if valid
        if let Ok(s) = std::str::from_utf8(data) {
            if s.chars().all(|c| !c.is_control() || c == '\n' || c == '\t') {
                return s.to_string();
            }
        }
    }
    // Show summary for binary/large data
    format!("<binary data: {size} bytes>")
}
}

Timestamp Formatting

Relative time formatting for better readability:

#![allow(unused)]
fn main() {
fn format_timestamp(unix_secs: u64) -> String {
    let now = std::time::SystemTime::now()
        .duration_since(std::time::UNIX_EPOCH)
        .map(|d| d.as_secs())
        .unwrap_or(0);

    if unix_secs == 0 {
        return "unknown".to_string();
    }

    let diff = now.saturating_sub(unix_secs);

    if diff < 60 {
        format!("{diff}s ago")
    } else if diff < 3600 {
        let mins = diff / 60;
        format!("{mins}m ago")
    } else if diff < 86400 {
        let hours = diff / 3600;
        format!("{hours}h ago")
    } else {
        let days = diff / 86400;
        format!("{days}d ago")
    }
}
}

Destructive Operation Confirmation

The shell integrates with the checkpoint system to provide interactive confirmation for destructive operations:

#![allow(unused)]
fn main() {
struct ShellConfirmationHandler {
    editor: Arc<Mutex<Editor<(), DefaultHistory>>>,
}

impl ConfirmationHandler for ShellConfirmationHandler {
    fn confirm(&self, op: &DestructiveOp, preview: &OperationPreview) -> bool {
        let prompt = format_confirmation_prompt(op, preview);

        // Print the warning with sample data
        println!("\n{prompt}");

        // Ask for confirmation using readline
        let mut editor = self.editor.lock();
        editor
            .readline("Type 'yes' to proceed: ")
            .is_ok_and(|input| input.trim().eq_ignore_ascii_case("yes"))
    }
}
}

Supported destructive operations:

OperationWarning Message
DeleteWARNING: About to delete N row(s) from table 'name'
DropTableWARNING: About to drop table 'name' with N row(s)
DropIndexWARNING: About to drop index on 'column' in table 'name'
NodeDeleteWARNING: About to delete node N and M connected edge(s)
EmbedDeleteWARNING: About to delete embedding 'key'
VaultDeleteWARNING: About to delete vault secret 'key'
BlobDeleteWARNING: About to delete blob 'id' (size)
CacheClearWARNING: About to clear cache with N entries

Keyboard Shortcuts

Provided by rustyline:

ShortcutAction
Up/DownNavigate history
Ctrl+CCancel current input (prints ^C, continues loop)
Ctrl+DExit shell (EOF)
Ctrl+LClear screen
Ctrl+AMove to start of line
Ctrl+EMove to end of line
Ctrl+WDelete word backward
Ctrl+UDelete to start of line

Error Handling

Error TypeExampleOutput Stream
Parse errorError: unexpected token 'FORM' at position 12stderr
Table not foundError: table 'users' not foundstderr
Invalid queryError: unsupported operationstderr
WAL write failureCommand succeeded but WAL write failed: ...Returned as Error
WAL replay failureWAL replay failed at line N: ...Returned as Error

Errors are printed to stderr and do not exit the shell. The process_result function routes output appropriately:

#![allow(unused)]
fn main() {
pub fn process_result(result: &CommandResult) -> LoopAction {
    match result {
        CommandResult::Output(text) | CommandResult::Help(text) => {
            println!("{text}");
            LoopAction::Continue
        },
        CommandResult::Error(text) => {
            eprintln!("{text}");
            LoopAction::Continue
        },
        CommandResult::Exit => {
            println!("Goodbye!");
            LoopAction::Exit
        },
        CommandResult::Empty => LoopAction::Continue,
    }
}
}

Cluster Connectivity

Connect Command Syntax

CLUSTER CONNECT 'node_id@bind_addr' ['peer_id@peer_addr', ...]

Example:

> CLUSTER CONNECT 'node1@127.0.0.1:8001' 'node2@127.0.0.1:8002'
Cluster initialized: node1 @ 127.0.0.1:8001 with 1 peer(s)

Address Parsing

#![allow(unused)]
fn main() {
fn parse_node_address(s: &str) -> Result<(String, SocketAddr), String> {
    let parts: Vec<&str> = s.splitn(2, '@').collect();
    if parts.len() != 2 {
        return Err("Expected format 'node_id@host:port'".to_string());
    }

    let node_id = parts[0].to_string();
    let addr: SocketAddr = parts[1]
        .parse()
        .map_err(|e| format!("Invalid address '{}': {}", parts[1], e))?;

    Ok((node_id, addr))
}
}

Cluster Query Execution

The shell wraps the router for distributed query execution:

#![allow(unused)]
fn main() {
struct RouterExecutor(Arc<RwLock<QueryRouter>>);

impl QueryExecutor for RouterExecutor {
    fn execute(&self, query: &str) -> Result<Vec<u8>, String> {
        let router = self.0.read();
        router.execute_for_cluster(query)
    }
}
}

Performance Characteristics

OperationTime
Empty input2.3 ns
Help command43 ns
SELECT (100 rows)17.8 us
Format 1000 rows267 us

The shell adds negligible overhead to query execution.

Edge Cases and Gotchas

  1. Empty quoted paths: save '' returns an error, not an empty path.

  2. WAL not active by default: The WAL only becomes active after LOAD. New shells have no WAL.

  3. Case sensitivity: Built-in commands are case-insensitive, but query strings preserve case for data.

  4. History persistence: History is only saved when the shell exits normally (not on crash).

  5. ANSI codes: The clear command outputs ANSI escape sequences (\x1B[2J\x1B[H), which may not work on all terminals.

  6. Confirmation handler: Only active if checkpoint module is available when shell starts.

  7. WAL replay stops on first error: If any command fails during replay, the entire replay stops.

  8. Missing columns: When formatting rows with inconsistent columns, missing values show as empty strings.

  9. Binary blob display: Blobs over 256 bytes or with control characters show as <binary data: N bytes>.

  10. Timestamp overflow: Very old timestamps (before 1970) or 0 display as “unknown”.

User Experience Tips

  1. Use compressed snapshots for large datasets: SAVE COMPRESSED reduces file size by ~4x with minimal precision loss.

  2. Check WAL status before critical operations: Run WAL STATUS to verify recovery capability.

  3. Use tab completion: Rustyline provides filename completion in some contexts.

  4. Ctrl+C is safe: It only cancels the current line, not the entire session.

  5. History survives sessions: Previous commands are available across shell restarts.

  6. For scripts, use programmatic API: shell.execute() returns structured results for automation.

  7. Cluster connect before distributed operations: Ensure CLUSTER CONNECT succeeds before running distributed transactions.

Dependencies

CratePurpose
query_routerQuery execution
relational_engineRow type for formatting
tensor_storeSnapshot persistence (save/load)
tensor_compressCompressed snapshot support
tensor_checkpointCheckpoint confirmation handling
tensor_chainCluster query executor trait
rustylineReadline functionality (history, shortcuts, Ctrl+C)
parking_lotMutex and RwLock for thread-safe router access
base64Vault key decoding
  • query_router: The Query Router executes all queries. The shell delegates all query parsing and execution to this module.
  • tensor_store: Provides the underlying storage layer and snapshot functionality.
  • tensor_compress: Handles compressed snapshot format with int8 quantization.
  • tensor_checkpoint: Provides checkpoint/rollback functionality with confirmation prompts.
  • tensor_chain: Provides cluster connectivity and distributed transaction support.

Neumann Server Architecture

The Neumann Server (neumann_server) provides a gRPC server that exposes the Neumann database over the network. It serves as the network gateway for remote clients, wrapping the Query Router with authentication, TLS encryption, and streaming support for large result sets and blob storage.

The server follows four design principles: zero-configuration startup (works out of the box with sensible defaults), security-first (API key authentication with constant-time comparison, TLS support), streaming-native (all large operations use gRPC streaming), and health monitoring (automatic failure tracking with configurable thresholds).

Architecture Overview

flowchart TD
    subgraph Clients
        CLI[neumann_client]
        gRPC[gRPC Clients]
        Web[gRPC-Web Browsers]
    end

    subgraph NeumannServer
        QS[QueryService]
        BS[BlobService]
        HS[HealthService]
        RS[ReflectionService]
        Auth[Auth Middleware]
        TLS[TLS Layer]
    end

    subgraph Backend
        QR[QueryRouter]
        Blob[BlobStore]
    end

    CLI --> TLS
    gRPC --> TLS
    Web --> TLS
    TLS --> Auth
    Auth --> QS
    Auth --> BS
    Auth --> HS
    QS --> QR
    BS --> Blob
    RS --> |Service Discovery| gRPC

Key Types

TypeDescription
NeumannServerMain server struct with router, blob store, and configuration
ServerConfigConfiguration for bind address, TLS, auth, and limits
TlsConfigTLS certificate paths and client certificate settings
AuthConfigAPI key list, header name, and anonymous access control
ApiKeyIndividual API key with identity and optional description
QueryServiceImplgRPC service for query execution with streaming
BlobServiceImplgRPC service for artifact storage with streaming
HealthServiceImplgRPC service for health checks
HealthStateShared health state across services
ServerErrorError type for server operations

Server Configuration

FieldTypeDefaultDescription
bind_addrSocketAddr127.0.0.1:9200Server bind address
tlsOption<TlsConfig>NoneTLS configuration
authOption<AuthConfig>NoneAuthentication configuration
max_message_sizeusize64 MBMaximum gRPC message size
max_upload_sizeusize512 MBMaximum blob upload size
enable_grpc_webbooltrueEnable gRPC-web for browsers
enable_reflectionbooltrueEnable service reflection
blob_chunk_sizeusize64 KBChunk size for blob streaming
stream_channel_capacityusize32Bounded channel capacity for backpressure

Configuration Builder

#![allow(unused)]
fn main() {
use neumann_server::{ServerConfig, TlsConfig, AuthConfig, ApiKey};
use std::path::PathBuf;

let config = ServerConfig::new()
    .with_bind_addr("0.0.0.0:9443".parse()?)
    .with_tls(TlsConfig::new(
        PathBuf::from("server.crt"),
        PathBuf::from("server.key"),
    ))
    .with_auth(
        AuthConfig::new()
            .with_api_key(ApiKey::new(
                "sk-prod-key-12345678".to_string(),
                "service:backend".to_string(),
            ))
            .with_anonymous(false)
    )
    .with_max_message_size(128 * 1024 * 1024)
    .with_grpc_web(true)
    .with_reflection(true);
}

TLS Configuration

FieldTypeDefaultDescription
cert_pathPathBufRequiredPath to certificate file (PEM)
key_pathPathBufRequiredPath to private key file (PEM)
ca_cert_pathOption<PathBuf>NoneCA certificate for client auth
require_client_certboolfalseRequire client certificates

TLS Setup Example

#![allow(unused)]
fn main() {
use neumann_server::TlsConfig;
use std::path::PathBuf;

// Basic TLS
let tls = TlsConfig::new(
    PathBuf::from("/etc/neumann/server.crt"),
    PathBuf::from("/etc/neumann/server.key"),
);

// Mutual TLS (mTLS)
let tls = TlsConfig::new(
    PathBuf::from("/etc/neumann/server.crt"),
    PathBuf::from("/etc/neumann/server.key"),
)
.with_ca_cert(PathBuf::from("/etc/neumann/ca.crt"))
.with_required_client_cert(true);
}

Authentication

AuthConfig Options

FieldTypeDefaultDescription
api_keysVec<ApiKey>EmptyList of valid API keys
api_key_headerStringx-api-keyHeader name for API key
allow_anonymousboolfalseAllow unauthenticated access

API Key Validation

The server uses constant-time comparison to prevent timing attacks. All keys are checked regardless of match status to avoid leaking information about valid prefixes:

#![allow(unused)]
fn main() {
// Internal validation logic
fn validate_key(&self, key: &str) -> Option<&str> {
    let key_bytes = key.as_bytes();
    let mut found_identity: Option<&str> = None;

    for api_key in &self.api_keys {
        let stored_bytes = api_key.key.as_bytes();
        let max_len = stored_bytes.len().max(key_bytes.len());

        let mut matches: u8 = 1;
        for i in 0..max_len {
            let stored_byte = stored_bytes.get(i).copied().unwrap_or(0);
            let key_byte = key_bytes.get(i).copied().unwrap_or(0);
            matches &= u8::from(stored_byte == key_byte);
        }

        let lengths_match = u8::from(stored_bytes.len() == key_bytes.len());
        matches &= lengths_match;

        if matches == 1 {
            found_identity = Some(api_key.identity.as_str());
        }
    }

    found_identity
}
}

Authentication Flow

flowchart TD
    A[Request arrives] --> B{Auth configured?}
    B -->|No| C[Allow with no identity]
    B -->|Yes| D{API key header present?}
    D -->|No| E{Anonymous allowed?}
    E -->|Yes| C
    E -->|No| F[Return UNAUTHENTICATED]
    D -->|Yes| G{Key valid?}
    G -->|Yes| H[Allow with identity from key]
    G -->|No| F

gRPC Services

QueryService

The QueryService provides query execution with three RPC methods:

MethodTypeDescription
ExecuteUnaryExecute single query, return full result
ExecuteStreamServer streamingExecute query, stream results chunk by chunk
ExecuteBatchUnaryExecute multiple queries, return all results

Execute RPC

rpc Execute(QueryRequest) returns (QueryResponse);

message QueryRequest {
    string query = 1;
    optional string identity = 2;
}

message QueryResponse {
    oneof result {
        EmptyResult empty = 1;
        CountResult count = 2;
        RowsResult rows = 3;
        NodesResult nodes = 4;
        EdgesResult edges = 5;
        PathResult path = 6;
        SimilarResult similar = 7;
        TableListResult table_list = 8;
        BlobResult blob = 9;
        IdsResult ids = 10;
    }
    optional ErrorInfo error = 15;
}

ExecuteStream RPC

For large result sets (rows, nodes, edges, similar items, blobs), the streaming RPC sends results one item at a time:

rpc ExecuteStream(QueryRequest) returns (stream QueryResponseChunk);

message QueryResponseChunk {
    oneof chunk {
        RowChunk row = 1;
        NodeChunk node = 2;
        EdgeChunk edge = 3;
        SimilarChunk similar_item = 4;
        bytes blob_data = 5;
        ErrorInfo error = 15;
    }
    bool is_final = 16;
}

ExecuteBatch RPC

rpc ExecuteBatch(BatchQueryRequest) returns (BatchQueryResponse);

message BatchQueryRequest {
    repeated QueryRequest queries = 1;
}

message BatchQueryResponse {
    repeated QueryResponse results = 1;
}

Security Note: In batch execution, the authenticated request identity is always used. Per-query identity fields are ignored to prevent privilege escalation attacks.

BlobService

The BlobService provides artifact storage with streaming upload/download:

MethodTypeDescription
UploadClient streamingUpload artifact with metadata
DownloadServer streamingDownload artifact in chunks
DeleteUnaryDelete artifact
GetMetadataUnaryGet artifact metadata

Upload Protocol

sequenceDiagram
    participant C as Client
    participant S as BlobService

    C->>S: UploadMetadata (filename, content_type, tags)
    C->>S: Chunk 1
    C->>S: Chunk 2
    C->>S: ...
    C->>S: Chunk N (end stream)
    S->>C: UploadResponse (artifact_id, size, checksum)

The first message must be metadata, followed by data chunks:

rpc Upload(stream BlobUploadRequest) returns (BlobUploadResponse);

message BlobUploadRequest {
    oneof request {
        UploadMetadata metadata = 1;
        bytes chunk = 2;
    }
}

message UploadMetadata {
    string filename = 1;
    optional string content_type = 2;
    repeated string tags = 3;
}

message BlobUploadResponse {
    string artifact_id = 1;
    uint64 size = 2;
    string checksum = 3;
}

Download Protocol

rpc Download(BlobDownloadRequest) returns (stream BlobDownloadChunk);

message BlobDownloadRequest {
    string artifact_id = 1;
}

message BlobDownloadChunk {
    bytes data = 1;
    bool is_final = 2;
}

HealthService

The HealthService follows the gRPC health checking protocol:

rpc Check(HealthCheckRequest) returns (HealthCheckResponse);

message HealthCheckRequest {
    optional string service = 1;
}

message HealthCheckResponse {
    ServingStatus status = 1;
}

enum ServingStatus {
    UNSPECIFIED = 0;
    SERVING = 1;
    NOT_SERVING = 2;
}

Health Check Targets

Service NameChecks
Empty or ""Overall server health (all services)
neumann.v1.QueryServiceQuery service health
neumann.v1.BlobServiceBlob service health
Unknown serviceReturns UNSPECIFIED

Automatic Health Tracking

The QueryService tracks consecutive failures and marks itself unhealthy after reaching the threshold (default: 5 failures):

#![allow(unused)]
fn main() {
const FAILURE_THRESHOLD: u32 = 5;

fn record_failure(&self) {
    let failures = self.consecutive_failures.fetch_add(1, Ordering::SeqCst) + 1;
    if failures >= FAILURE_THRESHOLD {
        if let Some(ref health) = self.health_state {
            health.set_query_service_healthy(false);
        }
    }
}

fn record_success(&self) {
    self.consecutive_failures.store(0, Ordering::SeqCst);
    if let Some(ref health) = self.health_state {
        health.set_query_service_healthy(true);
    }
}
}

Server Lifecycle

Startup Sequence

flowchart TD
    A[Create NeumannServer] --> B[Validate configuration]
    B --> C{TLS configured?}
    C -->|Yes| D[Load certificates]
    C -->|No| E[Plain TCP]
    D --> F[Build TLS config]
    F --> G[Create services]
    E --> G
    G --> H{gRPC-web enabled?}
    H -->|Yes| I[Add gRPC-web layer]
    H -->|No| J[Standard gRPC]
    I --> K{Reflection enabled?}
    J --> K
    K -->|Yes| L[Add reflection service]
    K -->|No| M[Start serving]
    L --> M
    M --> N[Accept connections]

Basic Server Setup

use neumann_server::{NeumannServer, ServerConfig};
use query_router::QueryRouter;
use std::sync::Arc;
use parking_lot::RwLock;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create router
    let router = Arc::new(RwLock::new(QueryRouter::new()));

    // Create server with default config
    let server = NeumannServer::new(router, ServerConfig::default());

    // Start serving (blocks until shutdown)
    server.serve().await?;

    Ok(())
}

Server with Shared Storage

For applications that need both query and blob services sharing the same storage:

use neumann_server::{NeumannServer, ServerConfig};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = ServerConfig::default();

    // Creates QueryRouter and BlobStore sharing the same TensorStore
    let server = NeumannServer::with_shared_storage(config).await?;

    server.serve().await?;

    Ok(())
}

Graceful Shutdown

use tokio::signal;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let server = NeumannServer::with_shared_storage(ServerConfig::default()).await?;

    // Shutdown on Ctrl+C
    server.serve_with_shutdown(signal::ctrl_c().map(|_| ())).await?;

    Ok(())
}

Error Handling

Server Errors

ErrorCausegRPC Status
ConfigInvalid configurationINVALID_ARGUMENT
TransportNetwork/TLS failureUNAVAILABLE
QueryQuery execution failedINVALID_ARGUMENT
AuthAuthentication failedUNAUTHENTICATED
BlobBlob operation failedINTERNAL
InternalUnexpected server errorINTERNAL
InvalidArgumentBad request dataINVALID_ARGUMENT
NotFoundResource not foundNOT_FOUND
PermissionDeniedAccess deniedPERMISSION_DENIED
IoI/O errorINTERNAL

Error Conversion

#![allow(unused)]
fn main() {
impl From<ServerError> for Status {
    fn from(err: ServerError) -> Self {
        match &err {
            ServerError::Config(msg) => Status::invalid_argument(msg),
            ServerError::Transport(e) => Status::unavailable(e.to_string()),
            ServerError::Query(msg) => Status::invalid_argument(msg),
            ServerError::Auth(msg) => Status::unauthenticated(msg),
            ServerError::Blob(msg) => Status::internal(msg),
            ServerError::Internal(msg) => Status::internal(msg),
            ServerError::InvalidArgument(msg) => Status::invalid_argument(msg),
            ServerError::NotFound(msg) => Status::not_found(msg),
            ServerError::PermissionDenied(msg) => Status::permission_denied(msg),
            ServerError::Io(e) => Status::internal(e.to_string()),
        }
    }
}
}

Backpressure and Flow Control

Streaming Backpressure

The server uses bounded channels for streaming responses to prevent memory exhaustion:

#![allow(unused)]
fn main() {
// Default: 32 items buffered
let (tx, rx) = mpsc::channel(self.stream_channel_capacity);

tokio::spawn(async move {
    for item in results {
        // This will block if channel is full, providing backpressure
        if tx.send(Ok(item)).await.is_err() {
            // Receiver dropped, stop sending
            return;
        }
    }
});
}

Upload Size Limits

The BlobService enforces upload size limits:

#![allow(unused)]
fn main() {
if data.len().saturating_add(chunk.len()) > max_size {
    return Err(Status::resource_exhausted(format!(
        "upload exceeds maximum size of {max_size} bytes"
    )));
}
}

Production Deployment

#![allow(unused)]
fn main() {
let config = ServerConfig::new()
    .with_bind_addr("0.0.0.0:9443".parse()?)
    .with_tls(TlsConfig::new(
        PathBuf::from("/etc/neumann/tls/server.crt"),
        PathBuf::from("/etc/neumann/tls/server.key"),
    ))
    .with_auth(
        AuthConfig::new()
            .with_api_key(ApiKey::new(
                std::env::var("NEUMANN_API_KEY")?,
                "service:default".to_string(),
            ))
            .with_anonymous(false)
    )
    .with_max_message_size(64 * 1024 * 1024)
    .with_max_upload_size(1024 * 1024 * 1024)  // 1GB
    .with_stream_channel_capacity(64)
    .with_grpc_web(true)
    .with_reflection(false);  // Disable in production
}

Health Check Integration

Use health checks with load balancers:

# grpcurl health check
grpcurl -plaintext localhost:9200 neumann.v1.Health/Check

# With service name
grpcurl -plaintext -d '{"service":"neumann.v1.QueryService"}' \
    localhost:9200 neumann.v1.Health/Check

Logging

The server uses the tracing crate for structured logging:

#![allow(unused)]
fn main() {
use tracing_subscriber::FmtSubscriber;

let subscriber = FmtSubscriber::builder()
    .with_max_level(tracing::Level::INFO)
    .finish();
tracing::subscriber::set_global_default(subscriber)?;

// Server logs connection info and errors
// INFO: Starting Neumann gRPC server with TLS on 0.0.0.0:9443
// ERROR: Query execution error: table 'users' not found
}

Dependencies

CratePurpose
query_routerQuery execution backend
tensor_blobBlob storage backend
tensor_storeShared storage for both query and blob
tonicgRPC server framework
tonic-webgRPC-web layer for browser support
tonic-reflectionService reflection for debugging
tokioAsync runtime
parking_lotThread-safe router access
tracingStructured logging
thiserrorError type derivation
ModuleRelationship
neumann_clientClient SDK for connecting to this server
query_routerQuery execution backend
tensor_blobBlob storage backend
neumann_shellInteractive CLI (alternative interface)

Neumann Client Architecture

The Neumann Client (neumann_client) provides a Rust SDK for interacting with the Neumann database. It supports two modes: embedded mode for in-process database access via the Query Router, and remote mode for network access via gRPC to a Neumann Server.

The client follows four design principles: dual-mode flexibility (same API for embedded and remote), security-first (API keys are zeroized on drop), async-native (built on tokio for remote operations), and zero-copy where possible (streaming results for large datasets).

Architecture Overview

flowchart TD
    subgraph Application
        App[User Application]
    end

    subgraph NeumannClient
        Client[NeumannClient]
        Builder[ClientBuilder]
        Config[ClientConfig]
    end

    subgraph EmbeddedMode
        Router[QueryRouter]
        Store[TensorStore]
    end

    subgraph RemoteMode
        gRPC[gRPC Channel]
        TLS[TLS Layer]
    end

    subgraph NeumannServer
        Server[NeumannServer]
    end

    App --> Builder
    Builder --> Client
    Client -->|embedded| Router
    Router --> Store
    Client -->|remote| gRPC
    gRPC --> TLS
    TLS --> Server

Key Types

TypeDescription
NeumannClientMain client struct supporting both embedded and remote modes
ClientBuilderFluent builder for remote client connections
ClientConfigConfiguration for remote connections (address, API key, TLS)
ClientModeEnum: Embedded or Remote
ClientErrorError type for client operations
RemoteQueryResultWrapper for proto query response with typed accessors
QueryResultRe-export of query_router result type (embedded mode)

Client Modes

ModeFeature FlagUse Case
EmbeddedembeddedIn-process database, unit testing, CLI tools
Remoteremote (default)Production gRPC connections to server
FullfullBoth modes available

Feature Flags

[dependencies]
# Remote only (default)
neumann_client = "0.1"

# Embedded only
neumann_client = { version = "0.1", default-features = false, features = ["embedded"] }

# Both modes
neumann_client = { version = "0.1", features = ["full"] }

Client Configuration

FieldTypeDefaultDescription
addressStringlocalhost:9200Server address (host:port)
api_keyOption<String>NoneAPI key for authentication
tlsboolfalseEnable TLS encryption
timeout_msu6430000Request timeout in milliseconds

Security: API Key Zeroization

API keys are automatically zeroed from memory when the configuration is dropped to prevent credential leakage:

#![allow(unused)]
fn main() {
impl Drop for ClientConfig {
    fn drop(&mut self) {
        if let Some(ref mut key) = self.api_key {
            key.zeroize();  // Overwrites memory with zeros
        }
    }
}
}

Remote Mode

Connection Builder

The ClientBuilder provides a fluent API for configuring remote connections:

#![allow(unused)]
fn main() {
use neumann_client::NeumannClient;
use std::time::Duration;

// Minimal connection
let client = NeumannClient::connect("localhost:9200")
    .build()
    .await?;

// Full configuration
let client = NeumannClient::connect("db.example.com:9443")
    .api_key("sk-production-key")
    .with_tls()
    .timeout_ms(60_000)
    .build()
    .await?;
}

Connection Flow

sequenceDiagram
    participant App as Application
    participant Builder as ClientBuilder
    participant Client as NeumannClient
    participant Channel as gRPC Channel
    participant Server as NeumannServer

    App->>Builder: connect("address")
    App->>Builder: api_key("key")
    App->>Builder: with_tls()
    App->>Builder: build().await
    Builder->>Channel: Create endpoint
    Channel->>Server: TCP/TLS handshake
    Server-->>Channel: Connection established
    Channel-->>Builder: Channel ready
    Builder-->>Client: NeumannClient created
    Client-->>App: Ready for queries

Query Execution

#![allow(unused)]
fn main() {
// Single query
let result = client.execute("SELECT * FROM users").await?;

// With identity (for vault access)
let result = client
    .execute_with_identity("VAULT GET 'secret'", Some("service:backend"))
    .await?;

// Batch queries
let results = client
    .execute_batch(&[
        "CREATE TABLE orders (id:int, total:float)",
        "INSERT orders id=1, total=99.99",
        "SELECT orders",
    ])
    .await?;
}

RemoteQueryResult Accessors

The RemoteQueryResult wrapper provides typed access to query results:

#![allow(unused)]
fn main() {
let result = client.execute("SELECT * FROM users").await?;

// Check for errors
if result.has_error() {
    eprintln!("Error: {}", result.error_message().unwrap());
    return Err(...);
}

// Check result type
if result.is_empty() {
    println!("No results");
}

// Access typed data
if let Some(count) = result.count() {
    println!("Count: {}", count);
}

if let Some(rows) = result.rows() {
    for row in rows {
        println!("Row ID: {}", row.id);
    }
}

if let Some(nodes) = result.nodes() {
    for node in nodes {
        println!("Node: {} ({})", node.id, node.label);
    }
}

if let Some(edges) = result.edges() {
    for edge in edges {
        println!("Edge: {} -> {}", edge.from, edge.to);
    }
}

if let Some(similar) = result.similar() {
    for item in similar {
        println!("{}: {:.4}", item.key, item.score);
    }
}

// Access raw proto response
let proto = result.into_inner();
}

Blocking Connection

For synchronous contexts, use the blocking builder:

#![allow(unused)]
fn main() {
let client = NeumannClient::connect("localhost:9200")
    .api_key("test-key")
    .build_blocking()?;  // Creates temporary tokio runtime
}

Embedded Mode

Creating an Embedded Client

#![allow(unused)]
fn main() {
use neumann_client::NeumannClient;

// New embedded database
let client = NeumannClient::embedded()?;

// With custom router (for shared state)
use query_router::QueryRouter;
use std::sync::Arc;
use parking_lot::RwLock;

let router = Arc::new(RwLock::new(QueryRouter::new()));
let client = NeumannClient::with_router(router);
}

Synchronous Query Execution

Embedded mode provides synchronous execution for simpler code flow:

#![allow(unused)]
fn main() {
use neumann_client::QueryResult;

// Create table
let result = client.execute_sync("CREATE TABLE users (name:string, age:int)")?;
assert!(matches!(result, QueryResult::Empty));

// Insert data
let result = client.execute_sync("INSERT users name=\"Alice\", age=30")?;

// Query data
let result = client.execute_sync("SELECT users")?;
match result {
    QueryResult::Rows(rows) => {
        for row in rows {
            println!("{:?}", row);
        }
    }
    _ => {}
}
}

With Identity

#![allow(unused)]
fn main() {
// Set identity for vault access control
let result = client.execute_sync_with_identity(
    "VAULT GET 'api_secret'",
    Some("service:backend"),
)?;
}

Error Handling

Error Types

ErrorCodeRetryableDescription
Connection6YesFailed to connect to server
Query9NoQuery execution failed
Authentication5NoInvalid API key
PermissionDenied3NoAccess denied
NotFound2NoResource not found
InvalidArgument1NoBad request data
Parse8NoQuery parse error
Internal7NoServer internal error
Timeout6YesRequest timed out
Unavailable6YesServer unavailable

Error Methods

#![allow(unused)]
fn main() {
let err = ClientError::Connection("connection refused".to_string());

// Get error code
let code = err.code();  // 6

// Check if retryable
if err.is_retryable() {
    // Retry with exponential backoff
}

// Display error
eprintln!("Error: {}", err);  // "connection error: connection refused"
}

Error Handling Pattern

#![allow(unused)]
fn main() {
use neumann_client::ClientError;

match client.execute("SELECT * FROM users").await {
    Ok(result) => {
        if result.has_error() {
            // Query-level error (e.g., table not found)
            eprintln!("Query error: {}", result.error_message().unwrap());
        } else {
            // Process results
        }
    }
    Err(ClientError::Connection(msg)) => {
        // Network error - maybe retry
        eprintln!("Connection failed: {}", msg);
    }
    Err(ClientError::Authentication(msg)) => {
        // Bad credentials - check API key
        eprintln!("Auth failed: {}", msg);
    }
    Err(ClientError::Timeout(msg)) => {
        // Request too slow - maybe retry with longer timeout
        eprintln!("Timeout: {}", msg);
    }
    Err(e) => {
        eprintln!("Unexpected error: {}", e);
    }
}
}

Conversion from gRPC Status

Remote errors are automatically converted from tonic Status:

#![allow(unused)]
fn main() {
impl From<tonic::Status> for ClientError {
    fn from(status: tonic::Status) -> Self {
        match status.code() {
            Code::InvalidArgument => Self::InvalidArgument(status.message().to_string()),
            Code::NotFound => Self::NotFound(status.message().to_string()),
            Code::PermissionDenied => Self::PermissionDenied(status.message().to_string()),
            Code::Unauthenticated => Self::Authentication(status.message().to_string()),
            Code::Unavailable => Self::Unavailable(status.message().to_string()),
            Code::DeadlineExceeded => Self::Timeout(status.message().to_string()),
            _ => Self::Internal(status.message().to_string()),
        }
    }
}
}

Connection Management

Connection State

#![allow(unused)]
fn main() {
let client = NeumannClient::connect("localhost:9200")
    .build()
    .await?;

// Check mode
match client.mode() {
    ClientMode::Embedded => println!("In-process"),
    ClientMode::Remote => println!("Connected to server"),
}

// Check connection status
if client.is_connected() {
    // Ready for queries
}
}

Closing Connections

#![allow(unused)]
fn main() {
let mut client = NeumannClient::connect("localhost:9200")
    .build()
    .await?;

// Explicit close
client.close();

// Or automatic on drop
drop(client);  // Connection closed, API key zeroized
}

Usage Examples

Complete Remote Example

use neumann_client::{NeumannClient, ClientError};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Connect to server
    let client = NeumannClient::connect("localhost:9200")
        .api_key(std::env::var("NEUMANN_API_KEY")?)
        .with_tls()
        .timeout_ms(30_000)
        .build()
        .await?;

    // Create schema
    client.execute("CREATE TABLE products (name:string, price:float)").await?;

    // Insert data
    client.execute("INSERT products name=\"Widget\", price=9.99").await?;
    client.execute("INSERT products name=\"Gadget\", price=19.99").await?;

    // Query data
    let result = client.execute("SELECT products WHERE price > 10").await?;

    if let Some(rows) = result.rows() {
        for row in rows {
            println!("Product: {:?}", row);
        }
    }

    Ok(())
}

Complete Embedded Example

use neumann_client::{NeumannClient, QueryResult};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create embedded client
    let client = NeumannClient::embedded()?;

    // Create schema
    client.execute_sync("CREATE TABLE events (name:string, timestamp:int)")?;

    // Insert data
    client.execute_sync("INSERT events name=\"login\", timestamp=1700000000")?;

    // Query data
    match client.execute_sync("SELECT events")? {
        QueryResult::Rows(rows) => {
            println!("Found {} events", rows.len());
            for row in rows {
                println!("  {:?}", row);
            }
        }
        _ => println!("Unexpected result type"),
    }

    Ok(())
}

Testing with Embedded Mode

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use neumann_client::{NeumannClient, QueryResult};

    #[test]
    fn test_user_creation() {
        let client = NeumannClient::embedded().unwrap();

        // Setup
        client
            .execute_sync("CREATE TABLE users (email:string, active:bool)")
            .unwrap();

        // Test
        client
            .execute_sync("INSERT users email=\"test@example.com\", active=true")
            .unwrap();

        // Verify
        let result = client.execute_sync("SELECT users").unwrap();
        match result {
            QueryResult::Rows(rows) => {
                assert_eq!(rows.len(), 1);
            }
            _ => panic!("Expected rows"),
        }
    }
}
}

Shared Router Between Clients

#![allow(unused)]
fn main() {
use neumann_client::NeumannClient;
use query_router::QueryRouter;
use std::sync::Arc;
use parking_lot::RwLock;

// Create shared router
let router = Arc::new(RwLock::new(QueryRouter::new()));

// Create multiple clients sharing same state
let client1 = NeumannClient::with_router(Arc::clone(&router));
let client2 = NeumannClient::with_router(Arc::clone(&router));

// Changes from client1 visible to client2
client1.execute_sync("CREATE TABLE shared (x:int)")?;
let result = client2.execute_sync("SELECT shared")?;  // Works!
}

Best Practices

Connection Reuse

Create one client and reuse it for multiple queries:

#![allow(unused)]
fn main() {
// Good: Reuse client
let client = NeumannClient::connect("localhost:9200").build().await?;
for query in queries {
    client.execute(&query).await?;
}

// Bad: New connection per query
for query in queries {
    let client = NeumannClient::connect("localhost:9200").build().await?;
    client.execute(&query).await?;
}  // Connection overhead for each query
}

Timeout Configuration

Set appropriate timeouts based on query complexity:

#![allow(unused)]
fn main() {
// Quick queries
let client = NeumannClient::connect("localhost:9200")
    .timeout_ms(5_000)  // 5 seconds
    .build()
    .await?;

// Complex analytics
let client = NeumannClient::connect("localhost:9200")
    .timeout_ms(300_000)  // 5 minutes
    .build()
    .await?;
}

API Key Security

Never hardcode API keys:

#![allow(unused)]
fn main() {
// Good: Environment variable
let api_key = std::env::var("NEUMANN_API_KEY")?;
let client = NeumannClient::connect("localhost:9200")
    .api_key(api_key)
    .build()
    .await?;

// Bad: Hardcoded key
let client = NeumannClient::connect("localhost:9200")
    .api_key("sk-secret-key-12345")  // Will be in binary!
    .build()
    .await?;
}

Dependencies

CratePurposeFeature
query_routerEmbedded mode query executionembedded
tonicgRPC clientremote
tokioAsync runtimeremote
parking_lotThread-safe router accessembedded
zeroizeSecure memory clearingAlways
thiserrorError type derivationAlways
tracingStructured loggingAlways
ModuleRelationship
neumann_serverServer counterpart for remote mode
query_routerQuery execution backend for embedded mode
neumann_shellAlternative interactive interface

TypeScript SDK Architecture

The TypeScript SDK (@neumann/client) provides a TypeScript/JavaScript client for the Neumann database with support for both Node.js (gRPC) and browser (gRPC-Web) environments.

The SDK follows four design principles: environment agnostic (same API for Node.js and browsers via dynamic imports), type-safe (full TypeScript support with discriminated unions for results), streaming-first (async iterators for large result sets), and zero dependencies in core types (proto converters are separate from type definitions).

Architecture Overview

flowchart TD
    subgraph Application
        App[TypeScript Application]
    end

    subgraph SDK["@neumann/client"]
        Client[NeumannClient]
        Types[Type Definitions]
        Errors[Error Classes]
        Converters[Proto Converters]
    end

    subgraph NodeJS[Node.js Environment]
        gRPC["@grpc/grpc-js"]
    end

    subgraph Browser[Browser Environment]
        gRPCWeb[grpc-web]
    end

    App --> Client
    Client --> Types
    Client --> Errors
    Client --> Converters
    Client -->|connect| gRPC
    Client -->|connectWeb| gRPCWeb
    gRPC --> Server[NeumannServer]
    gRPCWeb --> Server

Installation

# npm
npm install @neumann/client

# yarn
yarn add @neumann/client

# pnpm
pnpm add @neumann/client

For Node.js, also install the gRPC package:

npm install @grpc/grpc-js

For browsers, install gRPC-Web:

npm install grpc-web

Key Types

TypeDescription
NeumannClientMain client class for database operations
ConnectOptionsOptions for server connection (API key, TLS, metadata)
QueryOptionsOptions for query execution (identity)
ClientModeClient mode: 'remote' or 'embedded'
QueryResultDiscriminated union of all result types
ValueTyped scalar value with type tag
RowRelational row with column values
NodeGraph node with label and properties
EdgeGraph edge with type, source, target, properties
PathGraph path as list of segments
SimilarItemVector similarity search result
ArtifactInfoBlob artifact metadata
NeumannErrorBase error class with error code

Connection Options

FieldTypeDefaultDescription
apiKeystring?undefinedAPI key for authentication
tlsboolean?falseEnable TLS encryption
metadataRecord<string, string>?undefinedCustom metadata headers

Query Options

FieldTypeDefaultDescription
identitystring?undefinedIdentity for vault access control

Connection Methods

Node.js Connection

import { NeumannClient } from '@neumann/client';

// Basic connection
const client = await NeumannClient.connect('localhost:9200');

// With authentication and TLS
const client = await NeumannClient.connect('db.example.com:9443', {
  apiKey: process.env.NEUMANN_API_KEY,
  tls: true,
  metadata: { 'x-request-id': 'abc123' },
});

Browser Connection (gRPC-Web)

import { NeumannClient } from '@neumann/client';

// Connect via gRPC-Web
const client = await NeumannClient.connectWeb('https://api.example.com', {
  apiKey: 'your-api-key',
});

Query Execution

Single Query

const result = await client.execute('SELECT users');

// With identity for vault access
const result = await client.execute('VAULT GET "secret"', {
  identity: 'service:backend',
});

Streaming Query

For large result sets, use streaming to receive results incrementally:

for await (const chunk of client.executeStream('SELECT large_table')) {
  if (chunk.type === 'rows') {
    for (const row of chunk.rows) {
      console.log(rowToObject(row));
    }
  }
}

Batch Query

Execute multiple queries in a single request:

const results = await client.executeBatch([
  'CREATE TABLE orders (id:int, total:float)',
  'INSERT orders id=1, total=99.99',
  'SELECT orders',
]);

for (const result of results) {
  console.log(result.type);
}

Query Result Types

The QueryResult type is a discriminated union. Use the type field to determine which result type you have:

TypeFieldsDescription
'empty'-No result (DDL operations)
'value'value: stringSingle value result
'count'count: numberRow count
'rows'rows: Row[]Relational query rows
'nodes'nodes: Node[]Graph nodes
'edges'edges: Edge[]Graph edges
'paths'paths: Path[]Graph paths
'similar'items: SimilarItem[]Vector similarity results
'ids'ids: string[]List of IDs
'tableList'names: string[]Table names
'blob'data: Uint8ArrayBinary blob data
'blobInfo'info: ArtifactInfoBlob metadata
'error'code: number, message: stringError response

Type Guards

Use the provided type guards for type-safe result handling:

import {
  isRowsResult,
  isNodesResult,
  isErrorResult,
  rowToObject,
} from '@neumann/client';

const result = await client.execute('SELECT users');

if (isErrorResult(result)) {
  console.error(`Error ${result.code}: ${result.message}`);
} else if (isRowsResult(result)) {
  for (const row of result.rows) {
    console.log(rowToObject(row));
  }
}

Result Pattern Matching

const result = await client.execute(query);

switch (result.type) {
  case 'empty':
    console.log('OK');
    break;
  case 'count':
    console.log(`${result.count} rows affected`);
    break;
  case 'rows':
    console.table(result.rows.map(rowToObject));
    break;
  case 'nodes':
    result.nodes.forEach((n) => console.log(`[${n.id}] ${n.label}`));
    break;
  case 'similar':
    result.items.forEach((s) => console.log(`${s.key}: ${s.score.toFixed(4)}`));
    break;
  case 'error':
    throw new Error(result.message);
}

Value Types

Values use a tagged union pattern for type safety:

import {
  Value,
  nullValue,
  intValue,
  floatValue,
  stringValue,
  boolValue,
  bytesValue,
  valueToNative,
  valueFromNative,
} from '@neumann/client';

// Create typed values
const v1: Value = nullValue();
const v2: Value = intValue(42);
const v3: Value = floatValue(3.14);
const v4: Value = stringValue('hello');
const v5: Value = boolValue(true);
const v6: Value = bytesValue(new Uint8Array([1, 2, 3]));

// Convert to native JavaScript types
const native = valueToNative(v2); // 42

// Create from native values (auto-detects type)
const auto = valueFromNative(42); // { type: 'int', data: 42 }

Conversion Utilities

Row Conversion

import { rowToObject } from '@neumann/client';

const result = await client.execute('SELECT users');
if (result.type === 'rows') {
  const objects = result.rows.map(rowToObject);
  // [{ name: 'Alice', age: 30 }, { name: 'Bob', age: 25 }]
}

Node Conversion

import { nodeToObject } from '@neumann/client';

const result = await client.execute('NODE LIST');
if (result.type === 'nodes') {
  const objects = result.nodes.map(nodeToObject);
  // [{ id: '1', label: 'person', properties: { name: 'Alice' } }]
}

Edge Conversion

import { edgeToObject } from '@neumann/client';

const result = await client.execute('EDGE LIST');
if (result.type === 'edges') {
  const objects = result.edges.map(edgeToObject);
  // [{ id: '1', type: 'knows', source: '1', target: '2', properties: {} }]
}

Error Handling

Error Codes

CodeNameDescription
0UNKNOWNUnknown error
1INVALID_ARGUMENTBad request data
2NOT_FOUNDResource not found
3PERMISSION_DENIEDAccess denied
4ALREADY_EXISTSResource already exists
5UNAUTHENTICATEDAuthentication failed
6UNAVAILABLEServer unavailable
7INTERNALInternal server error
8PARSE_ERRORQuery parse error
9QUERY_ERRORQuery execution error

Error Classes

import {
  NeumannError,
  ConnectionError,
  AuthenticationError,
  PermissionDeniedError,
  NotFoundError,
  InvalidArgumentError,
  ParseError,
  QueryError,
  InternalError,
  errorFromCode,
} from '@neumann/client';

try {
  await client.execute('SELECT nonexistent');
} catch (e) {
  if (e instanceof ConnectionError) {
    console.error('Connection failed:', e.message);
  } else if (e instanceof AuthenticationError) {
    console.error('Auth failed - check API key');
  } else if (e instanceof ParseError) {
    console.error('Query syntax error:', e.message);
  } else if (e instanceof NeumannError) {
    console.error(`[${e.code}] ${e.message}`);
  }
}

Error Factory

Create errors from numeric codes:

import { errorFromCode, ErrorCode } from '@neumann/client';

const error = errorFromCode(ErrorCode.NOT_FOUND, 'Table not found');
// Returns NotFoundError instance

Client Lifecycle

// Create client
const client = await NeumannClient.connect('localhost:9200');

// Check connection status
console.log(client.isConnected); // true
console.log(client.clientMode); // 'remote'

// Execute queries
const result = await client.execute('SELECT users');

// Close connection when done
client.close();
console.log(client.isConnected); // false

Usage Examples

Complete CRUD Example

import { NeumannClient, isRowsResult, rowToObject } from '@neumann/client';

async function main() {
  const client = await NeumannClient.connect('localhost:9200', {
    apiKey: process.env.NEUMANN_API_KEY,
  });

  try {
    // Create table
    await client.execute('CREATE TABLE products (name:string, price:float)');

    // Insert data
    await client.execute('INSERT products name="Widget", price=9.99');
    await client.execute('INSERT products name="Gadget", price=19.99');

    // Query data
    const result = await client.execute('SELECT products WHERE price > 10');

    if (isRowsResult(result)) {
      const products = result.rows.map(rowToObject);
      console.log('Products over $10:', products);
    }

    // Update data
    await client.execute('UPDATE products SET price=24.99 WHERE name="Gadget"');

    // Delete data
    await client.execute('DELETE products WHERE price < 15');

    // Drop table
    await client.execute('DROP TABLE products');
  } finally {
    client.close();
  }
}

Graph Operations

const client = await NeumannClient.connect('localhost:9200');

// Create nodes
await client.execute('NODE CREATE person {name: "Alice", age: 30}');
await client.execute('NODE CREATE person {name: "Bob", age: 25}');

// Create edge
await client.execute('EDGE CREATE 1 -> 2 : knows {since: 2020}');

// Query nodes
const nodes = await client.execute('NODE LIST person');
if (nodes.type === 'nodes') {
  nodes.nodes.forEach((n) => {
    console.log(`[${n.id}] ${n.label}:`, nodeToObject(n).properties);
  });
}

// Find path
const path = await client.execute('PATH 1 -> 2');
if (path.type === 'paths' && path.paths.length > 0) {
  const nodeIds = path.paths[0].segments.map((s) => s.node.id);
  console.log('Path:', nodeIds.join(' -> '));
}
const client = await NeumannClient.connect('localhost:9200');

// Store embeddings
await client.execute('EMBED STORE "doc1" [0.1, 0.2, 0.3, 0.4]');
await client.execute('EMBED STORE "doc2" [0.15, 0.25, 0.35, 0.45]');
await client.execute('EMBED STORE "doc3" [0.9, 0.8, 0.7, 0.6]');

// Find similar
const result = await client.execute('SIMILAR "doc1" COSINE LIMIT 2');
if (result.type === 'similar') {
  result.items.forEach((item) => {
    console.log(`${item.key}: ${item.score.toFixed(4)}`);
  });
}

Browser Usage with React

import { useState, useEffect } from 'react';
import { NeumannClient, QueryResult } from '@neumann/client';

function useNeumannQuery(query: string) {
  const [result, setResult] = useState<QueryResult | null>(null);
  const [loading, setLoading] = useState(true);
  const [error, setError] = useState<Error | null>(null);

  useEffect(() => {
    let cancelled = false;

    async function fetchData() {
      try {
        const client = await NeumannClient.connectWeb('/api/neumann');
        const data = await client.execute(query);
        if (!cancelled) {
          setResult(data);
          setLoading(false);
        }
        client.close();
      } catch (e) {
        if (!cancelled) {
          setError(e as Error);
          setLoading(false);
        }
      }
    }

    fetchData();
    return () => {
      cancelled = true;
    };
  }, [query]);

  return { result, loading, error };
}

Proto Conversion

The SDK includes utilities for converting protobuf messages to typed objects:

FunctionDescription
convertProtoValueConvert proto Value to typed Value
convertProtoRowConvert proto Row to Row
convertProtoNodeConvert proto Node to Node
convertProtoEdgeConvert proto Edge to Edge
convertProtoPathConvert proto Path to Path
convertProtoSimilarItemConvert proto SimilarItem to SimilarItem
convertProtoArtifactInfoConvert proto ArtifactInfo to ArtifactInfo

These are used internally but exported for custom integrations.

Dependencies

PackagePurposeEnvironment
@grpc/grpc-jsgRPC clientNode.js
grpc-webgRPC-Web clientBrowser

The SDK uses dynamic imports to load the appropriate gRPC library based on the connection method used.

ModuleRelationship
neumann_serverServer that this SDK connects to
neumann_clientRust SDK with same capabilities
neumann-pyPython SDK with same API design

Python SDK Architecture

The Python SDK (neumann-db) provides a Python client for the Neumann database with support for both embedded mode (via PyO3 bindings) and remote mode (via gRPC). It includes async support and integrations for pandas and numpy.

The SDK follows four design principles: Pythonic API (context managers, type hints, dataclasses), dual-mode (same API for embedded and remote), async-first (native asyncio support), and ecosystem integration (pandas DataFrame and numpy array support).

Architecture Overview

flowchart TD
    subgraph Application
        App[Python Application]
    end

    subgraph SDK[neumann-db]
        Client[NeumannClient]
        AsyncClient[AsyncNeumannClient]
        Tx[Transaction]
        Types[Data Types]
        Errors[Error Classes]
    end

    subgraph Integrations
        Pandas[pandas Integration]
        Numpy[numpy Integration]
    end

    subgraph EmbeddedMode[Embedded Mode]
        PyO3[_native PyO3 Module]
        Router[QueryRouter]
    end

    subgraph RemoteMode[Remote Mode]
        gRPC[grpcio]
        Proto[Proto Stubs]
    end

    App --> Client
    App --> AsyncClient
    Client --> Tx
    Client --> Types
    Client --> Errors
    Client --> Pandas
    Client --> Numpy
    Client -->|embedded| PyO3
    PyO3 --> Router
    Client -->|remote| gRPC
    AsyncClient --> gRPC
    gRPC --> Proto
    Proto --> Server[NeumannServer]

Installation

# Basic installation (remote mode only)
pip install neumann-db

# With native module for embedded mode
pip install neumann-db[native]

# With pandas integration
pip install neumann-db[pandas]

# With numpy integration
pip install neumann-db[numpy]

# Full installation
pip install neumann-db[full]

Key Types

TypeDescription
NeumannClientSynchronous client supporting both modes
AsyncNeumannClientAsync client for remote mode
TransactionTransaction context manager
QueryResultQuery result with typed accessors
QueryResultTypeEnum of result types
ValueTyped scalar value
ScalarTypeEnum of scalar types
RowRelational row with typed column accessors
NodeGraph node with properties
EdgeGraph edge with properties
PathGraph path as list of segments
PathSegmentPath segment (node + optional edge)
SimilarItemVector similarity result
ArtifactInfoBlob artifact metadata
NeumannErrorBase exception class

Client Modes

ModeClass MethodRequirementsUse Case
EmbeddedNeumannClient.embedded()neumann-db[native]Testing, CLI tools
RemoteNeumannClient.connect()grpcioProduction
Async RemoteAsyncNeumannClient.connect()grpcioAsync applications

Synchronous Client

Embedded Mode

from neumann import NeumannClient

# In-memory database
client = NeumannClient.embedded()

# Persistent storage
client = NeumannClient.embedded(path="/path/to/data")

# Use as context manager
with NeumannClient.embedded() as client:
    client.execute("CREATE TABLE users (name:string)")

Remote Mode

from neumann import NeumannClient

# Basic connection
client = NeumannClient.connect("localhost:9200")

# With authentication and TLS
client = NeumannClient.connect(
    "db.example.com:9443",
    api_key="your-api-key",
    tls=True,
)

# Context manager
with NeumannClient.connect("localhost:9200") as client:
    result = client.execute("SELECT users")

Query Execution

# Single query
result = client.execute("SELECT users")

# With identity for vault access
result = client.execute(
    "VAULT GET 'secret'",
    identity="service:backend",
)

# Streaming query
for chunk in client.execute_stream("SELECT large_table"):
    for row in chunk.rows:
        print(row.to_dict())

# Batch execution
results = client.execute_batch([
    "CREATE TABLE orders (id:int, total:float)",
    "INSERT orders id=1, total=99.99",
    "SELECT orders",
])

Async Client

The async client supports remote mode only (PyO3 has threading limitations):

from neumann.aio import AsyncNeumannClient

# Connect
client = await AsyncNeumannClient.connect(
    "localhost:9200",
    api_key="your-api-key",
)

# Execute query
result = await client.execute("SELECT users")

# Streaming
async for chunk in client.execute_stream("SELECT large_table"):
    for row in chunk.rows:
        print(row.to_dict())

# Batch
results = await client.execute_batch(queries)

# Close
await client.close()

Async Context Manager

async with await AsyncNeumannClient.connect("localhost:9200") as client:
    result = await client.execute("SELECT users")
    for row in result.rows:
        print(row.to_dict())

Run Embedded in Async Context

Use run_in_executor to use embedded mode from async code:

async def query_embedded():
    client = await AsyncNeumannClient.connect("localhost:9200")
    # This runs the embedded client in a thread pool
    result = await client.run_in_executor("SELECT users")
    return result

Transaction Support

Transactions provide automatic commit/rollback with context managers:

from neumann import NeumannClient, Transaction

client = NeumannClient.connect("localhost:9200")

# Using Transaction directly
tx = Transaction(client)
tx.begin()
try:
    tx.execute("INSERT users name='Alice'")
    tx.execute("INSERT users name='Bob'")
    tx.commit()
except Exception:
    tx.rollback()
    raise

# Using context manager (preferred)
with Transaction(client) as tx:
    tx.execute("INSERT users name='Alice'")
    tx.execute("INSERT users name='Bob'")
    # Auto-commits on success, auto-rollbacks on exception

Transaction Properties

PropertyTypeDescription
is_activeboolTrue if transaction is active

Transaction Methods

MethodDescription
begin()Start the transaction
commit()Commit the transaction
rollback()Rollback the transaction
execute(query)Execute query within transaction

Query Result Types

The QueryResult class provides typed access to query results:

PropertyReturn TypeDescription
typeQueryResultTypeResult type enum
is_emptyboolTrue if empty result
is_errorboolTrue if error result
valuestr or NoneSingle value result
countint or NoneRow count
rowslist[Row]Relational rows
nodeslist[Node]Graph nodes
edgeslist[Edge]Graph edges
pathslist[Path]Graph paths
similar_itemslist[SimilarItem]Similarity results
idslist[str]ID list
table_nameslist[str]Table names
blob_databytes or NoneBinary data
blob_infoArtifactInfo or NoneBlob metadata
error_messagestr or NoneError message

Result Type Enum

from neumann import QueryResultType

result = client.execute(query)

match result.type:
    case QueryResultType.EMPTY:
        print("OK")
    case QueryResultType.COUNT:
        print(f"{result.count} rows affected")
    case QueryResultType.ROWS:
        for row in result.rows:
            print(row.to_dict())
    case QueryResultType.NODES:
        for node in result.nodes:
            print(f"[{node.id}] {node.label}")
    case QueryResultType.SIMILAR:
        for item in result.similar_items:
            print(f"{item.key}: {item.score:.4f}")
    case QueryResultType.ERROR:
        raise Exception(result.error_message)

Data Types

Value

Immutable typed scalar value:

from neumann import Value, ScalarType

# Create values
v1 = Value.null()
v2 = Value.int_(42)
v3 = Value.float_(3.14)
v4 = Value.string("hello")
v5 = Value.bool_(True)
v6 = Value.bytes_(b"data")

# Access type and data
print(v2.type)  # ScalarType.INT
print(v2.data)  # 42

# Convert to Python native type
native = v2.as_python()  # 42

Row

Relational row with typed accessors:

from neumann import Row

row = result.rows[0]

# Get raw Value
val = row.get("name")

# Get typed values
name: str | None = row.get_string("name")
age: int | None = row.get_int("age")
score: float | None = row.get_float("score")
active: bool | None = row.get_bool("active")

# Convert to dict
data = row.to_dict()  # {"name": "Alice", "age": 30}

Node

Graph node with properties:

from neumann import Node

node = result.nodes[0]

print(node.id)      # "1"
print(node.label)   # "person"

# Get property
name = node.get_property("name")

# Convert to dict
data = node.to_dict()
# {"id": "1", "label": "person", "properties": {"name": "Alice"}}

Edge

Graph edge with properties:

from neumann import Edge

edge = result.edges[0]

print(edge.id)         # "1"
print(edge.edge_type)  # "knows"
print(edge.source)     # "1"
print(edge.target)     # "2"

# Get property
since = edge.get_property("since")

# Convert to dict
data = edge.to_dict()
# {"id": "1", "type": "knows", "source": "1", "target": "2", "properties": {}}

Path

Graph path as segments:

from neumann import Path

path = result.paths[0]

# Get all nodes in path
nodes = path.nodes  # [Node, Node, ...]

# Get all edges in path
edges = path.edges  # [Edge, Edge, ...]

# Path length
length = len(path)

# Iterate segments
for segment in path.segments:
    print(f"Node: {segment.node.id}")
    if segment.edge:
        print(f"  -> via edge {segment.edge.id}")

SimilarItem

Vector similarity result:

from neumann import SimilarItem

for item in result.similar_items:
    print(f"Key: {item.key}")
    print(f"Score: {item.score:.4f}")
    if item.metadata:
        print(f"Metadata: {item.metadata}")

ArtifactInfo

Blob artifact metadata:

from neumann import ArtifactInfo

info = result.blob_info
print(f"ID: {info.artifact_id}")
print(f"Filename: {info.filename}")
print(f"Size: {info.size} bytes")
print(f"Checksum: {info.checksum}")
print(f"Content-Type: {info.content_type}")
print(f"Created: {info.created_at}")
print(f"Tags: {info.tags}")

Error Handling

Error Codes

CodeNameDescription
0UNKNOWNUnknown error
1INVALID_ARGUMENTBad request data
2NOT_FOUNDResource not found
3PERMISSION_DENIEDAccess denied
4ALREADY_EXISTSResource exists
5UNAUTHENTICATEDAuth failed
6UNAVAILABLEServer unavailable
7INTERNALInternal error
8PARSE_ERRORQuery parse error
9QUERY_ERRORQuery execution error

Error Classes

from neumann import (
    NeumannError,
    ConnectionError,
    AuthenticationError,
    PermissionError,
    NotFoundError,
    InvalidArgumentError,
    ParseError,
    QueryError,
    InternalError,
    ErrorCode,
)

try:
    result = client.execute("SELECT nonexistent")
except ConnectionError as e:
    print(f"Connection failed: {e.message}")
except AuthenticationError:
    print("Check your API key")
except ParseError as e:
    print(f"Query syntax error: {e.message}")
except NeumannError as e:
    print(f"[{e.code.name}] {e.message}")

Error Factory

from neumann.errors import error_from_code, ErrorCode

# Create error from code
error = error_from_code(ErrorCode.NOT_FOUND, "Table 'users' not found")
# Returns NotFoundError instance

Pandas Integration

Convert query results to pandas DataFrames:

from neumann.integrations.pandas import (
    result_to_dataframe,
    rows_to_dataframe,
    dataframe_to_inserts,
)

# Result to DataFrame
result = client.execute("SELECT users")
df = result_to_dataframe(result)

# Rows to DataFrame
df = rows_to_dataframe(result.rows)

# DataFrame to INSERT statements
inserts = dataframe_to_inserts(
    df,
    table="users",
    column_mapping={"user_name": "name"},  # Optional column rename
)

# Execute inserts
for query in inserts:
    client.execute(query)

NumPy Integration

Work with vectors using numpy arrays:

from neumann.integrations.numpy import (
    vector_to_insert,
    vectors_to_inserts,
    parse_embedding,
    cosine_similarity,
    euclidean_distance,
    normalize_vectors,
)
import numpy as np

# Single vector to INSERT
query = vector_to_insert("doc1", np.array([0.1, 0.2, 0.3]))
client.execute(query)

# Multiple vectors
vectors = {
    "doc1": np.array([0.1, 0.2, 0.3]),
    "doc2": np.array([0.4, 0.5, 0.6]),
}
queries = vectors_to_inserts(vectors, normalize=True)
for q in queries:
    client.execute(q)

# Parse embedding from result
embedding = parse_embedding("[0.1, 0.2, 0.3]")

# Distance calculations
sim = cosine_similarity(vec1, vec2)
dist = euclidean_distance(vec1, vec2)

# Batch normalization
normalized = normalize_vectors(np.array([vec1, vec2, vec3]))

Usage Examples

Complete CRUD Example

from neumann import NeumannClient

with NeumannClient.connect("localhost:9200") as client:
    # Create table
    client.execute("CREATE TABLE products (name:string, price:float)")

    # Insert data
    client.execute('INSERT products name="Widget", price=9.99')
    client.execute('INSERT products name="Gadget", price=19.99')

    # Query data
    result = client.execute("SELECT products WHERE price > 10")
    for row in result.rows:
        print(row.to_dict())

    # Update
    client.execute('UPDATE products SET price=24.99 WHERE name="Gadget"')

    # Delete
    client.execute("DELETE products WHERE price < 15")

    # Drop table
    client.execute("DROP TABLE products")

Graph Operations

from neumann import NeumannClient

with NeumannClient.connect("localhost:9200") as client:
    # Create nodes
    client.execute('NODE CREATE person {name: "Alice", age: 30}')
    client.execute('NODE CREATE person {name: "Bob", age: 25}')

    # Create edge
    client.execute("EDGE CREATE 1 -> 2 : knows {since: 2020}")

    # List nodes
    result = client.execute("NODE LIST person")
    for node in result.nodes:
        print(f"[{node.id}] {node.label}: {node.to_dict()['properties']}")

    # Find neighbors
    result = client.execute("NEIGHBORS 1 OUTGOING")

    # Find path
    result = client.execute("PATH 1 -> 2")
    if result.paths:
        path = result.paths[0]
        print(" -> ".join(n.id for n in path.nodes))

Vector Search with NumPy

from neumann import NeumannClient
from neumann.integrations.numpy import vector_to_insert, normalize_vectors
import numpy as np

with NeumannClient.connect("localhost:9200") as client:
    # Generate and store embeddings
    embeddings = np.random.randn(100, 768).astype(np.float32)
    embeddings = normalize_vectors(embeddings)

    for i, emb in enumerate(embeddings):
        query = vector_to_insert(f"doc{i}", emb)
        client.execute(query)

    # Query vector
    query_vec = np.random.randn(768).astype(np.float32)
    query_str = vector_to_insert("query", query_vec)
    client.execute(query_str)

    # Find similar
    result = client.execute('SIMILAR "query" COSINE LIMIT 10')
    for item in result.similar_items:
        print(f"{item.key}: {item.score:.4f}")

Async Web Application

from fastapi import FastAPI
from neumann.aio import AsyncNeumannClient

app = FastAPI()
client: AsyncNeumannClient | None = None

@app.on_event("startup")
async def startup():
    global client
    client = await AsyncNeumannClient.connect(
        "localhost:9200",
        api_key="your-api-key",
    )

@app.on_event("shutdown")
async def shutdown():
    if client:
        await client.close()

@app.get("/users")
async def get_users():
    result = await client.execute("SELECT users")
    return [row.to_dict() for row in result.rows]

@app.get("/users/{user_id}")
async def get_user(user_id: int):
    result = await client.execute(f"SELECT users WHERE id = {user_id}")
    if result.rows:
        return result.rows[0].to_dict()
    return {"error": "Not found"}

Dependencies

PackagePurposeExtra
grpciogRPC clientDefault
protobufProtocol buffersDefault
neumann-nativePyO3 bindings[native]
pandasDataFrame support[pandas]
numpyArray support[numpy]
ModuleRelationship
neumann_serverServer that this SDK connects to
neumann_clientRust SDK with same capabilities
@neumann/clientTypeScript SDK with same API design

TCP Transport

The TCP transport layer provides reliable, secure node-to-node communication for the tensor_chain distributed system. It implements connection pooling, TLS security, rate limiting, compression, and automatic reconnection.

Overview

The TcpTransport implements the Transport trait, providing:

  • Connection pooling for efficient peer communication
  • TLS encryption with mutual authentication support
  • Rate limiting using token bucket algorithm
  • LZ4 compression for bandwidth efficiency
  • Automatic reconnection with exponential backoff
flowchart TD
    A[Application] --> B[TcpTransport]
    B --> C[ConnectionManager]
    C --> D1[ConnectionPool Node A]
    C --> D2[ConnectionPool Node B]
    C --> D3[ConnectionPool Node C]
    D1 --> E1[TLS Stream]
    D2 --> E2[TLS Stream]
    D3 --> E3[TLS Stream]

Connection Architecture

Connection Manager

The ConnectionManager maintains connection pools for each peer. Each pool can hold multiple connections for load distribution.

#![allow(unused)]
fn main() {
let config = TcpTransportConfig::new("node1", "0.0.0.0:9100".parse()?);
let transport = TcpTransport::new(config);
transport.start().await?;
}

Connection Lifecycle

stateDiagram-v2
    [*] --> Connecting
    Connecting --> Handshaking: TCP Connected
    Handshaking --> Active: Handshake Success
    Handshaking --> Failed: Handshake Failed
    Active --> Reading: Message Available
    Active --> Writing: Send Request
    Reading --> Active: Message Processed
    Writing --> Active: Message Sent
    Active --> Reconnecting: Connection Lost
    Reconnecting --> Connecting: Backoff Complete
    Reconnecting --> Failed: Max Retries
    Failed --> [*]

Configuration

ParameterDefaultDescription
pool_size2Connections per peer
connect_timeout_ms5000Connection timeout in milliseconds
io_timeout_ms30000Read/write timeout in milliseconds
max_message_size16 MBMaximum message size in bytes
keepalivetrueEnable TCP keepalive
keepalive_interval_secs30Keepalive probe interval
max_pending_messages1000Outbound queue size per peer
recv_buffer_size1000Incoming message channel size

TLS Security

The transport supports four security modes to accommodate different deployment scenarios.

Security Modes

ModeTLSmTLSNodeId VerifyUse Case
StrictYesYesYesProduction deployments
PermissiveYesNoNoGradual TLS rollout
DevelopmentNoNoNoLocal testing only
LegacyNoNoNoMigration from older versions

NodeId Verification

NodeId verification ensures the peer’s identity matches their TLS certificate:

ModeDescription
NoneTrust NodeId from handshake (testing only)
CommonNameNodeId must match certificate CN
SubjectAltNameNodeId must match a SAN DNS entry

TLS Configuration

[tls]
cert_path = "/etc/neumann/node.crt"
key_path = "/etc/neumann/node.key"
ca_cert_path = "/etc/neumann/ca.crt"
require_client_auth = true
node_id_verification = "CommonName"

mTLS Handshake

sequenceDiagram
    participant C as Client Node
    participant S as Server Node

    C->>S: TCP Connect
    C->>S: TLS ClientHello
    S->>C: TLS ServerHello + Certificate
    S->>C: CertificateRequest
    C->>S: Client Certificate
    C->>S: CertificateVerify
    C->>S: Finished
    S->>C: Finished
    Note over C,S: TLS Established
    C->>S: Handshake(node_id, capabilities)
    S->>C: Handshake(node_id, capabilities)
    Note over C,S: Connection Ready

Rate Limiting

Per-peer rate limiting uses the token bucket algorithm to prevent any single peer from overwhelming the system.

Token Bucket Algorithm

flowchart LR
    A[Refill Timer] -->|tokens/sec| B[Token Bucket]
    B -->|check| C{Tokens > 0?}
    C -->|Yes| D[Allow Message]
    C -->|No| E[Reject Message]
    D --> F[Consume Token]

Configuration Presets

PresetBucket SizeRefill RateDescription
Default10050/secBalanced throughput
Aggressive5025/secLower burst, tighter limit
Permissive200100/secHigher throughput allowed
DisabledNo rate limiting

Configuration Example

[rate_limit]
enabled = true
bucket_size = 100
refill_rate = 50.0

Compression

Frame-level LZ4 compression reduces bandwidth usage for larger messages. Compression is negotiated during the handshake.

Frame Format

+--------+--------+------------+
| Length | Flags  | Payload    |
| 4 bytes| 1 byte | N bytes    |
+--------+--------+------------+

Flags byte:
  bit 0: 1 = LZ4 compressed, 0 = uncompressed
  bits 1-7: reserved (must be 0)

Configuration

ParameterDefaultDescription
enabledtrueEnable compression
methodLz4Compression algorithm
min_size256Minimum payload size to compress

Messages smaller than min_size are sent uncompressed to avoid overhead.

[compression]
enabled = true
method = "Lz4"
min_size = 256

Reconnection

Automatic reconnection uses exponential backoff with jitter to recover from transient failures without overwhelming the network.

Backoff Calculation

backoff = min(initial * multiplier^attempt, max_backoff)
jitter = backoff * random(-jitter_factor, +jitter_factor)
final_delay = backoff + jitter

Configuration

ParameterDefaultDescription
enabledtrueEnable auto-reconnection
initial_backoff_ms100Initial backoff delay
max_backoff_ms30000Maximum backoff delay
multiplier2.0Exponential multiplier
max_attemptsNoneMax retries (None = infinite)
jitter0.1Jitter factor (0.0 to 1.0)

Backoff Example

AttemptBase DelayWith 10% Jitter
0100ms90-110ms
1200ms180-220ms
2400ms360-440ms
3800ms720-880ms
8+30000ms27000-33000ms

Metrics

The transport exposes statistics through TransportStats:

MetricDescription
messages_sentTotal messages sent
messages_receivedTotal messages received
bytes_sentTotal bytes sent
bytes_receivedTotal bytes received
peer_countNumber of connected peers
connection_countTotal active connections
#![allow(unused)]
fn main() {
let stats = transport.stats();
println!("Messages sent: {}", stats.messages_sent);
println!("Connected peers: {}", stats.peer_count);
}

Error Handling

ErrorCauseRecovery
TimeoutOperation exceeded timeoutRetry with backoff
PeerNotFoundNo pool for peerEstablish connection first
HandshakeFailedProtocol mismatch or bad certCheck configuration
TlsRequiredTLS needed but not configuredConfigure TLS
MtlsRequiredmTLS needed but not enabledEnable client auth
RateLimitedToken bucket exhaustedWait for refill
CompressionDecompression failedCheck for data corruption

Usage Example

#![allow(unused)]
fn main() {
use tensor_chain::tcp::{
    TcpTransport, TcpTransportConfig, TlsConfig, SecurityMode,
    RateLimitConfig, CompressionConfig,
};

// Create secure production configuration
let tls = TlsConfig::new_secure(
    "/etc/neumann/node.crt",
    "/etc/neumann/node.key",
    "/etc/neumann/ca.crt",
);

let config = TcpTransportConfig::new("node1", "0.0.0.0:9100".parse()?)
    .with_tls(tls)
    .with_security_mode(SecurityMode::Strict)
    .with_rate_limit(RateLimitConfig::default())
    .with_compression(CompressionConfig::default())
    .with_pool_size(4);

// Validate security before starting
config.validate_security()?;

// Start transport
let transport = TcpTransport::new(config);
transport.start().await?;

// Connect to peer
transport.connect(&PeerConfig {
    node_id: "node2".to_string(),
    address: "10.0.1.2:9100".to_string(),
}).await?;

// Send message
transport.send(&"node2".to_string(), Message::Ping { term: 1 }).await?;

// Receive messages
let (from, msg) = transport.recv().await?;
}

Source Reference

  • tensor_chain/src/tcp/config.rs - Configuration types
  • tensor_chain/src/tcp/transport.rs - Transport implementation
  • tensor_chain/src/tcp/tls.rs - TLS wrapper
  • tensor_chain/src/tcp/rate_limit.rs - Token bucket rate limiter
  • tensor_chain/src/tcp/compression.rs - LZ4 compression
  • tensor_chain/src/tcp/framing.rs - Wire protocol codec
  • tensor_chain/src/tcp/connection.rs - Connection pool

Snapshot Streaming

The snapshot streaming system provides memory-efficient serialization and transfer of Raft log snapshots. It enables handling of large snapshots containing millions of log entries without exhausting heap memory.

Overview

Key features:

  • Incremental writing: Entries serialized one at a time via SnapshotWriter
  • Lazy reading: Entries deserialized on-demand via SnapshotReader iterator
  • Memory bounded: Automatic disk spill via SnapshotBuffer
  • Backwards compatible: Falls back to legacy format for old snapshots
flowchart TD
    A[Raft State Machine] -->|write_entry| B[SnapshotWriter]
    B -->|finish| C[SnapshotBuffer]
    C -->|memory/file| D{Size Check}
    D -->|< threshold| E[Memory Mode]
    D -->|> threshold| F[File Mode + mmap]
    E --> G[Chunk Transfer]
    F --> G
    G -->|network| H[SnapshotReader]
    H -->|iterator| I[Follower Node]

Wire Format

The streaming format uses length-prefixed entries for efficient parsing.

Header Structure

+--------+--------+------------------+
| Magic | Version | Entry Count |
| 4 bytes | 4 bytes | 8 bytes |
+--------+--------+------------------+
| "SNAP" | 1 | u64 LE |
+--------+--------+------------------+
Total: 16 bytes

Entry Structure

+--------+------------------------+
| Length | Bincode-serialized |
| 4 bytes | LogEntry |
+--------+------------------------+
| u32 LE | variable |
+--------+------------------------+

Complete Snapshot Layout

+--------+--------+--------+--------+--------+--------+
| SNAP   | Ver(1) | Count  | Len1   | Entry1 | Len2   | ...
| 4B     | 4B     | 8B     | 4B     | N bytes| 4B     | ...
+--------+--------+--------+--------+--------+--------+

Architecture

Leader-to-Follower Flow

sequenceDiagram
    participant L as Leader
    participant W as SnapshotWriter
    participant B as SnapshotBuffer
    participant N as Network
    participant R as SnapshotReader
    participant F as Follower

    L->>W: write_entry(entry)
    W->>B: serialize + write
    Note over B: Memory or File mode
    L->>W: finish()
    W->>B: finalize()
    B->>N: chunk transfer
    N->>R: received chunks
    R->>F: iterator.next()
    F->>F: apply(entry)

SnapshotBuffer State Transitions

stateDiagram-v2
    [*] --> Memory: new()
    Memory --> Memory: write() [size < threshold]
    Memory --> File: write() [size >= threshold]
    File --> File: write() [grow if needed]
    Memory --> Finalized: finalize()
    File --> Finalized: finalize() + fsync
    Finalized --> [*]: drop (cleanup)

SnapshotBuffer

The SnapshotBuffer provides adaptive memory/disk storage with bounded memory usage.

Configuration

ParameterDefaultDescription
max_memory_bytes256 MBThreshold before disk spill
temp_dirSystemDirectory for temp files
initial_file_capacity64 MBInitial file size when spilling

Configuration Example

#![allow(unused)]
fn main() {
use tensor_chain::snapshot_buffer::SnapshotBufferConfig;

let config = SnapshotBufferConfig::default()
    .with_max_memory(512 * 1024 * 1024)  // 512 MB
    .with_temp_dir("/var/lib/neumann/snapshots");
}

Performance Characteristics

OperationMemory ModeFile Mode
write()O(1) amortizedO(1) + possible mmap resize
as_slice()O(1)O(1) zero-copy via mmap
read_chunk()O(n) copyO(n) copy
finalize()O(1)O(1) + fsync

SnapshotWriter

The SnapshotWriter serializes log entries incrementally using the length- prefixed format.

Usage

#![allow(unused)]
fn main() {
use tensor_chain::snapshot_streaming::SnapshotWriter;
use tensor_chain::snapshot_buffer::SnapshotBufferConfig;

let config = SnapshotBufferConfig::default();
let mut writer = SnapshotWriter::new(config)?;

// Write entries incrementally
for entry in log_entries {
    writer.write_entry(&entry)?;
}

// Check progress
println!("Entries: {}", writer.entry_count());
println!("Bytes: {}", writer.bytes_written());
println!("Last index: {}", writer.last_index());

// Finalize and get buffer
let buffer = writer.finish()?;
}

Progress Tracking

MethodDescription
entry_count()Number of entries written
bytes_written()Total bytes including header
last_index()Index of last entry written
last_term()Term of last entry written

SnapshotReader

The SnapshotReader deserializes entries on-demand using an iterator interface.

Usage

#![allow(unused)]
fn main() {
use tensor_chain::snapshot_streaming::SnapshotReader;

// Create reader (validates header)
let reader = SnapshotReader::new(&buffer)?;

println!("Entry count: {}", reader.entry_count());

// Read via iterator
for result in reader {
    let entry = result?;
    state_machine.apply(entry);
}
}

Iterator Protocol

sequenceDiagram
    participant A as Application
    participant R as SnapshotReader
    participant B as Buffer

    loop For each entry
        A->>R: next()
        R->>B: read 4 bytes (length)
        R->>B: read N bytes (entry)
        R->>A: Some(Ok(LogEntry))
    end
    A->>R: next()
    R->>A: None (end)

Progress Tracking

MethodDescription
entry_count()Total entries in snapshot
entries_read()Entries read so far
remaining()Entries not yet read

Chunk Transfer

For network transfer, the buffer supports chunked reading with resume capability.

Resume Protocol

sequenceDiagram
    participant L as Leader
    participant F as Follower

    L->>F: Chunk 0 (offset=0, len=64KB)
    F->>F: Store chunk
    Note over F: Network interruption
    F->>L: Resume (offset=64KB)
    L->>F: Chunk 1 (offset=64KB, len=64KB)
    F->>F: Append chunk
    L->>F: Chunk 2 (offset=128KB, len=32KB)
    F->>F: Complete snapshot

Bandwidth Configuration

Chunk SizeUse Case
16 KBHigh-latency networks
64 KBDefault, balanced
256 KBLow-latency, high-bandwidth
1 MBLocal/datacenter transfers

Error Handling

Error TypeCauseRecovery
IoFile/mmap operation failedCheck disk space/perms
BufferOut of bounds readVerify offset/length
SerializationBincode encode/decode failedCheck data integrity
InvalidFormatWrong magic, version, or sizeVerify snapshot source
UnexpectedEofTruncated data or count errorRe-transfer snapshot

Security Limits

LimitValuePurpose
Max entry size100 MBPrevent memory exhaustion
Max header version1Reject unknown formats

Legacy Compatibility

The system automatically handles legacy (non-streaming) snapshots.

Format Detection

#![allow(unused)]
fn main() {
use tensor_chain::snapshot_streaming::deserialize_entries;

// Automatically detects format
let entries = deserialize_entries(snapshot_bytes)?;

// Works with:
// - Streaming format (magic = "SNAP")
// - Legacy bincode Vec<LogEntry>
}

Usage Example

Complete Leader Workflow

#![allow(unused)]
fn main() {
use tensor_chain::snapshot_streaming::{SnapshotWriter, serialize_entries};
use tensor_chain::snapshot_buffer::SnapshotBufferConfig;

// Create optimized config for large snapshots
let config = SnapshotBufferConfig::default()
    .with_max_memory(256 * 1024 * 1024);

// Serialize incrementally
let mut writer = SnapshotWriter::new(config)?;
for entry in state_machine.log_entries() {
    writer.write_entry(&entry)?;
}
let buffer = writer.finish()?;

// Serve chunks to followers
let total_len = buffer.total_len();
let chunk_size = 64 * 1024;
let mut offset = 0;

while offset < total_len {
    let len = (total_len - offset).min(chunk_size as u64) as usize;
    let chunk = buffer.as_slice(offset, len)?;
    send_chunk_to_follower(offset, chunk)?;
    offset += len as u64;
}
}

Complete Follower Workflow

#![allow(unused)]
fn main() {
use tensor_chain::snapshot_streaming::SnapshotReader;
use tensor_chain::snapshot_buffer::SnapshotBuffer;

// Receive and assemble chunks
let mut buffer = SnapshotBuffer::with_defaults()?;
while let Some(chunk) = receive_chunk() {
    buffer.write(&chunk)?;
}
buffer.finalize()?;

// Verify integrity
let expected_hash = received_hash;
let actual_hash = buffer.hash();
assert_eq!(expected_hash, actual_hash);

// Apply entries
let reader = SnapshotReader::new(&buffer)?;
for result in reader {
    let entry = result?;
    state_machine.apply(entry)?;
}
}

Source Reference

  • tensor_chain/src/snapshot_streaming.rs - Streaming protocol
  • tensor_chain/src/snapshot_buffer.rs - Adaptive buffer implementation

Transaction Workspace

The transaction workspace system provides ACID transaction semantics for tensor_chain operations. It enables isolated execution with snapshot-based reads and atomic commits via delta tracking.

Overview

Key features:

  • Snapshot isolation: Reads see consistent state from transaction start
  • Delta tracking: Changes tracked as semantic embeddings for conflict detection
  • Atomic commit: All-or-nothing via block append
  • Cross-shard coordination: Two-phase commit (2PC) for distributed transactions
flowchart TD
    A[Client] -->|begin| B[TransactionWorkspace]
    B -->|snapshot| C[Checkpoint]
    B -->|add_operation| D[Operations]
    B -->|compute_delta| E[EmbeddingState]
    E --> F{Conflict Check}
    F -->|orthogonal| G[Commit]
    F -->|conflicting| H[Rollback]
    H -->|restore| C

Workspace Lifecycle

State Machine

stateDiagram-v2
    [*] --> Active: begin()
    Active --> Active: add_operation()
    Active --> Committing: mark_committing()
    Committing --> Committed: mark_committed()
    Committing --> Failed: error
    Active --> RolledBack: rollback()
    Failed --> [*]
    Committed --> [*]
    RolledBack --> [*]

State Descriptions

StateDescription
ActiveOperations can be added
CommittingCommit in progress, no more operations
CommittedSuccessfully committed to the chain
RolledBackRolled back, state restored from checkpoint
FailedError during commit, requires manual resolution

Transaction Operations

Basic Usage

#![allow(unused)]
fn main() {
use tensor_chain::{TensorStore, TransactionWorkspace};
use tensor_chain::block::Transaction;

let store = TensorStore::new();

// Begin transaction
let workspace = TransactionWorkspace::begin(&store)?;

// Add operations
workspace.add_operation(Transaction::Put {
    key: "user:1".to_string(),
    data: vec![1, 2, 3],
})?;

workspace.add_operation(Transaction::Put {
    key: "user:2".to_string(),
    data: vec![4, 5, 6],
})?;

// Check affected keys
let keys = workspace.affected_keys();
assert!(keys.contains("user:1"));

// Commit or rollback
workspace.mark_committing()?;
workspace.mark_committed();
}

Operation Types

OperationDescriptionAffected Key
PutInsert or update keyThe key itself
DeleteRemove keyThe key itself
UpdateModify existing keyThe key itself

Delta Tracking

The workspace tracks changes as semantic embeddings using the EmbeddingState machine. This enables conflict detection based on vector similarity.

Before/After Embedding Flow

flowchart LR
    A[begin] -->|capture| B[before embedding]
    B --> C[operations]
    C -->|compute| D[after embedding]
    D --> E[delta = after - before]
    E --> F[DeltaVector]

Computing Deltas

#![allow(unused)]
fn main() {
// Set the before-state embedding at transaction start
workspace.set_before_embedding(before_embedding);

// ... execute operations ...

// Compute delta at commit time
workspace.compute_delta(after_embedding);

// Get delta for conflict detection
let delta_vector = workspace.to_delta_vector();
}

Delta Vector Structure

FieldTypeDescription
embeddingVec<f32>Semantic change vector
affected_keysHashSet<String>Keys modified by transaction
tx_idu64Transaction identifier

Isolation Levels

The workspace provides snapshot isolation by default. All reads within a transaction see the state captured at begin().

LevelDirty ReadsNon-RepeatablePhantom Reads
Snapshot (default)NoNoNo

Snapshot Mechanism

  1. begin() captures the store state as a binary checkpoint
  2. All reads within the transaction see this snapshot
  3. rollback() restores the snapshot if needed
  4. Checkpoint is discarded after commit/rollback

Lock Management

For distributed transactions, the LockManager provides key-level locking with deadlock prevention.

Lock Ordering

To prevent deadlocks, always acquire locks in this order:

  1. pending - Transaction state map
  2. lock_manager.locks - Key-level locks
  3. lock_manager.tx_locks - Per-transaction lock sets
  4. pending_aborts - Abort queue

Lock Configuration

ParameterDefaultDescription
default_timeout30sLock expiration time
timeout_ms5000Transaction timeout

Lock Acquisition

#![allow(unused)]
fn main() {
// Try to acquire locks for multiple keys
match lock_manager.try_lock(tx_id, &keys) {
    Ok(lock_handle) => {
        // Locks acquired successfully
    }
    Err(conflicting_tx) => {
        // Another transaction holds a lock
    }
}
}

Cross-Shard Coordination

Distributed transactions use two-phase commit (2PC) for cross-shard coordination.

2PC Protocol

sequenceDiagram
    participant C as Coordinator
    participant S1 as Shard 1
    participant S2 as Shard 2

    Note over C: Phase 1: Prepare
    C->>S1: TxPrepare(ops, delta)
    C->>S2: TxPrepare(ops, delta)
    S1->>S1: acquire locks
    S2->>S2: acquire locks
    S1->>S1: check conflicts
    S2->>S2: check conflicts
    S1->>C: Vote(Yes, delta)
    S2->>C: Vote(Yes, delta)

    Note over C: Phase 2: Commit
    C->>S1: TxCommit
    C->>S2: TxCommit
    S1->>C: Ack
    S2->>C: Ack

Transaction Phases

PhaseDescription
PreparingAcquiring locks, computing deltas
PreparedAll participants voted YES
CommittingFinalizing the commit
CommittedSuccessfully committed
AbortingRolling back due to NO vote or timeout
AbortedSuccessfully aborted

Prepare Vote Types

VoteDescriptionAction
YesReady to commit, locks acquiredProceed to Phase 2
NoCannot commit (validation failed)Abort
ConflictDetected semantic conflictAbort

Conflict Detection

The workspace uses delta embeddings to detect conflicts based on vector similarity.

Orthogonality Check

Two transactions are considered orthogonal (non-conflicting) if their delta vectors have low cosine similarity:

similarity = cos(delta_A, delta_B)
orthogonal = abs(similarity) < threshold

Configuration

ParameterDefaultDescription
orthogonal_threshold0.1Max similarity for orthogonal
merge_window_ms60000Window for merge candidates

Merge Candidates

The TransactionManager can find transactions eligible for parallel commit:

#![allow(unused)]
fn main() {
// Find orthogonal transactions that can be merged
let candidates = manager.find_merge_candidates(
    &workspace,
    0.1,      // orthogonal threshold
    60_000,   // merge window (60s)
);

// Candidates are sorted by similarity (most orthogonal first)
for candidate in candidates {
    println!("Tx {} similarity: {}", candidate.workspace.id(), candidate.similarity);
}
}

Error Handling

ErrorCauseRecovery
TransactionFailedOperation on non-active workspaceCheck state first
WorkspaceErrorSnapshot/restore failedCheck store health
LockConflictAnother tx holds the lockRetry with backoff
TimeoutTransaction exceeded timeoutIncrease timeout

Usage Example

Complete Transaction Flow

#![allow(unused)]
fn main() {
use tensor_chain::{TensorStore, TransactionManager};
use tensor_chain::block::Transaction;

// Create manager
let store = TensorStore::new();
let manager = TransactionManager::new();

// Begin transaction
let workspace = manager.begin(&store)?;

// Add operations
workspace.add_operation(Transaction::Put {
    key: "account:1".to_string(),
    data: serialize(&Account { balance: 100 }),
})?;

workspace.add_operation(Transaction::Put {
    key: "account:2".to_string(),
    data: serialize(&Account { balance: 200 }),
})?;

// Set embeddings for conflict detection
workspace.set_before_embedding(vec![0.0; 128]);
workspace.compute_delta(compute_state_embedding(&store));

// Check for conflicts with other active transactions
let candidates = manager.find_merge_candidates(&workspace, 0.1, 60_000);
if candidates.is_empty() {
    // No orthogonal transactions, commit alone
    workspace.mark_committing()?;
    workspace.mark_committed();
} else {
    // Can merge with orthogonal transactions
    // ... merge logic ...
}

// Remove from manager
manager.remove(workspace.id());
}

Source Reference

  • tensor_chain/src/transaction.rs - TransactionWorkspace, TransactionManager
  • tensor_chain/src/distributed_tx.rs - 2PC coordinator, LockManager
  • tensor_chain/src/embedding.rs - EmbeddingState machine
  • tensor_chain/src/consensus.rs - DeltaVector, conflict detection

Python SDK Quickstart

The Neumann Python SDK provides both synchronous and asynchronous clients for querying a Neumann server, with optional embedded mode via PyO3 bindings.

Installation

pip install neumann

Connect

Remote (gRPC)

from neumann import NeumannClient

client = NeumannClient.connect("localhost:50051", api_key="your-api-key")

Embedded (no server needed)

client = NeumannClient.embedded(path="/tmp/neumann-data")

Async

from neumann.aio import AsyncNeumannClient

async with await AsyncNeumannClient.connect("localhost:50051") as client:
    result = await client.query("SELECT * FROM users")

Execute Queries

# Single query
result = client.query("SELECT * FROM users WHERE age > 25")

# Batch queries
results = client.execute_batch([
    "INSERT INTO users VALUES (1, 'Alice', 30)",
    "INSERT INTO users VALUES (2, 'Bob', 25)",
])

# Streaming results
for chunk in client.execute_stream("SELECT * FROM large_table"):
    process(chunk)

Handle Results

Results are typed by QueryResultType. Check the type before accessing data:

from neumann import QueryResultType

result = client.query("SELECT * FROM users")

if result.type == QueryResultType.ROWS:
    for row in result.rows:
        print(row.get_string("name"), row.get_int("age"))
        # Or convert to dict:
        print(row.to_dict())

elif result.type == QueryResultType.COUNT:
    print(f"Count: {result.count}")

elif result.type == QueryResultType.NODES:
    for node in result.nodes:
        print(node.id, node.label, node.properties)

elif result.type == QueryResultType.SIMILAR:
    for item in result.similar_items:
        print(f"{item.key}: {item.score:.4f}")

Result types

TypeFieldDescription
ROWSresult.rowsRelational query results
NODESresult.nodesGraph nodes
EDGESresult.edgesGraph edges
PATHSresult.pathsGraph paths
SIMILARresult.similar_itemsVector similarity results
COUNTresult.countInteger count
VALUEresult.valueSingle scalar value
TABLE_LISTresult.tablesAvailable tables
EMPTYNo result

Vector Operations

For dedicated vector operations, use VectorClient:

from neumann import VectorClient, VectorPoint

vectors = VectorClient.connect("localhost:50051", api_key="your-key")

# Create a collection
vectors.create_collection("documents", dimension=384, distance="cosine")

# Upsert points
vectors.upsert_points("documents", [
    VectorPoint(id="doc1", vector=[0.1, 0.2, ...], payload={"title": "Hello"}),
    VectorPoint(id="doc2", vector=[0.3, 0.4, ...], payload={"title": "World"}),
])

# Query similar points
results = vectors.query_points(
    "documents",
    query_vector=[0.15, 0.25, ...],
    limit=10,
    score_threshold=0.8,
    with_payload=True,
)

for point in results:
    print(f"{point.id}: {point.score:.4f} - {point.payload}")

# Manage collections
names = vectors.list_collections()
info = vectors.get_collection("documents")
count = vectors.count_points("documents")

vectors.close()

Pandas Integration

Convert query results to DataFrames:

from neumann.integrations.pandas import result_to_dataframe, dataframe_to_inserts

# Query to DataFrame
result = client.query("SELECT * FROM users")
df = result_to_dataframe(result)
print(df.head())

# DataFrame to INSERT queries
queries = dataframe_to_inserts(df, "users_backup")
client.execute_batch(queries)

NumPy Integration

Work with vectors as NumPy arrays:

import numpy as np
from neumann.integrations.numpy import (
    vector_to_insert,
    vectors_to_inserts,
    cosine_similarity,
    normalize_vectors,
)

# Single vector insert
query = vector_to_insert("doc1", np.array([0.1, 0.2, 0.3]), normalize=True)
client.query(query)

# Batch insert
vectors_dict = {"doc1": np.array([0.1, 0.2]), "doc2": np.array([0.3, 0.4])}
queries = vectors_to_inserts(vectors_dict)
client.execute_batch(queries)

# Compute similarity locally
sim = cosine_similarity(vec1, vec2)

Error Handling

from neumann import (
    NeumannError,
    ConnectionError,
    AuthenticationError,
    NotFoundError,
    QueryError,
    ParseError,
)

try:
    result = client.query("SELECT * FROM nonexistent")
except NotFoundError as e:
    print(f"Not found: {e.message}")
except ParseError as e:
    print(f"Syntax error: {e.message}")
except ConnectionError as e:
    print(f"Connection failed: {e}")
except NeumannError as e:
    print(f"Error [{e.code.name}]: {e.message}")

Configuration

Fine-tune timeouts, retries, and keepalive:

from neumann import ClientConfig, TimeoutConfig, RetryConfig

config = ClientConfig(
    timeout=TimeoutConfig(
        default_timeout_s=30.0,
        connect_timeout_s=10.0,
    ),
    retry=RetryConfig(
        max_attempts=3,
        initial_backoff_ms=100,
        max_backoff_ms=10000,
        backoff_multiplier=2.0,
    ),
)

client = NeumannClient.connect("localhost:50051", config=config)

# Preset configurations
config = ClientConfig.fast_fail()       # 5s timeout, 1 attempt
config = ClientConfig.no_retry()        # Default timeout, 1 attempt
config = ClientConfig.high_latency()    # 120s timeout, 5 attempts

Next Steps

TypeScript SDK Quickstart

The Neumann TypeScript SDK provides a fully typed client for Node.js (gRPC) and browser (gRPC-Web) environments.

Installation

npm install @neumann/client
# or
yarn add @neumann/client

Connect

Node.js (gRPC)

import { NeumannClient } from '@neumann/client';

const client = await NeumannClient.connect('localhost:9200', {
  apiKey: 'your-api-key',
  tls: false,
});

Browser (gRPC-Web)

const client = await NeumannClient.connectWeb('http://localhost:9200');

Execute Queries

// Single query
const result = await client.query('SELECT * FROM users WHERE age > 25');

// Batch queries
const results = await client.executeBatch([
  "INSERT INTO users VALUES (1, 'Alice', 30)",
  "INSERT INTO users VALUES (2, 'Bob', 25)",
]);

// Streaming results
for await (const chunk of client.executeStream('SELECT * FROM large_table')) {
  process(chunk);
}

// Paginated results
const page = await client.executePaginated('SELECT * FROM users', {
  pageSize: 100,
  countTotal: true,
});
console.log(`Total: ${page.totalCount}, Has more: ${page.hasMore}`);

Handle Results

Results use discriminated unions with type guard functions:

import {
  isRowsResult,
  isNodesResult,
  isSimilarResult,
  isCountResult,
  rowToObject,
} from '@neumann/client';

const result = await client.query('SELECT * FROM users');

if (isRowsResult(result)) {
  for (const row of result.rows) {
    const obj = rowToObject(row);
    console.log(obj.name, obj.age);
  }
}

if (isNodesResult(result)) {
  for (const node of result.nodes) {
    console.log(node.id, node.label, node.properties);
  }
}

if (isSimilarResult(result)) {
  for (const item of result.items) {
    console.log(`${item.key}: ${item.score.toFixed(4)}`);
  }
}

if (isCountResult(result)) {
  console.log(`Count: ${result.count}`);
}

Result type guards

GuardResult FieldDescription
isRowsResultresult.rowsRelational query results
isNodesResultresult.nodesGraph nodes
isEdgesResultresult.edgesGraph edges
isSimilarResultresult.itemsVector similarity results
isCountResultresult.countInteger count
isValueResultresult.valueSingle scalar value
isErrorResultresult.errorError message

Transactions

// Automatic commit/rollback
const result = await client.withTransaction(async (tx) => {
  await tx.execute("INSERT INTO users VALUES (1, 'Alice', 30)");
  await tx.execute("INSERT INTO users VALUES (2, 'Bob', 25)");
  return 'inserted';
});
// Transaction is committed on success, rolled back on error

// Manual control
const tx = client.beginTransaction();
await tx.begin();
await tx.execute("INSERT INTO users VALUES (3, 'Carol', 28)");
await tx.commit(); // or tx.rollback()

Vector Operations

For dedicated vector operations, use VectorClient:

import { VectorClient } from '@neumann/client';

const vectors = await VectorClient.connect('localhost:9200');

// Create a collection
await vectors.createCollection('documents', 384, 'cosine');

// Upsert points
await vectors.upsertPoints('documents', [
  { id: 'doc1', vector: [0.1, 0.2, ...], payload: { title: 'Hello' } },
  { id: 'doc2', vector: [0.3, 0.4, ...], payload: { title: 'World' } },
]);

// Query similar points
const results = await vectors.queryPoints('documents', [0.15, 0.25, ...], {
  limit: 10,
  scoreThreshold: 0.8,
  withPayload: true,
});

for (const point of results) {
  console.log(`${point.id}: ${point.score.toFixed(4)} - ${JSON.stringify(point.payload)}`);
}

// Scroll through all points
for await (const point of vectors.scrollAllPoints('documents')) {
  console.log(point.id);
}

// Manage collections
const names = await vectors.listCollections();
const info = await vectors.getCollection('documents');
const count = await vectors.countPoints('documents');

vectors.close();

Blob Operations

Upload and download binary artifacts:

import { BlobClient } from '@neumann/client';

// Upload from buffer
const result = await blob.uploadBlob('document.pdf', Buffer.from(data), {
  contentType: 'application/pdf',
  tags: ['quarterly', 'report'],
  linkedTo: ['entity-id'],
});
console.log(`Uploaded: ${result.artifactId}`);

// Download as buffer
const data = await blob.downloadBlobFull(result.artifactId);

// Stream download
for await (const chunk of blob.downloadBlob(result.artifactId)) {
  process(chunk);
}

// Metadata
const metadata = await blob.getBlobMetadata(result.artifactId);
console.log(`Size: ${metadata.size}, Type: ${metadata.contentType}`);

Error Handling

import {
  NeumannError,
  ConnectionError,
  AuthenticationError,
  NotFoundError,
  ParseError,
  ErrorCode,
} from '@neumann/client';

try {
  const result = await client.query('SELECT * FROM nonexistent');
} catch (err) {
  if (err instanceof NotFoundError) {
    console.log(`Not found: ${err.message}`);
  } else if (err instanceof ParseError) {
    console.log(`Syntax error: ${err.message}`);
  } else if (err instanceof ConnectionError) {
    console.log(`Connection failed: ${err.message}`);
  } else if (err instanceof NeumannError) {
    console.log(`Error [${ErrorCode[err.code]}]: ${err.message}`);
  }
}

Configuration

import {
  ClientConfig,
  mergeClientConfig,
  noRetryConfig,
  fastFailConfig,
  highLatencyConfig,
} from '@neumann/client';

const config: ClientConfig = {
  timeout: {
    defaultTimeoutS: 30,
    connectTimeoutS: 10,
  },
  retry: {
    maxAttempts: 3,
    initialBackoffMs: 100,
    maxBackoffMs: 10000,
    backoffMultiplier: 2.0,
  },
};

const client = await NeumannClient.connect('localhost:9200', { config });

// Preset configurations
const fast = fastFailConfig();        // 5s timeout, 1 attempt
const noRetry = noRetryConfig();      // Default timeout, 1 attempt
const highLat = highLatencyConfig();   // 120s timeout, 5 attempts

Pagination

Iterate through large result sets:

// Single page
const page = await client.executePaginated('SELECT * FROM users', {
  pageSize: 100,
  countTotal: true,
  cursorTtlSecs: 300,
});

// Iterate all pages
for await (const result of client.executeAllPages('SELECT * FROM users')) {
  if (isRowsResult(result)) {
    for (const row of result.rows) {
      process(rowToObject(row));
    }
  }
}

// Clean up cursor
if (page.nextCursor) {
  await client.closeCursor(page.nextCursor);
}

Next Steps

Query Language Reference

Neumann uses a SQL-inspired query language extended with graph, vector, blob, vault, cache, and chain commands. All commands are case-insensitive.


Relational Commands

SELECT

SELECT [DISTINCT] columns
FROM table [alias]
[JOIN table ON condition | USING (columns)]
[WHERE condition]
[GROUP BY columns]
[HAVING condition]
[ORDER BY columns [ASC|DESC] [NULLS FIRST|LAST]]
[LIMIT n]
[OFFSET n]

Columns can be *, expressions, or expr AS alias. Supports subqueries in FROM and WHERE clauses.

Join types: INNER, LEFT, RIGHT, FULL, CROSS, NATURAL.

SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.total > 100
ORDER BY o.total DESC
LIMIT 10

INSERT

INSERT INTO table [(columns)] VALUES (values), ...
INSERT INTO table [(columns)] SELECT ...
INSERT INTO users (id, name, age) VALUES (1, 'Alice', 30)
INSERT INTO users VALUES (2, 'Bob', 25), (3, 'Carol', 28)

UPDATE

UPDATE table SET column = value, ... [WHERE condition]
UPDATE users SET age = 31 WHERE name = 'Alice'

DELETE

DELETE FROM table [WHERE condition]
DELETE FROM users WHERE age < 18

CREATE TABLE

CREATE TABLE [IF NOT EXISTS] name (
    column type [constraints],
    ...
    [table_constraints]
)

Column types: INT, INTEGER, BIGINT, SMALLINT, FLOAT, DOUBLE, REAL, DECIMAL(p,s), NUMERIC(p,s), VARCHAR(n), CHAR(n), TEXT, BOOLEAN, DATE, TIME, TIMESTAMP, BLOB.

Column constraints: NOT NULL, NULL, UNIQUE, PRIMARY KEY, DEFAULT expr, CHECK(expr), REFERENCES table(column) [ON DELETE|UPDATE action].

Table constraints: PRIMARY KEY (columns), UNIQUE (columns), FOREIGN KEY (columns) REFERENCES table(column), CHECK(expr).

Referential actions: CASCADE, RESTRICT, SET NULL, SET DEFAULT, NO ACTION.

CREATE TABLE orders (
    id INT PRIMARY KEY,
    user_id INT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    total FLOAT DEFAULT 0.0,
    created TIMESTAMP,
    UNIQUE (user_id, created)
)

DROP TABLE

DROP TABLE [IF EXISTS] name [CASCADE]
DROP TABLE IF EXISTS orders CASCADE

CREATE INDEX

CREATE [UNIQUE] INDEX [IF NOT EXISTS] name ON table (columns)
CREATE INDEX idx_users_name ON users (name)
CREATE UNIQUE INDEX idx_email ON users (email)

DROP INDEX

DROP INDEX [IF EXISTS] name
DROP INDEX ON table(column)
DROP INDEX idx_users_name
DROP INDEX ON users(email)

SHOW TABLES

SHOW TABLES

Lists all relational tables.

DESCRIBE

DESCRIBE TABLE name
DESCRIBE NODE label
DESCRIBE EDGE type

Shows the schema of a table, node label, or edge type.

DESCRIBE TABLE users
DESCRIBE NODE person
DESCRIBE EDGE reports_to

Graph Commands

NODE CREATE

NODE CREATE label { key: value, ... }

Creates a node with the given label and properties.

NODE CREATE person { name: 'Alice', role: 'Engineer', team: 'Platform' }

NODE GET

NODE GET id

Retrieves a node by its ID.

NODE GET 'abc-123'

NODE DELETE

NODE DELETE id

Deletes a node by its ID.

NODE DELETE 'abc-123'

NODE LIST

NODE LIST [label] [LIMIT n] [OFFSET m]

Lists nodes, optionally filtered by label.

NODE LIST person LIMIT 10
NODE LIST

EDGE CREATE

EDGE CREATE from_id -> to_id : edge_type [{ key: value, ... }]

Creates a directed edge between two nodes.

EDGE CREATE 'alice-id' -> 'bob-id' : reports_to { since: '2024-01' }

EDGE GET

EDGE GET id

Retrieves an edge by its ID.

EDGE DELETE

EDGE DELETE id

Deletes an edge by its ID.

EDGE LIST

EDGE LIST [type] [LIMIT n] [OFFSET m]

Lists edges, optionally filtered by type.

EDGE LIST reports_to LIMIT 20

NEIGHBORS

NEIGHBORS id [OUTGOING|INCOMING|BOTH] [: edge_type]
    [BY SIMILARITY [vector] LIMIT n]

Finds neighbors of a node. The optional BY SIMILARITY clause enables cross-engine queries that combine graph traversal with vector similarity.

NEIGHBORS 'alice-id' OUTGOING : reports_to
NEIGHBORS 'node-1' BOTH BY SIMILARITY [0.1, 0.2, 0.3] LIMIT 5

PATH

PATH [SHORTEST|ALL|WEIGHTED|ALL_WEIGHTED|VARIABLE] from_id TO to_id
    [MAX_DEPTH n] [MIN_DEPTH n] [WEIGHT property]

Finds paths between two nodes.

PATH SHORTEST 'alice-id' TO 'ceo-id'
PATH WEIGHTED 'a' TO 'b' WEIGHT cost MAX_DEPTH 5
PATH ALL 'start' TO 'end' MIN_DEPTH 2 MAX_DEPTH 4

Graph Algorithms

PAGERANK

PAGERANK [DAMPING d] [TOLERANCE t] [MAX_ITERATIONS n]
    [DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]

Computes PageRank scores for all nodes.

PAGERANK DAMPING 0.85 MAX_ITERATIONS 100
PAGERANK EDGE_TYPE collaborates

BETWEENNESS

BETWEENNESS [SAMPLING_RATIO r]
    [DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]

Computes betweenness centrality for all nodes.

BETWEENNESS SAMPLING_RATIO 0.5

CLOSENESS

CLOSENESS [DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]

Computes closeness centrality for all nodes.

EIGENVECTOR

EIGENVECTOR [MAX_ITERATIONS n] [TOLERANCE t]
    [DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]

Computes eigenvector centrality for all nodes.

LOUVAIN

LOUVAIN [RESOLUTION r] [MAX_PASSES n]
    [DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]

Detects communities using the Louvain algorithm.

LOUVAIN RESOLUTION 1.0 MAX_PASSES 10

LABEL_PROPAGATION

LABEL_PROPAGATION [MAX_ITERATIONS n]
    [DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]

Detects communities using label propagation.


Graph Constraints

GRAPH CONSTRAINT CREATE

GRAPH CONSTRAINT CREATE name ON NODE|EDGE [(label)] property UNIQUE|EXISTS|TYPE 'type'

Creates a property constraint on nodes or edges.

GRAPH CONSTRAINT CREATE unique_email ON NODE (person) email UNIQUE
GRAPH CONSTRAINT CREATE requires_name ON NODE name EXISTS

GRAPH CONSTRAINT DROP

GRAPH CONSTRAINT DROP name

GRAPH CONSTRAINT LIST

GRAPH CONSTRAINT LIST

Lists all graph constraints.

GRAPH CONSTRAINT GET

GRAPH CONSTRAINT GET name

Graph Indexes

GRAPH INDEX CREATE

GRAPH INDEX CREATE NODE PROPERTY property
GRAPH INDEX CREATE EDGE PROPERTY property
GRAPH INDEX CREATE LABEL
GRAPH INDEX CREATE EDGE_TYPE

Creates a graph property or label index.

GRAPH INDEX DROP

GRAPH INDEX DROP NODE property
GRAPH INDEX DROP EDGE property

GRAPH INDEX SHOW

GRAPH INDEX SHOW NODE
GRAPH INDEX SHOW EDGE

Graph Aggregation

COUNT NODES / COUNT EDGES

GRAPH AGGREGATE COUNT NODES [label]
GRAPH AGGREGATE COUNT EDGES [type]
GRAPH AGGREGATE COUNT NODES person
GRAPH AGGREGATE COUNT EDGES reports_to

AGGREGATE property

GRAPH AGGREGATE SUM|AVG|MIN|MAX|COUNT NODE property [label] [WHERE condition]
GRAPH AGGREGATE SUM|AVG|MIN|MAX|COUNT EDGE property [type] [WHERE condition]
GRAPH AGGREGATE AVG NODE age person
GRAPH AGGREGATE SUM EDGE weight collaborates WHERE weight > 0.5

Graph Pattern Matching

PATTERN MATCH

GRAPH PATTERN MATCH (pattern) [LIMIT n]
GRAPH PATTERN COUNT (pattern)
GRAPH PATTERN EXISTS (pattern)

Matches structural patterns in the graph.

GRAPH PATTERN MATCH (a:person)-[:reports_to]->(b:person) LIMIT 10
GRAPH PATTERN EXISTS (a:person)-[:mentors]->(b:person)

Graph Batch Operations

GRAPH BATCH CREATE NODES

GRAPH BATCH CREATE NODES [(label { props }), ...]

GRAPH BATCH CREATE EDGES

GRAPH BATCH CREATE EDGES [(from -> to : type { props }), ...]

GRAPH BATCH DELETE NODES

GRAPH BATCH DELETE NODES [id1, id2, ...]

GRAPH BATCH DELETE EDGES

GRAPH BATCH DELETE EDGES [id1, id2, ...]

GRAPH BATCH UPDATE NODES

GRAPH BATCH UPDATE NODES [(id { props }), ...]

Vector Commands

EMBED STORE

EMBED STORE key [vector] [IN collection]

Stores a vector embedding with an associated key.

EMBED STORE 'doc1' [0.1, 0.2, 0.3, 0.4]
EMBED STORE 'doc2' [0.5, 0.6, 0.7, 0.8] IN my_collection

EMBED GET

EMBED GET key [IN collection]

Retrieves a stored embedding.

EMBED GET 'doc1'

EMBED DELETE

EMBED DELETE key [IN collection]

Deletes a stored embedding.

EMBED BUILD INDEX

EMBED BUILD INDEX [IN collection]

Builds or rebuilds the HNSW index for similarity search.

EMBED BATCH

EMBED BATCH [('key1', [v1, v2, ...]), ('key2', [v1, v2, ...])] [IN collection]

Stores multiple embeddings in a single operation.

EMBED BATCH [('doc1', [0.1, 0.2]), ('doc2', [0.3, 0.4])]

SIMILAR

SIMILAR key|[vector] [LIMIT n] [METRIC COSINE|EUCLIDEAN|DOT_PRODUCT]
    [CONNECTED TO node_id] [IN collection] [WHERE condition]

Finds similar embeddings by key or vector. The optional CONNECTED TO clause combines vector similarity with graph connectivity for cross-engine queries.

SIMILAR 'doc1' LIMIT 5
SIMILAR [0.1, 0.2, 0.3] LIMIT 10 METRIC COSINE
SIMILAR [0.1, 0.2, 0.3] LIMIT 5 CONNECTED TO 'alice-id'
SIMILAR 'doc1' LIMIT 10 IN my_collection WHERE score > 0.8

SHOW EMBEDDINGS

SHOW EMBEDDINGS [LIMIT n]

Lists stored embeddings.

SHOW VECTOR INDEX

SHOW VECTOR INDEX

Shows information about the HNSW index.

COUNT EMBEDDINGS

COUNT EMBEDDINGS

Returns the number of stored embeddings.


Unified Entity Commands

ENTITY CREATE

ENTITY CREATE key { properties } [EMBEDDING [vector]]

Creates a unified entity with optional embedding. A unified entity spans all engines: it is stored as relational data, as a graph node, and optionally as a vector embedding.

ENTITY CREATE 'alice' { name: 'Alice', role: 'Engineer' } EMBEDDING [0.1, 0.2, 0.3]

ENTITY GET

ENTITY GET key

Retrieves a unified entity with all its data across engines.

ENTITY UPDATE

ENTITY UPDATE key { properties } [EMBEDDING [vector]]

Updates an existing unified entity.

ENTITY UPDATE 'alice' { role: 'Senior Engineer' } EMBEDDING [0.15, 0.25, 0.35]

ENTITY DELETE

ENTITY DELETE key

Deletes a unified entity from all engines.

ENTITY CONNECT

ENTITY CONNECT from_key -> to_key : edge_type

Creates a relationship between two unified entities.

ENTITY CONNECT 'alice' -> 'bob' : reports_to

ENTITY BATCH

ENTITY BATCH CREATE [{ key: 'k1', props... }, { key: 'k2', props... }]

Creates multiple unified entities in a single operation.

FIND

FIND NODE [label] [WHERE condition] [LIMIT n]
FIND EDGE [type] [WHERE condition] [LIMIT n]
FIND ROWS FROM table [WHERE condition] [LIMIT n]
FIND PATH from_label -[edge_type]-> to_label [WHERE condition] [LIMIT n]

Cross-engine search that queries across relational, graph, and vector engines.

FIND NODE person WHERE name = 'Alice'
FIND EDGE reports_to LIMIT 10
FIND ROWS FROM users WHERE age > 25
FIND PATH person -[reports_to]-> person LIMIT 5

Vault Commands

VAULT SET

VAULT SET key value

Stores an encrypted secret.

VAULT SET 'api_key' 'sk-abc123'

VAULT GET

VAULT GET key

Retrieves a decrypted secret (requires appropriate access).

VAULT GET 'api_key'

VAULT DELETE

VAULT DELETE key

Deletes a secret.

VAULT LIST

VAULT LIST [pattern]

Lists secrets, optionally filtered by pattern.

VAULT LIST
VAULT LIST 'api_*'

VAULT ROTATE

VAULT ROTATE key new_value

Rotates a secret to a new value while maintaining the same key.

VAULT ROTATE 'api_key' 'sk-new456'

VAULT GRANT

VAULT GRANT entity ON key

Grants an entity access to a secret.

VAULT GRANT 'alice' ON 'api_key'

VAULT REVOKE

VAULT REVOKE entity ON key

Revokes an entity’s access to a secret.

VAULT REVOKE 'bob' ON 'api_key'

Cache Commands

CACHE INIT

CACHE INIT

Initializes the LLM response cache.

CACHE STATS

CACHE STATS

Shows cache hit/miss statistics.

CACHE CLEAR

CACHE CLEAR

Clears all cache entries.

CACHE EVICT

CACHE EVICT [n]

Evicts the least recently used entries. If n is provided, evicts that many.

CACHE GET

CACHE GET key

Retrieves a cached response by exact key.

CACHE GET 'what is machine learning?'

CACHE PUT

CACHE PUT key value

Stores a response in the cache.

CACHE PUT 'what is ML?' 'Machine learning is...'

CACHE SEMANTIC GET

CACHE SEMANTIC GET query [THRESHOLD n]

Performs a semantic similarity lookup in the cache. Returns the closest matching cached response if it exceeds the similarity threshold.

CACHE SEMANTIC GET 'explain machine learning' THRESHOLD 0.85

CACHE SEMANTIC PUT

CACHE SEMANTIC PUT query response EMBEDDING [vector]

Stores a response with its embedding for semantic matching.

CACHE SEMANTIC PUT 'what is ML?' 'Machine learning is...' EMBEDDING [0.1, 0.2, 0.3]

Blob Storage Commands

BLOB INIT

BLOB INIT

Initializes the blob storage engine.

BLOB PUT

BLOB PUT filename [DATA value | FROM path]
    [TYPE content_type] [BY creator] [LINK entity, ...] [TAG tag, ...]

Uploads a blob with optional metadata.

BLOB PUT 'report.pdf' FROM '/tmp/report.pdf' TYPE 'application/pdf' TAG 'quarterly'
BLOB PUT 'config.json' DATA '{"key": "value"}' BY 'admin'

BLOB GET

BLOB GET artifact_id [TO path]

Downloads a blob. If TO is specified, writes to the given file path.

BLOB GET 'art-123'
BLOB GET 'art-123' TO '/tmp/download.pdf'

BLOB DELETE

BLOB DELETE artifact_id

Deletes a blob.

BLOB INFO

BLOB INFO artifact_id

Shows metadata for a blob (size, checksum, creation date, tags, links).

BLOB LINK artifact_id TO entity

Links a blob to an entity.

BLOB LINK 'art-123' TO 'alice'
BLOB UNLINK artifact_id FROM entity

Removes a link between a blob and an entity.

BLOB LINKS artifact_id

Lists all entities linked to a blob.

BLOB TAG

BLOB TAG artifact_id tag

Adds a tag to a blob.

BLOB TAG 'art-123' 'important'

BLOB UNTAG

BLOB UNTAG artifact_id tag

Removes a tag from a blob.

BLOB VERIFY

BLOB VERIFY artifact_id

Verifies the integrity of a blob by checking its checksum.

BLOB GC

BLOB GC [FULL]

Runs garbage collection on blob storage. FULL performs a thorough sweep.

BLOB REPAIR

BLOB REPAIR

Repairs blob storage by fixing inconsistencies.

BLOB STATS

BLOB STATS

Shows blob storage statistics (total count, size, etc.).

BLOB META SET

BLOB META SET artifact_id key value

Sets a custom metadata key-value pair on a blob.

BLOB META SET 'art-123' 'department' 'engineering'

BLOB META GET

BLOB META GET artifact_id key

Gets a custom metadata value from a blob.

BLOBS

BLOBS [pattern]

Lists all blobs, optionally filtered by filename pattern.

BLOBS FOR

BLOBS FOR entity

Lists blobs linked to a specific entity.

BLOBS FOR 'alice'

BLOBS BY TAG

BLOBS BY TAG tag

Lists blobs with a specific tag.

BLOBS BY TAG 'quarterly'

BLOBS WHERE TYPE

BLOBS WHERE TYPE = content_type

Lists blobs with a specific content type.

BLOBS WHERE TYPE = 'application/pdf'

BLOBS SIMILAR TO

BLOBS SIMILAR TO artifact_id [LIMIT n]

Finds blobs similar to a given blob.


Checkpoint Commands

CHECKPOINT

CHECKPOINT [name]

Creates a named checkpoint (snapshot) of the current state.

CHECKPOINT 'before-migration'
CHECKPOINT

CHECKPOINTS

CHECKPOINTS [LIMIT n]

Lists all available checkpoints.

ROLLBACK TO

ROLLBACK TO checkpoint_id

Restores the database to a previous checkpoint.

ROLLBACK TO 'before-migration'

Chain Commands

The chain subsystem provides a tensor-native blockchain with Raft consensus.

BEGIN CHAIN TRANSACTION

BEGIN CHAIN TRANSACTION

Starts a new chain transaction. All subsequent mutations are buffered until commit.

COMMIT CHAIN

COMMIT CHAIN

Commits the current chain transaction, appending a new block.

ROLLBACK CHAIN TO

ROLLBACK CHAIN TO height

Rolls back the chain to a specific block height.

CHAIN HEIGHT

CHAIN HEIGHT

Returns the current chain height (number of blocks).

CHAIN TIP

CHAIN TIP

Returns the most recent block.

CHAIN BLOCK

CHAIN BLOCK height

Retrieves a block at the given height.

CHAIN BLOCK 42

CHAIN VERIFY

CHAIN VERIFY

Verifies the integrity of the entire chain.

CHAIN HISTORY

CHAIN HISTORY key

Gets the history of changes for a specific key across all blocks.

CHAIN HISTORY 'users/alice'

CHAIN SIMILAR

CHAIN SIMILAR [embedding] [LIMIT n]

Searches the chain by embedding similarity.

CHAIN SIMILAR [0.1, 0.2, 0.3] LIMIT 5

CHAIN DRIFT

CHAIN DRIFT FROM height TO height

Computes drift metrics between two chain heights.

CHAIN DRIFT FROM 10 TO 50

SHOW CODEBOOK GLOBAL

SHOW CODEBOOK GLOBAL

Shows the global codebook used for tensor compression.

SHOW CODEBOOK LOCAL

SHOW CODEBOOK LOCAL domain

Shows the local codebook for a specific domain.

SHOW CODEBOOK LOCAL 'embeddings'

ANALYZE CODEBOOK TRANSITIONS

ANALYZE CODEBOOK TRANSITIONS

Analyzes transitions between codebook states.


Cluster Commands

CLUSTER CONNECT

CLUSTER CONNECT address

Connects to a cluster node.

CLUSTER CONNECT 'node2@192.168.1.10:7000'

CLUSTER DISCONNECT

CLUSTER DISCONNECT

Disconnects from the cluster.

CLUSTER STATUS

CLUSTER STATUS

Shows the current cluster status (membership, leader, term).

CLUSTER NODES

CLUSTER NODES

Lists all cluster nodes and their states.

CLUSTER LEADER

CLUSTER LEADER

Shows the current cluster leader.


Cypher Commands (Experimental)

Neumann includes experimental support for Cypher-style graph queries.

MATCH

[OPTIONAL] MATCH pattern [WHERE condition]
RETURN items [ORDER BY items] [SKIP n] [LIMIT n]

Pattern matching query with Cypher syntax.

MATCH (p:Person)-[:REPORTS_TO]->(m:Person)
RETURN p.name, m.name

MATCH (a:Person)-[:KNOWS*1..3]->(b:Person)
WHERE a.name = 'Alice'
RETURN b.name, COUNT(*) AS depth
ORDER BY depth
LIMIT 10

Relationship patterns: -[r:TYPE]-> (outgoing), <-[r:TYPE]- (incoming), -[r:TYPE]- (undirected). Variable-length: -[*1..5]->.

CYPHER CREATE

CREATE (pattern)

Creates nodes and relationships.

CREATE (p:Person { name: 'Dave', role: 'Designer' })
CREATE (a)-[:KNOWS]->(b)

CYPHER DELETE

[DETACH] DELETE variables

Deletes nodes or relationships. DETACH DELETE also removes all relationships.

MERGE

MERGE (pattern) [ON CREATE SET ...] [ON MATCH SET ...]

Upsert: matches an existing pattern or creates it.

MERGE (p:Person { name: 'Alice' })
ON CREATE SET p.created = '2024-01-01'
ON MATCH SET p.updated = '2024-06-01'

Shell Commands

These commands are available in the interactive shell but are not part of the query language.

CommandDescription
helpShow available commands
exit / quitExit the shell
clearClear the screen
tablesAlias for SHOW TABLES
save 'path'Save data to binary file
load 'path'Load data from binary file

Persistence

Start the shell with WAL (write-ahead log) for durability:

neumann --wal-dir ./data

Data Types

Neumann has two layers of types: scalar values for individual fields and tensor values for composite storage.


Scalar Types

ScalarValue represents a single value in the system.

TypeDescriptionExamples
NullAbsence of a valueNULL
BoolBooleanTRUE, FALSE
Int64-bit signed integer42, -1, 0
Float64-bit floating point3.14, -0.5, 1e10
StringUTF-8 text'hello', 'Alice'
BytesRaw binary data(used internally for blob content)

Literals

  • Strings: Single-quoted: 'hello world'
  • Integers: Unquoted numbers: 42, -7
  • Floats: Numbers with decimal or exponent: 3.14, 1e-5
  • Booleans: TRUE or FALSE (case-insensitive)
  • Null: NULL
  • Arrays: Square brackets: [1, 2, 3] or [0.1, 0.2, 0.3]

Tensor Types

TensorValue wraps scalar values with vector and pointer types for the unified data model.

TypeDescriptionUse Case
Scalar(ScalarValue)Single scalar valueTable columns, node properties
Vector(Vec<f32>)Dense float vectorEmbeddings for similarity search
Sparse(SparseVector)Sparse vectorMemory-efficient high-dimensional embeddings
Pointer(String)Reference to another entityGraph edges, foreign keys
Pointers(Vec<String>)Multiple referencesMulti-valued relationships

Column Types in CREATE TABLE

When creating relational tables, use SQL-style type names. These map to internal scalar types.

SQL TypeInternal TypeNotes
INT, INTEGERInt64-bit signed integer
BIGINTIntSame as INT (64-bit)
SMALLINTIntSame as INT (64-bit)
FLOATFloat64-bit floating point
DOUBLEFloatSame as FLOAT
REALFloatSame as FLOAT
DECIMAL(p,s)FloatPrecision and scale are advisory
NUMERIC(p,s)FloatSame as DECIMAL
VARCHAR(n)StringMax length is advisory
CHAR(n)StringFixed-width (padded)
TEXTStringUnlimited length
BOOLEANBoolTRUE or FALSE
DATEStringStored as ISO-8601 string
TIMEStringStored as ISO-8601 string
TIMESTAMPStringStored as ISO-8601 string
BLOBBytesRaw binary data
custom nameStringAny unrecognized type stores as String

Type Coercion

Neumann performs implicit type coercion in comparisons:

  • Int and Float in arithmetic: Int is promoted to Float
  • String comparisons: lexicographic ordering
  • Null propagation: any operation with NULL yields NULL
  • Boolean context: only Bool values are truthy/falsy (no implicit conversion from Int)

Vector Representation

Dense vectors are stored as Vec<f32> and used for similarity search via HNSW indexes. All vectors in a collection must have the same dimensionality.

EMBED STORE 'doc1' [0.1, 0.2, 0.3, 0.4]

Sparse vectors use a compact representation storing only non-zero indices and values, making them efficient for high-dimensional data (e.g., 30,000+ dimensions for bag-of-words models).


Identifiers

Identifiers (table names, column names, labels) follow these rules:

  • Start with a letter or underscore
  • Contain letters, digits, and underscores
  • Case-insensitive for keywords, case-preserving for identifiers
  • No quoting required for simple names
  • Use single quotes for string values: 'value'

Functions Reference


Aggregate Functions

These functions operate on groups of rows in SELECT queries with GROUP BY.

FunctionDescriptionExample
COUNT(*)Count all rowsSELECT COUNT(*) FROM users
COUNT(column)Count non-null valuesSELECT COUNT(name) FROM users
SUM(column)Sum numeric valuesSELECT SUM(total) FROM orders
AVG(column)Average numeric valuesSELECT AVG(age) FROM users
MIN(column)Minimum valueSELECT MIN(created) FROM orders
MAX(column)Maximum valueSELECT MAX(total) FROM orders
SELECT team, COUNT(*) AS headcount, AVG(age) AS avg_age
FROM employees
GROUP BY team
HAVING COUNT(*) > 5
ORDER BY headcount DESC

Graph Algorithm Functions

These are invoked as top-level commands, not as SQL functions. See the Query Language Reference for full syntax.

AlgorithmDescriptionReturns
PAGERANKLink analysis rankingNode scores (0.0-1.0)
BETWEENNESSBridge node importanceNode scores
CLOSENESSAverage distance to all nodesNode scores
EIGENVECTORInfluence-based rankingNode scores
LOUVAINCommunity detectionCommunity assignments
LABEL_PROPAGATIONCommunity detectionCommunity assignments

Graph Aggregate Functions

Used with the GRAPH AGGREGATE command on node/edge properties.

FunctionDescription
COUNTCount nodes/edges matching criteria
SUMSum of property values
AVGAverage of property values
MINMinimum property value
MAXMaximum property value
GRAPH AGGREGATE COUNT NODES person
GRAPH AGGREGATE AVG NODE age person WHERE age > 20
GRAPH AGGREGATE SUM EDGE weight collaborates

Distance Metrics

Used with SIMILAR and EMBED commands for vector similarity search.

MetricKeywordRangeBest For
Cosine similarityCOSINE-1.0 to 1.0Text embeddings, normalized vectors
Euclidean distanceEUCLIDEAN0.0 to infinitySpatial data, image features
Dot productDOT_PRODUCT-infinity to infinityPre-normalized vectors, recommendation
SIMILAR [0.1, 0.2, 0.3] LIMIT 10 METRIC COSINE
SIMILAR 'doc1' LIMIT 5 METRIC EUCLIDEAN

The default metric is COSINE when not specified.


Expression Operators

Arithmetic

OperatorDescription
+Addition
-Subtraction
*Multiplication
/Division
%Modulo

Comparison

OperatorDescription
=Equal
!= or <>Not equal
<Less than
<=Less than or equal
>Greater than
>=Greater than or equal

Logical

OperatorDescription
ANDLogical AND
ORLogical OR
NOTLogical NOT

Special Predicates

PredicateDescriptionExample
IS NULLTest for nullWHERE name IS NULL
IS NOT NULLTest for non-nullWHERE name IS NOT NULL
IN (list)Set membershipWHERE id IN (1, 2, 3)
NOT IN (list)Set non-membershipWHERE id NOT IN (1, 2)
BETWEEN a AND bRange checkWHERE age BETWEEN 18 AND 65
LIKE patternPattern matchingWHERE name LIKE 'A%'
NOT LIKE patternNegative patternWHERE name NOT LIKE '%test%'
EXISTS (subquery)Subquery existenceWHERE EXISTS (SELECT ...)

CASE Expression

CASE
    WHEN condition THEN result
    [WHEN condition THEN result ...]
    [ELSE default]
END
SELECT name,
    CASE
        WHEN age < 18 THEN 'minor'
        WHEN age < 65 THEN 'adult'
        ELSE 'senior'
    END AS category
FROM users

CAST

CAST(expression AS type)
SELECT CAST(age AS FLOAT) / 10 AS decade FROM users

Tensor Data Model

Neumann uses a unified tensor-based data model that represents all data types as mathematical tensors.

Core Types

TensorValue

The fundamental value type:

VariantDescriptionExample
Scalar(ScalarValue)Single value42, "hello", true
Vector(Vec<f32>)Dense embedding[0.1, 0.2, 0.3]
Pointer(String)Reference to entity"user_123"
Pointers(Vec<String>)Multiple references["a", "b", "c"]

ScalarValue

Primitive values:

VariantRust TypeExample
Int(i64)64-bit integer42
Float(f64)64-bit float3.14
String(String)UTF-8 string"hello"
Bool(bool)Booleantrue
Bytes(Vec<u8>)Binary data[0x01, 0x02]
NullNull valueNULL

TensorData

A map of field names to TensorValues:

#![allow(unused)]
fn main() {
// Conceptually: HashMap<String, TensorValue>
let user = TensorData::new()
    .with("id", TensorValue::Scalar(ScalarValue::Int(1)))
    .with("name", TensorValue::Scalar(ScalarValue::String("Alice".into())))
    .with("embedding", TensorValue::Vector(vec![0.1, 0.2, 0.3]));
}

Sparse Vectors

For high-dimensional sparse data:

#![allow(unused)]
fn main() {
// Only stores non-zero values
let sparse = SparseVector::new(1000)  // 1000 dimensions
    .with_value(42, 0.5)
    .with_value(100, 0.3)
    .with_value(500, 0.8);
}

Operations

OperationDescription
cosine_similarityCosine distance between vectors
euclidean_distanceL2 distance
dot_productInner product
weighted_averageBlend multiple vectors
project_orthogonalRemove component

Type Mapping

Relational Engine

SQL TypeTensorValue
INTScalar(Int)
FLOATScalar(Float)
STRINGScalar(String)
BOOLScalar(Bool)
VECTOR(n)Vector

Graph Engine

Graph ElementTensorValue
Node IDScalar(String)
Edge targetPointer
PropertiesTensorData

Vector Engine

Vector TypeTensorValue
DenseVector
SparseSparseVector (internal)

Storage Layout

Data is stored in TensorStore as key-value pairs:

Key: "users/1"
Value: TensorData {
    "id": Scalar(Int(1)),
    "name": Scalar(String("Alice")),
    "embedding": Vector([0.1, 0.2, ...])
}

Sparse Vectors

Sparse vectors are a memory-efficient representation for high-dimensional data where most values are zero.

When to Use Sparse Vectors

Use CaseDenseSparse
Low dimensions (<100)PreferredOverhead
High dimensions (>1000)Memory intensivePreferred
Most values non-zeroPreferredOverhead
<10% values non-zeroWastefulPreferred

SparseVector Type

#![allow(unused)]
fn main() {
pub struct SparseVector {
    dimension: usize,
    indices: Vec<usize>,
    values: Vec<f32>,
}
}

Memory Comparison

For a 10,000-dimensional vector with 100 non-zero values:

RepresentationMemory
Dense Vec<f32>40,000 bytes
Sparse~800 bytes
Savings98%

Operations

Creation

#![allow(unused)]
fn main() {
// From dense
let sparse = SparseVector::from_dense(&[0.0, 0.5, 0.0, 0.3, 0.0]);

// Incremental
let mut sparse = SparseVector::new(1000);
sparse.set(42, 0.5);
sparse.set(100, 0.3);
}

Arithmetic

#![allow(unused)]
fn main() {
// Subtraction (for deltas)
let delta = new_state.sub(&old_state);

// Weighted average
let blended = SparseVector::weighted_average(&[
    (&vec_a, 0.7),
    (&vec_b, 0.3),
]);

// Orthogonal projection
let residual = vec.project_orthogonal(&basis);
}

Similarity Metrics

MetricFormulaRange
Cosinea.b / (‖a‖ * ‖b‖)-1 to 1
Euclideansqrt(sum((a-b)^2))0 to inf
Jaccard‖A ∩ B‖ / ‖A ∪ B‖0 to 1
Angularacos(cosine) / pi0 to 1
#![allow(unused)]
fn main() {
let sim = vec_a.cosine_similarity(&vec_b);
let dist = vec_a.euclidean_distance(&vec_b);
let jacc = vec_a.jaccard_index(&vec_b);
}

HNSW Index

Hierarchical Navigable Small World for approximate nearest neighbor search:

#![allow(unused)]
fn main() {
let mut index = HNSWIndex::new(HNSWConfig::default());

// Insert
index.insert("doc_1", sparse_vec_1);
index.insert("doc_2", sparse_vec_2);

// Search
let results = index.search(&query_vec, 10); // top 10
}

Configuration

ParameterDefaultDescription
m16Max connections per layer
ef_construction200Build-time search width
ef_search50Query-time search width

Delta Encoding

For tracking state changes:

#![allow(unused)]
fn main() {
// Compute delta between states
let delta = DeltaVector::from_diff(&old_embedding, &new_embedding);

// Apply delta
let new_state = old_state.add(&delta.to_sparse());

// Check if orthogonal (non-conflicting)
if delta_a.is_orthogonal(&delta_b) {
    // Can merge automatically
}
}

Compression

Sparse vectors compress well:

MethodRatioSpeed
Varint indices2-4xFast
Quantization (int8)4xFast
Binary quantization32xVery fast

Semantic Operations

Semantic operations in Neumann leverage vector embeddings to perform meaning-aware computations.

Core Concepts

Embeddings

Embeddings map data to vector space where similar items are close:

"cat" -> [0.2, 0.8, 0.1, ...]
"dog" -> [0.3, 0.7, 0.2, ...]  (close to cat)
"car" -> [0.9, 0.1, 0.5, ...]  (far from cat)

Similarity Search

Find items similar to a query:

SELECT * FROM documents
WHERE SIMILAR(embedding, query_vec, 0.8)
LIMIT 10;

Operations

Conflict Detection

In tensor_chain, semantic operations detect conflicts:

#![allow(unused)]
fn main() {
// Two changes conflict if their deltas overlap
let conflict = delta_a.cosine_similarity(&delta_b) > threshold;

// Orthogonal changes can be merged
if delta_a.is_orthogonal(&delta_b) {
    let merged = delta_a.add(&delta_b);
}
}

Auto-Merge

Non-conflicting changes merge automatically:

flowchart LR
    A[State S0] --> B[Change A: fields 1-10]
    A --> C[Change B: fields 11-20]
    B --> D[Merged: fields 1-20]
    C --> D

Semantic Conflict Resolution

When changes overlap:

ScenarioDetectionResolution
Orthogonalsimilarity < 0.1Auto-merge
Partial overlap0.1 <= similarity < 0.5Manual review
Direct conflictsimilarity >= 0.5Reject newer

Codebook Quantization

For efficient similarity comparisons:

Global Codebook

Static centroids for consensus validation:

#![allow(unused)]
fn main() {
let codebook = GlobalCodebook::new(1024, 128); // 1024 centroids, 128 dims
let quantized = codebook.quantize(&embedding);
}

Local Codebook

Adaptive centroids per domain:

#![allow(unused)]
fn main() {
let mut codebook = LocalCodebook::new(256, 128);
codebook.update(&new_embeddings, 0.1); // EMA update
}

Distance Metrics

MetricUse CaseProperties
CosineText similarityScale-invariant
EuclideanSpatial dataAbsolute distance
AngularNormalized comparison0 to 1 range
GeodesicManifold dataCurvature-aware

tensor_cache uses semantic similarity:

#![allow(unused)]
fn main() {
// Exact match first
if let Some(hit) = cache.get_exact(&prompt_hash) {
    return hit;
}

// Then semantic search
if let Some(hit) = cache.search_similar(&prompt_embedding, 0.95) {
    return hit;
}
}

Embedding State Machine

tensor_chain tracks embedding lifecycle:

stateDiagram-v2
    [*] --> Initial: new transaction
    Initial --> Computed: compute_embedding()
    Computed --> Validated: validate()
    Validated --> Committed: commit()
#![allow(unused)]
fn main() {
pub enum EmbeddingState {
    Initial,                    // No embedding yet
    Computed(SparseVector),     // Computed, not validated
    Validated(SparseVector),    // Passed validation
}
}

Distributed Transactions

tensor_chain implements distributed transactions using Two-Phase Commit (2PC) with semantic conflict detection.

Transaction Lifecycle

stateDiagram-v2
    [*] --> Pending: begin()
    Pending --> Preparing: prepare()
    Preparing --> Prepared: all votes received
    Prepared --> Committing: commit decision
    Prepared --> Aborting: abort decision
    Committing --> Committed: all acks
    Aborting --> Aborted: all acks
    Committed --> [*]
    Aborted --> [*]

Two-Phase Commit

Phase 1: Prepare

  1. Coordinator sends Prepare to all participants
  2. Each participant:
    • Acquires locks
    • Validates constraints
    • Writes to WAL
    • Votes Yes or No

Phase 2: Commit/Abort

  1. If all vote Yes: Coordinator sends Commit
  2. If any vote No: Coordinator sends Abort
  3. Participants apply or rollback

Message Types

MessageDirectionPurpose
TxPrepareMsgCoordinator -> ParticipantStart prepare phase
TxVoteParticipant -> CoordinatorVote yes/no
TxCommitMsgCoordinator -> ParticipantCommit decision
TxAbortMsgCoordinator -> ParticipantAbort decision
TxAckParticipant -> CoordinatorAcknowledge commit/abort

Lock Management

Lock Types

LockCompatibilityUse
Shared (S)S-S compatibleRead operations
Exclusive (X)Incompatible with allWrite operations

Lock Acquisition

#![allow(unused)]
fn main() {
// Acquire lock with timeout
let lock = lock_manager.acquire(
    tx_id,
    key,
    LockMode::Exclusive,
    Duration::from_secs(5),
)?;
}

Deadlock Detection

Wait-for graph analysis:

#![allow(unused)]
fn main() {
// Check for cycles before waiting
if wait_graph.would_create_cycle(my_tx, blocking_tx) {
    // Abort to prevent deadlock
    return Err(DeadlockDetected);
}

// Register wait
wait_graph.add_wait(my_tx, blocking_tx);
}

Victim Selection

PolicyBehavior
YoungestAbort most recent transaction
OldestAbort longest-running
LowestPriorityAbort lowest priority
MostLocksAbort holding most locks

Semantic Conflict Detection

Beyond lock-based conflicts, tensor_chain detects semantic conflicts:

#![allow(unused)]
fn main() {
// Compute embedding deltas
let delta_a = tx_a.compute_delta();
let delta_b = tx_b.compute_delta();

// Check for semantic overlap
if delta_a.cosine_similarity(&delta_b) > CONFLICT_THRESHOLD {
    // Semantic conflict - need manual resolution
    return PrepareVote::Conflict { ... };
}
}

Recovery

Coordinator Failure

  1. New coordinator queries participants for tx state
  2. If any committed: complete commit
  3. If all prepared: re-run commit decision
  4. Otherwise: abort

Participant Failure

  1. Participant replays WAL on restart
  2. For prepared transactions: query coordinator
  3. Apply commit or abort based on coordinator state

Configuration

#![allow(unused)]
fn main() {
pub struct DistributedTxConfig {
    /// Prepare phase timeout
    pub prepare_timeout_ms: u64,
    /// Commit phase timeout
    pub commit_timeout_ms: u64,
    /// Maximum concurrent transactions
    pub max_concurrent_tx: usize,
    /// Lock wait timeout
    pub lock_timeout_ms: u64,
}
}

Formal Verification

The 2PC protocol is formally specified in TwoPhaseCommit.tla and exhaustively model-checked with TLC across 2.3M distinct states. The model verifies Atomicity (all-or-nothing), NoOrphanedLocks, ConsistentDecision, VoteIrrevocability, and DecisionStability. See Formal Verification for full results.

Best Practices

  1. Keep transactions short: Long transactions increase conflict probability
  2. Order lock acquisition: Acquire locks in consistent order to prevent deadlocks
  3. Use appropriate isolation: Not all operations need serializable isolation
  4. Monitor deadlock rate: High rates indicate contention issues

Consensus Protocols

tensor_chain uses Raft consensus with SWIM gossip for membership management.

Raft Consensus

Overview

Raft provides:

  • Leader election
  • Log replication
  • Safety (never returns incorrect results)
  • Availability (operational if majority alive)

Node States

stateDiagram-v2
    [*] --> Follower
    Follower --> Candidate: election timeout
    Candidate --> Leader: wins election
    Candidate --> Follower: discovers leader
    Leader --> Follower: discovers higher term
    Candidate --> Candidate: split vote

Terms

Time divided into terms with at most one leader:

Term 1: [Leader A] -----> [Follower timeout]
Term 2: [Election] -> [Leader B] -----> ...

Log Replication

sequenceDiagram
    participant C as Client
    participant L as Leader
    participant F1 as Follower 1
    participant F2 as Follower 2

    C->>L: Write request
    L->>L: Append to log
    par Replicate
        L->>F1: AppendEntries
        L->>F2: AppendEntries
    end
    F1->>L: Success
    F2->>L: Success
    L->>L: Commit (majority)
    L->>C: Success

Configuration

ParameterDefaultDescription
election_timeout_min150msMin election timeout
election_timeout_max300msMax election timeout
heartbeat_interval50msLeader heartbeat frequency
max_entries_per_append100Batch size for replication

SWIM Gossip

Overview

Scalable Weakly-consistent Infection-style Membership:

  • O(log N) failure detection
  • Distributed membership view
  • No single point of failure

Protocol

sequenceDiagram
    participant A as Node A
    participant B as Node B (target)
    participant C as Node C

    A->>B: Ping
    Note over B: No response
    A->>C: PingReq(B)
    C->>B: Ping
    alt B responds
        B->>C: Ack
        C->>A: Ack (indirect)
    else B down
        C->>A: Nack
        A->>A: Mark B suspect
    end

Node States

StateDescriptionTransition
HealthyResponding normally
SuspectFailed direct pingAfter timeout
FailedConfirmed downAfter indirect ping failure

LWW-CRDT Membership

Last-Writer-Wins with incarnation numbers:

#![allow(unused)]
fn main() {
// State comparison
fn supersedes(&self, other: &Self) -> bool {
    (self.incarnation, self.timestamp) > (other.incarnation, other.timestamp)
}

// Merge takes winner per node
fn merge(&mut self, other: &Self) {
    for (node_id, state) in &other.states {
        if state.supersedes(&self.states[node_id]) {
            self.states.insert(node_id.clone(), state.clone());
        }
    }
}
}

Configuration

ParameterDefaultDescription
ping_interval1sDirect ping frequency
ping_timeout500msTime to wait for response
suspect_timeout3sTime before marking failed
indirect_ping_count3Number of indirect pings

Hybrid Logical Clocks

Combine physical time with logical counters:

#![allow(unused)]
fn main() {
pub struct HybridTimestamp {
    wall_ms: u64,    // Physical time (milliseconds)
    logical: u16,    // Logical counter
}
}

Properties

  • Monotonic: Always increases
  • Bounded drift: Stays close to wall clock
  • Causality: If A happens-before B, then ts(A) < ts(B)

Usage

#![allow(unused)]
fn main() {
let hlc = HybridLogicalClock::new(node_id);

// Local event
let ts = hlc.now();

// Receive message with timestamp
let ts = hlc.receive(message_ts);
}

Formal Verification

Both protocols are formally specified in TLA+ and exhaustively model-checked with TLC:

  • Raft.tla verifies ElectionSafety, LogMatching, StateMachineSafety, LeaderCompleteness, VoteIntegrity, and TermMonotonicity across 18.3M distinct states.
  • Membership.tla verifies NoFalsePositivesSafety, MonotonicEpochs, and MonotonicIncarnations across 54K distinct states.

Model checking found and led to fixes for protocol bugs including out-of-order message handling in Raft log replication and an invalid fairness formula in the gossip spec. See Formal Verification for full results.

Integration

Raft and SWIM work together:

  1. SWIM detects node failures quickly
  2. Raft handles leader election and log consistency
  3. HLC provides ordering across the cluster
flowchart TB
    subgraph Membership Layer
        SWIM[SWIM Gossip]
    end

    subgraph Consensus Layer
        Raft[Raft Consensus]
    end

    subgraph Time Layer
        HLC[Hybrid Logical Clock]
    end

    SWIM -->|failure notifications| Raft
    HLC -->|timestamps| SWIM
    HLC -->|timestamps| Raft

Embedding State Machine

The EmbeddingState provides type-safe state transitions for transaction embedding lifecycle. It eliminates Option ceremony and ensures correct API usage at compile time.

Overview

Transaction embeddings track the semantic change from before-state to after-state. The state machine ensures:

  • Before embedding is always available
  • Delta is only accessible after computation
  • Dimension mismatches are caught early
  • Double-computation is prevented

State Diagram

stateDiagram-v2
    [*] --> Initial: new(before)
    Initial --> Computed: compute(after)
    Computed --> Computed: access only
    Initial --> Initial: access only
StateDescriptionAvailable Data
InitialTransaction started, before capturedbefore
ComputedDelta computed, ready for conflict checkbefore, after, delta

API Reference

Construction Methods

MethodDescriptionResult State
new(before)Create from sparse vectorInitial
from_dense(&[f32])Create from dense sliceInitial
empty(dim)Create zero vector of given dimensionInitial
default()Create empty (dimension 0)Initial

State Query Methods

MethodInitialComputed
before()&SparseVector&SparseVector
after()NoneSome(&SparseVector)
delta()NoneSome(&SparseVector)
is_computed()falsetrue
dimension()dimensiondimension

Transition Methods

MethodFromToError Conditions
compute(after)InitialComputedAlreadyComputed, DimensionMismatch
compute_from_dense(&[f32])InitialComputedAlreadyComputed, DimensionMismatch
compute_with_threshold(after, threshold)InitialComputedAlreadyComputed, DimensionMismatch

Threshold Configuration

The compute_with_threshold method creates sparse deltas by ignoring small changes. This reduces memory usage for high-dimensional embeddings.

Threshold Effects

ThresholdEffectUse Case
0.0All differences capturedExact tracking
0.001Ignore floating-point noiseGeneral use
0.01Ignore minor changesDimensionality reduction
0.1Only major changesCoarse conflict detection

Example

#![allow(unused)]
fn main() {
let state = EmbeddingState::from_dense(&before);

// Only capture differences > 0.01
let computed = state.compute_with_threshold(&after, 0.01)?;

// Sparse delta - fewer non-zero entries
let delta = computed.delta().unwrap();
println!("Non-zero entries: {}", delta.nnz());
}

Error Handling

Error Types

ErrorCausePrevention
NotComputedAccessing delta before computeCheck is_computed()
AlreadyComputedCalling compute twiceCheck is_computed()
DimensionMismatchBefore and after have different dimsValidate dimensions

Error Display

#![allow(unused)]
fn main() {
// NotComputed
"delta not yet computed"

// AlreadyComputed
"delta already computed"

// DimensionMismatch
"dimension mismatch: before=128, after=64"
}

Example Usage

Basic Workflow

#![allow(unused)]
fn main() {
use tensor_chain::embedding::EmbeddingState;
use tensor_store::SparseVector;

// 1. Capture before-state at transaction start
let before = SparseVector::from_dense(&[1.0, 0.0, 0.0, 0.0]);
let state = EmbeddingState::new(before);

// 2. State is Initial - delta not available
assert!(!state.is_computed());
assert!(state.delta().is_none());

// 3. Compute delta at commit time
let after = SparseVector::from_dense(&[1.0, 0.5, 0.0, 0.0]);
let computed = state.compute(after)?;

// 4. State is Computed - delta available
assert!(computed.is_computed());
let delta = computed.delta().unwrap();

// Delta is [0.0, 0.5, 0.0, 0.0]
assert_eq!(delta.nnz(), 1);  // Only one non-zero
}

Using delta_or_zero

For code that needs a dense vector regardless of state:

#![allow(unused)]
fn main() {
// Safe to call in any state
let dense_delta = state.delta_or_zero();

// Returns zeros if Initial
// Returns actual delta if Computed
}

Delta Magnitude

#![allow(unused)]
fn main() {
// Check if transaction made significant changes
let magnitude = state.delta_magnitude();

if magnitude < 0.001 {
    println!("No meaningful changes");
} else {
    println!("Change magnitude: {}", magnitude);
}
}

Integration with Consensus

The embedding state integrates with the consensus layer for conflict detection.

Delta to DeltaVector

#![allow(unused)]
fn main() {
use tensor_chain::consensus::DeltaVector;

let state = EmbeddingState::from_dense(&before);
let computed = state.compute_from_dense(&after)?;

// Create DeltaVector for conflict detection
let delta_vec = DeltaVector::new(
    computed.delta_or_zero(),
    affected_keys,
    tx_id,
);

// Check orthogonality with another transaction
let similarity = delta_vec.cosine_similarity(&other_delta);
if similarity.abs() < 0.1 {
    println!("Transactions are orthogonal - can merge");
}
}

Conflict Classification

SimilarityClassificationAction
< 0.1OrthogonalCan merge
0.1 - 0.5Low conflictMerge possible
0.5 - 0.9ConflictingNeeds resolution
> 0.9ParallelMust serialize

Serialization

The state machine supports bitcode serialization for persistence:

#![allow(unused)]
fn main() {
// Serialize
let bytes = bitcode::encode(&state);

// Deserialize
let restored: EmbeddingState = bitcode::decode(&bytes)?;

// State is preserved
assert_eq!(state.is_computed(), restored.is_computed());
}

Source Reference

  • tensor_chain/src/embedding.rs - EmbeddingState implementation
  • tensor_store/src/lib.rs - SparseVector type

Codebook Manager

The codebook system provides vector quantization for mapping continuous tensor states to a finite vocabulary of valid states. It enables state validation and efficient consensus through hierarchical quantization.

Overview

The system consists of two levels:

  • GlobalCodebook: Static codebook shared across all nodes for consensus
  • LocalCodebook: Adaptive codebook per domain that captures residuals
flowchart TD
    A[Input Vector] --> B[Global Codebook]
    B --> C{Residual > threshold?}
    C -->|No| D[Global code only]
    C -->|Yes| E[Local Codebook]
    E --> F[Global + Local codes]
    D --> G[Final Quantization]
    F --> G

Global Codebook

The global codebook provides consensus-safe quantization using static centroids shared by all nodes.

Initialization Methods

MethodDescription
new(dimension)Empty codebook
from_centroids(Vec<Vec<f32>>)From pre-computed centroids
from_centroids_with_labelsCentroids with semantic labels
from_kmeans(vectors, k, iters)Initialize via k-means clustering

Quantization

#![allow(unused)]
fn main() {
use tensor_chain::codebook::GlobalCodebook;

// Initialize from training data
let codebook = GlobalCodebook::from_kmeans(&training_vectors, 256, 100);

// Quantize a vector
if let Some((entry_id, similarity)) = codebook.quantize(&vector) {
    println!("Nearest entry: {}, similarity: {}", entry_id, similarity);
}

// Compute residual for hierarchical quantization
if let Some((id, residual)) = codebook.compute_residual(&vector) {
    // residual = vector - centroid[id]
}
}

Local Codebook

Local codebooks adapt to domain-specific patterns using exponential moving average (EMA) updates. They capture residuals that the global codebook misses.

EMA Update Formula

When an observation matches an existing entry:

centroid_new = alpha * observation + (1 - alpha) * centroid_old

where alpha controls the learning rate (default: 0.1).

Configuration

ParameterDefaultDescription
max_entries256Maximum entries in the codebook
ema_alpha0.1EMA learning rate
min_usage_for_prune2Minimum accesses before pruning
pruning_strategyHybridHow to select entries for removal

Pruning Strategies

StrategyDescriptionScore Formula
LRULeast Recently Usedlast_access
LFULeast Frequently Usedaccess_count
HybridWeighted combinationw1*recency + w2*frequency
#![allow(unused)]
fn main() {
use tensor_chain::codebook::{LocalCodebook, PruningStrategy};

let mut local = LocalCodebook::new("transactions", 128, 256, 0.1);

// Use LRU pruning
local.set_pruning_strategy(PruningStrategy::LRU);

// Or hybrid with custom weights
local.set_pruning_strategy(PruningStrategy::Hybrid {
    recency_weight: 0.7,
    frequency_weight: 0.3,
});
}

CodebookManager

The CodebookManager coordinates hierarchical quantization across global and local codebooks.

Configuration

#![allow(unused)]
fn main() {
use tensor_chain::codebook::{CodebookManager, CodebookConfig, GlobalCodebook};

let config = CodebookConfig {
    local_capacity: 256,        // Max entries per local codebook
    ema_alpha: 0.1,             // EMA learning rate
    similarity_threshold: 0.9,   // Match threshold for local updates
    residual_threshold: 0.05,    // Min residual for local quantization
    validity_threshold: 0.8,     // State validity threshold
};

let global = GlobalCodebook::from_kmeans(&training_data, 512, 100);
let manager = CodebookManager::new(global, config);
}

Hierarchical Quantization

sequenceDiagram
    participant V as Input Vector
    participant G as Global Codebook
    participant L as Local Codebook
    participant R as Result

    V->>G: quantize()
    G->>G: Find nearest centroid
    G->>V: (entry_id, similarity)
    V->>V: residual = vector - centroid
    alt residual > threshold
        V->>L: quantize_and_update(residual)
        L->>L: EMA update or insert
        L->>R: codes = [global_id, local_id]
    else residual <= threshold
        V->>R: codes = [global_id]
    end

Usage

#![allow(unused)]
fn main() {
// Quantize a transaction embedding
let result = manager.quantize("transactions", &embedding)?;

println!("Global entry: {}", result.global_entry_id);
println!("Global similarity: {}", result.global_similarity);

if let Some(local_id) = result.local_entry_id {
    println!("Local entry: {}", local_id);
    println!("Local similarity: {}", result.local_similarity.unwrap());
}

// Final codes for storage/transmission
println!("Codes: {:?}", result.codes);
}

State Validation

The codebook system validates states against known-good patterns.

Validation Methods

#![allow(unused)]
fn main() {
// Check if a state is valid (matches any codebook entry)
let is_valid = manager.is_valid_state("transactions", &state);

// Check if a transition is valid
let is_valid_transition = manager.is_valid_transition(
    "transactions",
    &from_state,
    &to_state,
    0.5,  // max allowed distance
);
}

Validation Flow

CheckThresholdOutcome
Global matchvalidity_thresholdValid if similarity >= threshold
Local matchvalidity_thresholdValid if similarity >= threshold
Transition distancemax_distanceValid if euclidean <= max

Worked Example

Training and Runtime Quantization

#![allow(unused)]
fn main() {
use tensor_chain::codebook::{CodebookManager, CodebookConfig, GlobalCodebook};

// Phase 1: Training - build global codebook
let training_embeddings: Vec<Vec<f32>> = collect_training_data();
let global = GlobalCodebook::from_kmeans(&training_embeddings, 512, 100);

// Phase 2: Runtime - create manager
let config = CodebookConfig::default();
let manager = CodebookManager::new(global, config);

// Phase 3: Quantize incoming transactions
for tx in transactions {
    let embedding = compute_embedding(&tx);

    // Hierarchical quantization
    let quant = manager.quantize("transactions", &embedding)?;

    // Validate the state
    if !manager.is_valid_state("transactions", &embedding) {
        warn!("Unusual transaction state: {:?}", tx);
    }

    // Store codes for consensus
    tx.set_quantization_codes(quant.codes);
}

// Local codebook learns domain-specific patterns over time
manager.with_local("transactions", |local| {
    let stats = local.stats();
    println!("Local entries: {}", stats.entry_count);
    println!("Total updates: {}", stats.total_updates);
});
}

k-Means Initialization

The global codebook uses k-means++ initialization for optimal centroid placement.

flowchart TD
    A[Training Vectors] --> B[Random First Centroid]
    B --> C[Compute Distances]
    C --> D[Weighted Probability Selection]
    D --> E{k centroids?}
    E -->|No| C
    E -->|Yes| F[Lloyd's Iteration]
    F --> G{Converged?}
    G -->|No| F
    G -->|Yes| H[Final Centroids]

Configuration

#![allow(unused)]
fn main() {
use tensor_store::{KMeans, KMeansConfig};

let config = KMeansConfig {
    max_iterations: 100,
    tolerance: 1e-4,
    init_method: InitMethod::KMeansPlusPlus,
};

let kmeans = KMeans::new(config);
let centroids = kmeans.fit(&vectors, 512);
}

Statistics and Monitoring

Local Codebook Stats

MetricDescription
entry_countCurrent number of entries
total_updatesEMA updates performed
total_lookupsQuantization queries
total_prunesEntries removed due to capacity
total_insertionsNew entries added
#![allow(unused)]
fn main() {
manager.with_local("transactions", |local| {
    let stats = local.stats();
    let hit_rate = 1.0 - (stats.total_insertions as f64 / stats.total_lookups as f64);
    println!("Cache hit rate: {:.2}%", hit_rate * 100.0);
});
}

Source Reference

  • tensor_chain/src/codebook.rs - Codebook implementations
  • tensor_store/src/lib.rs - KMeans clustering

Formal Verification

Neumann’s distributed protocols are formally specified in TLA+ and exhaustively model-checked with the TLC model checker. The specifications live in specs/tla/ and cover the three critical protocol layers in tensor_chain.

What Is Model Checked

TLC explores every reachable state of a bounded model, checking safety invariants and temporal properties at each state. Unlike testing (which samples executions), model checking is exhaustive: if TLC reports no errors, the properties hold for every possible interleaving within the model bounds.

Raft Consensus (Raft.tla)

Models leader election, log replication, and commit advancement for the Tensor-Raft protocol implemented in tensor_chain/src/raft.rs. Three configurations exercise different aspects of the protocol.

Properties verified (14):

PropertyTypeWhat It Means
ElectionSafetyInvariantAt most one leader per term
LogMatchingInvariantSame index + term implies same entry
StateMachineSafetyInvariantNo divergent committed entries
LeaderCompletenessInvariantCommitted entries survive leader changes
VoteIntegrityInvariantEach node votes at most once per term
PreVoteSafetyInvariantPre-vote does not disrupt existing leaders
ReplicationInvInvariantEvery committed entry exists on a quorum
TermMonotonicityTemporalTerms never decrease
CommittedLogAppendOnlyPropTemporalCommitted entries never retracted
MonotonicCommitIndexPropTemporalcommitIndex never decreases
MonotonicMatchIndexPropTemporalmatchIndex monotonic per leader term
NeverCommitEntryPrevTermsPropTemporalOnly current-term entries committed
StateTransitionsPropTemporalValid state machine transitions
PermittedLogChangesPropTemporalLog changes only via valid paths

Result (Raft.cfg, 3 nodes): 6,641,341 states generated, 1,338,669 distinct states, depth 42, 2 min 24s. Zero errors.

Two-Phase Commit (TwoPhaseCommit.tla)

Models the 2PC protocol for cross-shard distributed transactions implemented in tensor_chain/src/distributed_tx.rs. Includes a fault model with message loss and participant crash/recovery.

Properties verified (6):

PropertyTypeWhat It Means
AtomicityInvariantAll participants commit or all abort
NoOrphanedLocksInvariantCompleted transactions release locks
ConsistentDecisionInvariantCoordinator decision matches outcomes
VoteIrrevocabilityTemporalPrepared votes cannot be retracted without coordinator
DecisionStabilityTemporalCoordinator decision never changes

Fault model: DropMessage (network loss) and ParticipantRestart (crash with WAL-backed lock recovery).

Result: 1,869,429,350 states generated, 190,170,601 distinct states, depth 29, 2 hr 55 min. Zero errors. Every reachable state under message loss and crash/recovery satisfies all properties.

SWIM Gossip Membership (Membership.tla)

Models the SWIM-based gossip protocol for cluster membership and failure detection implemented in tensor_chain/src/gossip.rs and tensor_chain/src/membership.rs.

Properties verified (3):

PropertyTypeWhat It Means
NoFalsePositivesSafetyInvariantNo node marked Failed above its own incarnation
MonotonicEpochsTemporalLamport timestamps never decrease
MonotonicIncarnationsTemporalIncarnation numbers never decrease

Result (2-node): 136,097 states generated, 54,148 distinct states, depth 17. Zero errors. Result (3-node): 16,513 states generated, 5,992 distinct states, depth 13. Zero errors.

Bugs Found by Model Checking

TLC discovered real protocol bugs that would be extremely difficult to find through testing alone:

  1. matchIndex response reporting (Raft): Follower reported matchIndex = Len(log) instead of prevLogIndex + Len(entries). A heartbeat response would falsely claim the full log matched the leader’s, enabling incorrect commits. Caught by ReplicationInv.

  2. Out-of-order matchIndex regression (Raft): Leader unconditionally set matchIndex from responses. A stale heartbeat response arriving after a replication response would regress the value. Fixed by taking the max. Caught by MonotonicMatchIndexProp.

  3. inPreVote not reset on step-down (Raft): When stepping down to a higher term, the inPreVote flag was not cleared. A node could remain in pre-vote state as a Follower. Caught by PreVoteSafety.

  4. Self-message processing (Raft): A leader could process its own AppendEntries heartbeat, truncating its own log.

  5. Heartbeat log wipe (Raft): Empty heartbeat messages with prevLogIndex = 0 computed an empty new log, destroying committed entries.

  6. Out-of-order AppendEntries (Raft): Stale messages could overwrite entries from newer messages. Fixed with proper Raft Section 5.3 conflict-resolution.

  7. Gossip fairness formula (Membership): Quantification over messages (a state variable) inside WF_vars is semantically invalid in TLA+.

How to Run

cd specs/tla

# Fast CI check (~3 minutes total)
make ci

# All configs including extensions
make all

# Individual specs
java -XX:+UseParallelGC -Xmx4g -jar tla2tools.jar \
  -deadlock -workers auto -config Raft.cfg Raft.tla

The -deadlock flag suppresses false deadlock reports on terminal states in bounded models. The -workers auto flag enables multi-threaded checking.

Relationship to Testing

TechniqueCoverageFinds
Unit testsSpecific scenariosImplementation bugs
Integration testsCross-crate workflowsWiring bugs
Fuzz testingRandom inputsCrash/panic bugs
Model checkingAll interleavingsProtocol design bugs

Model checking complements testing. It verifies the protocol design is correct (no possible interleaving violates safety), while tests verify the Rust implementation matches the design. Together they provide high confidence that the distributed protocols behave correctly.

Further Reading

Worked Examples

This tutorial demonstrates tensor_chain’s conflict detection, deadlock resolution, and orthogonal transaction merging through detailed scenarios.

Prerequisites

  • Understanding of transaction workspaces
  • Familiarity with delta embeddings
  • Basic knowledge of distributed transactions

Scenario 1: Semantic Conflict Detection

Two transactions modify overlapping data. The system detects the conflict using delta embedding similarity.

Setup

#![allow(unused)]
fn main() {
use tensor_chain::{TensorStore, TransactionManager};
use tensor_chain::block::Transaction;

let store = TensorStore::new();
let manager = TransactionManager::new();

// Initialize account data
store.put("account:1", serialize(&Account { balance: 1000 }))?;
store.put("account:2", serialize(&Account { balance: 2000 }))?;
}

Transaction Execution

#![allow(unused)]
fn main() {
// Transaction A: Transfer from account:1 to account:2
let tx_a = manager.begin(&store)?;
tx_a.add_operation(Transaction::Put {
    key: "account:1".to_string(),
    data: serialize(&Account { balance: 900 }),  // -100
})?;
tx_a.add_operation(Transaction::Put {
    key: "account:2".to_string(),
    data: serialize(&Account { balance: 2100 }), // +100
})?;

// Transaction B: Transfer from account:1 to account:3 (conflicts on account:1)
let tx_b = manager.begin(&store)?;
tx_b.add_operation(Transaction::Put {
    key: "account:1".to_string(),
    data: serialize(&Account { balance: 800 }),  // -200
})?;
tx_b.add_operation(Transaction::Put {
    key: "account:3".to_string(),
    data: serialize(&Account { balance: 200 }),  // +200
})?;
}

Conflict Detection Flow

sequenceDiagram
    participant A as Transaction A
    participant B as Transaction B
    participant CM as ConsensusManager

    A->>A: compute_delta()
    Note over A: delta_a = [0.8, 0.2, 0.0, 0.0]
    B->>B: compute_delta()
    Note over B: delta_b = [0.9, 0.0, 0.1, 0.0]

    A->>CM: prepare(delta_a)
    B->>CM: prepare(delta_b)

    CM->>CM: cosine_similarity(delta_a, delta_b)
    Note over CM: similarity = 0.72 (HIGH)

    CM->>A: Vote::Yes
    CM->>B: Vote::Conflict(similarity=0.72, tx=A)

    A->>A: commit()
    B->>B: abort() + retry

Classification Table

Similarity RangeClassificationAction
0.0 - 0.1OrthogonalParallel commit OK
0.1 - 0.5Low overlapMerge possible
0.5 - 0.9ConflictingSerialize execution
0.9 - 1.0ParallelAbort one

Application Retry Logic

#![allow(unused)]
fn main() {
// Retry with exponential backoff
let mut attempt = 0;
let max_attempts = 5;

loop {
    let workspace = manager.begin(&store)?;

    // Re-read current state
    let account = store.get("account:1")?;

    // Apply changes
    workspace.add_operation(Transaction::Put {
        key: "account:1".to_string(),
        data: serialize(&Account {
            balance: account.balance - 200,
        }),
    })?;

    // Try to commit
    match commit_with_conflict_check(&workspace, &manager) {
        Ok(()) => break,
        Err(ConflictError { similarity, .. }) => {
            attempt += 1;
            if attempt >= max_attempts {
                return Err("max retries exceeded");
            }

            // Exponential backoff with jitter
            let backoff = (100 * 2u64.pow(attempt)) + rand::random::<u64>() % 50;
            std::thread::sleep(Duration::from_millis(backoff));
        }
    }
}
}

Scenario 2: Deadlock Detection and Resolution

Two transactions wait on each other’s locks, creating a cycle in the wait-for graph.

Setup

Transaction T1: needs locks on [key_A, key_B]
Transaction T2: needs locks on [key_B, key_A]

Timeline:
  T1 acquires key_A
  T2 acquires key_B
  T1 waits for key_B (held by T2)
  T2 waits for key_A (held by T1)
  -> DEADLOCK

Wait-For Graph

flowchart LR
    T1((T1)) -->|waits for key_B| T2((T2))
    T2 -->|waits for key_A| T1

Detection Flow

sequenceDiagram
    participant T1 as Transaction 1
    participant T2 as Transaction 2
    participant LM as LockManager
    participant WG as WaitForGraph
    participant DD as DeadlockDetector

    T1->>LM: try_lock(key_A)
    LM->>T1: Ok(handle_1)

    T2->>LM: try_lock(key_B)
    LM->>T2: Ok(handle_2)

    T1->>LM: try_lock(key_B)
    LM->>WG: add_wait(T1, T2)
    LM->>T1: Err(blocked by T2)

    T2->>LM: try_lock(key_A)
    LM->>WG: add_wait(T2, T1)
    LM->>T2: Err(blocked by T1)

    DD->>WG: detect_cycle()
    WG->>DD: Some([T1, T2])

    DD->>DD: select_victim(T2)
    DD->>T2: abort()
    DD->>LM: release(T2)
    DD->>WG: remove(T2)

    T1->>LM: try_lock(key_B)
    LM->>T1: Ok(handle_3)

Victim Selection

CriterionWeightDescription
Lock count0.3Fewer locks = preferred victim
Transaction age0.3Younger = preferred victim
Priority0.4Lower priority = preferred victim
#![allow(unused)]
fn main() {
fn select_victim(cycle: &[u64], priorities: &HashMap<u64, u32>) -> u64 {
    cycle
        .iter()
        .min_by_key(|&&tx_id| {
            let priority = priorities.get(&tx_id).copied().unwrap_or(0);
            let lock_count = lock_manager.lock_count_for_transaction(tx_id);
            (priority, lock_count)
        })
        .copied()
        .unwrap()
}
}

Configuration

[deadlock]
detection_interval_ms = 100
max_cycle_length = 10
victim_selection = "youngest"  # or "lowest_priority", "fewest_locks"

Scenario 3: Orthogonal Transaction Merging

Two transactions modify non-overlapping data with orthogonal delta embeddings. They can be committed in parallel.

Setup

#![allow(unused)]
fn main() {
// Transaction A: Update user preferences
let tx_a = manager.begin(&store)?;
tx_a.add_operation(Transaction::Put {
    key: "user:1:prefs".to_string(),
    data: serialize(&Preferences { theme: "dark" }),
})?;
tx_a.set_before_embedding(vec![0.0; 128]);
tx_a.compute_delta(vec![1.0, 0.0, 0.0, 0.0]);  // Direction: X

// Transaction B: Update product inventory
let tx_b = manager.begin(&store)?;
tx_b.add_operation(Transaction::Put {
    key: "product:42:stock".to_string(),
    data: serialize(&Stock { quantity: 100 }),
})?;
tx_b.set_before_embedding(vec![0.0; 128]);
tx_b.compute_delta(vec![0.0, 1.0, 0.0, 0.0]);  // Direction: Y
}

Parallel Commit

sequenceDiagram
    participant A as Transaction A
    participant B as Transaction B
    participant CM as ConsensusManager
    participant C as Chain

    par Prepare Phase
        A->>CM: prepare(delta_a)
        B->>CM: prepare(delta_b)
    end

    CM->>CM: similarity = 0.0 (ORTHOGONAL)

    par Commit Phase
        CM->>A: Vote::Yes
        CM->>B: Vote::Yes
        A->>C: append(block_a)
        B->>C: append(block_b)
    end

    Note over C: Both blocks committed

Orthogonality Analysis

Transaction ATransaction BOverlapSimilarityCan Merge?
user:1:prefsproduct:42:stockNone0.00Yes
user:1:balanceuser:2:balanceNone0.15Yes
user:1:balanceuser:1:prefsuser:10.30Maybe
account:1account:1Full0.95No

Merge Implementation

#![allow(unused)]
fn main() {
// Find merge candidates
let candidates = manager.find_merge_candidates(
    &tx_a,
    0.1,      // orthogonal threshold
    60_000,   // merge window (60s)
);

if !candidates.is_empty() {
    // Create merged block with multiple transactions
    let mut merged_ops = tx_a.operations();
    for candidate in &candidates {
        merged_ops.extend(candidate.workspace.operations());
    }

    // Compute merged delta
    let merged_delta = tx_a.to_delta_vector();
    for candidate in &candidates {
        merged_delta = merged_delta.add(&candidate.delta);
    }

    // Single block contains both transactions
    let block = Block::new(header, merged_ops);
    chain.append(block)?;

    // Mark all as committed
    tx_a.mark_committed();
    for candidate in candidates {
        candidate.workspace.mark_committed();
    }
}
}

Summary

ScenarioDetection MethodResolution
ConflictDelta similarity > 0.5Serialize, retry loser
DeadlockWait-for graph cycleAbort victim, retry
OrthogonalDelta similarity < 0.1Parallel commit/merge

Further Reading

Deployment

Single Node

For development and testing:

neumann --data-dir ./data

Cluster Deployment

Prerequisites

  • 3, 5, or 7 nodes (odd number for quorum)
  • Network connectivity between nodes
  • Synchronized clocks (NTP)

Configuration

Each node needs a config file:

# /etc/neumann/config.toml

[node]
id = "node1"
data_dir = "/var/lib/neumann"
bind_address = "0.0.0.0:7878"

[cluster]
peers = [
    "node2:7878",
    "node3:7878",
]

[raft]
election_timeout_min_ms = 150
election_timeout_max_ms = 300
heartbeat_interval_ms = 50

[gossip]
bind_address = "0.0.0.0:7879"
ping_interval_ms = 1000

Starting the Cluster

# Start first node (will become leader)
neumann --config /etc/neumann/config.toml --bootstrap

# Start remaining nodes
neumann --config /etc/neumann/config.toml

Verify Cluster Health

# Check cluster status
curl http://node1:9090/health

# View membership
neumann-admin cluster-status

Docker Compose

version: '3.8'
services:
  node1:
    image: neumann/neumann:latest
    environment:
      - NEUMANN_NODE_ID=node1
      - NEUMANN_PEERS=node2:7878,node3:7878
    ports:
      - "7878:7878"
      - "9090:9090"
    volumes:
      - node1-data:/var/lib/neumann

  node2:
    image: neumann/neumann:latest
    environment:
      - NEUMANN_NODE_ID=node2
      - NEUMANN_PEERS=node1:7878,node3:7878
    volumes:
      - node2-data:/var/lib/neumann

  node3:
    image: neumann/neumann:latest
    environment:
      - NEUMANN_NODE_ID=node3
      - NEUMANN_PEERS=node1:7878,node2:7878
    volumes:
      - node3-data:/var/lib/neumann

volumes:
  node1-data:
  node2-data:
  node3-data:

Kubernetes

See the Helm chart in deploy/helm/neumann/.

helm install neumann ./deploy/helm/neumann \
  --set replicas=3 \
  --set persistence.size=100Gi

Production Checklist

  • Odd number of nodes (3, 5, or 7)
  • Nodes in separate availability zones
  • NTP configured and synchronized
  • Firewall rules for ports 7878, 7879, 9090
  • Monitoring and alerting configured
  • Backup strategy in place
  • Resource limits set appropriately

Configuration

Configuration Sources

Configuration is loaded in order (later overrides earlier):

  1. Default values
  2. Config file (/etc/neumann/config.toml)
  3. Environment variables (NEUMANN_*)
  4. Command-line flags

Config File Format

[node]
id = "node1"
data_dir = "/var/lib/neumann"
bind_address = "0.0.0.0:7878"

[cluster]
peers = ["node2:7878", "node3:7878"]

[raft]
election_timeout_min_ms = 150
election_timeout_max_ms = 300
heartbeat_interval_ms = 50
max_entries_per_append = 100
snapshot_interval = 10000

[gossip]
bind_address = "0.0.0.0:7879"
ping_interval_ms = 1000
ping_timeout_ms = 500
suspect_timeout_ms = 3000
indirect_ping_count = 3

[transaction]
prepare_timeout_ms = 5000
commit_timeout_ms = 5000
lock_timeout_ms = 5000
max_concurrent_tx = 1000

[deadlock]
enabled = true
detection_interval_ms = 100
victim_policy = "youngest"
auto_abort_victim = true

[storage]
max_memory_mb = 1024
wal_sync_mode = "fsync"
compression = "lz4"

[metrics]
enabled = true
bind_address = "0.0.0.0:9090"

Environment Variables

VariableConfig PathExample
NEUMANN_NODE_IDnode.idnode1
NEUMANN_DATA_DIRnode.data_dir/var/lib/neumann
NEUMANN_PEERScluster.peersnode2:7878,node3:7878
NEUMANN_LOG_LEVELinfo

Command-Line Flags

neumann \
  --config /etc/neumann/config.toml \
  --node-id node1 \
  --data-dir /var/lib/neumann \
  --bind 0.0.0.0:7878 \
  --bootstrap \
  --log-level debug

Key Parameters

Raft Tuning

ParameterDefaultTuning
election_timeout_min_ms150Increase for high-latency networks
election_timeout_max_ms300Should be 2x min
heartbeat_interval_ms50Lower for faster failure detection
snapshot_interval10000Higher for less I/O, slower recovery

Transaction Tuning

ParameterDefaultTuning
prepare_timeout_ms5000Increase for slow networks
lock_timeout_ms5000Lower to fail fast on contention
max_concurrent_tx1000Based on memory and CPU

Storage Tuning

ParameterDefaultTuning
max_memory_mb1024Based on available RAM
wal_sync_modefsyncnone for speed (data loss risk)
compressionlz4none for speed, zstd for ratio

Monitoring

Metrics Endpoint

Prometheus metrics are exposed at http://node:9090/metrics.

Key Metrics

Raft Consensus

MetricTypeDescription
tensor_chain_raft_stateGaugeCurrent state (follower=0, candidate=1, leader=2)
tensor_chain_termGaugeCurrent Raft term
tensor_chain_commit_indexGaugeHighest committed log index
tensor_chain_applied_indexGaugeHighest applied log index
tensor_chain_elections_totalCounterTotal elections started
tensor_chain_append_entries_totalCounterTotal AppendEntries RPCs

Transactions

MetricTypeDescription
tensor_chain_tx_activeGaugeCurrently active transactions
tensor_chain_tx_commits_totalCounterTotal committed transactions
tensor_chain_tx_aborts_totalCounterTotal aborted transactions
tensor_chain_tx_latency_secondsHistogramTransaction latency

Deadlock Detection

MetricTypeDescription
tensor_chain_deadlocks_totalCounterTotal deadlocks detected
tensor_chain_deadlock_victims_totalCounterTransactions aborted as victims
tensor_chain_wait_graph_sizeGaugeCurrent wait-for graph size

Gossip

MetricTypeDescription
tensor_chain_gossip_membersGaugeKnown cluster members
tensor_chain_gossip_healthyGaugeHealthy members
tensor_chain_gossip_suspectGaugeSuspect members
tensor_chain_gossip_failedGaugeFailed members

Storage

MetricTypeDescription
tensor_chain_entries_totalGaugeTotal stored entries
tensor_chain_memory_bytesGaugeMemory usage
tensor_chain_disk_bytesGaugeDisk usage
tensor_chain_wal_size_bytesGaugeWAL file size

Prometheus Configuration

scrape_configs:
  - job_name: 'neumann'
    static_configs:
      - targets:
        - 'node1:9090'
        - 'node2:9090'
        - 'node3:9090'

Grafana Dashboard

Import the dashboard from deploy/grafana/neumann-dashboard.json.

Panels include:

  • Cluster overview (leader, term, members)
  • Transaction throughput and latency
  • Replication lag
  • Memory and disk usage
  • Deadlock rate

Alerting Rules

See docs/book/src/operations/runbooks/ for alert definitions.

groups:
  - name: neumann
    rules:
      - alert: NoLeader
        expr: sum(tensor_chain_raft_state{state="leader"}) == 0
        for: 30s
        labels:
          severity: critical

      - alert: HighReplicationLag
        expr: tensor_chain_commit_index - tensor_chain_applied_index > 1000
        for: 1m
        labels:
          severity: warning

      - alert: HighDeadlockRate
        expr: rate(tensor_chain_deadlocks_total[5m]) > 1
        for: 5m
        labels:
          severity: warning

Health Endpoint

curl http://node:9090/health

Response:

{
  "status": "healthy",
  "raft_state": "leader",
  "term": 42,
  "commit_index": 12345,
  "members": 3,
  "healthy_members": 3
}

Logging

Configure log level:

RUST_LOG=tensor_chain=debug neumann

Log levels: error, warn, info, debug, trace

Troubleshooting

Common Issues

Node Won’t Start

Symptom: Node exits immediately or fails to bind

Check:

# Port already in use
lsof -i :7878
lsof -i :9090

# Permissions
ls -la /var/lib/neumann

# Config syntax
neumann --config /etc/neumann/config.toml --validate

Solutions:

  • Kill conflicting process
  • Fix directory permissions: chown -R neumann:neumann /var/lib/neumann
  • Fix config syntax errors

Can’t Connect to Cluster

Symptom: Client connections timeout

Check:

# Network connectivity
nc -zv node1 7878

# Firewall rules
iptables -L -n | grep 7878

# Node health
curl http://node1:9090/health

Solutions:

  • Open firewall ports 7878, 7879, 9090
  • Check DNS resolution
  • Verify node is running

Slow Performance

Symptom: High latency, low throughput

Check:

# Metrics
curl http://node1:9090/metrics | grep -E "(latency|throughput)"

# Disk I/O
iostat -x 1

# Memory
free -h

# CPU
top -p $(pgrep neumann)

Solutions:

  • Increase memory allocation
  • Use faster storage (NVMe)
  • Tune Raft parameters
  • Add more nodes for read scaling

Data Inconsistency

Symptom: Different nodes return different data

Check:

# Compare commit indices
for node in node1 node2 node3; do
  curl -s http://$node:9090/metrics | grep commit_index
done

# Check for partitions
neumann-admin cluster-status

Solutions:

  • Wait for replication to catch up
  • Check network connectivity
  • Follow split-brain runbook if partitioned

High Memory Usage

Symptom: OOM kills, swap usage

Check:

# Memory breakdown
curl http://node1:9090/metrics | grep memory

# Process memory
ps aux | grep neumann

Solutions:

  • Increase max_memory_mb config
  • Trigger snapshot to reduce log size
  • Add more nodes to distribute load

WAL Growing Too Large

Symptom: Disk filling up

Check:

# WAL size
du -sh /var/lib/neumann/wal/

# Snapshot status
ls -la /var/lib/neumann/snapshots/

Solutions:

  • Trigger manual snapshot: curl -X POST http://node:9090/admin/snapshot
  • Reduce snapshot_interval
  • Add more disk space

Debug Logging

Enable detailed logging:

RUST_LOG=tensor_chain=debug,tower=warn neumann

For specific modules:

RUST_LOG=tensor_chain::raft=trace,tensor_chain::gossip=debug neumann

Getting Help

  1. Check the runbooks for specific scenarios
  2. Search GitHub issues
  3. Open a new issue with:
    • Neumann version
    • Configuration (redact secrets)
    • Relevant logs
    • Steps to reproduce

Example Configurations

This page provides complete configuration examples for different deployment scenarios.

Development (Single Node)

Minimal configuration for local development and testing.

[node]
id = "dev-node"
data_dir = "./data"

[cluster]
# Single node cluster - no seeds needed
seeds = []
port = 9100

# Disable TLS for development
[tls]
enabled = false

# Minimal rate limiting
[rate_limit]
enabled = false

# No compression for easier debugging
[compression]
enabled = false

# Shorter timeouts for faster feedback
[transactions]
timeout_ms = 1000
lock_timeout_ms = 500

# Verbose logging
[logging]
level = "debug"
format = "pretty"

Production (3-Node Cluster)

Standard production configuration with TLS, rate limiting, and tuned timeouts.

# === Node Configuration ===
[node]
id = "node1"
data_dir = "/var/lib/neumann/data"
# Bind to all interfaces
bind_address = "0.0.0.0"

# === Cluster Configuration ===
[cluster]
seeds = ["node1.example.com:9100", "node2.example.com:9100", "node3.example.com:9100"]
port = 9100
# Cluster name for identification
name = "production"

# === TLS Configuration ===
[tls]
enabled = true
cert_path = "/etc/neumann/node1.crt"
key_path = "/etc/neumann/node1.key"
ca_cert_path = "/etc/neumann/ca.crt"
# Require mutual TLS
require_client_auth = true
# Verify node identity matches certificate
node_id_verification = "CommonName"

# === TCP Transport ===
[tcp]
# Connections per peer
pool_size = 4
# Connection timeout
connect_timeout_ms = 5000
# Read/write timeout
io_timeout_ms = 30000
# Enable keepalive
keepalive = true
keepalive_interval_secs = 30
# Maximum message size (16 MB)
max_message_size = 16777216
# Outbound queue size
max_pending_messages = 1000

# === Rate Limiting ===
[rate_limit]
enabled = true
# Burst capacity
bucket_size = 100
# Tokens per second
refill_rate = 50.0

# === Compression ===
[compression]
enabled = true
method = "Lz4"
# Only compress messages > 256 bytes
min_size = 256

# === Transactions ===
[transactions]
# Transaction timeout
timeout_ms = 5000
# Lock timeout
lock_timeout_ms = 30000
# Default embedding dimension
embedding_dimension = 128

# === Conflict Detection ===
[consensus]
# Similarity threshold for conflict
conflict_threshold = 0.5
# Threshold for orthogonal merge
orthogonal_threshold = 0.1
# Merge window
merge_window_ms = 60000

# === Deadlock Detection ===
[deadlock]
enabled = true
detection_interval_ms = 100
max_cycle_length = 10

# === Snapshots ===
[snapshots]
# Memory threshold before disk spill
max_memory_bytes = 268435456  # 256 MB
# Snapshot interval
interval_secs = 3600
# Retention count
retain_count = 3

# === Metrics ===
[metrics]
enabled = true
# Prometheus endpoint
endpoint = "0.0.0.0:9090"
# Include detailed histograms
detailed = true

# === Logging ===
[logging]
level = "info"
format = "json"
# Log to file
file = "/var/log/neumann/neumann.log"
# Rotate logs
max_size_mb = 100
max_files = 10

High-Throughput (5-Node)

Optimized configuration for maximum write throughput.

[node]
id = "node1"
data_dir = "/var/lib/neumann/data"

[cluster]
seeds = [
    "node1.example.com:9100",
    "node2.example.com:9100",
    "node3.example.com:9100",
    "node4.example.com:9100",
    "node5.example.com:9100",
]
port = 9100
name = "high-throughput"

# === TLS (same as production) ===
[tls]
enabled = true
cert_path = "/etc/neumann/node1.crt"
key_path = "/etc/neumann/node1.key"
ca_cert_path = "/etc/neumann/ca.crt"
require_client_auth = true

# === TCP - Optimized for throughput ===
[tcp]
# More connections for parallelism
pool_size = 8
# Shorter timeouts for faster failover
connect_timeout_ms = 2000
io_timeout_ms = 10000
keepalive = true
keepalive_interval_secs = 15
# Larger message size for batching
max_message_size = 67108864  # 64 MB
# Larger queues for buffering
max_pending_messages = 5000
recv_buffer_size = 5000

# === Rate Limiting - Permissive ===
[rate_limit]
enabled = true
bucket_size = 500
refill_rate = 250.0

# === Compression - Aggressive ===
[compression]
enabled = true
method = "Lz4"
# Compress even small messages
min_size = 64

# === Transactions - Fast ===
[transactions]
timeout_ms = 2000
lock_timeout_ms = 5000
embedding_dimension = 64  # Smaller for speed

# === Consensus - Optimized ===
[consensus]
# Lower thresholds for more merging
conflict_threshold = 0.7
orthogonal_threshold = 0.2
merge_window_ms = 30000

# === Deadlock - Frequent checks ===
[deadlock]
enabled = true
detection_interval_ms = 50

# === Raft - Tuned for throughput ===
[raft]
# Batch more entries
max_entries_per_append = 1000
# Shorter election timeout
election_timeout_ms = 500
# Faster heartbeats
heartbeat_interval_ms = 100

# === Snapshots - Less frequent ===
[snapshots]
max_memory_bytes = 536870912  # 512 MB
interval_secs = 7200
retain_count = 2

Geo-Distributed (Multi-Region)

Configuration for clusters spanning multiple geographic regions with higher latency tolerance.

[node]
id = "node1-us-east"
data_dir = "/var/lib/neumann/data"
region = "us-east-1"

[cluster]
seeds = [
    "node1-us-east.example.com:9100",
    "node2-us-west.example.com:9100",
    "node3-eu-west.example.com:9100",
]
port = 9100
name = "geo-distributed"

# === TLS (same as production) ===
[tls]
enabled = true
cert_path = "/etc/neumann/node1-us-east.crt"
key_path = "/etc/neumann/node1-us-east.key"
ca_cert_path = "/etc/neumann/ca.crt"
require_client_auth = true

# === TCP - WAN optimized ===
[tcp]
pool_size = 4
# Longer timeouts for cross-region latency
connect_timeout_ms = 10000
io_timeout_ms = 60000
keepalive = true
# More frequent keepalives to detect failures
keepalive_interval_secs = 10
max_message_size = 16777216

# === Rate Limiting - Standard ===
[rate_limit]
enabled = true
bucket_size = 100
refill_rate = 50.0

# === Compression - Always on for WAN ===
[compression]
enabled = true
method = "Lz4"
min_size = 128

# === Transactions - Longer timeouts ===
[transactions]
# Higher timeout for cross-region coordination
timeout_ms = 15000
lock_timeout_ms = 60000
embedding_dimension = 128

# === Consensus - Relaxed for latency ===
[consensus]
conflict_threshold = 0.5
orthogonal_threshold = 0.1
# Longer merge window for slow convergence
merge_window_ms = 120000

# === Deadlock - Less frequent for WAN ===
[deadlock]
enabled = true
detection_interval_ms = 500

# === Raft - WAN tuned ===
[raft]
max_entries_per_append = 100
# Longer election timeout for WAN latency
election_timeout_ms = 3000
heartbeat_interval_ms = 500
# Enable pre-vote to prevent disruption during partitions
pre_vote = true

# === Snapshots ===
[snapshots]
max_memory_bytes = 268435456
interval_secs = 3600
retain_count = 5

# === Reconnection - Aggressive ===
[reconnection]
enabled = true
initial_backoff_ms = 500
max_backoff_ms = 60000
multiplier = 2.0
jitter = 0.2

# === Region awareness ===
[region]
# Prefer local reads
local_read_preference = true
# Region priority for leader election
priority = 1

Configuration Reference

Environment Variables

All configuration values can be overridden with environment variables:

NEUMANN_NODE_ID=node1
NEUMANN_CLUSTER_PORT=9100
NEUMANN_TLS_ENABLED=true
NEUMANN_LOGGING_LEVEL=debug

Configuration Precedence

  1. Environment variables (highest)
  2. Command-line arguments
  3. Configuration file
  4. Default values (lowest)

See Also

Runbooks

Operational runbooks for managing Neumann clusters, focusing on tensor_chain distributed operations.

Available Runbooks

RunbookScenarioSeverity
Leader ElectionCluster has no leaderCritical
Split-Brain RecoveryNetwork partition healedCritical
Node RecoveryNode crash or disk failureHigh
Backup and RestoreData backup and disaster recoveryHigh
Capacity PlanningResource sizing and scalingMedium
Deadlock ResolutionTransaction deadlocksMedium

How to Use These Runbooks

  1. Identify the symptom from alerts or monitoring
  2. Find the matching runbook in the table above
  3. Follow the diagnostic steps to confirm root cause
  4. Execute the resolution steps in order
  5. Verify recovery using the provided checks

Alerting Rules

Each runbook includes Prometheus alerting rules. Deploy them to your monitoring stack:

# Copy alerting rules
cp docs/book/src/operations/alerting-rules.yml /etc/prometheus/rules/neumann.yml

# Reload Prometheus
curl -X POST http://prometheus:9090/-/reload

Emergency Contacts

For production incidents:

  1. Page the on-call engineer
  2. Start an incident channel
  3. Follow the relevant runbook
  4. Document actions taken
  5. Schedule post-incident review

Node Management

This runbook covers adding and removing nodes from a tensor_chain cluster.

Adding a Node

Prerequisites Checklist

  • New node has network connectivity to existing cluster members
  • TLS certificates are configured (if using TLS)
  • Node has sufficient disk space for snapshot transfer
  • Firewall rules allow traffic on cluster port (default: 9100)
  • DNS/hostname resolution configured for the new node

Symptoms (Why Add a Node)

  • Cluster capacity insufficient for workload
  • Need additional replicas for fault tolerance
  • Geographic distribution requirements
  • Performance scaling requirements

Procedure

Step 1: Prepare the new node

# Install Neumann on the new node
cargo install neumann --version X.Y.Z

# Create configuration directory
mkdir -p /etc/neumann
mkdir -p /var/lib/neumann/data

# Copy TLS certificates (if using TLS)
scp admin@existing-node:/etc/neumann/ca.crt /etc/neumann/
# Generate node-specific certificates
./scripts/generate-node-cert.sh node4

Step 2: Configure the new node

Create /etc/neumann/config.toml:

[node]
id = "node4"
data_dir = "/var/lib/neumann/data"

[cluster]
# Existing cluster members for initial discovery
seeds = ["node1:9100", "node2:9100", "node3:9100"]
port = 9100

[tls]
cert_path = "/etc/neumann/node4.crt"
key_path = "/etc/neumann/node4.key"
ca_cert_path = "/etc/neumann/ca.crt"

Step 3: Join the cluster

# Start the node in join mode
neumann start --join

# Monitor the join process
neumann status --watch

Step 4: Verify cluster membership

# On any existing node
neumann cluster members

# Expected output:
# ID     ADDRESS       STATE     ROLE
# node1  10.0.1.1:9100 healthy   leader
# node2  10.0.1.2:9100 healthy   follower
# node3  10.0.1.3:9100 healthy   follower
# node4  10.0.1.4:9100 healthy   follower  <-- new node

Post-Addition Verification

# Verify snapshot transfer completed
neumann status node4 --verbose

# Check replication lag
neumann metrics node4 | grep replication_lag

# Verify the node participates in consensus
neumann raft status

Removing a Node

Prerequisites Checklist

  • Cluster will maintain quorum after removal
  • Node is not the current leader (trigger election first)
  • Data has been replicated to other nodes
  • No in-flight transactions involving this node

Symptoms (Why Remove a Node)

  • Hardware failure requiring decommission
  • Cluster right-sizing
  • Node relocation to different region
  • Maintenance requiring extended downtime

Pre-Removal Verification

# Check current cluster state
neumann cluster members

# Verify quorum will be maintained
# For N nodes, quorum = (N/2) + 1
# 5 nodes -> quorum = 3, can remove 2
# 3 nodes -> quorum = 2, can remove 1

Procedure

Step 1: Drain the node (graceful removal)

# Mark node as draining (stops accepting new requests)
neumann node drain node3

# Wait for in-flight transactions to complete
neumann node wait-drain node3 --timeout 300

Step 2: Transfer leadership if necessary

# Check if node is leader
neumann raft status

# If leader, trigger election
neumann raft transfer-leadership --to node1

Step 3: Remove from cluster

# Remove the node from cluster configuration
neumann cluster remove node3

# Verify removal
neumann cluster members

Step 4: Stop the node

# On the removed node
neumann stop

# Clean up data (optional)
rm -rf /var/lib/neumann/data/*

Post-Removal Verification

# Verify cluster health
neumann cluster health

# Check that remaining nodes have correct membership
neumann cluster members

# Verify no pending transactions for removed node
neumann transactions pending

Emergency Removal

Use emergency removal only when a node is unresponsive and cannot be drained gracefully.

Symptoms

  • Node is unreachable (network partition, hardware failure)
  • Node is unresponsive (hung process, resource exhaustion)
  • Need to restore quorum quickly

Procedure

# Force remove unresponsive node
neumann cluster remove node3 --force

# The cluster will:
# 1. Remove node from membership
# 2. Abort any transactions involving the node
# 3. Re-elect leader if necessary

Resolution

After emergency removal:

  1. Investigate root cause of node failure
  2. Repair or replace hardware if needed
  3. Re-add node using the addition procedure above

Prevention

  • Monitor node health with alerting
  • Configure appropriate timeouts
  • Maintain sufficient cluster size for fault tolerance

Quorum Considerations

Cluster SizeQuorumFault ToleranceNotes
110Development only
220Not recommended
321Minimum for production
532Recommended for HA
743Maximum practical size

Quorum Formula

quorum = (cluster_size / 2) + 1
fault_tolerance = cluster_size - quorum

Best Practices

  • Always maintain odd number of nodes
  • Never remove nodes if it would violate quorum
  • Plan node additions/removals during low-traffic periods
  • Test failover scenarios regularly

See Also

Cluster Upgrade

This runbook covers upgrading tensor_chain clusters with minimal downtime.

Upgrade Types

TypeDowntimeComplexityUse Case
RollingNoneLowMinor version upgrades
Blue-GreenMinimalMediumMajor version upgrades
CanaryNoneHighRisk-sensitive environments

Rolling Upgrade

Upgrade nodes one at a time while maintaining cluster availability.

Prerequisites

  • Cluster has 3+ nodes for quorum during upgrades
  • New version is backwards compatible with current version
  • Upgrade tested in staging environment
  • Backup of cluster state completed

Symptoms (Why Upgrade)

  • Security patches available
  • New features required
  • Bug fixes needed
  • Performance improvements available

Upgrade Sequence

sequenceDiagram
    participant F1 as Follower 1
    participant F2 as Follower 2
    participant L as Leader
    participant A as Admin

    Note over A: Start rolling upgrade
    A->>F1: upgrade
    F1->>F1: restart with new version
    F1->>L: rejoin cluster
    Note over F1: Follower 1 upgraded

    A->>F2: upgrade
    F2->>F2: restart with new version
    F2->>L: rejoin cluster
    Note over F2: Follower 2 upgraded

    A->>L: transfer leadership
    L->>F1: leadership transferred
    A->>L: upgrade (now follower)
    L->>F1: rejoin cluster
    Note over L: All nodes upgraded

Procedure

Step 1: Pre-upgrade checks

# Verify cluster health
neumann cluster health

# Check current versions
neumann cluster versions

# Verify backup is current
neumann backup status

Step 2: Upgrade followers first

# For each follower node:

# 1. Drain the node
neumann node drain node2

# 2. Stop the service
ssh node2 "systemctl stop neumann"

# 3. Upgrade the binary
ssh node2 "cargo install neumann --version X.Y.Z"

# 4. Start the service
ssh node2 "systemctl start neumann"

# 5. Verify rejoin
neumann cluster members

# 6. Wait for replication catch-up
neumann metrics node2 | grep replication_lag

Step 3: Upgrade the leader

# Transfer leadership to an upgraded follower
neumann raft transfer-leadership --to node2

# Verify leadership transferred
neumann raft status

# Now upgrade the old leader (same steps as followers)
neumann node drain node1
ssh node1 "systemctl stop neumann"
ssh node1 "cargo install neumann --version X.Y.Z"
ssh node1 "systemctl start neumann"

Step 4: Post-upgrade verification

# Verify all nodes on new version
neumann cluster versions

# Expected output:
# ID     VERSION
# node1  X.Y.Z
# node2  X.Y.Z
# node3  X.Y.Z

# Run health checks
neumann cluster health

# Verify functionality with test transactions
neumann test-transaction

Version Compatibility

Compatibility Matrix

From VersionTo VersionCompatibleNotes
0.9.x0.10.xYesRolling upgrade supported
0.10.x0.11.xYesRolling upgrade supported
0.8.x0.10.xNoBlue-green required
0.x.x1.0.xNoBlue-green required

Version Skew Policy

  • Maximum skew: 1 minor version during rolling upgrades
  • Leader version: Must be >= follower versions
  • Upgrade order: Always followers first, then leader

Rollback Procedure

If issues are discovered after upgrade:

Symptoms Requiring Rollback

  • Transaction failures after upgrade
  • Performance degradation
  • Consensus failures
  • Data corruption detected

Rollback Steps

# 1. Stop accepting new requests
neumann cluster pause

# 2. Identify problematic nodes
neumann cluster health --verbose

# 3. Rollback affected nodes
ssh node1 "cargo install neumann --version X.Y.Z-OLD"
ssh node1 "systemctl restart neumann"

# 4. Verify rollback
neumann cluster versions

# 5. Resume operations
neumann cluster resume

Rollback Limitations

  • Cannot rollback if schema changes were applied
  • Cannot rollback if new features were used
  • Always test rollback in staging first

Canary Upgrade

For risk-sensitive environments, upgrade a single node first and monitor.

Procedure

# 1. Select canary node (typically a follower)
CANARY=node3

# 2. Upgrade canary
neumann node drain $CANARY
ssh $CANARY "cargo install neumann --version X.Y.Z"
ssh $CANARY "systemctl restart neumann"

# 3. Monitor canary for 24-48 hours
neumann metrics $CANARY --watch

# 4. Compare metrics with non-canary nodes
neumann metrics compare $CANARY node1

# 5. If healthy, proceed with rolling upgrade
# If unhealthy, rollback canary

Canary Success Criteria

MetricThresholdAction if Exceeded
Error rate< 0.1%Rollback
Latency p99< 2x baselineInvestigate
Replication lag< 100msInvestigate
Memory usage< 1.5x baselineInvestigate

Automated Upgrade Script

#!/bin/bash
# rolling-upgrade.sh - Automated rolling upgrade script

set -e

NEW_VERSION=$1
NODES=$(neumann cluster members --format json | jq -r '.[] | .id')
LEADER=$(neumann raft status --format json | jq -r '.leader')

echo "Upgrading cluster to version $NEW_VERSION"

# Upgrade followers first
for node in $NODES; do
    if [ "$node" == "$LEADER" ]; then
        continue
    fi

    echo "Upgrading follower: $node"
    neumann node drain $node
    ssh $node "cargo install neumann --version $NEW_VERSION"
    ssh $node "systemctl restart neumann"

    # Wait for rejoin
    sleep 10
    neumann cluster wait-healthy --timeout 120
done

# Transfer leadership and upgrade old leader
echo "Transferring leadership from $LEADER"
NEW_LEADER=$(echo $NODES | tr ' ' '\n' | grep -v $LEADER | head -1)
neumann raft transfer-leadership --to $NEW_LEADER

sleep 5

echo "Upgrading old leader: $LEADER"
neumann node drain $LEADER
ssh $LEADER "cargo install neumann --version $NEW_VERSION"
ssh $LEADER "systemctl restart neumann"

# Final verification
neumann cluster wait-healthy --timeout 120
neumann cluster versions

echo "Upgrade complete"

See Also

Leader Election Failures

Symptoms

  • NoLeader alert firing
  • Continuous election attempts in logs
  • Client requests timing out with “no leader” errors
  • tensor_chain_elections_total metric increasing rapidly

Diagnostic Commands

Check Raft State

# Query each node's state
for node in node1 node2 node3; do
  curl -s http://$node:9090/metrics | grep tensor_chain_raft_state
done

Inspect Logs

# Look for election-related entries
grep -E "(election|vote|term)" /var/log/neumann/tensor_chain.log | tail -100

Verify Network Connectivity

# From each node, verify connectivity to peers
for peer in node1 node2 node3; do
  nc -zv $peer 7878 2>&1 | grep -v "Connection refused" || echo "FAIL: $peer"
done

Root Causes

1. Network Partition

Diagnosis: Nodes can’t reach each other

Solution:

  • Check firewall rules for port 7878 (Raft) and 7879 (gossip)
  • Verify network routes between nodes
  • Check for packet loss: ping -c 100 peer_node

2. Clock Skew

Diagnosis: Election timeouts inconsistent across nodes

Solution:

  • Ensure NTP is running: timedatectl status
  • Max recommended skew: 500ms
  • Sync clocks: chronyc makestep

3. Quorum Loss

Diagnosis: Fewer than (n/2)+1 nodes available

Solution:

  • For 3-node cluster: need 2 nodes
  • For 5-node cluster: need 3 nodes
  • Bring failed nodes back online or add new nodes

4. Election Timeout Too Aggressive

Diagnosis: Frequent elections even with healthy network

Solution:

[raft]
election_timeout_min_ms = 300   # Increase from default 150
election_timeout_max_ms = 600   # Increase from default 300

Resolution Steps

  1. Identify partitioned nodes using gossip membership view
  2. Restore connectivity if network issue
  3. If quorum lost, follow disaster recovery procedure
  4. Monitor tensor_chain_raft_state{state="leader"} for leader emergence

Alerting Rule

- alert: NoLeader
  expr: sum(tensor_chain_raft_state{state="leader"}) == 0
  for: 30s
  labels:
    severity: critical
  annotations:
    summary: "No Raft leader elected in cluster"
    runbook_url: "https://docs.neumann.io/operations/runbooks/leader-election"

Prevention

  • Deploy odd number of nodes (3, 5, 7)
  • Use separate availability zones
  • Monitor tensor_chain_elections_total rate
  • Set up network monitoring between nodes

Split-Brain Recovery

What is Split-Brain?

A network partition where multiple nodes believe they are the leader, potentially accepting conflicting writes.

Symptoms

  • Multiple nodes reporting raft_state="leader" in metrics
  • Clients seeing different data depending on which node they connect to
  • tensor_chain_partition_detected metric > 0
  • Gossip reporting different membership views

How tensor_chain Prevents Split-Brain

Raft consensus requires majority quorum:

  • 3 nodes: 2 required (only 1 partition can have leader)
  • 5 nodes: 3 required

Split-brain can only occur with symmetric partition where old leader is isolated but doesn’t realize it.

Automatic Recovery (Partition Merge Protocol)

When partitions heal, tensor_chain automatically reconciles:

Phase 1: Detection

  • Gossip detects new reachable nodes
  • Compare Raft terms and log lengths

Phase 2: Leader Resolution

  • Higher term wins
  • If same term, longer log wins
  • Losing leader steps down

Phase 3: State Reconciliation

  • Semantic conflict detection on divergent entries
  • Orthogonal changes: vector-add merge
  • Conflicting changes: reject newer (requires manual resolution)

Phase 4: Log Synchronization

  • Follower truncates divergent suffix
  • Leader replicates correct entries

Phase 5: Membership Merge

  • Gossip merges LWW membership states
  • Higher incarnation wins for each node

Phase 6: Checkpoint

  • Create snapshot post-merge for fast recovery

Manual Intervention (When Automatic Fails)

Scenario: Conflicting Writes

# 1. Identify conflicts
neumann-admin conflicts list --since "2h ago"

# 2. Export conflicting transactions
neumann-admin conflicts export --tx-id 12345 --output conflict.json

# 3. Choose resolution
neumann-admin conflicts resolve --tx-id 12345 --keep-version A

# 4. Or merge manually
neumann-admin conflicts resolve --tx-id 12345 --merge-custom merge.json

Scenario: Completely Diverged State

# 1. Stop all nodes
systemctl stop neumann

# 2. Identify authoritative node (longest log, highest term)
for node in node1 node2 node3; do
  ssh $node "neumann-admin raft-info"
done

# 3. On non-authoritative nodes, clear state
rm -rf /var/lib/neumann/raft/*

# 4. Restart authoritative node first
systemctl start neumann

# 5. Restart other nodes (will sync from leader)
systemctl start neumann

Post-Recovery Verification

# Verify single leader
curl -s http://node1:9090/metrics | grep 'raft_state{state="leader"}'

# Verify all nodes in sync
neumann-admin cluster-status

# Check for unresolved conflicts
neumann-admin conflicts list

# Verify recent transactions
neumann-admin tx-log --last 100

Prevention

  1. Network design: Avoid symmetric partitions
  2. Monitoring: Alert on partition detection
  3. Testing: Regularly run chaos engineering tests
  4. Backups: Regular snapshots enable point-in-time recovery

Node Recovery

Recovery Scenarios

ScenarioRecovery MethodData Loss Risk
Process crashWAL replayNone
Node rebootWAL replayNone
Disk failureSnapshot + log from leaderPossible (uncommitted)
Data corruptionSnapshot from leaderPossible (uncommitted)

Automatic Recovery Flow

flowchart TD
    A[Node Starts] --> B{WAL Exists?}
    B -->|Yes| C[Replay WAL]
    B -->|No| D[Request Snapshot]
    C --> E{Caught Up?}
    E -->|Yes| F[Join as Follower]
    E -->|No| D
    D --> G[Install Snapshot]
    G --> H[Replay Logs After Snapshot]
    H --> F
    F --> I[Healthy]

Manual Recovery Steps

1. Crash Recovery (WAL Intact)

# Just restart - WAL replay is automatic
systemctl start neumann

# Monitor recovery
journalctl -u neumann -f | grep -E "(recovery|replay|caught_up)"

2. Recovery from Snapshot

# 1. Stop node
systemctl stop neumann

# 2. Clear corrupted state
rm -rf /var/lib/neumann/raft/wal/*

# 3. Keep or clear snapshots (keep if valid)
ls -la /var/lib/neumann/raft/snapshots/

# 4. Restart - will fetch snapshot from leader
systemctl start neumann

# 5. Monitor snapshot transfer
watch -n1 'curl -s localhost:9090/metrics | grep snapshot_transfer'

3. Full State Rebuild

# 1. Stop node
systemctl stop neumann

# 2. Clear all Raft state
rm -rf /var/lib/neumann/raft/*

# 3. Clear tensor store (will be rebuilt)
rm -rf /var/lib/neumann/store/*

# 4. Restart
systemctl start neumann

Monitoring Recovery Progress

# Check sync status
curl -s localhost:9090/metrics | grep -E "(commit_index|applied_index|leader_commit)"

# Calculate lag
LEADER_COMMIT=$(curl -s http://leader:9090/metrics | grep tensor_chain_commit_index | awk '{print $2}')
MY_APPLIED=$(curl -s localhost:9090/metrics | grep tensor_chain_applied_index | awk '{print $2}')
echo "Lag: $((LEADER_COMMIT - MY_APPLIED)) entries"

# Estimated time to catch up (entries/sec)
watch -n5 'curl -s localhost:9090/metrics | grep tensor_chain_applied_index'

Troubleshooting

Recovery Stuck

Symptom: Node not catching up, applied_index not increasing

Causes:

  1. Network issue to leader
  2. Leader overloaded
  3. Snapshot transfer failing

Solution:

# Check leader connectivity
curl -v http://leader:7878/health

# Check snapshot transfer errors
grep "snapshot" /var/log/neumann/tensor_chain.log | grep -i error

# Manually trigger snapshot
curl -X POST http://leader:9090/admin/snapshot

Repeated Crashes During Recovery

Symptom: Node crashes while replaying WAL

Causes:

  1. Corrupted WAL entry
  2. Out of memory during replay
  3. Incompatible schema

Solution:

# Skip corrupted entries (data loss!)
neumann-admin wal-repair --skip-corrupted

# Or full rebuild
rm -rf /var/lib/neumann/raft/*
systemctl start neumann

Backup and Restore

Backup Strategy

TypeFrequencyRetentionRPORTO
SnapshotsEvery 10k entries7 daysMinutesMinutes
Full backupDaily30 days24 hoursHours
Off-siteWeekly1 year1 weekHours

Creating Backups

Snapshot Backup (Hot)

# Trigger snapshot on leader
curl -X POST http://leader:9090/admin/snapshot

# Wait for completion
watch 'curl -s http://leader:9090/metrics | grep snapshot'

# Copy snapshot files
rsync -av /var/lib/neumann/raft/snapshots/ backup:/backups/neumann/snapshots/

# Include metadata
neumann-admin cluster-info > backup:/backups/neumann/metadata.json
# 1. Stop writes (or accept slightly inconsistent backup)
neumann-admin pause-writes

# 2. Create snapshot
curl -X POST http://leader:9090/admin/snapshot
sleep 10

# 3. Backup all state
tar -czf neumann-backup-$(date +%Y%m%d).tar.gz \
  /var/lib/neumann/raft/snapshots/ \
  /var/lib/neumann/store/ \
  /etc/neumann/

# 4. Resume writes
neumann-admin resume-writes

# 5. Verify backup integrity
tar -tzf neumann-backup-*.tar.gz | head

Continuous WAL Archiving

# In config.toml
[wal]
archive_command = "aws s3 cp %p s3://backups/neumann/wal/%f"
archive_timeout = 60  # seconds

# Or to local storage
archive_command = "cp %p /mnt/backup/wal/%f"

Restore Procedures

Point-in-Time Recovery

# 1. Stop all nodes
ansible all -m systemd -a "name=neumann state=stopped"

# 2. Clear current state
ansible all -m shell -a "rm -rf /var/lib/neumann/raft/*"

# 3. Restore snapshot to one node
scp backup:/backups/neumann/snapshots/latest/* node1:/var/lib/neumann/raft/snapshots/

# 4. Replay WAL up to desired point
neumann-admin wal-replay \
  --wal-dir backup:/backups/neumann/wal/ \
  --until "2024-01-15T10:30:00Z"

# 5. Start first node
ssh node1 systemctl start neumann

# 6. Start remaining nodes (will sync from node1)
ansible "node2,node3" -m systemd -a "name=neumann state=started"

Full Cluster Restore

# 1. Extract backup
tar -xzf neumann-backup-20240115.tar.gz -C /tmp/restore/

# 2. Stop cluster
systemctl stop neumann

# 3. Restore files
rsync -av /tmp/restore/raft/ /var/lib/neumann/raft/
rsync -av /tmp/restore/store/ /var/lib/neumann/store/

# 4. Fix permissions
chown -R neumann:neumann /var/lib/neumann/

# 5. Start cluster
systemctl start neumann

Disaster Recovery (Complete Loss)

# 1. Provision new infrastructure

# 2. Install Neumann on all nodes

# 3. Restore from off-site backup
aws s3 cp s3://backups/neumann/latest.tar.gz /tmp/
tar -xzf /tmp/latest.tar.gz -C /var/lib/neumann/

# 4. Update config with new node addresses
vim /etc/neumann/config.toml

# 5. Initialize cluster
neumann-admin init-cluster --bootstrap

# 6. Verify
neumann-admin cluster-status

Verification

# Check data integrity
neumann-admin verify-checksums

# Compare entry counts
neumann-admin stats | grep total_entries

# Spot check recent data
neumann-admin query "SELECT COUNT(*) FROM ..."

Retention Policy

# Cron job for cleanup
0 2 * * * find /var/lib/neumann/raft/snapshots -mtime +7 -delete
0 3 * * 0 aws s3 rm s3://backups/neumann/wal/ --recursive --exclude "*.wal" --older-than 30d

Capacity Planning

Resource Requirements

Memory

ComponentFormulaExample (1M entries)
Raft log (in-memory)entries * avg_size * 21M 1KB 2 = 2 GB
Tensor store indexentries * 64 bytes1M * 64 = 64 MB
HNSW indexvectors * dim * 4 * ef1M 128 4 * 16 = 8 GB
Codebookcentroids * dim * 41024 128 4 = 512 KB
Connection bufferspeers * buffer_size10 * 64KB = 640 KB

Recommended minimum: 16 GB for production

Disk

ComponentFormulaExample
WALentries * avg_size10M * 1KB = 10 GB
Snapshotsstate_size * 25 GB * 2 = 10 GB
Mmap cold storagecold_entries * avg_size100M * 1KB = 100 GB

Recommended: 3x expected data size for growth

Network

Traffic TypeFormulaExample (100 TPS)
ReplicationTPS * entry_size * (replicas-1)100 1KB 2 = 200 KB/s
Gossipnodes * fanout * state_size / interval5 3 1KB / 1s = 15 KB/s
ClientTPS * (request + response)100 * 2KB = 200 KB/s

Recommended: 1 Gbps minimum, 10 Gbps for high throughput

CPU

OperationComplexityCores Needed
ConsensusO(1) per entry1 core
Embedding computationO(dim)1-2 cores
HNSW searchO(log N * ef)2-4 cores
Conflict detectionO(concurrent_txs^2)1 core

Recommended: 8+ cores for production

Sizing Examples

Small (Dev/Test)

  • 3 nodes
  • 4 cores, 8 GB RAM, 100 GB SSD each
  • Up to 1M entries, 10 TPS

Medium (Production)

  • 5 nodes
  • 8 cores, 32 GB RAM, 500 GB NVMe each
  • Up to 100M entries, 1000 TPS

Large (High-Scale)

  • 7+ nodes
  • 16+ cores, 64+ GB RAM, 2 TB NVMe each
  • 1B+ entries, 10k+ TPS
  • Consider sharding

Scaling Strategies

Vertical Scaling

When to use:

  • Single-node bottleneck (CPU, memory)
  • Read latency requirements

Actions:

  • Add RAM for larger in-memory log
  • Add cores for parallel embedding computation
  • Upgrade to NVMe for faster snapshots

Horizontal Scaling

When to use:

  • Throughput limited by consensus
  • Fault tolerance requirements

Actions:

  • Add read replicas (don’t participate in consensus)
  • Add consensus members (odd numbers only)
  • Implement sharding by key range

Monitoring for Capacity

# Prometheus alerts
- alert: HighMemoryUsage
  expr: tensor_chain_memory_usage_bytes / tensor_chain_memory_limit_bytes > 0.85
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Memory usage above 85%"

- alert: DiskSpaceLow
  expr: tensor_chain_disk_free_bytes < 10737418240  # 10 GB
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "Less than 10 GB disk space remaining"

- alert: HighCPUUsage
  expr: rate(tensor_chain_cpu_seconds_total[5m]) > 0.9
  for: 10m
  labels:
    severity: warning
  annotations:
    summary: "CPU usage above 90%"

Growth Projections

# Calculate daily growth
neumann-admin stats --since "7d ago" --format json | jq '.entries_per_day'

# Project storage needs
DAILY_GROWTH=100000  # entries
ENTRY_SIZE=1024      # bytes
DAYS=365
GROWTH=$((DAILY_GROWTH * ENTRY_SIZE * DAYS / 1024 / 1024 / 1024))
echo "Projected annual growth: ${GROWTH} GB"

Deadlock Resolution

Overview

tensor_chain automatically detects and resolves deadlocks in distributed transactions using wait-for graph analysis. This runbook covers monitoring, tuning, and manual intervention.

Automatic Detection

Deadlocks are detected within detection_interval_ms (default: 100ms) and resolved by aborting a victim transaction based on configured policy.

Monitoring

Metrics

# Deadlock rate
rate(tensor_chain_deadlocks_total[5m])

# Detection latency
histogram_quantile(0.99, tensor_chain_deadlock_detection_seconds_bucket)

# Victim aborts by policy
tensor_chain_deadlock_victims_total{policy="youngest"}

Logs

grep "deadlock" /var/log/neumann/tensor_chain.log

# Example output:
# [WARN] Deadlock detected: cycle=[tx_123, tx_456, tx_789], victim=tx_789
# [INFO] Aborted transaction tx_789 (youngest in cycle)

Tuning

Detection Interval

[deadlock]
detection_interval_ms = 100  # Lower = faster detection, higher CPU

Trade-off:

  • Lower interval: Faster detection, but more CPU overhead
  • Higher interval: Less overhead, but longer deadlock duration

Victim Selection Policy

[deadlock]
victim_policy = "youngest"  # Options: youngest, oldest, lowest_priority, most_locks
PolicyUse Case
youngestMinimize wasted work (default)
oldestPrevent starvation of long transactions
lowest_priorityBusiness-critical transactions survive
most_locksMaximize system throughput

Transaction Priorities

#![allow(unused)]
fn main() {
// Set priority when starting transaction
let tx = coordinator.begin_with_priority(Priority::High)?;
}

Manual Intervention

Force Abort Specific Transaction

neumann-admin tx abort --tx-id 12345 --reason "manual deadlock resolution"

Clear All Pending Transactions

# Emergency only - will lose in-flight work
neumann-admin tx clear-pending --confirm

Disable Auto-Resolution

[deadlock]
auto_abort_victim = false  # Require manual intervention

Then manually resolve:

# List detected deadlocks
neumann-admin deadlock list

# Resolve specific deadlock
neumann-admin deadlock resolve --cycle-id abc123 --abort tx_789

Prevention

Lock Ordering

Acquire locks in consistent order across all transactions:

#![allow(unused)]
fn main() {
// Good: always lock in sorted key order
let mut keys = vec!["key_b", "key_a", "key_c"];
keys.sort();
for key in keys {
    tx.lock(key)?;
}
}

Timeout-Based Prevention

[transaction]
lock_timeout_ms = 5000  # Abort if can't acquire lock within 5s

Reduce Lock Scope

#![allow(unused)]
fn main() {
// Bad: lock entire table
tx.lock("users/*")?;

// Good: lock specific keys
tx.lock("users/123")?;
tx.lock("users/456")?;
}

Troubleshooting

High Deadlock Rate

Cause: Hot keys with many concurrent transactions

Solution:

  1. Identify hot keys: neumann-admin lock-stats --top 10
  2. Consider sharding hot keys
  3. Batch operations to reduce lock duration

Detection Latency Spikes

Cause: Large wait-for graph from many concurrent transactions

Solution:

  1. Increase max_concurrent_transactions
  2. Reduce transaction duration
  3. Consider optimistic concurrency for read-heavy workloads

False Positives

Cause: Network delays causing timeout-based false waits

Solution:

  1. Increase lock_wait_threshold_ms
  2. Verify network latency between nodes
  3. Check for GC pauses

Benchmarks

This section provides performance benchmarks for all Neumann crates, measured using Criterion.rs.

Running Benchmarks

# Run all benchmarks
cargo bench

# Run benchmarks for a specific crate
cargo bench --package tensor_store
cargo bench --package relational_engine
cargo bench --package graph_engine
cargo bench --package vector_engine
cargo bench --package neumann_parser
cargo bench --package query_router
cargo bench --package neumann_shell
cargo bench --package tensor_compress
cargo bench --package tensor_vault
cargo bench --package tensor_cache
cargo bench --package tensor_chain

Benchmark reports are generated in target/criterion/ with HTML visualizations.

Performance Summary

In-Memory Operations

ComponentKey MetricPerformance
tensor_storeConcurrent writes7.5M/sec @ 1M entities
relational_engineIndexed lookup2.9us (1,604x vs scan)
graph_engineBFS traversal3us/node
vector_engineHNSW search150us @ 10K vectors
tensor_compressTT decompose10-20x compression
tensor_vaultAES-256-GCM24us get, 29us set
tensor_cacheExact lookup208ns hit
tensor_chainConflict detection52M pairs/sec @ 99% sparse
neumann_parserQuery parsing1.9M queries/sec
query_routerMixed workload455 queries/sec

Durable Storage (WAL)

OperationKey MetricPerformance
WAL writesDurable PUT (128d embeddings)1.4M ops/sec
WAL recoveryReplay 10K records~400us (25M records/sec)

All engines (RelationalEngine, GraphEngine, VectorEngine) support optional durability via open_durable() with full crash consistency.

Hardware Notes

Benchmarks run on:

  • Apple M-series (ARM64) or Intel x86_64
  • Results may vary based on CPU cache sizes, memory bandwidth, and core count

For consistent benchmarking:

# Disable CPU frequency scaling (Linux)
sudo cpupower frequency-set --governor performance

# Run with minimal background activity
cargo bench -- --noplot  # Skip HTML report generation for faster runs

Benchmark Categories

Storage Layer

Engines

Extended Modules

Distributed Systems

Query Layer

tensor_store Benchmarks

The tensor store uses DashMap (sharded concurrent HashMap) for thread-safe key-value storage.

Core Operations

Operation100 items1,000 items10,000 items
put40us (2.5M/s)447us (2.2M/s)7ms (1.4M/s)
get33us (3.0M/s)320us (3.1M/s)3ms (3.3M/s)

Scan Operations (10k total items, parallel)

OperationTime
scan 1k keys191us
scan_count 1k keys41us

Concurrent Write Performance

ThreadsDisjoint KeysHigh Contention (100 keys)
2795us974us
41.59ms1.48ms
84.6ms2.33ms

Mixed Workload

ConfigurationTime
4 readers + 2 writers579us

Analysis

  • Read vs Write: Reads are ~20% faster than writes due to DashMap’s read-optimized design
  • Scaling: Near-linear scaling up to 10k items; slight degradation at scale due to hash table growth
  • Concurrency: DashMap’s 16-shard design provides excellent concurrent performance
  • Contention: Under high contention, performance actually improves at 8 threads vs 4 (lock sharding distributes load)
  • Parallel scans: Uses rayon for >1000 keys (25-53% faster)
  • scan_count vs scan: Count-only is ~5x faster (avoids string cloning)

Bloom Filter (optional)

OperationTime
add68 ns
might_contain (hit)46 ns
might_contain (miss)63 ns

Sparse Lookups (1K keys in store)

Query TypeWithout BloomWith Bloom
Negative lookup52 ns68 ns
Positive lookup45 ns60 ns
Sparse workload (90% miss)52 ns67 ns

Note: Bloom filter adds ~15ns overhead for in-memory DashMap stores. It’s designed for scenarios where the backing store is slower (disk, network, remote database), where the early rejection of non-existent keys avoids expensive I/O.

Snapshot Persistence (bincode)

Operation100 items1,000 items10,000 items
save100 us (1.0M/s)927 us (1.08M/s)12.6 ms (791K/s)
load74 us (1.35M/s)826 us (1.21M/s)10.7 ms (936K/s)
load_with_bloom81 us (1.23M/s)840 us (1.19M/s)11.0 ms (908K/s)

Each item is a TensorData with 3 fields: id (i64), name (String), embedding (128-dim Vec<f32>).

Snapshot File Sizes

ItemsFile SizePer Item
100~60 KB~600 bytes
1,000~600 KB~600 bytes
10,000~6 MB~600 bytes

Snapshot Analysis

  • Throughput: ~1M items/second for both save and load
  • Atomicity: Uses temp file + rename for crash-safe writes
  • Bloom filter overhead: ~3-5% slower to rebuild filter during load
  • Scaling: Near-linear with dataset size
  • File size: ~600 bytes per item with 128-dim embeddings (dominated by vector data)

Write-Ahead Log (WAL)

WAL provides crash-consistent durability with minimal performance overhead. Benchmarks use same payload as in-memory tests (128-dim embeddings).

WAL Writes

RecordsTimeThroughput
100152 us657K ops/s
1,000753 us1.33M ops/s
10,0006.95 ms1.44M ops/s

WAL Recovery

RecordsTimeThroughput
100382 us261K elem/s
1,000394 us2.5M elem/s
10,000391 us25.6M elem/s

WAL Analysis

  • Near constant recovery time: Recovery is dominated by file open overhead (~400us), not record count
  • Sequential I/O: WAL replay reads sequentially, hitting 25M records/sec
  • Durable vs in-memory: WAL writes at 1.4M ops/sec vs 2.0M ops/sec in-memory (72% of in-memory speed)
  • Use case: Production deployments requiring crash consistency

All engines support WAL via open_durable():

#![allow(unused)]
fn main() {
// Durable graph engine
let engine = GraphEngine::open_durable("data/graph.wal", WalConfig::default())?;

// Recovery after crash
let engine = GraphEngine::recover("data/graph.wal", &WalConfig::default(), None)?;
}

Sparse Vectors

SparseVector provides memory-efficient storage for high-sparsity embeddings by storing only non-zero values.

Construction (768d)

SparsityTimeThroughput
50%1.2 us640K/s
90%890 ns870K/s
99%650 ns1.18M/s

Dot Product (768d)

SparsitySparse-SparseSparse-DenseDense-DenseSparse Speedup
50%2.1 us1.8 us580 ns0.3x (slower)
90%380 ns290 ns580 ns1.5-2x
99%38 ns26 ns580 ns15-22x

Memory Compression

DimensionSparsityDense SizeSparse SizeRatio
76890%3,072 B1,024 B3x
76899%3,072 B96 B32x
153699%6,144 B184 B33x

Sparse Vector Analysis

  • High sparsity sweet spot: At 99% sparsity, dot products are 15-22x faster than dense
  • Memory scaling: Compression ratio = 1 / (1 - sparsity), so 99% sparse = ~100x smaller
  • Construction overhead: Negligible (~1us per vector)
  • Use case: Embeddings from sparse models, one-hot encodings, pruned representations

Delta Vectors

DeltaVector stores embeddings as differences from reference “archetype” vectors, ideal for clustered embeddings.

Construction (768d, 5% delta)

DimensionTimeThroughput
1281.9 us526K/s
76812.3 us81K/s
153625.1 us40K/s

Dot Product (768d, precomputed archetype dot)

MethodTimevs Dense
Delta precomputed89 ns6.5x faster
Delta full620 ns~same
Dense baseline580 ns1x

Same-Archetype Dot Product (768d)

MethodTimeSpeedup
Delta-delta145 ns4x
Dense baseline580 ns1x

Delta Memory (768d)

Delta FractionDense SizeDelta SizeRatio
1% diff3,072 B120 B25x
5% diff3,072 B360 B8.5x
10% diff3,072 B680 B4.5x

Archetype Registry (8 archetypes, 768d)

OperationTime
find_best_archetype4.2 us
encode14 us
decode1.1 us

Delta Vector Analysis

  • Precomputed speedup: With archetype dot products cached, 6.5x faster than dense
  • Cluster-friendly: Similar vectors share archetypes, deltas are sparse
  • Use case: Semantic embeddings that cluster (documents, user profiles, products)

K-means Clustering

K-means discovers archetype vectors automatically from embedding collections.

K-means fit (128d, k=5)

VectorsTimeThroughput
10050 us2.0M elem/s
500241 us2.1M elem/s
1000482 us2.1M elem/s

Varying k (1000 vectors, 128d)

kTimeThroughput
2183 us5.5M elem/s
5482 us2.1M elem/s
10984 us1.0M elem/s
2014.5 ms69K elem/s

K-means Analysis

  • K-means++ is faster: Better initial centroids mean fewer iterations to converge
  • Linear with n: Doubling vectors roughly doubles time
  • Quadratic with k at high k: Each iteration is O(n*k), and more clusters need more iterations
  • Use case: Auto-discover archetypes for delta encoding, cluster analysis, centroid-based search

relational_engine Benchmarks

The relational engine provides SQL-like operations on top of tensor_store, with optional hash indexes for accelerated equality lookups and tensor-native condition evaluation.

Row Insertion

CountTimeThroughput
100462us216K rows/s
1,0003.1ms319K rows/s
5,00015.6ms320K rows/s

Batch Insertion

CountTimeThroughput
100282us355K rows/s
1,0001.45ms688K rows/s
5,0007.26ms688K rows/s

Select Full Scan

RowsTimeThroughput
100119us841K rows/s
1,000995us1.01M rows/s
5,0005.27ms949K rows/s

Select with Index vs Without (5,000 rows)

Query TypeWith IndexWithout IndexSpeedup
Equality (2% match)105us4.23ms40x
By _id (single row)2.93us4.70ms1,604x

Select Filtered - No Index (5,000 rows)

Filter TypeTime
Range (20% match)4.16ms
Compound AND4.42ms

Index Creation (parallel)

RowsTime
100554us
1,0002.75ms
5,00012.3ms

Update/Delete (1,000 rows, 10% affected)

OperationTime
Update1.74ms
Delete2.14ms

Join Performance (hash join)

TablesResult RowsTime
50 users x 500 posts5001.78ms
100 users x 1000 posts1,0001.50ms
100 users x 5000 posts5,00032.2ms

JOIN Types (10K x 10K rows)

JOIN TypeTimeThroughput
INNER JOIN45ms2.2M rows/s
LEFT JOIN52ms1.9M rows/s
RIGHT JOIN51ms1.9M rows/s
FULL JOIN68ms1.5M rows/s
CROSS JOIN180ms555K rows/s
NATURAL JOIN48ms2.1M rows/s

Aggregate Functions (1M rows, SIMD-accelerated)

FunctionTimeNotes
COUNT(*)2.1msO(1) via counter
SUM(col)8.5msSIMD i64x4
AVG(col)8.7msSIMD i64x4
MIN(col)12msFull scan
MAX(col)12msFull scan

GROUP BY Performance (100K rows)

GroupsTimeNotes
1015msParallel aggregation
10018msHash-based grouping
1,00025msLow per-group overhead
10,00045msHigh cardinality

Row Count

RowsTime
10049us
1,000462us
5,0002.95ms

Analysis

  • Index acceleration: Hash indexes provide O(1) lookup for equality conditions
    • 40x speedup for equality queries matching 2% of rows
    • 1,604x speedup for single-row _id lookups
  • Full scan cost: Without index, O(n) for all queries (parallelized for

    1000 rows)

  • Batch insert: 2x faster than individual inserts (688K/s vs 320K/s)
  • Tensor-native evaluation: evaluate_tensor() evaluates conditions directly on TensorData, avoiding Row conversion for non-matching rows
  • Parallel operations: update/delete/create_index use rayon for condition evaluation
  • Index maintenance: Small overhead on insert/update/delete to maintain indexes
  • Join complexity: O(n+m) hash join for INNER/LEFT/RIGHT/NATURAL; O(n*m) for CROSS
  • Aggregate functions: SUM/AVG use SIMD i64x4 vectors for 4x throughput improvement
  • GROUP BY: Hash-based grouping with parallel per-group aggregation

Competitor Comparison

OperationNeumannSQLiteDuckDBNotes
Point lookup (indexed)2.9us~3us~30usB-tree optimized
Full scan (5K rows)5.3ms~15ms~2msDuckDB columnar wins
Aggregation (1M rows)8.5ms~200ms~12msSIMD-accelerated
Hash join (10Kx10K)45ms~500ms~35msParallel execution
Insert (single row)3.1us~2us~5usSQLite B-tree optimal
Batch insert (1K rows)1.5ms~8ms~3msNeumann batch-optimized

Design Trade-offs

  • vs SQLite: Neumann trades SQLite’s proven stability for tensor-native storage and SIMD acceleration. SQLite wins on point lookups; Neumann wins on analytics.
  • vs DuckDB: Similar columnar design. DuckDB has more mature query optimizer; Neumann has tighter tensor integration and lower memory footprint.
  • Unique to Neumann: Unified tensor storage enables cross-engine queries (relational + graph + vector) without data movement.

graph_engine Benchmarks

The graph engine stores nodes and edges as tensors, using adjacency lists for neighbor lookups.

Node Creation

CountTimePer Node
100107us1.07us
1,0001.67ms1.67us
5,0009.4ms1.88us

Edge Creation (1,000 edges)

TypeTimePer Edge
Directed2.4ms2.4us
Undirected3.6ms3.6us

Neighbor Lookup (star graph)

Fan-outTimePer Neighbor
1016us1.6us
5079us1.6us
100178us1.8us

BFS Traversal (binary tree)

DepthNodesTimePer Node
531110us3.5us
7127442us3.5us
95111.5ms2.9us

Shortest Path (BFS)

Graph TypeSizeTime
Chain10 nodes8.2us
Chain50 nodes44us
Chain100 nodes96us
Grid5x555us
Grid10x10265us

Analysis

  • Undirected edges: ~50% slower than directed (stores reverse edge internally)
  • Traversal: Consistent ~3us per node visited, good BFS implementation
  • Path finding: Near-linear with path length in chains; grid explores more nodes
  • Parallel delete_node: Uses rayon for high-degree nodes (>100 edges)
  • Memory overhead: Each node/edge is a full TensorData (~5-10 allocations)

Storage Model

graph_engine stores each node and edge as a separate tensor:

node:{id} -> TensorData { label, properties... }
edge:{id} -> TensorData { from, to, label, directed, properties... }
adj:{node_id}:out -> TensorData { edge_ids: [...] }
adj:{node_id}:in -> TensorData { edge_ids: [...] }

Trade-offs

  • Pro: Flexible property storage, consistent with tensor model
  • Con: More key lookups than traditional adjacency list
  • Pro: Each component independently updatable

Complexity

OperationTime ComplexityNotes
create_nodeO(1)Hash insert
create_edgeO(1)Hash insert + adjacency update
get_neighborsO(degree)Adjacency list lookup
bfsO(V + E)Standard BFS
shortest_pathO(V + E)BFS-based
delete_nodeO(degree)Removes all edges

vector_engine Benchmarks

The vector engine stores embeddings and performs k-nearest neighbor search using cosine similarity.

Store Embedding

DimensionTimeThroughput
128366 ns2.7M/s
768892 ns1.1M/s
1536969 ns1.0M/s

Get Embedding

DimensionTime
768287 ns

Delete Embedding

OperationTime
delete806 ns

Similarity Search (top 10, SIMD + adaptive parallel)

DatasetTimePer VectorMode
1,000 x 128d242 us242 nsSequential
1,000 x 768d367 us367 nsSequential
10,000 x 128d1.93 ms193 nsParallel

Cosine Similarity Computation (SIMD-accelerated)

DimensionTime
12826 ns
768165 ns
1536369 ns

Analysis

  • SIMD acceleration: 8-wide f32 SIMD (via wide crate) provides 3-9x speedup for cosine similarity
  • Adaptive parallelism: Uses rayon for parallel search when >5000 vectors (1.6x speedup at 10K)
  • Linear scaling with dimension: Cosine similarity is O(d) where d is vector dimension
  • Linear scaling with dataset size: Brute-force search is O(n*d) for n vectors
  • Memory bound: For 768d vectors, ~3 KB per embedding (768 * 4 bytes)
  • Search throughput: ~4M vector comparisons/second at 128d (with SIMD)
  • Store/Get performance: Sub-microsecond for typical embedding sizes

Complexity

OperationTime ComplexityNotes
store_embeddingO(d)Vector copy + hash insert
get_embeddingO(d)Hash lookup + vector clone
delete_embeddingO(1)Hash removal
search_similarO(n*d)Brute-force scan
compute_similarityO(d)Dot product + 2 magnitude calculations

HNSW Index (Approximate Nearest Neighbor)

HNSW provides O(log n) search complexity instead of O(n) brute force.

ConfigurationSearch Time (5K, 128d)
high_speed~50 us
default~100 us
high_recall~200 us

HNSW vs Brute Force (10K vectors, 128d)

MethodSearch TimeSpeedup
Brute force~2 ms1x
HNSW default~150 us~13x
Corpus SizeApproachRationale
< 10KBrute forceFast enough, pure tensor
10K - 100KHNSWPragmatic, 5-13x faster
> 100KHNSWNecessary for latency

Scaling Projections (HNSW for >10K vectors)

VectorsDimensionSearch Time (est.)
10K768~200 us
100K768~500 us
1M768~1 ms

For production workloads at extreme scale (>1M vectors), consider:

  • Sharded HNSW across multiple nodes
  • Dimensionality reduction (PCA)
  • Quantization (int8, binary)

Storage Model

vector_engine stores each embedding as a tensor:

emb:{key} -> TensorData { vector: [...] }

Trade-offs

  • Pro: Simple storage model, consistent with tensor abstraction
  • Pro: Sub-microsecond store/get operations
  • Pro: HNSW index for O(log n) approximate nearest neighbor search
  • Con: Brute-force O(n*d) for exact search (use HNSW for approximate)

tensor_compress Benchmarks

The tensor_compress crate provides compression algorithms optimized for tensor data: Tensor Train decomposition, delta encoding, sparse vectors, and run-length encoding.

Tensor Train Decomposition (primary compression method)

OperationTimePeak RAM
tt_decompose_256d~50 us41.8 KB
tt_decompose_1024d~80 us60.9 KB
tt_decompose_4096d~120 us137.5 KB
tt_reconstruct_4096d~1.2 ms67.9 KB
tt_dot_product_4096d~400 ns69.2 KB
tt_cosine_similarity_4096d~1 us69.2 KB

Delta Encoding (10K sequential IDs)

OperationTimeThroughputPeak RAM
compress_ids8.0 us1.25M IDs/s~210 KB
decompress_ids33 us303K IDs/s~100 KB

Run-Length Encoding (100K values)

OperationTimeThroughputPeak RAM
rle_encode29 us3.4M values/s~445 KB
rle_decode38 us2.6M values/s~833 KB

Compression Ratios

Data TypeTechniqueRatioLossless
4096-dim embeddingsTensor Train10-20xNo (<1% error)
1024-dim embeddingsTensor Train4-8xNo (<1% error)
Sparse vectorsNative sparse3-32xYes
Sequential IDsDelta + varint4-8xYes
Repeated valuesRLE2-100xYes

Analysis

  • TT decomposition: Achieves 10-20x compression for high-dimensional embeddings (4096+)
  • TT operations in compressed space: Dot product and cosine similarity computed directly in TT format without full reconstruction
  • Delta encoding: Asymmetric - compression is 4x faster than decompression
  • Sparse format: Efficient for vectors with >50% zeros, stores only non-zero positions/values
  • RLE: Best for highly repeated data (status columns, category IDs)
  • Memory efficiency: All operations use < 1 MB for typical data sizes
  • Integration: Use SAVE COMPRESSED in shell or save_snapshot_compressed() API

Usage Recommendations

Data CharacteristicsRecommended Compression
High-dimensional embeddings (1024+)Tensor Train
Sparse embeddings (>50% zeros)Native sparse format
Sequential IDs (node IDs, row IDs)Delta + varint
Categorical columns with repeatsRLE
Mixed data snapshotsComposite (auto-detect)

tensor_vault Benchmarks

The tensor_vault crate provides AES-256-GCM encrypted secret storage with graph-based access control, permission levels, TTL grants, rate limiting, namespace isolation, audit logging, and secret versioning.

Key Derivation (Argon2id)

OperationTimePeak RAM
argon2id_derivation80 ms~64 MB

Note: Argon2id is intentionally slow to resist brute-force attacks. The 64MB memory cost is configurable via VaultConfig.

Encryption/Decryption (AES-256-GCM)

OperationTimePeak RAM
set_1kb29 us~3 KB
get_1kb24 us~3 KB
set_10kb93 us~25 KB
get_10kb91 us~25 KB

Note: set includes versioning overhead (storing previous version pointers). get includes audit logging.

Access Control (Graph Path Verification)

OperationTimePeak RAM
check_shallow (1 hop)6 us~2 KB
check_deep (10 hops)17 us~3 KB
grant18 us~1 KB
revoke1.07 ms~1 KB

Secret Listing

OperationTimePeak RAM
list_100_secrets291 us~4 KB
list_1000_secrets2.7 ms~40 KB

Note: List includes access control checks and key name decryption for pattern matching.

Analysis

  • Key derivation: Argon2id dominates vault initialization (~80ms). This is by design for security.
  • Access check improved: Path verification is now ~6us for shallow, ~17us for deep (85% faster than before).
  • Versioning overhead: set is ~2x slower due to version tracking (stores pointer array).
  • Audit overhead: Every operation logs to audit store (adds ~5-10us per operation).
  • Revoke performance: ~1ms due to edge deletion, TTL tracker cleanup, and audit logging.
  • List scaling: ~2.7us per secret at 1000 (includes decryption for pattern matching).

Feature Performance Overhead

FeatureOverhead
Permission check~1 us (edge type comparison)
Rate limit check~100 ns (DashMap lookup)
TTL check~50 ns (heap peek)
Audit log write~5 us (tensor store put)
Version tracking~10 us (pointer array update)

Security vs Performance Trade-offs

ConfigurationKey DerivationSecurity
Default (64MB, 3 iter)~80 msHigh
Fast (16MB, 1 iter)~25 msMedium
Paranoid (256MB, 10 iter)~800 msVery High

Recommendations

  • Development: Use Fast configuration for quicker iteration
  • Production: Use Default or Paranoid based on threat model
  • High-throughput: Cache access decisions where possible
  • Audit compliance: Accept ~5us overhead for complete audit trail

tensor_cache Benchmarks

The tensor_cache crate provides LLM response caching with exact, semantic (HNSW), and embedding caches.

Exact Cache (Hash-based O(1))

OperationTime
lookup_hit208 ns
lookup_miss102 ns

Semantic Cache (HNSW-based O(log n))

OperationTime
lookup_hit21 us

Put (Exact + Semantic + HNSW insert)

EntriesTime
10049 us
1,00047 us
10,00053 us

Embedding Cache

OperationTime
lookup_hit230 ns
lookup_miss110 ns

Eviction (batch processing)

Entries in CacheTime
1,0003.3 us
5,0004.0 us
10,0008.4 us

Distance Metrics (raw computation, 128d)

MetricTimeNotes
Jaccard73 nsFastest, best for sparse
Euclidean105 nsGood for spatial data
Cosine186 nsDefault, best for dense
Angular193 nsAlternative to cosine

Semantic Lookup by Metric (1000 entries)

MetricTime
Jaccard28.6 us
Euclidean27.8 us
Cosine28.4 us

Sparse vs Dense (80% sparsity)

ConfigurationTimeImprovement
Dense lookup28.8 usbaseline
Sparse lookup24.1 us16% faster

Auto-Metric Selection

OperationTime
Sparsity check0.66 ns
Auto-select dense13.4 us
Auto-select sparse16.5 us

Redis Comparison

SystemIn-ProcessOver TCP
Redis~60 ns~143 us
tensor_cache (exact)208 ns~143 us*
tensor_cache (semantic)21 usN/A

*Estimated: network latency dominates (99.9% of time).

Key Insight: For embedded use (no network), Redis is 3.5x faster for exact lookups. Over TCP (typical deployment), both are network-bound at ~143us. Our differentiator is semantic search (21us) which Redis cannot provide.

Analysis

  • Exact cache: Hash-based O(1) lookup provides sub-microsecond hit/miss detection
  • Semantic cache: HNSW index provides O(log n) similarity search (~21us for hit)
  • Embedding cache: Fast O(1) lookup for precomputed embeddings
  • Put performance: Consistent ~50us regardless of cache size (HNSW insert is O(log n))
  • Eviction: Efficient batch eviction with LRU/LFU/Cost/Hybrid strategies
  • Distance metrics: Auto-selection based on sparsity (>=70% sparse uses Jaccard)
  • Token counting: tiktoken cl100k_base encoding for accurate GPT-4 token counts
  • Cost tracking: Estimates cost savings based on model pricing tables

Cache Layers

LayerComplexityUse Case
ExactO(1)Identical prompts
SemanticO(log n)Similar prompts
EmbeddingO(1)Precomputed embeddings

Eviction Strategies

StrategyDescription
LRUEvict least recently accessed
LFUEvict least frequently accessed
CostBasedEvict lowest cost efficiency
HybridWeighted combination (recommended)

Metric Selection Guide

Embedding TypeRecommended Metric
OpenAI/Cohere (dense)Cosine (default)
Sparse (>=70% zeros)Jaccard (auto-selected)
Spatial/geographicEuclidean
Custom binaryJaccard

tensor_blob Benchmarks

The tensor_blob crate provides S3-style chunked blob storage with content-addressable chunks, garbage collection, and integrity verification.

Overview

tensor_blob focuses on correctness and durability over raw throughput. Performance characteristics depend heavily on:

  • Chunk size configuration
  • Storage backend (memory vs disk)
  • Network conditions for streaming operations

Expected Performance Characteristics

OperationComplexityNotes
Put (upload)O(size / chunk_size)Linear with data size
Get (download)O(size / chunk_size)Linear with data size
DeleteO(chunk_count)Removes metadata + orphan detection
GCO(total_chunks)Full chunk scan
VerifyO(size)Re-hash entire blob
RepairO(corrupted_chunks)Only processes damaged chunks

Chunk Deduplication

Identical content shares chunks via SHA-256 content addressing:

  • Duplicate blobs: Store once, reference count tracked
  • Partial overlap: Shared chunks deduplicated at chunk boundaries
  • Storage savings: Depends on data redundancy

Garbage Collection

OperationBehavior
gc()Returns GcStats { deleted, freed_bytes }
Orphan detectionMarks unreferenced chunks
Active upload protectionGC skips in-progress uploads

Streaming Operations

APIUse Case
BlobWriterStreaming upload, bounded memory
BlobReader::next_chunk()Streaming download, chunk-by-chunk
get_full()Small blobs (<10MB), loads to memory

Configuration Impact

SettingImpact
Larger chunk_sizeFewer chunks, less overhead, less dedup
Smaller chunk_sizeMore chunks, more overhead, better dedup
Recommended1-4 MB chunks for most workloads

Integration Notes

  • Blob store persists to TensorStore
  • Metadata includes checksum, size, creation time
  • Links enable blob-to-graph entity relationships
  • Tags support blob categorization and search

Benchmarking Blob Operations

# Run blob-specific benchmarks (if available)
cargo bench --package tensor_blob

# For custom benchmarking, use the streaming API:
# - Measure upload throughput with BlobWriter
# - Measure download throughput with BlobReader
# - Test GC performance with various orphan ratios

tensor_chain Benchmarks

The tensor_chain crate provides a tensor-native blockchain with semantic consensus, Raft replication, 2PC distributed transactions, and sparse delta encoding.

Block Creation

ConfigurationTimePer Transaction
empty_block171 ns
block_10_txns13.4 us1.34 us
block_100_txns111 us1.11 us

Transaction Commit

OperationTimeThroughput
single_put432 us2.3K/s
multi_put_10480 us20.8K ops/s

Batch Transactions

CountTimeThroughput
10822 us12.2K/s
10021.5 ms4.7K/s
10001.6 s607/s

Consensus Validation

OperationTimeNotes
conflict_detection_pair279 nsHybrid cosine + Jaccard
cosine_similarity187 nsSparse vector
merge_pair448 nsOrthogonal merge
merge_all_10632 nsBatch merge
find_merge_order_109 usOptimal ordering

Codebook Operations

OperationTimeNotes
global_quantize_128d854 nsState validation
global_compute_residual925 nsDelta compression
global_is_valid_state1.28 usState machine check
local_quantize_128d145 nsEMA-adaptive
local_quantize_and_update177 nsWith EMA update
manager_quantize_128d1.2 usFull pipeline

Delta Vector Operations

OperationTimeImprovement
cosine_similarity_128d196 ns35% faster
add_128d975 ns44% faster
scale_128d163 ns35% faster
weighted_average_128d982 ns26% faster
overlaps_with8.4 ns35% faster
cosine_similarity_768d1.96 us10% faster
add_768d2.6 us27% faster

Chain Query Operations

OperationTimeImprovement
get_block_by_height1.19 us38% faster
get_tip1.06 us45% faster
get_genesis852 ns53% faster
height0.87 ns50% faster
tip_hash11.4 ns32% faster
history_key163 us15% faster
verify_chain_100_blocks276 us

Chain Iteration

OperationTimeImprovement
iterate_50_blocks88 us10% faster
get_blocks_range_0_2535 us27% faster

K-means Codebook Training

ConfigurationTime
100 vectors, 8 clusters123 us
1000 vectors, 16 clusters8.4 ms

Sparse Vector Performance

Conflict Detection by Sparsity Level (50 deltas, 128d)

SparsityTimeThroughputvs Dense
10% (dense)389 us3.1M pairs/s1x
50%261 us4.6M pairs/s1.5x
90%57 us21.5M pairs/s6.8x
99%23 us52.3M pairs/s16.9x

Individual Sparse Operations (vs previous dense implementation)

OperationSparse TimeImprovement
cosine_similarity16.5 ns76% faster
angular_distance28.5 ns64% faster
jaccard_index10.4 ns58% faster
euclidean_distance13.6 ns71% faster
overlapping_keys89 ns45% faster
add688 ns19% faster
weighted_average674 ns12% faster
project_orthogonal624 ns42% faster
detect_conflict_full53 ns33% faster

High Dimension Sparse Performance

DimensionCosine TimeBatch Detect (20 deltas)Improvement
128d10.3 ns8.9 us57% faster
256d19 ns9.5 us55% faster
512d41 ns17.2 us49-75% faster
768d62.5 ns24 us55-77% faster

Real Transaction Delta Sparsity Analysis

Measurement of actual delta sparsity for different transaction patterns (128d embeddings):

PatternAvg NNZSparsityEstimated Speedup
Single Key Update4.096.9%~10x
Multi-Field Update11.391.2%~3x
New Record Insert29.577.0%~1x
Counter Increment1.099.2%~10x
Bulk Migration59.553.5%~1x
Graph Edge7.094.5%~3x

Realistic Workload Mix (70% single-key, 20% multi-field, 10% other):

  • Average NNZ: 7.1 / 128 dimensions
  • Average Sparsity: 94.5%
  • Expected speedup: 3-10x for typical workloads

Analysis

  • Sparse advantage: Real transaction deltas are 90-99% sparse, providing 3-10x speedup
  • Hybrid conflict detection: Cosine + Jaccard catches both angular and structural conflicts
  • Memory savings: Sparse DeltaVector uses 8-32x less memory than dense for typical deltas
  • Network bandwidth: Sparse serialization reduces replication bandwidth by 8-10x
  • High dimension scaling: Benefits increase with dimension (768d: 4-5x faster than dense)
  • Common operations optimized: Single-key updates (most common) are 96.9% sparse

Distributed Systems Benchmarks

Raft Consensus Operations

OperationTimeThroughput
raft_node_create545 ns1.8M/sec
raft_become_leader195 ns5.1M/sec
raft_heartbeat_stats_snapshot4.2 ns238M/sec
raft_log_length3.7 ns270M/sec
raft_stats_snapshot416 ps2.4B/sec

2PC Distributed Transaction Operations

OperationTimeThroughput
lock_manager_acquire256 ns3.9M/sec
lock_manager_release139 ns7.2M/sec
lock_manager_is_locked31 ns32M/sec
coordinator_create46 ns21.7M/sec
coordinator_stats418 ps2.4B/sec
participant_create11 ns91M/sec

Gossip Protocol Operations

OperationTimeThroughput
lww_state_create4.2 ns238M/sec
lww_state_merge169 ns5.9M/sec
gossip_node_state_create16 ns62M/sec
gossip_message_serialize36 ns28M/sec
gossip_message_deserialize81 ns12M/sec

Snapshot Operations

OperationTimeThroughput
snapshot_metadata_create131 ns7.6M/sec
snapshot_metadata_serialize76 ns13M/sec
snapshot_metadata_deserialize246 ns4.1M/sec
raft_membership_config_create102 ns9.8M/sec
raft_with_store_create948 ns1.1M/sec

Membership Operations

OperationTimeThroughput
membership_manager_create526 ns1.9M/sec
membership_view152 ns6.6M/sec
membership_partition_status19 ns52M/sec
membership_node_status46 ns21.7M/sec
membership_stats_snapshot2.9 ns344M/sec
membership_peer_ids71 ns14M/sec

Deadlock Detection

OperationTimeThroughput
wait_graph_add_edge372 ns2.7M/sec
wait_graph_detect_no_cycle374 ns2.7M/sec
wait_graph_detect_with_cycle302 ns3.3M/sec
deadlock_detector_detect392 ns2.6M/sec

Distributed Systems Analysis

  • Lock operations are fast: Lock acquisition at 256ns and lock checks at 31ns support high-throughput 2PC
  • Gossip is lightweight: State creation <5ns, merges ~169ns - suitable for high-frequency protocol rounds
  • Stats access is near-free: Sub-nanosecond stats snapshots (416ps) mean monitoring adds no overhead
  • Deadlock detection is efficient: Cycle detection in ~300-400ns allows frequent checks without blocking
  • Node/manager creation is slower (500-950ns) - expected for initialization with data structures
  • Snapshot deserialization at 246ns is acceptable for fast recovery

neumann_parser Benchmarks

The parser is a hand-written recursive descent parser with Pratt expression parsing for operator precedence.

Tokenization

Query TypeTimeThroughput
simple_select182 ns99 MiB/s
select_where640 ns88 MiB/s
complex_select986 ns95 MiB/s
insert493 ns120 MiB/s
update545 ns91 MiB/s
node625 ns98 MiB/s
edge585 ns94 MiB/s
path486 ns75 MiB/s
embed407 ns138 MiB/s
similar185 ns118 MiB/s

Parsing (tokenize + parse)

Query TypeTimeThroughput
simple_select235 ns77 MiB/s
select_where1.19 us47 MiB/s
complex_select1.89 us50 MiB/s
insert688 ns86 MiB/s
update806 ns61 MiB/s
delete464 ns62 MiB/s
create_table856 ns80 MiB/s
node837 ns81 MiB/s
edge750 ns74 MiB/s
neighbors520 ns55 MiB/s
path380 ns58 MiB/s
embed_store650 ns86 MiB/s
similar290 ns76 MiB/s

Expression Complexity

Expression TypeTime
simple (a = 1)350 ns
binary_and580 ns
binary_or570 ns
nested_and_or950 ns
deep_nesting1.5 us
arithmetic720 ns
comparison_chain1.3 us

Batch Parsing Throughput

Batch SizeTimeQueries/s
105.2 us1.9M/s
10052 us1.9M/s
1,000520 us1.9M/s

Large Query Parsing

Query TypeTime
INSERT 100 rows45 us
EMBED 768-dim vector38 us
WHERE 20 conditions8.5 us

Analysis

  • Zero dependencies: Hand-written lexer and parser with no external crates
  • Consistent throughput: ~75-120 MiB/s across query types
  • Expression complexity: Linear scaling with expression depth
  • Batch performance: Consistent 1.9M queries/second regardless of batch size
  • Large vectors: 768-dim embedding parsing in ~38us (20K dimensions/second)

Complexity

OperationTime ComplexityNotes
TokenizationO(n)Linear scan of input
ParsingO(n)Single pass, no backtracking
Expression parsingO(n * d)n = tokens, d = nesting depth
Error recoveryO(1)Immediate error on invalid syntax

Parser Design

  • Lexer: Character-by-character tokenization with lookahead
  • Parser: Recursive descent with Pratt parsing for expressions
  • AST: Zero-copy where possible, spans track source locations
  • Errors: Rich error messages with span highlighting

query_router Benchmarks

The query router integrates all engines and routes queries based on parsed AST type.

Relational Operations

OperationTime
SELECT * (100 rows)17 us
SELECT WHERE17 us
INSERT290 us
UPDATE6.5 ms

Graph Operations

OperationTime
NODE CREATE2.3 us
EDGE CREATE3.5 us
NEIGHBORS1.8 us
PATH (1 -> 10)85 us
FIND NODE1.2 us

Vector Operations

OperationTime
EMBED STORE (128d)28 us
EMBED GET1.5 us
SIMILAR LIMIT 5 (100 vectors)10 ms
SIMILAR LIMIT 10 (100 vectors)10 ms

Mixed Workload

ConfigurationTimeQueries/s
5 mixed queries (SELECT, NEIGHBORS, SIMILAR, INSERT, NODE)11 ms455/s

Insert Throughput

Batch SizeTimeRows/s
10029 ms3.4K/s
500145 ms3.4K/s
1,000290 ms3.4K/s

Analysis

  • Parse overhead: Parser adds ~200ns-2us per query (negligible vs execution)
  • Routing overhead: AST-based routing is O(1) pattern matching
  • Relational: SELECT is fast (17us); UPDATE scans all rows (6.5ms for 100 rows)
  • Graph: Node/edge creation ~2-3us; path finding scales with path length
  • Vector: Similarity search dominates mixed workloads (~10ms for 100 vectors)
  • Bottleneck identification: SIMILAR queries are the slowest operation; use HNSW index for large vector stores

Query Routing Flow

Query String
    │
    ▼
┌─────────┐
│ Parser  │  ~500ns
└────┬────┘
     │
     ▼
┌─────────┐
│   AST   │
└────┬────┘
     │
     ▼
┌─────────────┐
│   Router    │  O(1) dispatch
└──────┬──────┘
       │
       ├──► RelationalEngine
       ├──► GraphEngine
       ├──► VectorEngine
       ├──► Vault
       ├──► Cache
       └──► BlobStore

Performance Recommendations

Query TypeOptimization
High SELECT volumeCreate hash indexes on filter columns
Large vector searchBuild HNSW index
Graph traversalsUse NEIGHBORS with LIMIT
Batch insertsUse batch_insert() API
Mixed workloadsProfile to identify bottlenecks

Stress Tests

Comprehensive stress testing infrastructure for Neumann targeting 1M entity scale with extensive coverage of concurrency, data volume, and sustained load.

Quick Start

# Run all stress tests (45+ min total)
cargo test --release -p stress_tests -- --ignored --nocapture

# Run specific test suite
cargo test --release -p stress_tests --test hnsw_stress -- --ignored --nocapture

# Run with custom duration (30s instead of default)
STRESS_DURATION=30 cargo test --release -p stress_tests -- --ignored --nocapture

Test Suites

SuiteTestsDescription
HNSW Stress41M vector indexing, concurrent builds
TieredStore Stress3Hot/cold migration under load
Mixed Workload2All engines concurrent, realistic patterns
TensorStore Stress41M entities, high contention
BloomFilter Stress31M keys, bit-level concurrency
QueryRouter Stress3Concurrent queries, sustained writes
Duration Stress3Long-running stability, memory leaks

Key Performance Findings

TensorStore (DashMap)

  • 7.5M writes/sec at 1M entities
  • Sub-microsecond median latency
  • Handles 16:1 contention ratio with 2.5M ops/sec

HNSW Index

  • 3,372 vectors/sec insert rate at 1M scale
  • 0.11ms search latency (p50)
  • 99.8% recall@10 under concurrent load

BloomFilter

  • 0.88% FP rate at 1M keys (target 1%)
  • 15M+ ops/sec bit-level operations
  • Thread-safe with AtomicU64

Mixed Workloads

  • All engines can operate concurrently
  • Graph operations (5us p50) and vector ops (< 1us p50) are fastest
  • Relational engine adds ~12ms p50 overhead due to schema operations

Configuration

Environment Variables

VariableDefaultDescription
STRESS_DURATION30 (quick) / 600 (full)Test duration in seconds
STRESS_THREADS16Thread count for tests

Config Presets

#![allow(unused)]
fn main() {
// Quick mode (CI): 100K entities, 4 threads, 30s
let config = quick_config();

// Full mode (local): 1M entities, 16 threads, 10min
let config = full_config();

// Endurance mode: 500K entities, 8 threads, 1 hour
let config = endurance_config();
}

Latency Metrics

All tests report percentile latencies using HdrHistogram:

MetricDescription
p50Median latency
p9999th percentile
p99999.9th percentile
maxMaximum observed

Running in CI

For CI pipelines, use quick_config with limited duration:

- name: Run stress tests
  run: |
    STRESS_DURATION=30 cargo test --release -p stress_tests -- --ignored --nocapture
  timeout-minutes: 15

HNSW Stress Tests

Stress tests for the Hierarchical Navigable Small World (HNSW) index, targeting 1M vector scale.

Test Suite

TestScaleDescription
stress_hnsw_1m_vectors1M 128d vectorsBuild 1M vector index
stress_hnsw_100k_concurrent_build100K vectors, 16 threadsConcurrent index construction
stress_hnsw_search_during_insert50K vectors, 4+4 threadsConcurrent search during insert
stress_hnsw_recall_under_load10K vectorsVerify recall@10 under load

Results

TestKey MetricResult
1M vectorsInsert throughput3,372 vectors/sec
1M vectorsSearch latency (p50)0.11ms
100K concurrentInsert throughput1,155 vectors/sec
Recall@10Average recall99.8% (min 90%)

Running

# Run all HNSW stress tests
cargo test --release -p stress_tests --test hnsw_stress -- --ignored --nocapture

# Run specific test
cargo test --release -p stress_tests stress_hnsw_1m_vectors -- --ignored --nocapture

1M Vector Index Build

Tests building an HNSW index with 1 million 128-dimensional vectors.

What it validates:

  • Memory efficiency at scale
  • Index build time scalability
  • Search accuracy after large insertions

Expected behavior:

  • Linear memory growth with vector count
  • Sub-linear search time (O(log n))
  • Recall@10 > 95%

Concurrent Index Build

Tests building an HNSW index with 16 concurrent writer threads.

What it validates:

  • Thread-safety of HNSW insert operations
  • Performance under contention
  • Correctness with concurrent modifications

Expected behavior:

  • All inserted vectors are findable
  • No panics or data races
  • Throughput scales with thread count (with diminishing returns)

Search During Insert

Tests searching the index while new vectors are being inserted concurrently.

What it validates:

  • Read/write concurrency safety
  • Search accuracy with ongoing modifications
  • Latency stability under load

Expected behavior:

  • Searches return valid results
  • No stale or corrupted results
  • Latency remains bounded

Recall Under Load

Tests search recall accuracy under sustained concurrent load.

What it validates:

  • HNSW recall guarantees under stress
  • Accuracy with high query volume
  • Configuration impact on recall

Expected behavior:

  • Average recall@10 > 95%
  • Minimum recall@10 > 90%
  • High_recall config > default config recall

Performance Tuning

HNSW Configuration Impact

ConfigInsert SpeedSearch SpeedRecallMemory
high_speedFastestFastestLowerLower
defaultMediumMediumGoodMedium
high_recallSlowestSlowestHighestHigher

Scaling Recommendations

ScaleRecommendation
< 100KUse default config
100K - 1MConsider high_speed if latency-critical
> 1MShard across multiple indexes

TieredStore Stress Tests

Stress tests for the two-tier hot/cold storage system with automatic data migration.

Test Suite

TestScaleDescription
stress_tiered_hot_only_scale1M entitiesHot-only tier at scale
stress_tiered_migration_under_load100K entitiesHot/cold migration with concurrent load
stress_tiered_hot_read_latency100K entitiesRandom access read latency

Results

TestKey MetricResult
Hot-only 1MThroughput689K entities/sec
MigrationConcurrent accessWorks correctly
Read latencyp50< 3us
Read latencyp99< 500us

Running

# Run all TieredStore stress tests
cargo test --release -p stress_tests --test tiered_store_stress -- --ignored --nocapture

# Run specific test
cargo test --release -p stress_tests stress_tiered_hot_only_scale -- --ignored --nocapture

Hot-Only Scale Test

Tests TieredStore performance with only hot tier active (no cold storage).

What it validates:

  • In-memory performance at scale
  • DashMap + instrumentation overhead
  • Memory usage patterns

Expected behavior:

  • Throughput > 500K entities/sec
  • Linear memory growth
  • Consistent latency distribution

Migration Under Load

Tests hot-to-cold data migration while concurrent reads/writes continue.

What it validates:

  • Migration correctness during active use
  • No data loss during tier transitions
  • Read consistency during migration

Expected behavior:

  • All data accessible before and after migration
  • Reads don’t block on migration
  • Writes to migrated keys work correctly

Hot Read Latency

Tests random access read latency for hot tier data.

What it validates:

  • Read latency distribution
  • Hot path optimization
  • Cache efficiency

Expected behavior:

  • p50 latency < 3us
  • p99 latency < 500us
  • No extreme outliers (p999 < 10ms)

Architecture

TieredStore
    │
    ├── Hot Tier (DashMap)
    │   ├── Fast in-memory access
    │   ├── Access instrumentation
    │   └── Automatic hot shard tracking
    │
    └── Cold Tier (mmap)
        ├── Disk-backed storage
        ├── Memory-efficient for large datasets
        └── Transparent promotion on access

Migration Strategies

StrategyTriggerUse Case
Time-basedEntries older than thresholdAging data
Access-basedCold shards (low access)Infrequent data
Memory-basedHot tier size limitMemory pressure

Configuration

#![allow(unused)]
fn main() {
let config = TieredConfig {
    cold_dir: PathBuf::from("/var/lib/neumann/cold"),
    cold_capacity: 1_000_000,  // Max cold entries
    sample_rate: 0.01,         // 1% access sampling
};
}

Mixed Workload Stress Tests

Stress tests that exercise all Neumann engines simultaneously with realistic workload patterns.

Test Suite

TestScaleDescription
stress_all_engines_concurrent25K ops/thread, 12 threadsAll engines under concurrent load
stress_realistic_workload30s durationMixed OLTP + search + traversal

Results

TestKey MetricResult
All enginesCombined throughput841 ops/sec
All enginesRelational p5012ms
All enginesGraph p505us
All enginesVector p50< 1us
Realistic workloadMixed throughput232 ops/sec
Realistic workloadRead rate91 reads/sec
Realistic workloadWrite rate68 writes/sec
Realistic workloadSearch rate72 searches/sec

Running

# Run all mixed workload stress tests
cargo test --release -p stress_tests --test mixed_workload_stress -- --ignored --nocapture

# Run specific test
cargo test --release -p stress_tests stress_all_engines_concurrent -- --ignored --nocapture

All Engines Concurrent

Tests all engines (relational, graph, vector) under simultaneous heavy load from 12 threads.

What it validates:

  • Cross-engine concurrency safety
  • Shared TensorStore contention handling
  • No deadlocks or livelocks
  • Correct results under maximum stress

Workload distribution per thread:

  • Relational: INSERT, SELECT, UPDATE
  • Graph: NODE, EDGE, NEIGHBORS
  • Vector: EMBED, SIMILAR

Expected behavior:

  • No panics or assertion failures
  • All operations complete (no hangs)
  • Data consistency verified post-test

Realistic Workload

Simulates a production-like mixed workload over 30 seconds.

What it validates:

  • Sustained throughput over time
  • Memory stability (no leaks)
  • Latency consistency

Workload pattern:

  • 40% Reads (SELECT, GET, NEIGHBORS)
  • 30% Writes (INSERT, UPDATE, NODE)
  • 30% Searches (SIMILAR, PATH)

Expected behavior:

  • Throughput variance < 20%
  • Memory usage stable
  • No degradation over time

Engine Latency Breakdown

EngineOperationTypical p50Notes
RelationalSELECT1-10msSchema lookup overhead
RelationalINSERT3-15msIndex maintenance
GraphNEIGHBORS5-50usAdjacency list lookup
GraphPATH100us-5msScales with path length
VectorEMBED1-5usHash insert
VectorSIMILAR1-100msScales with corpus size

Bottleneck Identification

When mixed workload throughput is lower than expected:

  1. Vector search dominates: Use HNSW index for SIMILAR queries
  2. Relational scans: Add hash/B-tree indexes on filter columns
  3. Graph traversals: Add LIMIT to NEIGHBORS/PATH queries
  4. Contention: Check hot shards with instrumentation

Scaling Considerations

BottleneckSolution
CPU-boundAdd more cores, enable rayon parallelism
Memory-boundEnable tiered storage, use sparse vectors
I/O-boundUse NVMe storage, increase buffer sizes
Network-boundBatch operations, use local cache

Integration Tests

The integration test suite validates cross-engine functionality, data flow, and system behavior. All tests use a shared TensorStore to verify that relational, graph, and vector engines work correctly together.

Test Count: 267+ tests across 22 files

Running Tests

# Run all integration tests
cargo test --package integration_tests

# Run specific test file
cargo test --package integration_tests --test persistence

# Run single test
cargo test --package integration_tests test_snapshot_preserves_all_data

# Run with output
cargo test --package integration_tests -- --nocapture

Test Categories

CategoryTestsDescription
Persistence9Snapshot/restore across all engines
Concurrency10Multi-threaded and async operations
Cross-Engine10Data flow between engines
Error Handling10Proper error messages
Delete Operations7Cleanup and consistency
Cache Invalidation7Cache behavior on writes
FIND Command7Unified query syntax
Blob Lifecycle7GC, repair, streaming
Cache Advanced6TTL, semantic, eviction
Vault Advanced8Grants, audit, namespacing
Edge Cases10Boundary conditions
Tensor Compress10Quantization, delta, RLE encoding
Join Operations10Hash-based relational JOINs
HNSW Index13Approximate nearest neighbor search
Vault Versioning17Secret history and rollback
Index Operations18Hash and B-tree indexes
Columnar Storage20Columnar scans, batch insert, projection
Entity Graph API18String-keyed entity edge operations
Sparse Vectors22Sparse vector creation and similarity
Store Instrumentation15Access pattern tracking
Tiered Storage16Hot/cold data migration
Distance Metrics17COSINE, EUCLIDEAN, DOT_PRODUCT similarity

Test Helpers

Available in integration_tests/src/lib.rs:

Helper FunctionPurpose
create_shared_router()Creates QueryRouter with shared TensorStore
create_router_with_vault(master_key)Router with vault initialized
create_router_with_cache()Router with cache initialized
create_router_with_blob()Router with blob store initialized
create_router_with_all_features(master_key)Router with vault, cache, and blob
sample_embeddings(count, dim)Generates deterministic test embeddings using sin()
get_store_from_router(router)Extracts TensorStore from router
create_shared_engines()Creates (store, relational, graph, vector) tuple
create_shared_engines_arc()Same as above but wrapped in Arc for concurrency

Key Test Suites

Persistence Tests

Tests snapshot/restore functionality across all engines.

TestWhat It Tests
test_snapshot_preserves_all_dataAll engine data survives snapshot/restore
test_snapshot_during_writesConcurrent writes don’t corrupt snapshot
test_restore_to_fresh_storeSnapshot loads into new TensorStore
test_compressed_snapshot_roundtripCompression works for vector data
test_snapshot_includes_vault_secretsVault secrets persist in snapshot

Lessons Learned

  • Cache is intentionally ephemeral (internal DashMaps)
  • Vault secrets ARE persisted (encrypted in TensorStore)
  • Bloom filter must be re-initialized with same parameters on restore

Concurrency Tests

Tests multi-threaded and async access patterns.

TestWhat It Tests
test_concurrent_writes_all_engines6 threads write to relational/graph/vector simultaneously
test_shared_store_contention4 threads write same key 1000 times each
test_reader_writer_isolationReads during heavy writes
test_blob_parallel_uploads10 concurrent blob uploads with barrier sync

Lessons Learned

  • DashMap provides excellent concurrent write performance
  • Node IDs are NOT guaranteed sequential - must capture actual IDs
  • Blob operations require tokio::sync::Mutex for shared access
  • HNSW search is thread-safe during concurrent writes

Cross-Engine Tests

Tests data flow and operations across multiple engines.

TestWhat It Tests
test_unified_entity_across_enginesSingle entity with data in all 3 engines
test_graph_nodes_with_embeddingsGraph nodes linked to vector embeddings
test_insert_embed_search_cycleINSERT -> EMBED -> SIMILAR workflow
test_query_router_cross_engine_operationsRouter executes across all engines

Lessons Learned

  • execute() uses col:type syntax; execute_parsed() uses SQL syntax
  • NEIGHBORS command returns QueryResult::Ids, not QueryResult::Nodes
  • Node IDs must be captured and reused, not assumed to be 0, 1, 2…

Sparse Vector Tests (22 tests)

Tests sparse vector creation, storage, and similarity operations.

Key APIs

  • TensorValue::from_embedding(dense, value_threshold, sparsity_threshold)
  • TensorValue::from_embedding_auto(dense) - Auto thresholds (0.01 value, 0.7 sparsity)
  • TensorValue::dot(other) - Dot product (sparse-sparse, sparse-dense, dense-dense)
  • TensorValue::cosine_similarity(other) - Cosine similarity
  • TensorValue::to_dense() - Convert back to dense
  • TensorValue::dimension() - Get vector dimension

Distance Metrics Tests (17 tests)

Tests SIMILAR queries with different distance metrics.

Key Syntax

-- Metric goes AFTER LIMIT clause
SIMILAR 'key' LIMIT 10 EUCLIDEAN
SIMILAR [0.1, 0.2] LIMIT 5 DOT_PRODUCT

Known Issues

  • Metric keyword must be AFTER LIMIT (not METRIC EUCLIDEAN)
  • COSINE/DOT_PRODUCT return empty for zero-magnitude queries
  • EUCLIDEAN correctly handles zero vectors

Coverage Summary

CategoryFilesTestsKey Validations
Storage450+Persistence, tiering, instrumentation
Engines560+Relational, graph, vector operations
Security225+Vault, access control, versioning
Caching213Exact, semantic, invalidation
Advanced680+Compression, joins, indexes, sparse
Total17267+

Code Style

This guide covers the coding standards for Neumann. All contributions must follow these guidelines.

Rust Idioms

  • Prefer iterators over loops
  • Use ? for error propagation
  • Keep functions small and focused
  • Prefer composition over inheritance patterns

Formatting

All code must pass cargo fmt:

cargo fmt --check

Lints

All code must pass clippy with warnings as errors:

cargo clippy -- -D warnings

Comments Policy

Doc comments (///) are for rustdoc generation. Use them sparingly.

DO Document

  • Types (structs, enums) - explain purpose and invariants
  • Non-obvious behavior - when a method does something unexpected
  • Complex algorithms - when the “why” isn’t clear from code

DO NOT Document

  • Methods with self-explanatory names (get, set, new, len, is_empty)
  • Trivial implementations
  • Anything where the doc would just repeat the function name

Examples

#![allow(unused)]
fn main() {
// BAD - restates the obvious
/// Get a field value
pub fn get(&self, key: &str) -> Option<&TensorValue>

// GOOD - no comment needed, name is clear
pub fn get(&self, key: &str) -> Option<&TensorValue>

// GOOD - explains non-obvious behavior
/// Returns cloned data to ensure thread safety.
/// For zero-copy access, use get_ref().
pub fn get(&self, key: &str) -> Result<TensorData>
}

Inline comments (//) should explain “why”, never “what”.

Naming

  • Types: PascalCase
  • Functions and variables: snake_case
  • Constants: SCREAMING_SNAKE_CASE
  • Modules: snake_case

Error Handling

  • Use Result for fallible operations
  • Define error types with thiserror
  • Provide context with error messages
#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum MyError {
    #[error("failed to parse config: {0}")]
    ConfigParse(String),

    #[error("connection failed: {source}")]
    Connection {
        #[from]
        source: std::io::Error,
    },
}
}

Concurrency

  • Use DashMap for concurrent hash maps
  • Avoid Mutex where possible (use parking_lot if needed)
  • Document thread-safety in type docs

Testing

  • Unit tests in the same file as code (#[cfg(test)] module)
  • Test the public API, not implementation details
  • Use descriptive names: test_<function>_<scenario>_<expected>

Commits

  • Write clear, imperative commit messages
  • No emoji in commits
  • Reference issue numbers when applicable
  • Keep commits atomic - one logical change per commit

Testing

Test Philosophy

  • Test the public API, not implementation details
  • Include edge cases: empty inputs, boundaries, error conditions
  • Performance tests for operations that must scale (10k+ entities)
  • Concurrent tests for thread-safe code

Running Tests

# All tests
cargo test

# Specific crate
cargo test -p tensor_chain

# Specific test
cargo test test_raft_election

# With output
cargo test -- --nocapture

# Run ignored tests (slow/integration)
cargo test -- --ignored

Test Organization

Unit tests live in the same file:

#![allow(unused)]
fn main() {
pub fn process(data: &str) -> Result<Output> {
    // implementation
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_process_valid_input() {
        let result = process("valid").unwrap();
        assert_eq!(result.status, "ok");
    }

    #[test]
    fn test_process_empty_input() {
        let result = process("");
        assert!(result.is_err());
    }
}
}

Test Naming

Use the pattern: test_<function>_<scenario>_<expected>

#![allow(unused)]
fn main() {
#[test]
fn test_insert_duplicate_key_returns_error() { }

#[test]
fn test_search_empty_index_returns_empty() { }

#[test]
fn test_commit_after_abort_fails() { }
}

Concurrent Tests

For thread-safe code:

#![allow(unused)]
fn main() {
#[test]
fn test_store_concurrent_writes() {
    let store = Arc::new(TensorStore::new());
    let handles: Vec<_> = (0..10)
        .map(|i| {
            let store = Arc::clone(&store);
            std::thread::spawn(move || {
                for j in 0..1000 {
                    store.put(format!("key_{i}_{j}"), data.clone());
                }
            })
        })
        .collect();

    for h in handles {
        h.join().unwrap();
    }

    assert_eq!(store.len(), 10000);
}
}

Performance Tests

Mark slow tests with #[ignore]:

#![allow(unused)]
fn main() {
#[test]
#[ignore]
fn test_hnsw_search_10k_vectors() {
    let mut index = HNSWIndex::new(config);
    for i in 0..10_000 {
        index.insert(format!("vec_{i}"), random_vector(128));
    }

    let start = Instant::now();
    for _ in 0..100 {
        index.search(&query, 10);
    }
    let elapsed = start.elapsed();

    assert!(elapsed < Duration::from_secs(1));
}
}

Run with: cargo test -- --ignored

Integration Tests

Located in integration_tests/:

cargo test -p integration_tests

These test cross-crate behavior and full workflows.

Coverage

Check coverage with cargo-llvm-cov:

cargo install cargo-llvm-cov
cargo llvm-cov --workspace --html
open target/llvm-cov/html/index.html

Target coverage thresholds:

  • shell: 88%
  • parser: 91%
  • blob: 91%
  • router: 92%
  • chain: 95%

Model Checking (TLA+)

Distributed protocol changes must be verified against the TLA+ specifications in specs/tla/:

cd specs/tla

# Run TLC on all three specs
java -XX:+UseParallelGC -Xmx4g -jar tla2tools.jar \
  -deadlock -workers auto -config Raft.cfg Raft.tla

java -XX:+UseParallelGC -Xmx4g -jar tla2tools.jar \
  -deadlock -workers auto \
  -config TwoPhaseCommit.cfg TwoPhaseCommit.tla

java -XX:+UseParallelGC -Xmx4g -jar tla2tools.jar \
  -deadlock -workers auto \
  -config Membership.cfg Membership.tla

When modifying Raft, 2PC, or gossip protocols:

  1. Update the corresponding .tla spec
  2. Run TLC and verify zero errors
  3. Save output to specs/tla/tlc-results/

See Formal Verification for background on what model checking covers.

Mocking

Use trait objects for dependency injection:

#![allow(unused)]
fn main() {
pub trait Transport: Send + Sync {
    fn send(&self, msg: Message) -> Result<()>;
}

// In tests
struct MockTransport {
    sent: Mutex<Vec<Message>>,
}

impl Transport for MockTransport {
    fn send(&self, msg: Message) -> Result<()> {
        self.sent.lock().unwrap().push(msg);
        Ok(())
    }
}
}

Documentation

Documentation Structure

Neumann documentation consists of:

  1. mdBook (docs/book/) - Conceptual docs, tutorials, operations
  2. rustdoc - API reference generated from source
  3. README.md per crate - Quick overview

Writing mdBook Pages

File Location

docs/book/src/
├── SUMMARY.md          # Table of contents
├── introduction.md     # Landing page
├── getting-started/    # Tutorials
├── architecture/       # Module deep dives
├── concepts/           # Cross-cutting concepts
├── operations/         # Deployment, monitoring
└── contributing/       # Contribution guides

Page Structure

# Page Title

Brief introduction (1-2 paragraphs).

## Section 1

Content with examples.

### Subsection

More detail.

## Section 2

Use tables for structured data:

| Column 1 | Column 2 |
|----------|----------|
| Value 1  | Value 2  |

Use mermaid for diagrams:

\`\`\`mermaid
flowchart LR
    A --> B --> C
\`\`\`

Admonitions

Use mdbook-admonish syntax:

```admonish note
This is a note.

Warning

This is a warning.

Danger

This is dangerous.


## Writing Rustdoc

### Module Documentation

```rust
//! # Module Name
//!
//! Brief description (one line).
//!
//! ## Overview
//!
//! Longer explanation of purpose and design decisions.
//!
//! ## Example
//!
//! ```rust
//! // Example code
//! ```

Type Documentation

#![allow(unused)]
fn main() {
/// Brief description of the type.
///
/// Longer explanation if needed.
///
/// # Example
///
/// ```rust
/// let value = MyType::new();
/// ```
pub struct MyType { ... }
}

When to Document

Document:

  • All public types
  • Non-obvious behavior
  • Complex algorithms

Don’t document:

  • Self-explanatory methods (get, set, new)
  • Trivial implementations

Building Documentation

mdBook

cd docs/book
mdbook build
mdbook serve  # Local preview at localhost:3000

rustdoc

cargo doc --workspace --no-deps --open

Full Build

# mdBook
cd docs/book && mdbook build

# rustdoc
cargo doc --workspace --no-deps

# Combine
cp -r target/doc docs/book-output/api/
cd docs/book
mdbook-linkcheck --standalone

Adding Mermaid Diagrams

Supported diagram types:

  • flowchart - Flow diagrams
  • sequenceDiagram - Sequence diagrams
  • stateDiagram-v2 - State machines
  • classDiagram - Class diagrams
  • gantt - Gantt charts

Example:

\`\`\`mermaid
sequenceDiagram
    participant C as Client
    participant S as Server
    C->>S: Request
    S->>C: Response
\`\`\`

Fuzzing

Neumann uses cargo-fuzz (libFuzzer-based) for coverage-guided fuzzing.

Setup

# Install cargo-fuzz (requires nightly)
cargo install cargo-fuzz

# List available targets
cd fuzz && cargo +nightly fuzz list

Running Fuzz Targets

# Run a specific target for 60 seconds
cargo +nightly fuzz run parser_parse -- -max_total_time=60

# Run without sanitizer (2x faster for safe Rust)
cargo +nightly fuzz run parser_parse --sanitizer none

# Reproduce a crash
cargo +nightly fuzz run parser_parse artifacts/parser_parse/crash-xxx

Available Targets

TargetModuleWhat it tests
parser_parseneumann_parserStatement parsing
parser_parse_allneumann_parserMulti-statement parsing
parser_parse_exprneumann_parserExpression parsing
parser_tokenizeneumann_parserLexer/tokenization
compress_idstensor_compressVarint ID compression
compress_rletensor_compressRLE encode/decode
compress_snapshottensor_compressSnapshot serialization
vault_ciphertensor_vaultAES-256-GCM roundtrip
checkpoint_statetensor_checkpointCheckpoint bincode
storage_sparse_vectortensor_storeSparse vector roundtrip
slab_entity_indextensor_storeEntityIndex operations
consistent_hashtensor_storeConsistent hash partitioner
tcp_framingtensor_chainTCP wire protocol codec
membershiptensor_chainCluster config serialization

Adding a New Fuzz Target

  1. Create target file in fuzz/fuzz_targets/<name>.rs:
#![allow(unused)]
#![no_main]

fn main() {
use libfuzzer_sys::fuzz_target;

fuzz_target!(|data: &[u8]| {
    // Your fuzzing code here
    if let Ok(input) = std::str::from_utf8(data) {
        let _ = my_crate::parse(input);
    }
});
}
  1. Add entry to fuzz/Cargo.toml:
[[bin]]
name = "my_target"
path = "fuzz_targets/my_target.rs"
test = false
doc = false
bench = false
  1. Add seed corpus files to fuzz/corpus/<name>/:
mkdir -p fuzz/corpus/my_target
echo "valid input 1" > fuzz/corpus/my_target/seed1
echo "valid input 2" > fuzz/corpus/my_target/seed2
  1. Update CI matrix in .github/workflows/fuzz.yml

Structured Fuzzing

For complex input types, use arbitrary:

#![allow(unused)]
fn main() {
use arbitrary::Arbitrary;

#[derive(Arbitrary, Debug)]
struct MyInput {
    field1: u32,
    field2: String,
}

fuzz_target!(|input: MyInput| {
    let _ = my_crate::process(&input);
});
}

Investigating Crashes

# View crash input
xxd artifacts/my_target/crash-xxx

# Minimize crash
cargo +nightly fuzz tmin my_target artifacts/my_target/crash-xxx

# Debug
cargo +nightly fuzz run my_target artifacts/my_target/crash-xxx -- -verbosity=2

CI Integration

Fuzz tests run in CI for 60 seconds per target. See .github/workflows/fuzz.yml.

Best Practices

  1. Add corpus seeds: Real-world inputs help fuzzer find paths
  2. Use structured fuzzing: For complex inputs
  3. Run locally: Before pushing changes to fuzzed code
  4. Minimize crashes: Smaller inputs are easier to debug
  5. Keep targets focused: One functionality per target

API Reference

This document provides detailed public API documentation for all Neumann crates. For auto-generated rustdoc, see Building Locally.

Table of Contents


tensor_store

Core key-value storage with HNSW indexing, sparse vectors, and tiered storage.

Core Types

TypeDescription
TensorStoreThread-safe key-value store with slab routing
TensorDataHashMap-based entity with typed fields
TensorValueField value: Scalar, Vector, Sparse, Pointer(s)
ScalarValueNull, Bool, Int, Float, String, Bytes

TensorStore

#![allow(unused)]
fn main() {
use tensor_store::{TensorStore, TensorData, TensorValue, ScalarValue};

let store = TensorStore::new();

// Store entity
let mut data = TensorData::new();
data.set("name", TensorValue::Scalar(ScalarValue::String("Alice".into())));
data.set("embedding", TensorValue::Vector(vec![0.1, 0.2, 0.3]));
store.put("user:1", data)?;

// Retrieve
let entity = store.get("user:1")?;
assert!(entity.has("name"));

// Check existence
store.exists("user:1");  // -> bool

// Delete
store.delete("user:1")?;

// Scan by prefix
let count = store.scan_count("user:");
}

TensorData

#![allow(unused)]
fn main() {
let mut data = TensorData::new();

// Set fields
data.set("field", TensorValue::Scalar(ScalarValue::Int(42)));

// Get fields
let value = data.get("field");  // -> Option<&TensorValue>

// Check field existence
data.has("field");  // -> bool

// Field names
let fields: Vec<&str> = data.keys().collect();
}

HNSW Index

Hierarchical Navigable Small World graph for approximate nearest neighbor search.

#![allow(unused)]
fn main() {
use tensor_store::{HNSWIndex, HNSWConfig, DistanceMetric};

// Create with config
let config = HNSWConfig {
    m: 16,              // Connections per node
    ef_construction: 200,
    ef_search: 50,
    max_elements: 10000,
    distance_metric: DistanceMetric::Cosine,
    ..Default::default()
};
let index = HNSWIndex::new(128, config);  // 128 dimensions

// Insert vector
index.insert("doc:1", &embedding)?;

// Search
let results = index.search(&query_vector, 10)?;
for (key, distance) in results {
    println!("{}: {}", key, distance);
}
}

Sparse Vectors

Memory-efficient sparse embeddings with 15+ distance metrics.

#![allow(unused)]
fn main() {
use tensor_store::SparseVector;

// Create from dense (auto-detects sparsity)
let sparse = SparseVector::from_dense(&[0.0, 0.5, 0.0, 0.3, 0.0]);

// Create from indices and values
let sparse = SparseVector::new(vec![1, 3], vec![0.5, 0.3], 5)?;

// Operations
let dense = sparse.to_dense();
let dot = sparse.dot(&other_sparse);
let cosine = sparse.cosine_similarity(&other_sparse);
}

Tiered Storage

Automatic hot/cold storage with mmap backing.

#![allow(unused)]
fn main() {
use tensor_store::{TieredStore, TieredConfig};
use std::path::Path;

let config = TieredConfig {
    hot_capacity: 10000,
    cold_path: Path::new("/data/cold").to_path_buf(),
    migration_threshold: 0.8,
    ..Default::default()
};
let store = TieredStore::new(config)?;

// Automatic migration based on access patterns
store.put("key", data)?;
let value = store.get("key")?;
}

Cache Ring

Fixed-size eviction cache with multiple strategies.

#![allow(unused)]
fn main() {
use tensor_store::{CacheRing, EvictionStrategy};

let cache = CacheRing::new(1000, EvictionStrategy::LRU);

cache.put("key", value);
let hit = cache.get("key");  // -> Option<V>

// Statistics
let stats = cache.stats();
println!("Hit rate: {:.2}%", stats.hit_rate() * 100.0);
}

Consistent Hash Partitioner

Partition routing with virtual nodes.

#![allow(unused)]
fn main() {
use tensor_store::{ConsistentHashPartitioner, ConsistentHashConfig};

let config = ConsistentHashConfig {
    virtual_nodes: 150,
    replication_factor: 3,
};
let partitioner = ConsistentHashPartitioner::new(config);

partitioner.add_node("node1");
partitioner.add_node("node2");

let partition = partitioner.partition("user:123");
let replicas = partitioner.replicas("user:123");
}

relational_engine

SQL-like table operations with SIMD-accelerated filtering.

Core Types

TypeDescription
RelationalEngineMain engine with TensorStore backend
SchemaTable schema with column definitions
ColumnColumn name, type, nullability
ColumnTypeInt, Float, String, Bool, Bytes, Json
ValueTyped query value
ConditionComposable filter predicate
RowRow with ID and values

Table Operations

#![allow(unused)]
fn main() {
use relational_engine::{RelationalEngine, Schema, Column, ColumnType};

let engine = RelationalEngine::new();

// Create table
let schema = Schema::new(vec![
    Column::new("name", ColumnType::String),
    Column::new("age", ColumnType::Int),
    Column::new("email", ColumnType::String).nullable(),
]);
engine.create_table("users", schema)?;

// Check existence
engine.table_exists("users")?;  // -> bool

// List tables
let tables = engine.list_tables();  // -> Vec<String>

// Get schema
let schema = engine.get_schema("users")?;

// Row count
engine.row_count("users")?;  // -> usize

// Drop table
engine.drop_table("users")?;
}

CRUD Operations

#![allow(unused)]
fn main() {
use relational_engine::{Condition, Value};
use std::collections::HashMap;

// INSERT
let mut values = HashMap::new();
values.insert("name".to_string(), Value::String("Alice".into()));
values.insert("age".to_string(), Value::Int(30));
let row_id = engine.insert("users", values)?;

// BATCH INSERT (59x faster)
let rows: Vec<HashMap<String, Value>> = vec![/* ... */];
let row_ids = engine.batch_insert("users", rows)?;

// SELECT with condition
let rows = engine.select("users",
    Condition::Ge("age".into(), Value::Int(18)))?;

// UPDATE
let mut updates = HashMap::new();
updates.insert("age".to_string(), Value::Int(31));
let count = engine.update("users",
    Condition::Eq("name".into(), Value::String("Alice".into())),
    updates)?;

// DELETE
let count = engine.delete_rows("users",
    Condition::Lt("age".into(), Value::Int(18)))?;
}

Conditions

#![allow(unused)]
fn main() {
use relational_engine::{Condition, Value};

// Simple conditions
Condition::True                           // Match all
Condition::Eq("col".into(), Value::Int(1))  // col = 1
Condition::Ne("col".into(), Value::Int(1))  // col != 1
Condition::Lt("col".into(), Value::Int(10)) // col < 10
Condition::Le("col".into(), Value::Int(10)) // col <= 10
Condition::Gt("col".into(), Value::Int(0))  // col > 0
Condition::Ge("col".into(), Value::Int(0))  // col >= 0

// Compound conditions
let cond = Condition::Ge("age".into(), Value::Int(18))
    .and(Condition::Lt("age".into(), Value::Int(65)));

let cond = Condition::Eq("status".into(), Value::String("active".into()))
    .or(Condition::Gt("priority".into(), Value::Int(5)));
}

Indexes

#![allow(unused)]
fn main() {
// Hash index (O(1) equality)
engine.create_index("users", "email")?;
engine.has_index("users", "email");  // -> bool
engine.drop_index("users", "email")?;

// B-tree index (O(log n) range)
engine.create_btree_index("users", "age")?;
engine.has_btree_index("users", "age");  // -> bool
engine.drop_btree_index("users", "age")?;

// List indexed columns
engine.get_indexed_columns("users");        // -> Vec<String>
engine.get_btree_indexed_columns("users");  // -> Vec<String>
}

Joins

#![allow(unused)]
fn main() {
// INNER JOIN
let joined = engine.join("users", "posts", "_id", "user_id")?;
// -> Vec<(Row, Row)>

// LEFT JOIN
let joined = engine.left_join("users", "posts", "_id", "user_id")?;
// -> Vec<(Row, Option<Row>)>

// RIGHT JOIN
let joined = engine.right_join("users", "posts", "_id", "user_id")?;
// -> Vec<(Option<Row>, Row)>

// FULL JOIN
let joined = engine.full_join("users", "posts", "_id", "user_id")?;
// -> Vec<(Option<Row>, Option<Row>)>

// CROSS JOIN
let joined = engine.cross_join("users", "posts")?;
// -> Vec<(Row, Row)>

// NATURAL JOIN
let joined = engine.natural_join("users", "profiles")?;
// -> Vec<(Row, Row)>
}

Aggregates

#![allow(unused)]
fn main() {
// COUNT
let count = engine.count("users", Condition::True)?;
let count = engine.count_column("users", "email", Condition::True)?;

// SUM
let total = engine.sum("orders", "amount", Condition::True)?;

// AVG
let avg = engine.avg("orders", "amount", Condition::True)?;  // Option<f64>

// MIN/MAX
let min = engine.min("products", "price", Condition::True)?;  // Option<Value>
let max = engine.max("products", "price", Condition::True)?;
}

Transactions

#![allow(unused)]
fn main() {
use relational_engine::{TransactionManager, TxPhase};

let tx_manager = TransactionManager::new();

// Begin transaction
let tx_id = tx_manager.begin();

// Check state
tx_manager.is_active(tx_id);  // -> bool
tx_manager.get(tx_id);        // -> Option<TxPhase>

// Acquire row locks
tx_manager.lock_manager().try_lock(tx_id, &[
    ("users".to_string(), 1),
    ("users".to_string(), 2),
])?;

// Commit or rollback
tx_manager.set_phase(tx_id, TxPhase::Committed);
tx_manager.release_locks(tx_id);
tx_manager.remove(tx_id);
}

graph_engine

Directed graph operations with BFS traversal and shortest path.

Core Types

TypeDescription
GraphEngineMain engine with TensorStore backend
NodeGraph node with label and properties
EdgeDirected edge with type and properties
DirectionOutgoing, Incoming, Both
PropertyValueNull, Int, Float, String, Bool
PathSequence of nodes and edges

Node Operations

#![allow(unused)]
fn main() {
use graph_engine::{GraphEngine, PropertyValue};
use std::collections::HashMap;

let engine = GraphEngine::new();

// Create node
let mut props = HashMap::new();
props.insert("name".to_string(), PropertyValue::String("Alice".into()));
let node_id = engine.create_node("person", props)?;

// Get node
let node = engine.get_node(node_id)?;
println!("{}: {:?}", node.label, node.properties);

// Update node
let mut updates = HashMap::new();
updates.insert("age".to_string(), PropertyValue::Int(30));
engine.update_node(node_id, updates)?;

// Delete node
engine.delete_node(node_id)?;

// Find nodes by label
let people = engine.find_nodes_by_label("person")?;
}

Edge Operations

#![allow(unused)]
fn main() {
use graph_engine::Direction;

// Create edge
let edge_id = engine.create_edge(from_id, to_id, "follows", HashMap::new())?;

// Get edge
let edge = engine.get_edge(edge_id)?;

// Get neighbors
let neighbors = engine.neighbors(node_id, Direction::Outgoing)?;
let neighbors = engine.neighbors(node_id, Direction::Incoming)?;
let neighbors = engine.neighbors(node_id, Direction::Both)?;

// Get edges
let edges = engine.edges(node_id, Direction::Outgoing)?;

// Delete edge
engine.delete_edge(edge_id)?;
}

Traversal

#![allow(unused)]
fn main() {
// BFS traversal
let visited = engine.bfs(start_id, |node| {
    // Return true to continue traversal
    true
})?;

// Shortest path (Dijkstra)
let path = engine.shortest_path(from_id, to_id)?;
if let Some(path) = path {
    for node_id in path.nodes {
        println!("-> {}", node_id);
    }
}
}

Property Indexes

#![allow(unused)]
fn main() {
use graph_engine::{IndexTarget, RangeOp};

// Create index on node property
engine.create_property_index(IndexTarget::Node, "age")?;

// Create index on edge property
engine.create_property_index(IndexTarget::Edge, "weight")?;

// Range query using index
let results = engine.find_by_range(
    IndexTarget::Node,
    "age",
    &PropertyValue::Int(18),
    RangeOp::Ge,
)?;
}

vector_engine

Embedding storage and similarity search.

Core Types

TypeDescription
VectorEngineMain engine for embedding operations
SearchResultKey and similarity score
DistanceMetricCosine, Euclidean, DotProduct
FilterConditionMetadata filter (Eq, Ne, Lt, Gt, And, Or, In, etc.)
FilterValueFilter value type (Int, Float, String, Bool, Null)
FilterStrategyFilter strategy (Auto, PreFilter, PostFilter)
FilteredSearchConfigConfiguration for filtered search
VectorCollectionConfigConfiguration for collections
MetadataValueSimplified metadata value type

Basic Operations

#![allow(unused)]
fn main() {
use vector_engine::{VectorEngine, DistanceMetric};

let engine = VectorEngine::new();

// Store embedding (auto-detects sparse)
engine.store_embedding("doc:1", vec![0.1, 0.2, 0.3])?;

// Get embedding
let vector = engine.get_embedding("doc:1")?;

// Check existence
engine.exists("doc:1");  // -> bool

// Delete
engine.delete_embedding("doc:1")?;

// Count embeddings
engine.count();  // -> usize
}

Similarity Search

#![allow(unused)]
fn main() {
// Search similar embeddings
let query = vec![0.1, 0.2, 0.3];
let results = engine.search_similar(&query, 10)?;

for result in results {
    println!("{}: {:.4}", result.key, result.score);
}

// Search with metric
let results = engine.search_similar_with_metric(
    &query,
    10,
    DistanceMetric::Euclidean,
)?;
}

Filtered Search

#![allow(unused)]
fn main() {
use vector_engine::{FilterCondition, FilterValue, FilteredSearchConfig};

// Build filter
let filter = FilterCondition::Eq("category".into(), FilterValue::String("science".into()))
    .and(FilterCondition::Gt("year".into(), FilterValue::Int(2020)));

// Search with filter
let results = engine.search_similar_filtered(&query, 10, &filter, None)?;

// With explicit strategy
let config = FilteredSearchConfig::pre_filter();
let results = engine.search_similar_filtered(&query, 10, &filter, Some(config))?;
}

Metadata Storage

#![allow(unused)]
fn main() {
use tensor_store::TensorValue;
use std::collections::HashMap;

// Store embedding with metadata
let mut metadata = HashMap::new();
metadata.insert("category".into(), TensorValue::from("science"));
metadata.insert("year".into(), TensorValue::from(2024i64));
engine.store_embedding_with_metadata("doc:1", vec![0.1, 0.2], metadata)?;

// Get metadata
let meta = engine.get_metadata("doc:1")?;

// Update metadata
let mut updates = HashMap::new();
updates.insert("year".into(), TensorValue::from(2025i64));
engine.update_metadata("doc:1", updates)?;
}

Collections

#![allow(unused)]
fn main() {
use vector_engine::VectorCollectionConfig;

// Create collection
let config = VectorCollectionConfig::default()
    .with_dimension(768)
    .with_metric(DistanceMetric::Cosine);
engine.create_collection("documents", config)?;

// Store in collection
engine.store_in_collection("documents", "doc:1", vec![0.1; 768])?;

// Search in collection
let results = engine.search_in_collection("documents", &query, 10)?;

// List/delete collections
let collections = engine.list_collections();
engine.delete_collection("documents")?;
}

tensor_chain

Distributed consensus with Raft and 2PC transactions.

Core Types

TypeDescription
ChainBlock chain with graph-based linking
BlockBlock with header and transactions
TransactionPut, Delete, Update operations
RaftNodeRaft consensus state machine
DistributedTxCoordinator2PC transaction coordinator

Chain Operations

#![allow(unused)]
fn main() {
use tensor_chain::{Chain, Transaction, Block};
use graph_engine::GraphEngine;
use std::sync::Arc;

let graph = Arc::new(GraphEngine::new());
let chain = Chain::new(graph, "node1".to_string());
chain.initialize()?;

// Create block
let builder = chain.new_block()
    .add_transaction(Transaction::Put {
        key: "user:1".into(),
        data: vec![1, 2, 3],
    })
    .add_transaction(Transaction::Delete {
        key: "user:0".into(),
    });

let block = builder.build();
chain.append(block)?;

// Query chain
let height = chain.height();
let block = chain.get_block(1)?;
}

Raft Consensus

#![allow(unused)]
fn main() {
use tensor_chain::{RaftNode, RaftConfig, RaftState};

let config = RaftConfig {
    election_timeout_min: 150,
    election_timeout_max: 300,
    heartbeat_interval: 50,
    ..Default::default()
};

let raft = RaftNode::new("node1".into(), config);

// State queries
raft.is_leader();     // -> bool
raft.current_term();  // -> u64
raft.state();         // -> RaftState

// Statistics
let stats = raft.stats();
}

Distributed Transactions

#![allow(unused)]
fn main() {
use tensor_chain::{DistributedTxCoordinator, DistributedTxConfig};

let config = DistributedTxConfig {
    prepare_timeout_ms: 5000,
    commit_timeout_ms: 5000,
    max_retries: 3,
    ..Default::default()
};

let coordinator = DistributedTxCoordinator::new(config);

// Begin distributed transaction
let tx_id = coordinator.begin()?;

// Prepare phase
coordinator.prepare(tx_id, keys, participants).await?;

// Commit phase
coordinator.commit(tx_id).await?;

// Or abort
coordinator.abort(tx_id).await?;
}

Membership Management

#![allow(unused)]
fn main() {
use tensor_chain::{MembershipManager, ClusterConfig, HealthConfig};

let config = ClusterConfig {
    local: LocalNodeConfig { id: "node1".into(), addr: "127.0.0.1:9000".parse()? },
    peers: vec![],
    health: HealthConfig::default(),
};

let membership = MembershipManager::new(config);

// Add/remove nodes
membership.add_node("node2", "127.0.0.1:9001".parse()?)?;
membership.remove_node("node2")?;

// Health status
let health = membership.node_health("node2");
let status = membership.partition_status();
}

neumann_parser

Hand-written recursive descent parser for the Neumann query language.

Core Types

TypeDescription
StatementParsed statement with span
StatementKindSelect, Insert, Update, Delete, Node, Edge, etc.
ExprExpression AST node
TokenLexer token with span
ParseErrorError with source location

Parsing

#![allow(unused)]
fn main() {
use neumann_parser::{parse, parse_all, parse_expr, tokenize};

// Parse single statement
let stmt = parse("SELECT * FROM users WHERE id = 1")?;

// Parse multiple statements
let stmts = parse_all("SELECT 1; SELECT 2")?;

// Parse expression only
let expr = parse_expr("1 + 2 * 3")?;

// Tokenize
let tokens = tokenize("SELECT id, name FROM users");
}

Error Handling

#![allow(unused)]
fn main() {
let result = parse("SELCT * FROM users");
if let Err(err) = result {
    // Format with source context
    let formatted = err.format_with_source("SELCT * FROM users");
    eprintln!("{}", formatted);

    // Access error details
    println!("Line: {}", err.line());
    println!("Column: {}", err.column());
}
}

Span Utilities

#![allow(unused)]
fn main() {
use neumann_parser::{line_number, line_col, get_line, BytePos};

let source = "SELECT\nFROM\nWHERE";

// Get line number (1-indexed)
let line = line_number(source, BytePos(7));  // -> 2

// Get line and column
let (line, col) = line_col(source, BytePos(7));  // -> (2, 1)

// Get line text
let text = get_line(source, BytePos(7));  // -> "FROM"
}

query_router

Unified query routing across all engines.

Core Types

TypeDescription
QueryRouterMain router handling all query types
QueryResultResult variants for different query types
RouterErrorError types from all engines

Query Execution

#![allow(unused)]
fn main() {
use query_router::{QueryRouter, QueryResult};

let router = QueryRouter::new();

// Execute query
let result = router.execute("SELECT * FROM users")?;

match result {
    QueryResult::Rows(rows) => { /* relational result */ }
    QueryResult::Nodes(nodes) => { /* graph result */ }
    QueryResult::Similar(results) => { /* vector result */ }
    QueryResult::Success(msg) => { /* command result */ }
    _ => {}
}
}

Distributed Queries

#![allow(unused)]
fn main() {
use query_router::{QueryPlanner, MergeStrategy, ResultMerger};

let planner = QueryPlanner::new(partitioner);

// Plan distributed query
let plan = planner.plan("SELECT * FROM users WHERE region = 'us'")?;

// Execute on shards
let shard_results = execute_on_shards(&plan).await?;

// Merge results
let merger = ResultMerger::new(MergeStrategy::Union);
let final_result = merger.merge(shard_results)?;
}

tensor_cache

LLM response cache with exact and semantic matching.

Core Types

TypeDescription
CacheMulti-layer LLM response cache
CacheConfigConfiguration for cache behavior
CacheHitSuccessful lookup result
CacheLayerExact, Semantic, Embedding
EvictionStrategyLRU, LFU, CostBased, Hybrid

Operations

#![allow(unused)]
fn main() {
use tensor_cache::{Cache, CacheConfig, EvictionStrategy};

let mut config = CacheConfig::default();
config.embedding_dim = 384;
config.eviction_strategy = EvictionStrategy::Hybrid;
let cache = Cache::with_config(config)?;

// Store response
let embedding = vec![0.1, 0.2, /* ... */];
cache.put(
    "What is 2+2?",
    &embedding,
    "The answer is 4.",
    "gpt-4",
    None,  // version
)?;

// Lookup (tries exact, then semantic)
if let Some(hit) = cache.get("What is 2+2?", Some(&embedding)) {
    println!("Response: {}", hit.response);
    println!("Layer: {:?}", hit.layer);
    println!("Cost saved: ${:.4}", hit.cost_saved);
}

// Statistics
let stats = cache.stats();
println!("Hit rate: {:.2}%", stats.hit_rate() * 100.0);
}

tensor_vault

Encrypted secret storage with graph-based access control.

Core Types

TypeDescription
VaultMain vault API
VaultConfigConfiguration for security settings
PermissionRead, Write, Admin
MasterKeyDerived encryption key

Operations

#![allow(unused)]
fn main() {
use tensor_vault::{Vault, VaultConfig, Permission};

let config = VaultConfig::default();
let vault = Vault::new(config)?;

// Store secret
vault.set("requester", "db/password", b"secret123", Permission::Admin)?;

// Get secret
let secret = vault.get("requester", "db/password")?;

// Grant access
vault.grant("admin", "user", "db/password", Permission::Read)?;

// Revoke access
vault.revoke("admin", "user", "db/password")?;

// List secrets
let secrets = vault.list("requester", "db/")?;

// Delete secret
vault.delete("admin", "db/password")?;
}

tensor_blob

S3-style object storage with content-addressable chunks.

Core Types

TypeDescription
BlobStoreMain blob storage API
BlobConfigConfiguration for chunk size, GC
PutOptionsOptions for storing artifacts
ArtifactMetadataMetadata for stored artifacts
BlobWriterStreaming upload
BlobReaderStreaming download

Operations

#![allow(unused)]
fn main() {
use tensor_blob::{BlobStore, BlobConfig, PutOptions};

let config = BlobConfig::default();
let store = BlobStore::new(tensor_store, config).await?;

// Store artifact
let artifact_id = store.put(
    "report.pdf",
    &file_bytes,
    PutOptions::new()
        .with_created_by("user:alice")
        .with_tag("quarterly"),
).await?;

// Get artifact
let data = store.get(&artifact_id).await?;

// Streaming upload
let mut writer = store.writer("large-file.bin", PutOptions::new()).await?;
writer.write(&chunk1).await?;
writer.write(&chunk2).await?;
let artifact_id = writer.finish().await?;

// Streaming download
let mut reader = store.reader(&artifact_id).await?;
let chunk = reader.read(1024).await?;

// Delete
store.delete(&artifact_id).await?;

// Metadata
let metadata = store.metadata(&artifact_id).await?;
}

tensor_checkpoint

Snapshot and rollback system.

Core Types

TypeDescription
CheckpointManagerMain checkpoint API
CheckpointConfigConfiguration for checkpoints
DestructiveOpDelete, Update operations
OperationPreviewPreview of affected data
ConfirmationHandlerCustom confirmation logic

Operations

#![allow(unused)]
fn main() {
use tensor_checkpoint::{CheckpointManager, CheckpointConfig, AutoConfirm};
use std::sync::Arc;

let config = CheckpointConfig::new()
    .with_max_checkpoints(10)
    .with_auto_checkpoint(true);

let manager = CheckpointManager::new(blob_store, config).await;
manager.set_confirmation_handler(Arc::new(AutoConfirm));

// Create checkpoint
let checkpoint_id = manager.create(Some("before-migration"), &store).await?;

// List checkpoints
let checkpoints = manager.list().await?;

// Restore from checkpoint
manager.restore(&checkpoint_id, &mut store).await?;

// Delete checkpoint
manager.delete(&checkpoint_id).await?;
}

Common Patterns

Error Handling

All crates use the Result type with crate-specific error enums:

#![allow(unused)]
fn main() {
use relational_engine::{RelationalEngine, RelationalError};

let result = engine.create_table("users", schema);
match result {
    Ok(()) => println!("Table created"),
    Err(RelationalError::TableAlreadyExists) => println!("Already exists"),
    Err(e) => eprintln!("Error: {}", e),
}
}

Thread Safety

All engines use parking_lot and DashMap for concurrent access:

#![allow(unused)]
fn main() {
use std::sync::Arc;
use std::thread;

let engine = Arc::new(RelationalEngine::new());

let handles: Vec<_> = (0..4).map(|i| {
    let engine = Arc::clone(&engine);
    thread::spawn(move || {
        engine.insert("users", values).unwrap();
    })
}).collect();

for handle in handles {
    handle.join().unwrap();
}
}

Async Operations

tensor_blob, tensor_cache, and tensor_checkpoint use async APIs:

#![allow(unused)]
fn main() {
use tokio::runtime::Runtime;

let rt = Runtime::new()?;
rt.block_on(async {
    let store = BlobStore::new(tensor_store, config).await?;
    store.put("file.txt", &data, options).await?;
    Ok(())
})?;
}

Building Locally

Generate documentation from source:

# Basic documentation
cargo doc --workspace --no-deps --open

# With all features and private items
cargo doc --workspace --no-deps --all-features --document-private-items

# With scraped examples (nightly)
RUSTDOCFLAGS="--cfg docsrs" cargo +nightly doc \
  -Zunstable-options \
  -Zrustdoc-scrape-examples \
  --all-features

Crate Documentation

After generating docs locally with cargo doc, you can browse documentation for:

  • tensor_store - Core storage layer
  • relational_engine - SQL-like tables
  • graph_engine - Graph operations
  • vector_engine - Vector similarity search
  • tensor_chain - Distributed consensus
  • neumann_parser - Query language parser
  • query_router - Unified query execution
  • tensor_cache - Multi-layer caching
  • tensor_vault - Encrypted secrets
  • tensor_blob - Blob storage
  • tensor_checkpoint - Snapshot/restore