Introduction
Neumann is a unified tensor-based runtime that stores relational data, graph relationships, and vector embeddings in a single system. Instead of stitching together a SQL database, a graph store, and a vector index, Neumann gives you all three behind one query language.
Choose Your Path
| I want to… | Go to |
|---|---|
| Try it in 5 minutes | Quick Start |
| Build a project with it | Five-Minute Tutorial |
| See what it can do | Use Cases |
| Understand the design | Architecture Overview |
| Use the Python SDK | Python Quickstart |
| Use the TypeScript SDK | TypeScript Quickstart |
| Look up a command | Query Language Reference |
What Makes Neumann Different
One system, three engines. Store a table, connect entities in a graph, and search by vector similarity without moving data between systems.
-- Relational
CREATE TABLE documents (id INT PRIMARY KEY, title TEXT, author TEXT);
INSERT INTO documents VALUES (1, 'Intro to ML', 'Alice');
-- Graph
NODE CREATE topic { name: 'machine-learning' }
ENTITY CONNECT 'doc-1' -> 'topic-ml' : covers
-- Vector
EMBED STORE 'doc-1' [0.1, 0.2, 0.3, 0.4]
-- Cross-engine: find similar documents connected to a topic
SIMILAR 'doc-1' LIMIT 5 CONNECTED TO 'topic-ml'
Encrypted vault. Store secrets with AES-256-GCM encryption and graph-based access control.
LLM cache. Cache LLM responses with exact and semantic matching to reduce API costs.
Built-in consensus. Raft-based distributed consensus with 2PC transactions for multi-node deployments.
Architecture
+-------------------+
| neumann_shell | Interactive CLI
| neumann_server | gRPC server
+-------------------+
|
+-------------------+
| query_router | Unified query execution
+-------------------+
|
+----------+---------+---------+----------+
| | | | |
relational graph vector tensor_ tensor_
_engine _engine _engine _vault _cache
| | | | |
+----------+---------+---------+----------+
|
+-------------------+
| tensor_store | Core storage (HNSW, sharded B-trees)
+-------------------+
Additional subsystems: tensor_blob (S3-style blob storage), tensor_chain
(blockchain with Raft), tensor_checkpoint (snapshots), tensor_compress
(tensor train decomposition).
Getting Started
- Installation – Install Neumann
- Quick Start – Your first queries
- Five-Minute Tutorial – Build a mini RAG system
- Use Cases – Real-world applications
- Building from Source – Compile from source
Reference
- Query Language – Full command reference
- Data Types – Scalar, vector, and sparse types
- Functions – Aggregates, distance metrics, operators
- API Reference – Rustdoc output
Installation
Multiple installation methods are available depending on your needs.
Quick Install (Recommended)
The easiest way to install Neumann is using the install script:
curl -sSfL https://raw.githubusercontent.com/Shadylukin/Neumann/main/install.sh | bash
This script will:
- Detect your platform (Linux x86_64, macOS x86_64, macOS ARM64)
- Download a pre-built binary if available
- Fall back to building from source if needed
- Install to
/usr/local/binor~/.local/bin - Install shell completions and man pages
Environment Variables
| Variable | Description |
|---|---|
NEUMANN_INSTALL_DIR | Custom installation directory |
NEUMANN_VERSION | Install a specific version (e.g., v0.1.0) |
NEUMANN_NO_MODIFY_PATH | Set to 1 to skip PATH modification |
NEUMANN_SKIP_EXTRAS | Set to 1 to skip completions and man page installation |
Homebrew (macOS/Linux)
brew tap Shadylukin/tap
brew install neumann
Cargo (crates.io)
If you have Rust installed:
cargo install neumann_shell
To install the gRPC server:
cargo install neumann_server
Docker
Interactive CLI
docker run -it shadylukinack/neumann:latest
Server Mode
docker run -d -p 9200:9200 -v neumann-data:/var/lib/neumann shadylukinack/neumann:server
Docker Compose
# Clone the repository
git clone https://github.com/Shadylukin/Neumann.git
cd Neumann
# Start the server
docker compose up -d neumann-server
# Run the CLI
docker compose run --rm neumann-cli
From Source
Requirements
- Rust 1.75 or later
- Cargo (included with Rust)
- Git
- protobuf compiler (for gRPC)
Build Steps
# Clone the repository
git clone https://github.com/Shadylukin/Neumann.git
cd Neumann
# Build in release mode
cargo build --release --package neumann_shell
# Install locally
cargo install --path neumann_shell
Run Tests
cargo test
SDK Installation
Python
pip install neumann-db
For embedded mode (in-process database):
pip install neumann-db[native]
TypeScript / JavaScript
npm install neumann-db
Or with yarn:
yarn add neumann-db
Verify Installation
neumann --version
Platform Support
| Platform | Binary | Homebrew | Docker | Source |
|---|---|---|---|---|
| Linux x86_64 | Yes | Yes | Yes | Yes |
| macOS x86_64 | Yes | Yes | Yes | Yes |
| macOS ARM64 (Apple Silicon) | Yes | Yes | Yes | Yes |
| Windows x86_64 | No | No | Yes | Experimental |
Troubleshooting
“command not found: neumann”
The binary may not be in your PATH. Try:
# Check where it was installed
which neumann || ls ~/.local/bin/neumann
# Add to PATH if needed
export PATH="$HOME/.local/bin:$PATH"
Build fails with protobuf errors
Install the protobuf compiler:
# macOS
brew install protobuf
# Ubuntu/Debian
sudo apt-get install protobuf-compiler
# Fedora
sudo dnf install protobuf-compiler
Permission denied during install
The installer tries /usr/local/bin first (requires sudo) then falls back
to ~/.local/bin. You can specify a custom directory:
NEUMANN_INSTALL_DIR=~/bin \
curl -sSfL https://raw.githubusercontent.com/Shadylukin/Neumann/main/install.sh | bash
Python SDK native module errors
If you get errors about the native module, ensure you have a Rust toolchain:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
pip install neumann-db[native]
Updating
Quick Install
Re-run the install script to get the latest version:
curl -sSfL https://raw.githubusercontent.com/Shadylukin/Neumann/main/install.sh | bash
Homebrew
brew upgrade neumann
Cargo
cargo install neumann_shell --force
Python SDK
pip install --upgrade neumann-db
TypeScript SDK
npm update neumann-db
Uninstalling
Quick Install / Cargo
rm $(which neumann)
Homebrew
brew uninstall neumann
Docker
docker rmi shadylukinack/neumann:latest shadylukinack/neumann:server
docker volume rm neumann-data
Python SDK
pip uninstall neumann-db
TypeScript SDK
npm uninstall neumann-db
Next Steps
- Quick Start - Run your first queries
- Building from Source - Development setup
Quick Start
Get up and running with Neumann in under 5 minutes. This guide walks you through relational queries, graph operations, vector search, and the cross-engine “wow” moment.
Start the Shell
# In-memory (data lost on exit)
neumann
# With persistence (recommended)
neumann --wal-dir ./data
You will see:
Neumann v0.1.0
Type 'help' for available commands.
neumann>
1. Relational Queries
Create a table and insert some data:
CREATE TABLE people (id INT PRIMARY KEY, name TEXT, role TEXT, team TEXT);
INSERT INTO people VALUES (1, 'Alice', 'Staff Engineer', 'Platform');
INSERT INTO people VALUES (2, 'Bob', 'Engineering Manager', 'Platform');
INSERT INTO people VALUES (3, 'Carol', 'Senior Engineer', 'ML');
INSERT INTO people VALUES (4, 'Dave', 'Junior Engineer', 'Platform');
Query it:
SELECT * FROM people WHERE team = 'Platform';
SELECT name, role FROM people ORDER BY name;
SELECT team, COUNT(*) AS headcount FROM people GROUP BY team;
2. Graph Operations
Create nodes with labels and properties:
NODE CREATE person { name: 'Alice', role: 'Staff Engineer' }
NODE CREATE person { name: 'Bob', role: 'Engineering Manager' }
NODE CREATE person { name: 'Carol', role: 'Senior Engineer' }
NODE CREATE person { name: 'Dave', role: 'Junior Engineer' }
List the nodes to see their auto-generated IDs:
NODE LIST person
Create edges (replace the IDs with the actual values from NODE LIST):
EDGE CREATE 'alice-node-id' -> 'bob-node-id' : reports_to
EDGE CREATE 'dave-node-id' -> 'bob-node-id' : reports_to
EDGE CREATE 'alice-node-id' -> 'dave-node-id' : mentors
Traverse the graph:
NEIGHBORS 'bob-node-id' INCOMING : reports_to
PATH SHORTEST 'dave-node-id' TO 'bob-node-id'
Run graph algorithms:
PAGERANK
3. Vector Search
Store embeddings with string keys:
EMBED STORE 'alice' [0.9, 0.4, 0.1, 0.7, 0.6, 0.3]
EMBED STORE 'bob' [0.6, 0.2, 0.1, 0.5, 0.3, 0.2]
EMBED STORE 'carol' [0.3, 0.9, 0.1, 0.4, 0.8, 0.1]
EMBED STORE 'dave' [0.4, 0.1, 0.2, 0.5, 0.2, 0.1]
Find similar items by key or by vector:
SIMILAR 'alice' LIMIT 3
SIMILAR [0.8, 0.5, 0.1, 0.6, 0.5, 0.2] LIMIT 3 METRIC COSINE
Check what is stored:
SHOW EMBEDDINGS
COUNT EMBEDDINGS
4. The Cross-Engine Moment
This is where Neumann shines. Combine graph traversal with vector similarity in a single query:
SIMILAR 'alice' LIMIT 3 CONNECTED TO 'bob-node-id'
This finds embeddings similar to Alice’s that are also connected to Bob in the graph. No joins across separate databases needed.
Search across all engines with FIND:
FIND NODE person WHERE name = 'Alice'
Create unified entities that span relational, graph, and vector storage:
ENTITY CREATE 'project-x' { name: 'Project X', status: 'active' } EMBEDDING [0.5, 0.3, 0.7, 0.2, 0.4, 0.1]
ENTITY GET 'project-x'
5. Persistence
Save a checkpoint:
CHECKPOINT 'my-first-checkpoint'
CHECKPOINTS
If you started with --wal-dir, your data persists across restarts. You can also
save and load binary snapshots:
SAVE 'backup.bin'
LOAD 'backup.bin'
Next Steps
- Five-Minute Tutorial – Build a mini RAG system
- Use Cases – Real-world application patterns
- Query Language Reference – Full command list
- Architecture Overview – How it works under the hood
- Python SDK – Use Neumann from Python
- TypeScript SDK – Use Neumann from TypeScript
Sample Dataset
A ready-made dataset is available in samples/knowledge-base.nql. Load it with:
neumann --wal-dir ./data
neumann> \i samples/knowledge-base.nql
Five-Minute Tutorial: Build a Mini RAG System
This tutorial builds a retrieval-augmented generation (RAG) knowledge base using all three engines. By the end, you will have documents stored relationally, linked in a graph, searchable by embeddings, and protected with vault secrets.
Prerequisites
Start the shell with persistence:
neumann --wal-dir ./rag-data
Step 1: Create the Document Store
Use a relational table for structured document metadata:
CREATE TABLE documents (
id INT PRIMARY KEY,
title TEXT NOT NULL,
category TEXT,
author TEXT,
created TEXT
);
INSERT INTO documents VALUES (1, 'Intro to Neural Networks', 'ml', 'Alice', '2024-01-15');
INSERT INTO documents VALUES (2, 'Transformer Architecture', 'ml', 'Bob', '2024-02-20');
INSERT INTO documents VALUES (3, 'Database Indexing', 'systems', 'Carol', '2024-03-10');
INSERT INTO documents VALUES (4, 'Vector Search at Scale', 'systems', 'Alice', '2024-04-05');
INSERT INTO documents VALUES (5, 'Fine-Tuning LLMs', 'ml', 'Dave', '2024-05-12');
INSERT INTO documents VALUES (6, 'Consensus Protocols', 'distributed', 'Eve', '2024-06-01');
Verify:
SELECT * FROM documents ORDER BY id;
SELECT category, COUNT(*) FROM documents GROUP BY category;
Step 2: Add Graph Relationships
Create nodes for documents and topics, then link them:
NODE CREATE document { title: 'Intro to Neural Networks', doc_id: 1 }
NODE CREATE document { title: 'Transformer Architecture', doc_id: 2 }
NODE CREATE document { title: 'Database Indexing', doc_id: 3 }
NODE CREATE document { title: 'Vector Search at Scale', doc_id: 4 }
NODE CREATE document { title: 'Fine-Tuning LLMs', doc_id: 5 }
NODE CREATE document { title: 'Consensus Protocols', doc_id: 6 }
NODE CREATE topic { name: 'machine-learning' }
NODE CREATE topic { name: 'systems' }
NODE CREATE topic { name: 'distributed-systems' }
List nodes to get IDs:
NODE LIST document
NODE LIST topic
Create edges linking documents to topics (use actual IDs from NODE LIST):
-- Documents 1, 2, 5 cover machine-learning
-- Documents 3, 4 cover systems
-- Documents 4, 6 cover distributed-systems
-- Document 2 references document 1
-- Document 5 references documents 1 and 2
Create citation relationships between documents:
EDGE CREATE 'doc2-id' -> 'doc1-id' : cites
EDGE CREATE 'doc5-id' -> 'doc1-id' : cites
EDGE CREATE 'doc5-id' -> 'doc2-id' : cites
Check the graph:
NEIGHBORS 'doc1-id' INCOMING : cites
PATH SHORTEST 'doc5-id' TO 'doc1-id'
Step 3: Store Embeddings
Each document gets a vector representing its content. In a real system, you would generate these with an embedding model (e.g., OpenAI, Cohere, or a local model). Here we use hand-crafted 6-dimensional vectors:
-- [neural-nets, transformers, databases, vectors, llms, distributed]
EMBED STORE 'doc-1' [0.9, 0.3, 0.1, 0.2, 0.2, 0.1]
EMBED STORE 'doc-2' [0.7, 0.95, 0.1, 0.1, 0.4, 0.1]
EMBED STORE 'doc-3' [0.1, 0.1, 0.95, 0.3, 0.0, 0.2]
EMBED STORE 'doc-4' [0.2, 0.1, 0.5, 0.9, 0.1, 0.4]
EMBED STORE 'doc-5' [0.6, 0.5, 0.1, 0.1, 0.95, 0.1]
EMBED STORE 'doc-6' [0.1, 0.1, 0.3, 0.2, 0.1, 0.9]
Search by similarity:
-- Find documents similar to "Intro to Neural Networks"
SIMILAR 'doc-1' LIMIT 3
-- Search with a custom query vector (someone asking about transformers + LLMs)
SIMILAR [0.5, 0.8, 0.1, 0.1, 0.7, 0.1] LIMIT 3 METRIC COSINE
Step 4: Graph-Aware Semantic Search
Combine vector similarity with graph connectivity. Find documents similar to a query vector that are also connected to a specific topic node:
SIMILAR [0.8, 0.6, 0.1, 0.1, 0.5, 0.1] LIMIT 3 CONNECTED TO 'ml-topic-id'
This is the core RAG pattern: retrieve relevant documents using embedding similarity, then filter by graph relationships for context-aware results.
Step 5: Protect API Keys with Vault
Store the embedding API key securely:
VAULT SET 'openai_api_key' 'sk-proj-abc123...'
VAULT SET 'cohere_api_key' 'co-xyz789...'
VAULT GRANT 'alice' ON 'openai_api_key'
Retrieve when needed:
VAULT GET 'openai_api_key'
VAULT LIST
Step 6: Cache LLM Responses
Initialize the cache and store responses to avoid repeated API calls:
CACHE INIT
CACHE PUT 'what are transformers?' 'Transformers are a neural network architecture based on self-attention mechanisms...'
CACHE SEMANTIC PUT 'explain attention mechanism' 'The attention mechanism allows models to focus on relevant parts of the input...' EMBEDDING [0.6, 0.9, 0.1, 0.1, 0.3, 0.1]
On subsequent queries, check the cache first:
CACHE GET 'what are transformers?'
CACHE SEMANTIC GET 'how does attention work?' THRESHOLD 0.8
Step 7: Checkpoint
Save your work:
CHECKPOINT 'rag-setup-complete'
CHECKPOINTS
What You Built
You now have a working RAG knowledge base with:
- Structured metadata in a relational table (searchable with SQL)
- Semantic relationships in a graph (topics, citations, authorship)
- Vector embeddings for similarity search
- Graph-aware retrieval combining similarity and structure
- Encrypted secrets for API key management
- Response caching to reduce LLM API costs
- Checkpoint for safe rollback
Next Steps
- Use Cases – More application patterns
- Query Language Reference – All available commands
- Python SDK – Build this in Python
- Architecture Overview – How the engines work together
Use Cases
Neumann is designed for applications that need two or more of: structured data, relationships, and semantic search. Here are concrete examples with query patterns.
RAG Application
Problem: Build a retrieval-augmented generation system that retrieves relevant context for an LLM.
Why Neumann: Traditional RAG uses a vector store for retrieval. Neumann adds graph relationships (document structure, citations, topics) and relational metadata (authors, dates, permissions) to improve retrieval quality.
Schema
CREATE TABLE documents (
id INT PRIMARY KEY,
title TEXT NOT NULL,
source TEXT,
created TEXT,
chunk_count INT
);
NODE CREATE collection { name: 'engineering-docs' }
-- For each document:
NODE CREATE document { title: 'Design Doc: Auth System', doc_id: 1 }
-- Link to collection:
EDGE CREATE 'doc-node-id' -> 'collection-id' : belongs_to
-- Link related documents:
EDGE CREATE 'doc-a-id' -> 'doc-b-id' : references
Indexing
-- Store chunk embeddings (one per document chunk)
EMBED STORE 'doc-1-chunk-0' [0.1, 0.2, ...]
EMBED STORE 'doc-1-chunk-1' [0.3, 0.1, ...]
Retrieval
-- Basic: find similar chunks
SIMILAR [0.2, 0.3, ...] LIMIT 10 METRIC COSINE
-- Graph-aware: find chunks connected to a specific collection
SIMILAR [0.2, 0.3, ...] LIMIT 10 CONNECTED TO 'collection-id'
-- Check cache before calling LLM
CACHE SEMANTIC GET 'how does auth work?' THRESHOLD 0.85
Store LLM response
CACHE SEMANTIC PUT 'how does auth work?' 'The auth system uses JWT tokens...' EMBEDDING [0.2, 0.3, ...]
Agent Memory
Problem: Give an AI agent persistent memory that supports both exact recall and semantic search, with conversation structure.
Why Neumann: Agents need to recall specific facts (relational), navigate conversation history (graph), and find semantically related memories (vector).
Schema
CREATE TABLE memories (
id INT PRIMARY KEY,
content TEXT NOT NULL,
memory_type TEXT,
importance FLOAT,
created TEXT
);
-- Create session nodes
NODE CREATE session { name: 'session-2024-01-15', user: 'alice' }
-- Create memory nodes
NODE CREATE memory { content: 'User prefers dark mode', type: 'preference' }
-- Link memories to sessions
EDGE CREATE 'memory-id' -> 'session-id' : observed_in
-- Link related memories
EDGE CREATE 'memory-a' -> 'memory-b' : related_to
Store a Memory
INSERT INTO memories VALUES (1, 'User prefers dark mode', 'preference', 0.8, '2024-01-15');
NODE CREATE memory { content: 'User prefers dark mode', memory_id: 1 }
EMBED STORE 'memory-1' [0.1, 0.3, ...]
Recall
-- Semantic recall: find memories similar to current context
SIMILAR [0.2, 0.3, ...] LIMIT 5
-- Structured recall: get high-importance memories
SELECT * FROM memories WHERE importance > 0.7 ORDER BY importance DESC LIMIT 10
-- Graph recall: get memories from a specific session
NEIGHBORS 'session-id' INCOMING : observed_in
Knowledge Graph
Problem: Build a knowledge graph of entities with properties, relationships, and similarity search.
Why Neumann: Knowledge graphs need rich entity properties (relational), typed relationships (graph), and entity similarity (vector) in a single queryable system.
Build the Graph
-- Create entity types
NODE CREATE company { name: 'Acme Corp', industry: 'Technology', founded: 2010 }
NODE CREATE person { name: 'Jane Smith', title: 'CEO' }
NODE CREATE product { name: 'Acme Cloud', category: 'IaaS' }
-- Create relationships
EDGE CREATE 'jane-id' -> 'acme-id' : works_at { role: 'CEO', since: '2018' }
EDGE CREATE 'acme-id' -> 'product-id' : produces
Add Embeddings
-- Embed entity descriptions for similarity search
EMBED STORE 'acme-corp' [0.8, 0.3, 0.1, ...]
EMBED STORE 'jane-smith' [0.2, 0.7, 0.5, ...]
EMBED STORE 'acme-cloud' [0.9, 0.4, 0.2, ...]
Query
-- Find entities similar to a description
SIMILAR [0.7, 0.3, 0.2, ...] LIMIT 5
-- Discover connections
NEIGHBORS 'acme-id' BOTH
PATH SHORTEST 'jane-id' TO 'product-id'
-- Graph analytics
PAGERANK
LOUVAIN
BETWEENNESS
Access-Controlled Search
Problem: Build a search system where different users see different results based on permissions.
Why Neumann: Vault stores access tokens, graph models the permission hierarchy, and vector search handles the retrieval. No external auth system needed.
Setup Permissions
-- Store API keys securely
VAULT SET 'admin_key' 'ak-admin-secret'
VAULT SET 'user_key' 'ak-user-secret'
-- Grant access based on roles
VAULT GRANT 'alice' ON 'admin_key'
VAULT GRANT 'bob' ON 'user_key'
-- Build permission graph
NODE CREATE role { name: 'admin' }
NODE CREATE role { name: 'viewer' }
NODE CREATE resource { name: 'confidential-docs' }
EDGE CREATE 'admin-role-id' -> 'resource-id' : can_access
Query with Access Check
-- Find documents similar to query
SIMILAR [0.3, 0.5, ...] LIMIT 10
-- Verify access through graph
NEIGHBORS 'admin-role-id' OUTGOING : can_access
-- Rotate keys periodically
VAULT ROTATE 'admin_key' 'ak-new-admin-secret'
Common Patterns
Checkpoint Before Migrations
CHECKPOINT 'before-schema-v2'
-- Run migration...
-- If something goes wrong:
ROLLBACK TO 'before-schema-v2'
Blob Attachments
-- Upload a file and link it to an entity
BLOB INIT
BLOB PUT 'report.pdf' FROM '/tmp/report.pdf' TAG 'quarterly' LINK 'entity-id'
-- Find all blobs for an entity
BLOBS FOR 'entity-id'
-- Find blobs by tag
BLOBS BY TAG 'quarterly'
Chain for Audit Trail
-- Start a chain transaction for auditable operations
BEGIN CHAIN TRANSACTION
INSERT INTO audit_log VALUES (1, 'user_created', 'alice', '2024-01-15');
COMMIT CHAIN
-- Verify integrity
CHAIN VERIFY
CHAIN HISTORY 'audit_log/1'
Building from Source
Development Requirements
- Rust 1.75+ (stable)
- Rust nightly (for fuzzing)
- Git
Clone and Build
git clone https://github.com/Shadylukin/Neumann.git
cd Neumann
# Debug build
cargo build
# Release build (optimized)
cargo build --release
Running Tests
# Run all tests
cargo test
# Run tests for a specific crate
cargo test -p tensor_chain
# Run with output
cargo test -- --nocapture
Quality Checks
All code must pass before commit:
# Formatting
cargo fmt --check
# Lints (warnings as errors)
cargo clippy -- -D warnings
# Documentation builds
cargo doc --no-deps
Fuzzing
Requires nightly Rust:
# Install cargo-fuzz
cargo install cargo-fuzz
# Run a fuzz target
cd fuzz
cargo +nightly fuzz run parser_parse -- -max_total_time=60
Running the Shell
# Debug mode
cargo run -p neumann_shell
# Release mode
cargo run --release -p neumann_shell
IDE Setup
VS Code
Install the rust-analyzer extension. Recommended settings:
{
"rust-analyzer.checkOnSave.command": "clippy",
"rust-analyzer.cargo.features": "all"
}
IntelliJ/CLion
Install the Rust plugin. Enable clippy in settings.
Project Structure
Neumann/
├── tensor_store/ # Core storage layer
├── relational_engine/ # SQL-like tables
├── graph_engine/ # Graph operations
├── vector_engine/ # Embeddings
├── tensor_chain/ # Distributed consensus
├── neumann_parser/ # Query parsing
├── query_router/ # Query execution
├── neumann_shell/ # CLI interface
├── tensor_compress/ # Compression
├── tensor_vault/ # Encrypted storage
├── tensor_cache/ # LLM caching
├── tensor_blob/ # Blob storage
├── tensor_checkpoint/ # Snapshots
├── tensor_unified/ # Multi-engine facade
├── fuzz/ # Fuzz targets
└── docs/ # Documentation
Architecture Overview
Neumann is a unified tensor-based runtime that stores relational data, graph relationships, and vector embeddings in a single mathematical structure.
System Architecture
flowchart TB
subgraph Client Layer
Shell[neumann_shell]
end
subgraph Query Layer
Router[query_router]
Parser[neumann_parser]
end
subgraph Engine Layer
RE[relational_engine]
GE[graph_engine]
VE[vector_engine]
end
subgraph Storage Layer
TS[tensor_store]
TC[tensor_compress]
end
subgraph Extended Modules
Vault[tensor_vault]
Cache[tensor_cache]
Blob[tensor_blob]
Check[tensor_checkpoint]
Unified[tensor_unified]
Chain[tensor_chain]
end
Shell --> Router
Router --> Parser
Router --> RE
Router --> GE
Router --> VE
Router --> Vault
Router --> Cache
Router --> Blob
Router --> Chain
RE --> TS
GE --> TS
VE --> TS
Vault --> TS
Cache --> TS
Blob --> TS
Check --> TS
Unified --> RE
Unified --> GE
Unified --> VE
Chain --> TS
Chain --> GE
Chain --> Check
TS --> TC
Module Dependencies
| Module | Purpose | Depends On |
|---|---|---|
| tensor_store | Key-value storage layer | tensor_compress |
| relational_engine | SQL-like tables with indexes | tensor_store |
| graph_engine | Graph nodes and edges | tensor_store |
| vector_engine | Embeddings and similarity search | tensor_store |
| tensor_compress | Compression algorithms | — |
| tensor_vault | Encrypted secret storage | tensor_store, graph_engine |
| tensor_cache | Semantic LLM response caching | tensor_store |
| tensor_blob | S3-style chunked blob storage | tensor_store |
| tensor_checkpoint | Atomic snapshot/restore | tensor_store |
| tensor_unified | Multi-engine unified storage | all engines |
| tensor_chain | Tensor-native blockchain | tensor_store, graph_engine, tensor_checkpoint |
| neumann_parser | Query tokenization and parsing | — |
| query_router | Unified query execution | all engines, parser |
| neumann_shell | Interactive CLI interface | query_router |
Key Design Principles
Unified Data Model
All data is represented as tensors:
- Scalars: Single values (int, float, string, bool)
- Vectors: Dense or sparse embeddings
- Pointers: References to other entities
Thread Safety
All engines use DashMap for concurrent access:
- Sharded locks for write throughput
- No lock poisoning
- Read operations are lock-free
Composability
Engines can be composed:
- Use relational_engine alone for SQL workloads
- Combine with graph_engine for relationship queries
- Add vector_engine for similarity search
Data Flow
- Query Parsing: neumann_parser tokenizes and parses input
- Query Routing: query_router dispatches to appropriate engine
- Execution: Engine performs operation using tensor_store
- Storage: tensor_store persists data with optional compression
Distributed Architecture (tensor_chain)
For distributed deployments:
flowchart LR
subgraph Cluster
L[Leader]
F1[Follower 1]
F2[Follower 2]
end
C[Client] --> L
L --> F1
L --> F2
F1 -.-> L
F2 -.-> L
- Raft Consensus: Leader election and log replication
- 2PC Transactions: Cross-shard atomic operations
- SWIM Gossip: Membership and failure detection
Tensor Store Architecture
The tensor_store crate is the foundational storage layer for Neumann. It provides a unified tensor-based key-value store that holds all data - relational, graph, and vector - in a single mathematical structure. The store knows nothing about queries; it purely stores and retrieves tensors by key.
The architecture uses SlabRouter internally, which routes operations to specialized slabs based on key prefixes. This design eliminates hash table resize stalls by using BTreeMap-based storage, providing predictable O(log n) performance without the throughput cliffs caused by hash map resizing.
Core Types
TensorValue
Represents different types of values a tensor can hold.
| Variant | Rust Type | Use Case |
|---|---|---|
Scalar(ScalarValue) | enum | Properties (name, age, active) |
Vector(Vec<f32>) | dense array | Embeddings for similarity search |
Sparse(SparseVector) | compressed | Sparse embeddings (>70% zeros) |
Pointer(String) | single ref | Single relationship to another tensor |
Pointers(Vec<String>) | multi ref | Multiple relationships |
Automatic Sparsification: Use TensorValue::from_embedding_auto(dense) to
automatically choose between dense and sparse representation based on sparsity:
#![allow(unused)] fn main() { // Automatically uses Sparse if sparsity >= 70% let val = TensorValue::from_embedding_auto(dense_vec); // With custom thresholds (value_threshold, sparsity_threshold) let val = TensorValue::from_embedding(dense_vec, 0.01, 0.8); }
Vector Operations: TensorValue supports cross-format operations:
#![allow(unused)] fn main() { // Dot product works across Dense, Sparse, and mixed let dot = tensor_a.dot(&tensor_b); // Cosine similarity with automatic format handling let sim = tensor_a.cosine_similarity(&tensor_b); }
ScalarValue
| Variant | Rust Type | Example |
|---|---|---|
Null | — | Missing/undefined value |
Bool | bool | true, false |
Int | i64 | 42, -1 |
Float | f64 | 3.14159 |
String | String | "Alice" |
Bytes | Vec<u8> | Raw binary data |
TensorData
An entity that holds scalar properties, vector embeddings, and pointers to other
tensors via a HashMap<String, TensorValue> internally.
Reserved Field Names
| Field | Purpose | Used By |
|---|---|---|
_out | Outgoing graph edge pointers | GraphEngine |
_in | Incoming graph edge pointers | GraphEngine |
_embedding | Vector embedding | VectorEngine |
_label | Entity type/label | GraphEngine |
_type | Discriminator field | All engines |
_from | Edge source | GraphEngine |
_to | Edge target | GraphEngine |
_edge_type | Edge relationship type | GraphEngine |
_directed | Edge direction flag | GraphEngine |
_table | Table membership | RelationalEngine |
_id | Entity ID | System |
Architecture Diagram
TensorStore
|
+-- Arc<SlabRouter>
|
+-- MetadataSlab (general key-value, BTreeMap-based)
+-- EntityIndex (sorted vocabulary + hash index)
+-- EmbeddingSlab (dense f32 arrays)
+-- GraphTensor (CSR format for edges)
+-- RelationalSlab (columnar storage)
+-- CacheRing (LRU/LFU eviction)
+-- BlobLog (append-only blob storage)
SlabRouter Internals
SlabRouter is the core routing layer that directs operations to specialized storage backends based on key prefixes.
Key Routing Algorithm
flowchart TD
A[put/get/delete key] --> B{Classify Key}
B -->|emb:*| C[EmbeddingSlab + MetadataSlab]
B -->|node:* / edge:*| D[GraphTensor via MetadataSlab]
B -->|table:*| E[RelationalSlab via MetadataSlab]
B -->|_cache:*| F[CacheRing]
B -->|Everything else| G[MetadataSlab]
Key Classification
| Prefix | KeyClass | Slab | Purpose |
|---|---|---|---|
emb:* | Embedding | EmbeddingSlab + EntityIndex | Embedding vectors with stable ID assignment |
node:*, edge:* | Graph | MetadataSlab | Graph nodes and edges |
table:* | Table | MetadataSlab | Relational rows |
_cache:* | Cache | CacheRing | Cached data with eviction |
_blob:* | Metadata | MetadataSlab | Blob metadata (chunks stored separately) |
| Everything else | Metadata | MetadataSlab | General key-value storage |
SlabRouter Operation Flow
PUT Operation:
#![allow(unused)] fn main() { fn put(&self, key: &str, value: TensorData) { match classify_key(key) { KeyClass::Embedding => { // 1. Get or create stable entity ID let entity_id = self.index.get_or_create(key); // 2. Extract and store embedding vector if let Some(TensorValue::Vector(vec)) = value.get("_embedding") { self.embeddings.set(entity_id, vec); } // 3. Store full metadata self.metadata.set(key, value); } KeyClass::Cache => { let size = estimate_size(&value); self.cache.put(key, value, 1.0, size); } _ => self.metadata.set(key, value), } } }
GET Operation:
#![allow(unused)] fn main() { fn get(&self, key: &str) -> Result<TensorData> { match classify_key(key) { KeyClass::Embedding => { // Try to reconstruct from embedding slab + metadata if let Some(entity_id) = self.index.get(key) { if let Some(vector) = self.embeddings.get(entity_id) { let mut data = self.metadata.get(key).unwrap_or_default(); data.set("_embedding", TensorValue::Vector(vector)); return Ok(data); } } self.metadata.get(key) } KeyClass::Cache => self.cache.get(key), _ => self.metadata.get(key), } } }
Specialized Slabs
| Slab | Data Structure | Purpose |
|---|---|---|
MetadataSlab | RwLock<BTreeMap<String, TensorData>> | General key-value storage |
EntityIndex | Sorted vocabulary + hash index | Stable ID assignment |
EmbeddingSlab | Dense f32 arrays + BTreeMap | Embedding vectors |
GraphTensor | CSR format (row pointers + column indices) | Graph edges |
RelationalSlab | Columnar storage | Table rows |
CacheRing | Ring buffer with LRU/LFU | Fixed-size cache |
BlobLog | Append-only segments | Large binary data |
Performance Characteristics
Operation Complexity
| Operation | Time Complexity | Notes |
|---|---|---|
put | O(log n) | BTreeMap insert |
get | O(log n) + clone | Clone prevents reference issues |
delete | O(log n) | BTreeMap remove |
exists | O(log n) | BTreeMap lookup |
scan | O(k + log n) | BTreeMap range, k = result count |
scan_count | O(k + log n) | No allocation |
scan_filter_map | O(k + log n) | Single-pass filter with selective cloning |
len | O(1) | Cached count |
clear | O(n) | Clears all data |
Throughput Comparison
| Metric | SlabRouter | Previous (DashMap) |
|---|---|---|
| PUT throughput | 3.1+ M ops/sec | 2.5 M ops/sec |
| GET throughput | 4.9+ M ops/sec | 4.5 M ops/sec |
| Throughput variance (CV) | 12% steady-state | 222% during resize |
| Resize stalls | None | 99.6% throughput drops |
Optimized Scan Performance
Use scan_filter_map for selective queries to avoid cloning non-matching
entries:
#![allow(unused)] fn main() { // Old path: 5000 clones for 5000 rows, ~2.6ms let users = store.scan("users:"); let matches: Vec<_> = users.iter() .filter_map(|key| store.get(key).ok()) .filter(|data| /* condition */) .collect(); // New path: 250 clones for 5% match rate, ~0.13ms (20x faster) let matches = store.scan_filter_map("users:", |key, data| { if /* condition */ { Some(data.clone()) } else { None } }); }
Concurrency Model
TensorStore uses tensor-based structures instead of hash maps for predictable performance:
- No Resize Stalls: BTreeMap and sorted arrays grow incrementally
- Lock-free Reads: RwLock allows many concurrent readers
- Predictable Writes: O(log n) inserts, no amortized O(n) resizing
- Clone on Read:
get()returns cloned data to avoid holding references - Shareable Storage: TensorStore clones share the same underlying data via Arc
BloomFilter
The BloomFilter provides O(1) probabilistic rejection of non-existent keys, useful for sparse key spaces where most lookups are misses.
Mathematical Foundation
The Bloom filter uses optimal parameters calculated as:
Bit array size: m = -n * ln(p) / (ln(2)^2)
- Where n = expected items, p = false positive rate
Number of hash functions: k = (m/n) * ln(2)
- Clamped to range 1 to 16
Implementation Details
#![allow(unused)] fn main() { pub struct BloomFilter { bits: Box<[AtomicU64]>, // Atomic u64 blocks for lock-free access num_bits: usize, num_hashes: usize, } }
Hash Function: Uses SipHash with different seeds for each hash function:
#![allow(unused)] fn main() { fn hash_index<K: Hash>(&self, key: &K, seed: usize) -> usize { let mut hasher = SipHasher::new_with_seed(seed as u64); key.hash(&mut hasher); (hasher.finish() as usize) % self.num_bits } }
Parameter Tuning Guide
| Expected Items | FP Rate | Bits | Hash Functions | Memory |
|---|---|---|---|---|
| 10,000 | 1% | 95,851 | 7 | ~12 KB |
| 10,000 | 0.1% | 143,776 | 10 | ~18 KB |
| 100,000 | 1% | 958,506 | 7 | ~117 KB |
| 1,000,000 | 1% | 9,585,059 | 7 | ~1.2 MB |
Gotchas:
- Bloom filter state is not persisted in snapshots; rebuild after load
- Thread-safe via AtomicU64 with Relaxed ordering (eventual consistency)
- Cannot remove items (use counting bloom filter for that case)
- False positive rate increases if more items than expected are inserted
HNSW Index
Hierarchical Navigable Small World index for approximate nearest neighbor search with O(log n) complexity.
Algorithm Overview
flowchart TD
subgraph "HNSW Structure"
L3[Layer 3: Entry Point] --> L2[Layer 2: Skip connections]
L2 --> L1[Layer 1: More connections]
L1 --> L0[Layer 0: All nodes, dense connections]
end
subgraph "Search Algorithm"
S1[Start at entry point, top layer] --> S2[Greedy descent to layer 1]
S2 --> S3[At layer 0: ef-search candidates]
S3 --> S4[Return top-k results]
end
Layer Selection
New nodes are assigned layers using exponential distribution:
#![allow(unused)] fn main() { fn random_level(&self) -> usize { let f = random_float_0_1(); let level = (-f.ln() * self.config.ml).floor() as usize; level.min(32) // Cap at 32 layers } }
Where ml = 1 / ln(m) and m = connections per layer.
HNSWConfig Parameters
| Parameter | Default | Description |
|---|---|---|
m | 16 | Max connections per node per layer |
m0 | 32 | Max connections at layer 0 (2*m) |
ef_construction | 200 | Candidates during construction |
ef_search | 50 | Candidates during search |
ml | 1/ln(m) | Level multiplier |
sparsity_threshold | 0.5 | Auto-sparse storage threshold |
max_nodes | 10,000,000 | Capacity limit (prevents memory exhaustion) |
Configuration Presets
#![allow(unused)] fn main() { // High recall (slower, more accurate) HNSWConfig::high_recall() // m=32, m0=64, ef_construction=400, ef_search=200 // High speed (faster, lower recall) HNSWConfig::high_speed() // m=8, m0=16, ef_construction=100, ef_search=20 // Custom configuration HNSWConfig { m: 24, m0: 48, ef_construction: 300, ef_search: 100, ..Default::default() } }
SIMD-Accelerated Distance
Dense vector operations use 8-wide SIMD (f32x8):
#![allow(unused)] fn main() { pub fn dot_product(a: &[f32], b: &[f32]) -> f32 { let chunks = a.len() / 8; let mut sum = f32x8::ZERO; for i in 0..chunks { let offset = i * 8; let va = f32x8::from(&a[offset..offset + 8]); let vb = f32x8::from(&b[offset..offset + 8]); sum += va * vb; } // Sum lanes and handle remainder let arr: [f32; 8] = sum.into(); let mut result: f32 = arr.iter().sum(); // ... scalar remainder handling } }
Neighbor Compression
HNSW neighbor lists use delta-varint encoding for 3-8x compression:
#![allow(unused)] fn main() { struct CompressedNeighbors { compressed: Vec<u8>, // Delta-varint encoded neighbor IDs } // Decompression: O(n) where n = neighbor count fn get(&self) -> Vec<usize> { decompress_ids(&self.compressed) } // Compression: Sort + delta encode fn set(&mut self, ids: &[usize]) { let mut sorted = ids.to_vec(); sorted.sort_unstable(); self.compressed = compress_ids(&sorted); } }
Storage Types
flowchart LR
subgraph "EmbeddingStorage"
D[Dense: Vec f32]
S[Sparse: SparseVector]
DV[Delta: DeltaVector]
TT[TensorTrain: TTVectorCached]
end
D --> |"sparsity > 50%"| S
D --> |"clusters around archetype"| DV
D --> |"high-dim 768+"| TT
| Storage Type | Memory | Use Case | Distance Computation |
|---|---|---|---|
| Dense | 4 bytes/dim | General purpose | SIMD dot product |
| Sparse | 6 bytes/nnz | >50% zeros | Sparse-sparse O(nnz) |
| Delta | 6 bytes/diff | Clustered embeddings | Via archetype |
| TensorTrain | 8-10x compression | 768+ dimensions | Native TT or reconstruct |
Edge Cases and Gotchas
-
Delta vectors cannot be inserted directly - they require archetype registry for distance computation. Convert to Dense first.
-
TensorTrain storage - While stored in TT format, HNSW reconstructs to dense for fast distance computation during search (native TT distance is O(r^4) per comparison).
-
Capacity limits - Default max_nodes=10M prevents memory exhaustion from fuzzing/adversarial input. Use
try_insertfor graceful handling. -
Empty index - Entry point is
usize::MAXwhen empty; search returns empty results.
SparseVector
Memory-efficient storage for vectors with many zeros, based on the philosophy that “zero represents absence of information, not a stored value.”
Internal Structure
#![allow(unused)] fn main() { pub struct SparseVector { dimension: usize, // Total dimension (shell/boundary) positions: Vec<u32>, // Sorted positions of non-zero values values: Vec<f32>, // Corresponding values } }
Operation Complexity
| Operation | Complexity | Notes |
|---|---|---|
from_dense | O(n) | Filters zeros |
to_dense | O(n) | Reconstructs full vector |
get(index) | O(log nnz) | Binary search |
set(index, value) | O(nnz) | Insert/remove maintains sort |
dot(sparse) | O(min(nnz_a, nnz_b)) | Merge-join on positions |
dot_dense(dense) | O(nnz) | Only access stored positions |
add(sparse) | O(nnz_a + nnz_b) | Merge-based |
cosine_similarity | O(nnz) | Using cached magnitudes |
Sparse Arithmetic Operations
#![allow(unused)] fn main() { // Create delta from before/after states (only stores differences) let delta = SparseVector::from_diff(&before, &after, threshold); // Subtraction: self - other let diff = a.sub(&b); // Weighted average: (w1 * a + w2 * b) / (w1 + w2) let merged = a.weighted_average(&b, 0.7, 0.3); // Project out conflicting component let orthogonal = v.project_orthogonal(&conflict_direction); }
Distance Metrics
| Metric | Range | Use Case |
|---|---|---|
cosine_similarity | -1 to 1 | Directional similarity |
angular_distance | 0 to PI | Linear for small angles |
geodesic_distance | 0 to PI | Arc length on unit sphere |
jaccard_index | 0 to 1 | Structural overlap (positions) |
overlap_coefficient | 0 to 1 | Subset containment |
weighted_jaccard | 0 to 1 | Value-weighted structural overlap |
euclidean_distance | 0 to inf | L2 norm of difference |
manhattan_distance | 0 to inf | L1 norm of difference |
Security: NaN/Inf Sanitization
All similarity metrics sanitize results to prevent consensus ordering issues:
#![allow(unused)] fn main() { pub fn cosine_similarity(&self, other: &SparseVector) -> f32 { // ... computation ... // SECURITY: Sanitize result to valid range if result.is_nan() || result.is_infinite() { 0.0 } else { result.clamp(-1.0, 1.0) } } }
Memory Efficiency
#![allow(unused)] fn main() { let sparse = SparseVector::from_dense(&dense_vec); // Metrics sparse.sparsity() // Fraction of zeros (0.0 - 1.0) sparse.memory_bytes() // Actual memory used sparse.dense_memory_bytes() // Memory if stored dense sparse.compression_ratio() // Dense / Sparse ratio }
For a 1000-dim vector with 90% zeros:
- Dense: 4000 bytes
- Sparse: ~800 bytes (100 positions 4 bytes + 100 values 4 bytes)
- Compression ratio: 5x
Delta Vectors and Archetype Registry
Delta encoding stores vectors as differences from reference “archetype” vectors, providing significant compression for clustered embeddings.
Concept
flowchart LR
subgraph "Delta Encoding"
A[Archetype Vector] --> |"+ Delta"| R[Reconstructed Vector]
D[Delta: positions + values] --> R
end
When many embeddings cluster around common patterns:
- Identify archetype vectors (cluster centroids via k-means)
- Store each embedding as:
archetype_id + sparse_delta - Reconstruct on demand:
archetype + delta = original
DeltaVector Structure
#![allow(unused)] fn main() { pub struct DeltaVector { archetype_id: usize, // Reference archetype dimension: usize, // For reconstruction positions: Vec<u16>, // Diff positions (u16 for memory) deltas: Vec<f32>, // Delta values cached_magnitude: Option<f32>, // For fast cosine similarity } }
Optimized Dot Products
#![allow(unused)] fn main() { // With precomputed archetype dot query // Total: O(nnz) instead of O(dimension) let result = delta.dot_dense_with_precomputed(query, archetype_dot_query); // Between two deltas from SAME archetype // dot(A, B) = dot(R, R) + dot(R, delta_b) + dot(delta_a, R) + dot(delta_a, delta_b) let result = a.dot_same_archetype(&b, archetype, archetype_magnitude_sq); }
ArchetypeRegistry
#![allow(unused)] fn main() { // Create registry with max 16 archetypes let mut registry = ArchetypeRegistry::new(16); // Discover archetypes via k-means clustering let config = KMeansConfig { max_iterations: 100, convergence_threshold: 1e-4, seed: 42, init_method: KMeansInit::KMeansPlusPlus, // Better but slower }; registry.discover_archetypes(&embeddings, 5, config); // Encode vectors as deltas let delta = registry.encode(&vector, threshold)?; // Analyze coverage let stats = registry.analyze_coverage(&vectors, 0.01); // stats.avg_similarity, stats.avg_compression_ratio, stats.archetype_usage }
Persistence
#![allow(unused)] fn main() { // Save to TensorStore registry.save_to_store(&store)?; // Load from TensorStore let registry = ArchetypeRegistry::load_from_store(&store, 16)?; }
Tiered Storage
Two-tier storage with hot (in-memory) and cold (mmap) layers for memory-efficient storage of large datasets.
Architecture
flowchart TD
subgraph "Hot Tier (In-Memory)"
H[MetadataSlab]
I[ShardAccessTracker]
end
subgraph "Cold Tier (Mmap)"
C[MmapStoreMut]
CK[cold_keys HashSet]
end
GET --> H
H -->|miss| CK
CK -->|found| C
C -->|promote| H
PUT --> H
H -->|migrate_cold| C
TieredConfig
| Field | Type | Default | Description |
|---|---|---|---|
cold_dir | PathBuf | /tmp/tensor_cold | Directory for cold storage files |
cold_capacity | usize | 64MB | Initial cold file size |
sample_rate | u32 | 100 | Access tracking sampling (100 = 1%) |
Migration Algorithm
#![allow(unused)] fn main() { pub fn migrate_cold(&mut self, threshold_ms: u64) -> Result<usize> { // 1. Find shards not accessed within threshold let cold_shards = self.instrumentation.cold_shards(threshold_ms); // 2. Collect keys belonging to cold shards let keys_to_migrate: Vec<String> = self.hot.scan("") .filter(|(key, _)| { let shard = shard_for_key(key); cold_shards.contains(&shard) }) .map(|(key, _)| key) .collect(); // 3. Move to cold storage for key in keys_to_migrate { cold.insert(&key, &tensor)?; self.cold_keys.insert(key.clone()); self.hot.delete(&key); } cold.flush()?; } }
Automatic Promotion
When cold data is accessed, it’s automatically promoted back to hot:
#![allow(unused)] fn main() { pub fn get(&mut self, key: &str) -> Result<TensorData> { // Try hot first if let Some(data) = self.hot.get(key) { return Ok(data); } // Try cold if self.cold_keys.contains(key) { let tensor = self.cold.get(key)?; // Promote to hot self.hot.set(key, tensor.clone()); self.cold_keys.remove(key); self.migrations_to_hot.fetch_add(1, Ordering::Relaxed); return Ok(tensor); } Err(TensorStoreError::NotFound(key)) } }
Statistics
#![allow(unused)] fn main() { let stats = store.stats(); // stats.hot_count, stats.cold_count // stats.hot_lookups, stats.cold_lookups, stats.cold_hits // stats.migrations_to_cold, stats.migrations_to_hot }
Access Instrumentation
Low-overhead tracking of shard access patterns for intelligent memory tiering.
ShardAccessTracker
#![allow(unused)] fn main() { pub struct ShardAccessTracker { shards: Box<[ShardStats]>, // Per-shard counters shard_count: usize, // Default: 16 start_time: Instant, // For last_access timestamps sample_rate: u32, // 1 = every access, 100 = 1% sample_counter: AtomicU64, // For sampling } // Sampling logic fn should_sample(&self) -> bool { if self.sample_rate == 1 { return true; } self.sample_counter.fetch_add(1, Relaxed).is_multiple_of(self.sample_rate) } }
Hot/Cold Detection
#![allow(unused)] fn main() { // Get shards sorted by access count (hottest first) let hot = tracker.hot_shards(5); // Top 5 hottest // Get shards not accessed within threshold let cold = tracker.cold_shards(30_000); // Not accessed in 30s }
HNSW Access Stats
Specialized instrumentation for HNSW index:
#![allow(unused)] fn main() { pub struct HNSWAccessStats { entry_point_accesses: AtomicU64, layer0_traversals: AtomicU64, upper_layer_traversals: AtomicU64, total_searches: AtomicU64, distance_calculations: AtomicU64, } // Snapshot metrics let stats = hnsw.access_stats()?; stats.layer0_ratio() // Layer 0 work fraction stats.avg_distances_per_search // Distance calcs per search stats.searches_per_second() // Throughput }
Configuration Options
SlabRouterConfig
| Field | Type | Default | Description |
|---|---|---|---|
embedding_dim | usize | 384 | Embedding dimension for EmbeddingSlab |
cache_capacity | usize | 10,000 | Cache capacity for CacheRing |
cache_strategy | EvictionStrategy | Default | Eviction strategy (LRU/LFU) |
blob_segment_size | usize | 64MB | Segment size for BlobLog |
graph_merge_threshold | usize | 10,000 | Merge threshold for GraphTensor |
Usage Examples
Basic Operations
#![allow(unused)] fn main() { let store = TensorStore::new(); // Store a tensor let mut user = TensorData::new(); user.set("name", TensorValue::Scalar(ScalarValue::String("Alice".into()))); user.set("age", TensorValue::Scalar(ScalarValue::Int(30))); user.set("embedding", TensorValue::Vector(vec![0.1, 0.2, 0.3, 0.4])); store.put("user:1", user)?; // Retrieve let data = store.get("user:1")?; // Scan by prefix let user_keys = store.scan("user:"); let count = store.scan_count("user:"); }
With Bloom Filter
#![allow(unused)] fn main() { // Fast rejection of non-existent keys let store = TensorStore::with_bloom_filter(10_000, 0.01); store.put("key:1", tensor)?; // O(1) rejection if key definitely doesn't exist if store.exists("key:999") { /* ... */ } }
With Instrumentation
#![allow(unused)] fn main() { // Enable access tracking with 1% sampling let store = TensorStore::with_instrumentation(100); // After operations, check access patterns let snapshot = store.access_snapshot()?; println!("Hot shards: {:?}", store.hot_shards(5)?); println!("Cold shards: {:?}", store.cold_shards(30_000)?); }
Shared Storage Across Engines
#![allow(unused)] fn main() { let store = TensorStore::new(); // Clone shares the same underlying Arc<SlabRouter> let store_clone = store.clone(); // Both see the same data store.put("user:1", user_data)?; assert!(store_clone.exists("user:1")); // Use with multiple engines let vector_engine = VectorEngine::with_store(store.clone()); let graph_engine = GraphEngine::with_store(store.clone()); }
Persistence
#![allow(unused)] fn main() { // Save snapshot store.save_snapshot("data.bin")?; // Load snapshot let store = TensorStore::load_snapshot("data.bin")?; // Load with Bloom filter rebuild let store = TensorStore::load_snapshot_with_bloom_filter( "data.bin", 10_000, // expected items 0.01 // false positive rate )?; // Compressed snapshot use tensor_compress::{CompressionConfig, QuantMode}; let config = CompressionConfig { vector_quantization: Some(QuantMode::Int8), // 4x compression delta_encoding: true, rle_encoding: true, }; store.save_snapshot_compressed("data.bin", config)?; }
Tiered Storage
#![allow(unused)] fn main() { use tensor_store::{TieredStore, TieredConfig}; let config = TieredConfig { cold_dir: "/data/cold".into(), cold_capacity: 64 * 1024 * 1024, sample_rate: 100, }; let mut store = TieredStore::new(config)?; store.put("user:1", tensor); // Migrate cold data (not accessed in 30s) let migrated = store.migrate_cold(30_000)?; // Check stats let stats = store.stats(); println!("Hot: {}, Cold: {}", stats.hot_count, stats.cold_count); }
HNSW Index
#![allow(unused)] fn main() { let index = HNSWIndex::with_config(HNSWConfig::default()); // Insert dense, sparse, or auto-select index.insert(vec![0.1, 0.2, 0.3]); index.insert_sparse(sparse_vec); index.insert_auto(mixed_vec); // Auto-selects dense/sparse // With capacity checking match index.try_insert(vec) { Ok(id) => println!("Inserted as node {}", id), Err(EmbeddingStorageError::CapacityExceeded { limit, current }) => { println!("Index full: {} / {}", current, limit); } } // Search with custom ef let results = index.search_with_ef(&query, 10, 100); for (id, similarity) in results { println!("Node {}: {:.4}", id, similarity); } }
Delta-Encoded Embeddings
#![allow(unused)] fn main() { let mut registry = ArchetypeRegistry::new(16); // Discover archetypes from existing embeddings registry.discover_archetypes(&embeddings, 5, KMeansConfig::default()); // Encode new vectors as deltas let results = registry.encode_batch(&embeddings, 0.01); for (delta, compression_ratio) in results { println!("Archetype {}, compression: {:.2}x", delta.archetype_id(), compression_ratio); } }
Error Types
TensorStoreError
| Error | Cause |
|---|---|
NotFound(key) | get or delete on nonexistent key |
SnapshotError
| Error | Cause |
|---|---|
IoError(std::io::Error) | File not found, permission denied, disk full |
SerializationError(String) | Corrupted file, incompatible format |
TieredError
| Error | Cause |
|---|---|
Store(TensorStoreError) | Underlying store error |
Mmap(MmapError) | Memory-mapped file error |
Io(std::io::Error) | I/O error |
NotConfigured | Cold storage not configured |
EmbeddingStorageError
| Error | Cause |
|---|---|
DeltaRequiresRegistry | Delta storage used without archetype registry |
ArchetypeNotFound(id) | Referenced archetype not in registry |
CapacityExceeded { limit, current } | HNSW index at max_nodes limit |
DeltaNotSupported | Delta vectors inserted into HNSW (unsupported) |
Related Modules
| Module | Relationship |
|---|---|
relational_engine | Uses TensorStore for table row storage |
graph_engine | Uses TensorStore for node/edge storage |
vector_engine | Uses TensorStore + HNSWIndex for embeddings |
tensor_compress | Provides compression for snapshots |
tensor_checkpoint | Uses TensorStore snapshots for atomic restore |
tensor_chain | Uses TensorStore for blockchain state |
Dependencies
| Crate | Purpose |
|---|---|
serde | Serialization |
bincode | Binary snapshot format |
tensor_compress | Compression algorithms |
wide | SIMD operations (f32x8) |
memmap2 | Memory-mapped files |
fxhash | Fast hashing |
parking_lot | Efficient locks |
bitvec | Bit vectors for bloom filter |
Relational Engine
The Relational Engine (Module 2) provides SQL-like table operations on top of the Tensor Store. It implements schema enforcement, composable condition predicates, SIMD-accelerated columnar filtering, and both hash and B-tree indexes for query acceleration.
Tables, rows, and indexes are stored as tensor data in the underlying Tensor Store, inheriting its thread safety from DashMap. The engine supports all standard CRUD operations, six SQL join types, aggregate functions, and batch operations for bulk inserts.
Architecture
flowchart TD
subgraph RelationalEngine
API[Public API]
Schema[Schema Validation]
Cond[Condition Evaluation]
Hash[Hash Index]
BTree[B-Tree Index]
Columnar[Columnar SIMD]
end
API --> Schema
API --> Cond
Cond --> Hash
Cond --> BTree
Cond --> Columnar
subgraph TensorStore
Store[(DashMap Storage)]
Meta[Table Metadata]
Rows[Row Data]
Idx[Index Entries]
end
Schema --> Meta
API --> Rows
Hash --> Idx
BTree --> Idx
Query Execution Flow
flowchart TD
Query[SELECT Query] --> ParseCond[Parse Condition]
ParseCond --> CheckIdx{Has Index?}
CheckIdx -->|Hash Index + Eq| HashLookup[O(1) Hash Lookup]
CheckIdx -->|BTree + Range| BTreeRange[O(log n) Range Scan]
CheckIdx -->|No Index| FullScan[Full Table Scan]
HashLookup --> FilterRows[Apply Remaining Conditions]
BTreeRange --> FilterRows
FullScan --> SIMDFilter{Columnar Data?}
SIMDFilter -->|Yes| VectorFilter[SIMD Vectorized Filter]
SIMDFilter -->|No| RowFilter[Row-by-Row Filter]
VectorFilter --> Results[Build Result Set]
RowFilter --> Results
FilterRows --> Results
Key Types
| Type | Description |
|---|---|
RelationalEngine | Main engine struct with TensorStore backend |
RelationalConfig | Configuration for limits, timeouts, thresholds |
Schema | Table schema with column definitions and constraints |
Column | Column name, type, and nullability |
ColumnType | Int, Float, String, Bool, Bytes, Json |
Value | Typed value: Null, Int(i64), Float(f64), String(String), Bool(bool), Bytes(Vec<u8>), Json(Value) |
Row | Row with ID and ordered column values |
Condition | Composable filter predicate tree |
Constraint | Table constraint: PrimaryKey, Unique, ForeignKey, NotNull |
ForeignKeyConstraint | Foreign key definition with referential actions |
ReferentialAction | Restrict, Cascade, SetNull, SetDefault, NoAction |
RelationalError | Error variants for table/column/index/constraint operations |
ColumnData | Columnar storage for a single column with null bitmap |
SelectionVector | Bitmap-based row selection for SIMD operations |
OrderedKey | B-tree index key with total ordering semantics |
StreamingCursor | Iterator for batch-based query result streaming |
CursorBuilder | Builder for customizing streaming cursor options |
QueryMetrics | Query execution metrics for observability |
IndexTracker | Tracks index hits/misses to detect missing indexes |
Column Types
| Type | Rust Type | Storage Format | Description |
|---|---|---|---|
Int | i64 | 8-byte little-endian | 64-bit signed integer |
Float | f64 | 8-byte IEEE 754 | 64-bit floating point |
String | String | Dictionary-encoded | UTF-8 string with deduplication |
Bool | bool | Packed bitmap (64 values per u64) | Boolean |
Bytes | Vec<u8> | Raw bytes | Binary data |
Json | serde_json::Value | JSON string | JSON value |
Conditions
| Condition | Description | Index Support |
|---|---|---|
Condition::True | Matches all rows | N/A |
Condition::Eq(col, val) | Column equals value | Hash Index |
Condition::Ne(col, val) | Column not equals value | None |
Condition::Lt(col, val) | Column less than value | B-Tree Index |
Condition::Le(col, val) | Column less than or equal | B-Tree Index |
Condition::Gt(col, val) | Column greater than value | B-Tree Index |
Condition::Ge(col, val) | Column greater than or equal | B-Tree Index |
Condition::And(a, b) | Logical AND of two conditions | Partial (first indexable) |
Condition::Or(a, b) | Logical OR of two conditions | None |
Conditions can be combined using .and() and .or() methods:
#![allow(unused)] fn main() { // age >= 18 AND age < 65 let condition = Condition::Ge("age".into(), Value::Int(18)) .and(Condition::Lt("age".into(), Value::Int(65))); // status = 'active' OR priority > 5 let condition = Condition::Eq("status".into(), Value::String("active".into())) .or(Condition::Gt("priority".into(), Value::Int(5))); }
The special column _id filters by row ID and can be indexed.
Error Types
| Error | Cause |
|---|---|
TableNotFound | Table does not exist |
TableAlreadyExists | Creating duplicate table |
ColumnNotFound | Update references unknown column |
ColumnAlreadyExists | Column already exists in table |
TypeMismatch | Value type does not match column type |
NullNotAllowed | NULL in non-nullable column |
IndexAlreadyExists | Creating duplicate index |
IndexNotFound | Dropping non-existent index |
IndexCorrupted | Index data is corrupted |
StorageError | Underlying Tensor Store error |
InvalidName | Invalid table or column name |
SchemaCorrupted | Schema metadata is corrupted |
TransactionNotFound | Transaction ID not found |
TransactionInactive | Transaction already committed/aborted |
LockConflict | Lock conflict with another transaction |
LockTimeout | Lock acquisition timed out |
RollbackFailed | Rollback operation failed |
ResultTooLarge | Result set exceeds maximum size |
TooManyTables | Maximum table count exceeded |
TooManyIndexes | Maximum index count exceeded |
QueryTimeout | Query execution timed out |
PrimaryKeyViolation | Primary key constraint violated |
UniqueViolation | Unique constraint violated |
ForeignKeyViolation | Foreign key constraint violated on insert/update |
ForeignKeyRestrict | Foreign key prevents delete/update |
ConstraintNotFound | Constraint does not exist |
ConstraintAlreadyExists | Constraint already exists |
ColumnHasConstraint | Column has constraint preventing operation |
CannotAddColumn | Cannot add column due to constraint |
Storage Model
Tables, rows, and indexes are stored in Tensor Store with specific key patterns:
| Key Pattern | Content |
|---|---|
_meta:table:{name} | Schema metadata |
{table}:{row_id} | Row data |
_idx:{table}:{column} | Hash index metadata |
_idx:{table}:{column}:{hash} | Hash index entries (list of row IDs) |
_btree:{table}:{column} | B-tree index metadata |
_btree:{table}:{column}:{sortable_key} | B-tree index entries |
_col:{table}:{column}:data | Columnar data storage |
_col:{table}:{column}:ids | Columnar row ID mapping |
_col:{table}:{column}:nulls | Columnar null bitmap |
_col:{table}:{column}:meta | Columnar metadata |
Schema metadata encodes:
_columns: Comma-separated column names_col:{name}: Type and nullability for each column
Row Storage Format
Each row is stored as a TensorData object:
#![allow(unused)] fn main() { // Internal row structure { "_id": Scalar(Int(row_id)), "name": Scalar(String("Alice")), "age": Scalar(Int(30)), "email": Scalar(String("alice@example.com")) } }
Usage Examples
Table Operations
#![allow(unused)] fn main() { let engine = RelationalEngine::new(); // Create table with schema let schema = Schema::new(vec![ Column::new("name", ColumnType::String), Column::new("age", ColumnType::Int), Column::new("email", ColumnType::String).nullable(), ]); engine.create_table("users", schema)?; // Check existence engine.table_exists("users")?; // -> bool // List all tables let tables = engine.list_tables(); // -> Vec<String> // Get schema let schema = engine.get_schema("users")?; // Drop table (deletes all rows and indexes) engine.drop_table("users")?; // Row count engine.row_count("users")?; // -> usize }
CRUD Operations
#![allow(unused)] fn main() { // INSERT let mut values = HashMap::new(); values.insert("name".to_string(), Value::String("Alice".into())); values.insert("age".to_string(), Value::Int(30)); let row_id = engine.insert("users", values)?; // BATCH INSERT (59x faster for bulk inserts) let rows: Vec<HashMap<String, Value>> = (0..1000) .map(|i| { let mut values = HashMap::new(); values.insert("name".to_string(), Value::String(format!("User{}", i))); values.insert("age".to_string(), Value::Int(20 + i)); values }) .collect(); let row_ids = engine.batch_insert("users", rows)?; // SELECT let rows = engine.select("users", Condition::Eq("age".into(), Value::Int(30)))?; // UPDATE let mut updates = HashMap::new(); updates.insert("age".to_string(), Value::Int(31)); let count = engine.update( "users", Condition::Eq("name".into(), Value::String("Alice".into())), updates )?; // DELETE let count = engine.delete_rows("users", Condition::Lt("age".into(), Value::Int(18)))?; }
Constraints
The engine supports four constraint types for data integrity:
| Constraint | Description |
|---|---|
PrimaryKey | Unique + not null, identifies rows uniquely |
Unique | Values must be unique (NULLs allowed) |
ForeignKey | References rows in another table |
NotNull | Column cannot contain NULL values |
#![allow(unused)] fn main() { use relational_engine::{Constraint, ForeignKeyConstraint, ReferentialAction}; // Create table with constraints let schema = Schema::with_constraints( vec![ Column::new("id", ColumnType::Int), Column::new("email", ColumnType::String), Column::new("dept_id", ColumnType::Int).nullable(), ], vec![ Constraint::primary_key("pk_users", vec!["id".to_string()]), Constraint::unique("uq_email", vec!["email".to_string()]), ], ); engine.create_table("users", schema)?; // Add constraint after table creation engine.add_constraint("users", Constraint::not_null("nn_email", "email"))?; // Add foreign key with referential actions let fk = ForeignKeyConstraint::new( "fk_users_dept", vec!["dept_id".to_string()], "departments", vec!["id".to_string()], ) .on_delete(ReferentialAction::SetNull) .on_update(ReferentialAction::Cascade); engine.add_constraint("users", Constraint::foreign_key(fk))?; // Get constraints let constraints = engine.get_constraints("users")?; // Drop constraint engine.drop_constraint("users", "uq_email")?; }
Referential Actions
Foreign keys support these actions on delete/update of referenced rows:
| Action | Description |
|---|---|
Restrict (default) | Prevent the operation |
Cascade | Cascade to referencing rows |
SetNull | Set referencing columns to NULL |
SetDefault | Set referencing columns to default |
NoAction | Same as Restrict, checked at commit |
ALTER TABLE Operations
#![allow(unused)] fn main() { // Add a new column (nullable or with default) engine.add_column("users", Column::new("phone", ColumnType::String).nullable())?; // Drop a column (fails if column has constraints) engine.drop_column("users", "phone")?; // Rename a column (updates constraints automatically) engine.rename_column("users", "email", "email_address")?; }
Joins
All six SQL join types are supported using hash join algorithm (O(n+m)):
#![allow(unused)] fn main() { // INNER JOIN - Only matching rows from both tables let joined = engine.join("users", "posts", "_id", "user_id")?; // Returns: Vec<(Row, Row)> // LEFT JOIN - All rows from left, matching from right (or None) let joined = engine.left_join("users", "posts", "_id", "user_id")?; // Returns: Vec<(Row, Option<Row>)> // RIGHT JOIN - All rows from right, matching from left (or None) let joined = engine.right_join("users", "posts", "_id", "user_id")?; // Returns: Vec<(Option<Row>, Row)> // FULL JOIN - All rows from both tables let joined = engine.full_join("users", "posts", "_id", "user_id")?; // Returns: Vec<(Option<Row>, Option<Row>)> // CROSS JOIN (Cartesian product) let joined = engine.cross_join("users", "posts")?; // Returns: Vec<(Row, Row)> with n*m rows // NATURAL JOIN (on common column names) let joined = engine.natural_join("users", "user_profiles")?; // Returns: Vec<(Row, Row)> matching on all common columns }
Aggregate Functions
#![allow(unused)] fn main() { // COUNT(*) - count all rows let count = engine.count("users", Condition::True)?; // COUNT(column) - count non-null values let count = engine.count_column("users", "email", Condition::True)?; // SUM - returns f64 let total = engine.sum("orders", "amount", Condition::True)?; // AVG - returns Option<f64> (None if no matching rows) let avg = engine.avg("orders", "amount", Condition::True)?; // MIN/MAX - returns Option<Value> let min = engine.min("products", "price", Condition::True)?; let max = engine.max("products", "price", Condition::True)?; }
Indexes
Hash Indexes
Hash indexes provide O(1) equality lookups for Condition::Eq queries:
#![allow(unused)] fn main() { // Create hash index engine.create_index("users", "age")?; // Check existence engine.has_index("users", "age"); // -> bool // Get indexed columns engine.get_indexed_columns("users"); // -> Vec<String> // Drop index engine.drop_index("users", "age")?; }
Hash Index Implementation Details:
graph LR
subgraph "Hash Index Structure"
Value[Column Value] --> Hash[hash_key()]
Hash --> Bucket["_idx:table:col:hash"]
Bucket --> IDs["Vec<row_id>"]
end
The hash index uses value-specific hashing:
| Value Type | Hash Format | Example |
|---|---|---|
Null | "null" | "null" |
Int(i) | "i:{value}" | "i:42" |
Float(f) | "f:{bits}" | "f:4614253070214989087" |
String(s) | "s:{hash}" | "s:a1b2c3d4" |
Bool(b) | "b:{value}" | "b:true" |
Hash index performance:
| Query Type | Without Index | With Index | Speedup |
|---|---|---|---|
| Equality (2% match on 5K rows) | 5.96ms | 126us | 47x |
| Single row by _id (5K rows) | 5.59ms | 3.5us | 1,597x |
B-Tree Indexes
B-tree indexes accelerate range queries (Lt, Le, Gt, Ge) with O(log n +
m) complexity:
#![allow(unused)] fn main() { // Create B-tree index engine.create_btree_index("users", "age")?; // Check existence engine.has_btree_index("users", "age"); // -> bool // Get B-tree indexed columns engine.get_btree_indexed_columns("users"); // -> Vec<String> // Drop index engine.drop_btree_index("users", "age")?; // Range queries now use the index engine.select("users", Condition::Ge("age".into(), Value::Int(18)))?; }
B-Tree Index Implementation Details:
The B-tree index uses a dual-storage approach:
- In-memory BTreeMap: For O(log n) range operations
- Persistent TensorStore: For durability and recovery
#![allow(unused)] fn main() { // Internal B-tree index structure btree_indexes: RwLock<HashMap< (String, String), // (table, column) BTreeMap<OrderedKey, Vec<u64>> // value -> row_ids >> }
OrderedKey for Total Ordering:
The OrderedKey enum provides correct ordering semantics:
#![allow(unused)] fn main() { pub enum OrderedKey { Null, // Sorts first Bool(bool), // false < true Int(i64), // Standard integer ordering Float(OrderedFloat), // NaN < all other values String(String), // Lexicographic ordering } }
Sortable Key Encoding:
For persistent storage, values are encoded to maintain lexicographic ordering:
| Type | Encoding | Example |
|---|---|---|
Null | "0" | "0" |
Int(i) | "i{hex(i + 2^63)}" | "i8000000000000000" for 0 |
Float(f) | "f{sortable_bits}" | IEEE 754 with sign handling |
String(s) | "s{s}" | "sAlice" |
Bool(b) | "b0" or "b1" | "b1" for true |
Integer encoding shifts the range from [-2^63, 2^63-1] to [0, 2^64-1] for
correct lexicographic ordering of negative numbers.
B-Tree Range Operations:
#![allow(unused)] fn main() { // Internal range lookup fn btree_range_lookup(&self, table: &str, column: &str, value: &Value, op: RangeOp) -> Option<Vec<u64>> { match op { RangeOp::Lt => btree.range(..target), RangeOp::Le => btree.range(..=target), RangeOp::Gt => btree.range((Excluded(target), Unbounded)), RangeOp::Ge => btree.range(target..), } } }
Columnar Architecture
The engine uses columnar storage with SIMD-accelerated filtering:
Columnar Data Structures
graph TD
subgraph "ColumnData"
Name[name: String]
RowIDs[row_ids: Vec<u64>]
Nulls[nulls: NullBitmap]
Values[values: ColumnValues]
end
subgraph "ColumnValues Variants"
Int["Int(Vec<i64>)"]
Float["Float(Vec<f64>)"]
String["String { dict, indices }"]
Bool["Bool(Vec<u64>)"]
end
subgraph "NullBitmap Variants"
None["None (no nulls)"]
Dense["Dense(Vec<u64>)"]
Sparse["Sparse(Vec<u64>)"]
end
Values --> Int
Values --> Float
Values --> String
Values --> Bool
Nulls --> None
Nulls --> Dense
Nulls --> Sparse
Null Bitmap Selection:
None: When column has no null valuesSparse: When nulls are < 10% of rows (stores positions)Dense: When nulls are >= 10% of rows (stores bitmap)
SIMD Filtering
Column data is stored in contiguous arrays enabling 4-wide SIMD vectorized
comparisons using the wide crate:
#![allow(unused)] fn main() { // SIMD filter implementation using wide::i64x4 pub fn filter_lt_i64(values: &[i64], threshold: i64, result: &mut [u64]) { let chunks = values.len() / 4; let threshold_vec = i64x4::splat(threshold); for i in 0..chunks { let offset = i * 4; let v = i64x4::new([ values[offset], values[offset + 1], values[offset + 2], values[offset + 3], ]); let cmp = v.cmp_lt(threshold_vec); let mask_arr: [i64; 4] = cmp.into(); for (j, &m) in mask_arr.iter().enumerate() { if m != 0 { let bit_pos = offset + j; result[bit_pos / 64] |= 1u64 << (bit_pos % 64); } } } // Handle remainder with scalar fallback let start = chunks * 4; for i in start..values.len() { if values[i] < threshold { result[i / 64] |= 1u64 << (i % 64); } } } }
Available SIMD Filter Functions:
| Function | Operation | Types |
|---|---|---|
filter_lt_i64 | Less than | i64 |
filter_le_i64 | Less than or equal | i64 |
filter_gt_i64 | Greater than | i64 |
filter_ge_i64 | Greater than or equal | i64 |
filter_eq_i64 | Equal | i64 |
filter_ne_i64 | Not equal | i64 |
filter_lt_f64 | Less than | f64 |
filter_gt_f64 | Greater than | f64 |
filter_eq_f64 | Equal (with epsilon) | f64 |
Bitmap Operations:
#![allow(unused)] fn main() { // AND two selection bitmaps pub fn bitmap_and(a: &[u64], b: &[u64], result: &mut [u64]) // OR two selection bitmaps pub fn bitmap_or(a: &[u64], b: &[u64], result: &mut [u64]) // Count set bits pub fn popcount(bitmap: &[u64]) -> usize // Extract selected indices pub fn selected_indices(bitmap: &[u64], max_count: usize) -> Vec<usize> }
Selection Vectors
Query results use bitmap-based selection vectors to avoid copying data:
#![allow(unused)] fn main() { pub struct SelectionVector { bitmap: Vec<u64>, // Packed bits indicating selected rows row_count: usize, } impl SelectionVector { // Create selection of all rows pub fn all(row_count: usize) -> Self; // Create empty selection pub fn none(row_count: usize) -> Self; // Check if row is selected pub fn is_selected(&self, idx: usize) -> bool; // Count selected rows pub fn count(&self) -> usize; // AND two selections (intersection) pub fn intersect(&self, other: &SelectionVector) -> SelectionVector; // OR two selections (union) pub fn union(&self, other: &SelectionVector) -> SelectionVector; } }
Columnar Select API
#![allow(unused)] fn main() { // Materialize columns for SIMD filtering engine.materialize_columns("users", &["age", "name"])?; // Check if columnar data exists engine.has_columnar_data("users", "age"); // -> bool // Select with columnar scan options let options = ColumnarScanOptions { projection: Some(vec!["name".into()]), // Only return these columns prefer_columnar: true, // Use SIMD when available }; let rows = engine.select_columnar( "users", Condition::Gt("age".into(), Value::Int(50)), options )?; // Drop columnar data engine.drop_columnar_data("users", "age")?; }
Condition Evaluation
Two evaluation methods are available:
| Method | Input | Performance | Use Case |
|---|---|---|---|
evaluate(&row) | Row struct | Legacy, creates intermediate objects | Row-by-row filtering |
evaluate_tensor(&tensor) | TensorData | 31% faster, no intermediate allocation | Direct tensor filtering |
The engine automatically chooses the optimal evaluation path:
flowchart TD
Cond[Condition] --> CheckColumnar{Columnar Data Available?}
CheckColumnar -->|Yes| CheckType{Int Column?}
CheckColumnar -->|No| RowEval[evaluate_tensor per row]
CheckType -->|Yes| SIMDEval[SIMD Vectorized Filter]
CheckType -->|No| RowEval
SIMDEval --> Bitmap[Selection Bitmap]
RowEval --> Filter[Filter Matching Rows]
Bitmap --> Materialize[Materialize Results]
Filter --> Materialize
Join Algorithm Implementations
Hash Join (INNER, LEFT, RIGHT, FULL)
All equality joins use the hash join algorithm with O(n+m) complexity:
flowchart LR
subgraph "Build Phase"
RightTable[Right Table] --> BuildHash[Build Hash Index]
BuildHash --> HashIndex["HashMap<hash, Vec<idx>>"]
end
subgraph "Probe Phase"
LeftTable[Left Table] --> Probe[Probe Hash Index]
Probe --> HashIndex
HashIndex --> Match[Find Matching Rows]
end
Match --> Results[Join Results]
Hash Join Implementation:
#![allow(unused)] fn main() { pub fn join(&self, table_a: &str, table_b: &str, on_a: &str, on_b: &str) -> Result<Vec<(Row, Row)>> { let rows_a = self.select(table_a, Condition::True)?; let rows_b = self.select(table_b, Condition::True)?; // Build phase: index the right table let mut index: HashMap<String, Vec<usize>> = HashMap::with_capacity(rows_b.len()); for (i, row) in rows_b.iter().enumerate() { if let Some(val) = row.get_with_id(on_b) { let hash = val.hash_key(); index.entry(hash).or_default().push(i); } } // Probe phase: scan left table and probe index let mut results = Vec::with_capacity(min(rows_a.len(), rows_b.len())); for row_a in &rows_a { if let Some(val) = row_a.get_with_id(on_a) { let hash = val.hash_key(); if let Some(indices) = index.get(&hash) { for &i in indices { let row_b = &rows_b[i]; // Verify actual equality (handles hash collisions) if row_b.get_with_id(on_b).as_ref() == Some(&val) { results.push((row_a.clone(), row_b.clone())); } } } } } Ok(results) } }
Parallel Join Optimization:
When left table exceeds PARALLEL_THRESHOLD (1000 rows), joins use Rayon for
parallel probing:
#![allow(unused)] fn main() { if rows_a.len() >= Self::PARALLEL_THRESHOLD { rows_a.par_iter() .flat_map(|row_a| { // Parallel probe of hash index }) .collect() } }
Natural Join
Natural join finds all common column names and joins on their equality:
#![allow(unused)] fn main() { pub fn natural_join(&self, table_a: &str, table_b: &str) -> Result<Vec<(Row, Row)>> { let schema_a = self.get_schema(table_a)?; let schema_b = self.get_schema(table_b)?; // Find common columns let cols_a: HashSet<_> = schema_a.columns.iter().map(|c| c.name.as_str()).collect(); let cols_b: HashSet<_> = schema_b.columns.iter().map(|c| c.name.as_str()).collect(); let common_cols: Vec<_> = cols_a.intersection(&cols_b).copied().collect(); // No common columns = cross join if common_cols.is_empty() { return self.cross_join(table_a, table_b); } // Build composite hash key from all common columns // ... } }
Aggregate Function Internals
Parallel Aggregation
For tables exceeding PARALLEL_THRESHOLD (1000 rows), aggregates use parallel
reduction:
#![allow(unused)] fn main() { pub fn avg(&self, table: &str, column: &str, condition: Condition) -> Result<Option<f64>> { let rows = self.select(table, condition)?; let (total, count) = if rows.len() >= Self::PARALLEL_THRESHOLD { // Parallel map-reduce rows.par_iter() .map(|row| extract_numeric(row, column)) .reduce(|| (0.0, 0u64), |(s1, c1), (s2, c2)| (s1 + s2, c1 + c2)) } else { // Sequential accumulation let mut total = 0.0; let mut count = 0u64; for row in &rows { // accumulate... } (total, count) }; if count == 0 { Ok(None) } else { Ok(Some(total / count as f64)) } } }
MIN/MAX with Parallel Reduction
#![allow(unused)] fn main() { pub fn min(&self, table: &str, column: &str, condition: Condition) -> Result<Option<Value>> { let rows = self.select(table, condition)?; if rows.len() >= Self::PARALLEL_THRESHOLD { rows.par_iter() .filter_map(|row| row.get(column).filter(|v| !matches!(v, Value::Null))) .reduce_with(|a, b| { if a.partial_cmp_value(&b) == Some(Ordering::Less) { a } else { b } }) } else { // Sequential scan } } }
Performance Characteristics
| Operation | Complexity | Notes |
|---|---|---|
insert | O(1) + O(k) | Schema validation + store put + k index updates |
batch_insert | O(n) + O(n*k) | Single schema lookup, 59x faster than n inserts |
select (no index) | O(n) | Full table scan with SIMD filter |
select (hash index) | O(1) | Direct lookup via hash index |
select (btree range) | O(log n + m) | B-tree lookup + m matching rows |
update | O(n) + O(k) | Scan + conditional update + index maintenance |
delete_rows | O(n) + O(k) | Scan + conditional delete + index removal |
join | O(n+m) | Hash join for all 6 join types |
cross_join | O(n*m) | Cartesian product |
count/sum/avg/min/max | O(n) | Single pass over matching rows |
create_index | O(n) | Scan all rows to build index |
materialize_columns | O(n) | Extract column to contiguous array |
Where k = number of indexes on the table, n = rows in left table, m = rows in right table.
Parallel Threshold
Operations automatically switch to parallel execution when row count exceeds
PARALLEL_THRESHOLD:
#![allow(unused)] fn main() { impl RelationalEngine { const PARALLEL_THRESHOLD: usize = 1000; } }
Parallel Operations:
delete_rows(parallel deletion via Rayon)join(parallel probe phase)sum,avg,min,max(parallel reduction)
Configuration
RelationalConfig
The engine can be configured with RelationalConfig:
#![allow(unused)] fn main() { let config = RelationalConfig { max_tables: Some(1000), // Maximum tables allowed max_indexes_per_table: Some(10), // Maximum indexes per table max_btree_entries: 10_000_000, // Maximum B-tree index entries default_query_timeout_ms: Some(5000),// Default query timeout max_query_timeout_ms: Some(300_000), // Maximum allowed timeout (5 min) slow_query_threshold_ms: 100, // Slow query warning threshold max_query_result_rows: Some(10_000), // Maximum rows per query transaction_timeout_secs: 60, // Transaction timeout lock_timeout_secs: 30, // Lock acquisition timeout }; let engine = RelationalEngine::with_config(config); }
| Option | Default | Description |
|---|---|---|
max_tables | None (unlimited) | Maximum number of tables |
max_indexes_per_table | None (unlimited) | Maximum indexes per table |
max_btree_entries | 10,000,000 | Maximum B-tree index entries total |
default_query_timeout_ms | None | Default timeout for queries |
max_query_timeout_ms | 300,000 (5 min) | Maximum allowed query timeout |
slow_query_threshold_ms | 100 | Threshold for slow query warnings |
max_query_result_rows | None (unlimited) | Maximum rows returned per query |
transaction_timeout_secs | 60 | Transaction timeout |
lock_timeout_secs | 30 | Lock acquisition timeout |
Internal Constants
| Constant | Value | Description |
|---|---|---|
PARALLEL_THRESHOLD | 1000 | Minimum rows for parallel operations |
| Null bitmap sparse threshold | 10% | Use sparse bitmap when nulls < 10% |
| SIMD vector width | 4 | i64x4/f64x4 operations |
Observability
The observability module provides query metrics, slow query detection, and
index usage tracking.
Query Metrics
#![allow(unused)] fn main() { use relational_engine::observability::{QueryMetrics, check_slow_query}; use std::time::Duration; let metrics = QueryMetrics::new("users", "select") .with_rows_scanned(10000) .with_rows_returned(50) .with_index("idx_user_id") .with_duration(Duration::from_millis(25)); // Log warning if query exceeds threshold check_slow_query(&metrics, 100); // threshold in ms }
Index Tracking
Track index usage to identify missing indexes:
#![allow(unused)] fn main() { use relational_engine::observability::IndexTracker; let tracker = IndexTracker::new(); // Record when index is used tracker.record_hit("users", "id"); // Record when index could have been used but wasn't tracker.record_miss("users", "email"); // Get reports of columns needing indexes let reports = tracker.report_misses(); for report in reports { println!( "Table {}, column {}: {} misses, {} hits", report.table, report.column, report.miss_count, report.hit_count ); } // Aggregate statistics let total_hits = tracker.total_hits(); let total_misses = tracker.total_misses(); }
Slow Query Warnings
The check_slow_query function logs a tracing::warn! when queries exceed
the threshold:
#![allow(unused)] fn main() { use relational_engine::observability::{check_slow_query, warn_full_table_scan}; // Warn if query took > 100ms check_slow_query(&metrics, 100); // Warn about full table scans on large tables (> 1000 rows) warn_full_table_scan("users", "select", 5000); }
Streaming Cursor API
For large result sets, use streaming cursors to avoid loading all rows into memory at once. The cursor fetches rows in configurable batches.
Basic Usage
#![allow(unused)] fn main() { use relational_engine::{StreamingCursor, Condition}; // Create streaming cursor with default batch size (1000) let cursor = engine.select_streaming("users", Condition::True); // Iterate over results for row_result in cursor { let row = row_result?; println!("User: {:?}", row); } }
Custom Options
#![allow(unused)] fn main() { // With custom batch size let cursor = engine.select_streaming("users", Condition::True) .with_batch_size(100) .with_max_rows(5000); // Using the builder let cursor = engine.select_streaming_builder("users", Condition::True) .batch_size(100) .max_rows(5000) .build(); // Check cursor state let mut cursor = engine.select_streaming("users", Condition::True); while let Some(row) = cursor.next() { println!("Yielded so far: {}", cursor.rows_yielded()); } println!("Exhausted: {}", cursor.is_exhausted()); }
Cursor Methods
| Method | Description |
|---|---|
with_batch_size(n) | Set rows fetched per batch (default: 1000) |
with_max_rows(n) | Limit total rows returned |
rows_yielded() | Number of rows returned so far |
is_exhausted() | Whether cursor has no more rows |
Edge Cases and Gotchas
NULL Handling
-
NULL in conditions: Comparisons with NULL columns return
false:#![allow(unused)] fn main() { // If email is NULL, this returns false (not true!) Condition::Lt("email".into(), Value::String("z".into())) }
2. **NULL in joins**: NULL values never match in join conditions:
```rust
// Post with user_id = NULL will not join with any user
engine.join("users", "posts", "_id", "user_id")
- COUNT vs COUNT(column):
count()counts all rowscount_column()counts non-null values only
Type Mismatches
Comparisons between incompatible types return false rather than error:
#![allow(unused)] fn main() { // Age is Int, comparing with String returns 0 matches (not error) engine.select("users", Condition::Lt("age".into(), Value::String("30".into()))); }
Index Maintenance
Indexes are automatically maintained on INSERT, UPDATE, and DELETE:
#![allow(unused)] fn main() { // Creating index AFTER data exists engine.insert("users", values)?; // No index update engine.create_index("users", "age")?; // Scans all rows // Creating index BEFORE data exists engine.create_index("users", "age")?; // Empty index engine.insert("users", values)?; // Updates index }
Batch Insert Atomicity
batch_insert validates ALL rows upfront before inserting any:
#![allow(unused)] fn main() { let rows = vec![valid_row, invalid_row]; // Fails on validation - NO rows inserted (not partial insert) engine.batch_insert("users", rows); }
B-Tree Index Recovery
B-tree indexes maintain both in-memory and persistent state. The in-memory BTreeMap is rebuilt lazily on first access after restart.
Best Practices
Index Selection
| Query Pattern | Recommended Index |
|---|---|
WHERE col = value | Hash Index |
WHERE col > value | B-Tree Index |
WHERE col BETWEEN a AND b | B-Tree Index |
WHERE col IN (...) | Hash Index |
| Unique lookups by ID | Hash Index on _id |
Columnar Materialization
Materialize columns when:
- Performing many range scans on large tables
- Query selectivity is low (scanning most rows)
- Column data fits in memory
#![allow(unused)] fn main() { // Good: Materialize frequently-filtered columns engine.materialize_columns("events", &["timestamp", "user_id"])?; // Query uses SIMD acceleration engine.select_columnar("events", Condition::Gt("timestamp".into(), Value::Int(cutoff)), ColumnarScanOptions { prefer_columnar: true, .. } )?; }
Batch Operations
Use batch_insert for bulk loading:
#![allow(unused)] fn main() { // Bad: 1000 individual inserts for row in rows { engine.insert("table", row)?; // 1000 schema lookups } // Good: Single batch insert engine.batch_insert("table", rows)?; // 1 schema lookup, 59x faster }
SQL Features via Query Router
When using the relational engine through query_router, additional SQL features
are available:
ORDER BY and OFFSET
SELECT * FROM users ORDER BY age ASC;
SELECT * FROM users ORDER BY department DESC, name ASC;
SELECT * FROM users ORDER BY email NULLS FIRST;
SELECT * FROM users ORDER BY created_at DESC LIMIT 10 OFFSET 20;
GROUP BY and HAVING
SELECT department, COUNT(*), AVG(salary) FROM employees GROUP BY department;
SELECT product, SUM(quantity) as total
FROM orders
GROUP BY product
HAVING SUM(quantity) > 100;
Related Modules
| Module | Relationship |
|---|---|
tensor_store | Storage backend for tables, rows, and indexes |
query_router | Executes SQL queries using RelationalEngine |
neumann_parser | Parses SQL statements into AST |
tensor_unified | Multi-engine unified storage layer |
Feature Summary
Implemented
| Feature | Description |
|---|---|
| Hash indexes | O(1) equality lookups |
| B-tree indexes | O(log n) range query acceleration |
| All 6 JOIN types | INNER, LEFT, RIGHT, FULL, CROSS, NATURAL |
| Aggregate functions | COUNT, SUM, AVG, MIN, MAX |
| ORDER BY | Multi-column sorting with ASC/DESC, NULLS FIRST/LAST |
| LIMIT/OFFSET | Pagination support |
| GROUP BY + HAVING | Row grouping with aggregate filtering |
| Columnar storage | SIMD-accelerated filtering with selection vectors |
| Batch operations | 59x faster bulk inserts |
| Parallel operations | Rayon-based parallelism for large tables |
| Dictionary encoding | String column compression |
| Transactions | Row-level ACID with undo log - see Transactions |
| Constraints | PRIMARY KEY, UNIQUE, FOREIGN KEY, NOT NULL |
| Foreign Keys | Full referential integrity with CASCADE/SET NULL/RESTRICT |
| ALTER TABLE | add_column, drop_column, rename_column |
| Streaming cursors | Memory-efficient iteration over large result sets |
| Observability | Query metrics, slow query detection, index tracking |
Future Considerations
| Feature | Status |
|---|---|
| Query Optimization | Not implemented |
| Subqueries | Not implemented |
| Window Functions | Not implemented |
| Composite Indexes | Not implemented |
Relational Engine Transactions
Local ACID transactions for single-shard relational operations. This module provides row-level locking, undo logging for rollback, and timeout-based deadlock prevention.
Transactions in the relational engine operate within a single shard and do not coordinate with other nodes. For distributed transactions across multiple shards, see Distributed Transactions.
Architecture
flowchart TD
subgraph TransactionManager
TxMap[Transaction Map]
TxCounter[TX Counter]
DefaultTimeout[Default Timeout]
end
subgraph RowLockManager
LockMap["Locks: (table, row_id) -> RowLock"]
TxLocks["TX Locks: tx_id -> Vec<(table, row_id)>"]
LockTimeout[Lock Timeout]
end
subgraph Transaction
TxId[Transaction ID]
Phase[TxPhase State]
UndoLog["Undo Log: Vec<UndoEntry>"]
AffectedTables[Affected Tables]
StartTime[Started At]
end
TransactionManager --> RowLockManager
TransactionManager --> Transaction
Transaction --> UndoLog
Component Relationships
+-------------------+
| RelationalEngine |
+-------------------+
|
v
+-------------------+ +------------------+
| TransactionManager| --> | RowLockManager |
| - begin() | | - try_lock() |
| - commit() | | - release() |
| - rollback() | | - cleanup_expired|
+-------------------+ +------------------+
|
v
+-------------------+
| Transaction |
| - tx_id |
| - phase |
| - undo_log |
+-------------------+
Transaction Lifecycle
Transactions follow a 5-state machine with well-defined transitions:
flowchart LR
Active --> Committing
Active --> Aborting
Committing --> Committed
Aborting --> Aborted
style Committed fill:#9f9
style Aborted fill:#f99
| Phase | Description | Valid Transitions |
|---|---|---|
Active | Operations allowed, locks acquired | Committing, Aborting |
Committing | Finalizing changes | Committed |
Committed | Changes permanent (terminal) | None |
Aborting | Rolling back via undo log | Aborted |
Aborted | Changes reverted (terminal) | None |
Lifecycle Flow
#![allow(unused)] fn main() { // Begin transaction let tx_id = tx_manager.begin(); // Phase: Active // Acquire row locks for modifications lock_manager.try_lock(tx_id, &[("users", 1), ("users", 2)])?; // Record undo entries for rollback tx_manager.record_undo(tx_id, UndoEntry::UpdatedRow { ... }); // Option 1: Commit tx_manager.set_phase(tx_id, TxPhase::Committing); // Apply changes... tx_manager.set_phase(tx_id, TxPhase::Committed); tx_manager.release_locks(tx_id); // Option 2: Rollback tx_manager.set_phase(tx_id, TxPhase::Aborting); for entry in tx_manager.get_undo_log(tx_id).iter().rev() { // Apply undo entry... } tx_manager.set_phase(tx_id, TxPhase::Aborted); tx_manager.release_locks(tx_id); }
Undo Log
The undo log stores entries needed to reverse each operation on rollback. Entries are applied in reverse order during rollback.
UndoEntry Variants
| Variant | Action on Rollback | Stored Data |
|---|---|---|
InsertedRow | Delete the row | table, slab_row_id, row_id, index_entries |
UpdatedRow | Restore old values | table, slab_row_id, row_id, old_values, index_changes |
DeletedRow | Re-insert the row | table, slab_row_id, row_id, old_values, index_entries |
Undo Log Structure
Transaction Undo Log:
+---------------------------------------------------+
| Entry 0: InsertedRow { table: "users", row_id: 5 }|
+---------------------------------------------------+
| Entry 1: UpdatedRow { table: "users", row_id: 3, |
| old_values: [Int(25)], ... } |
+---------------------------------------------------+
| Entry 2: DeletedRow { table: "orders", row_id: 7, |
| old_values: [...], index_entries: [...] }|
+---------------------------------------------------+
Rollback order: Entry 2 -> Entry 1 -> Entry 0
Index Change Tracking
Updates that modify indexed columns record IndexChange entries:
#![allow(unused)] fn main() { pub struct IndexChange { pub column: String, // Column name pub old_value: Value, // Value before update pub new_value: Value, // Value after update } }
On rollback, index entries are reverted:
- Remove new index entry for the new value
- Restore old index entry for the old value
Row-Level Locking
The RowLockManager provides pessimistic row-level locking with atomic
multi-row acquisition.
Lock Acquisition
flowchart TD
Request[Lock Request] --> Check{All rows available?}
Check -->|Yes| Acquire[Acquire all locks atomically]
Check -->|No| Conflict[Return LockConflictInfo]
Acquire --> Success[Success]
style Conflict fill:#f99
style Success fill:#9f9
Locks are acquired atomically to prevent partial lock acquisition:
#![allow(unused)] fn main() { // Atomic multi-row locking let rows = vec![ ("users".to_string(), 1), ("users".to_string(), 2), ("orders".to_string(), 5), ]; match lock_manager.try_lock(tx_id, &rows) { Ok(()) => { // All locks acquired } Err(conflict) => { // No locks acquired, conflict info provided println!("Blocked by tx {}", conflict.blocking_tx); } } }
Lock Semantics
| Property | Behavior |
|---|---|
| Granularity | Row-level: (table, row_id) |
| Acquisition | All-or-nothing atomic |
| Re-entry | Same transaction can re-acquire its locks |
| Timeout | Configurable, default 30 seconds |
| Expiration | Expired locks are treated as available |
Lock Conflict Detection
When a lock conflict occurs, LockConflictInfo provides details:
#![allow(unused)] fn main() { pub struct LockConflictInfo { pub blocking_tx: u64, // Transaction holding the lock pub table: String, // Table name pub row_id: u64, // Row ID } }
Expired Lock Handling
Locks automatically expire after the configured timeout:
#![allow(unused)] fn main() { // Check if lock is expired if lock.is_expired() { // Lock can be acquired by another transaction } // Periodic cleanup of expired locks let cleaned = lock_manager.cleanup_expired(); }
Deadline
The Deadline struct provides monotonic time-based timeout checking:
#![allow(unused)] fn main() { // Create deadline with timeout let deadline = Deadline::from_timeout_ms(Some(5000)); // Check expiration if deadline.is_expired() { return Err(TimeoutError); } // Get remaining time if let Some(remaining) = deadline.remaining_ms() { println!("{} ms remaining", remaining); } // Never-expiring deadline let no_deadline = Deadline::never(); }
Benefits of monotonic time (Instant):
- Immune to system clock changes
- Consistent timeout behavior
- No backwards time jumps
Configuration
Transaction Manager
| Parameter | Default | Description |
|---|---|---|
default_timeout | 60 seconds | Maximum transaction duration |
Row Lock Manager
| Parameter | Default | Description |
|---|---|---|
default_timeout | 30 seconds | Maximum time to hold a lock |
Custom Configuration
#![allow(unused)] fn main() { use std::time::Duration; // Custom transaction timeout let tx_manager = TransactionManager::with_timeout( Duration::from_secs(120) ); // Custom lock timeout let lock_manager = RowLockManager::with_default_timeout( Duration::from_secs(60) ); }
Error Handling
Lock Errors
| Error | Cause | Recovery |
|---|---|---|
LockConflict | Row locked by another transaction | Retry with exponential backoff |
| Lock timeout | Could not acquire lock in time | Rollback and retry |
Transaction Errors
| Error | Cause | Recovery |
|---|---|---|
| Transaction not found | Invalid transaction ID | Start new transaction |
| Transaction expired | Exceeded timeout | Transaction auto-aborted |
| Invalid phase transition | Illegal state change | Check transaction state |
Cleanup Operations
#![allow(unused)] fn main() { // Clean up expired transactions (releases their locks) let expired_count = tx_manager.cleanup_expired(); // Clean up expired locks only let expired_locks = lock_manager.cleanup_expired(); }
Comparison with Distributed Transactions
| Aspect | Relational Tx | Distributed Tx (2PC) |
|---|---|---|
| Scope | Single shard | Cross-shard |
| Protocol | Local locking | Prepare/Commit phases |
| Deadlock detection | Timeout-based | Wait-for graph analysis |
| Coordinator | None | DistributedTxCoordinator |
| Recovery | Undo log | WAL + 2PC recovery |
| Latency | Low (local) | Higher (network round-trips) |
| Isolation | Row-level locks | Key-level locks |
For cross-shard transactions, use tensor_chain’s
Distributed Transactions.
Usage Examples
Basic Transaction
#![allow(unused)] fn main() { let engine = RelationalEngine::new(); let tx_manager = engine.tx_manager(); // Begin transaction let tx_id = tx_manager.begin(); // Perform operations (simplified) // engine.insert_tx(tx_id, "users", values)?; // engine.update_tx(tx_id, "users", condition, updates)?; // Commit tx_manager.set_phase(tx_id, TxPhase::Committing); // Apply pending changes... tx_manager.set_phase(tx_id, TxPhase::Committed); tx_manager.release_locks(tx_id); tx_manager.remove(tx_id); }
Rollback on Error
#![allow(unused)] fn main() { let tx_id = tx_manager.begin(); match perform_operations(tx_id) { Ok(()) => { tx_manager.set_phase(tx_id, TxPhase::Committed); } Err(e) => { tx_manager.set_phase(tx_id, TxPhase::Aborting); // Apply undo log in reverse if let Some(undo_log) = tx_manager.get_undo_log(tx_id) { for entry in undo_log.iter().rev() { apply_undo(entry); } } tx_manager.set_phase(tx_id, TxPhase::Aborted); } } tx_manager.release_locks(tx_id); tx_manager.remove(tx_id); }
Checking Lock Status
#![allow(unused)] fn main() { let lock_manager = tx_manager.lock_manager(); // Check if row is locked if lock_manager.is_locked("users", 42) { println!("Row is locked"); } // Get lock holder if let Some(holder_tx) = lock_manager.lock_holder("users", 42) { println!("Locked by transaction {}", holder_tx); } // Count active locks println!("{} active locks", lock_manager.active_lock_count()); println!("{} locks held by tx {}", lock_manager.locks_held_by(tx_id), tx_id); }
Key Types
| Type | Description |
|---|---|
TxPhase | 5-state transaction phase enum |
Transaction | Active transaction state |
UndoEntry | Undo log entry for rollback |
IndexChange | Index modification record |
RowLock | Row lock with timeout |
RowLockManager | Row-level lock manager |
LockConflictInfo | Lock conflict details |
Deadline | Monotonic timeout checker |
TransactionManager | Transaction lifecycle manager |
Source References
relational_engine/src/transaction.rs- Transaction implementationrelational_engine/src/lib.rs- Integration with RelationalEngine
Graph Engine
The Graph Engine provides graph operations on top of the Tensor Store. It implements a labeled property graph model with support for both directed and undirected edges, BFS traversals, and shortest path finding. The engine inherits thread safety from TensorStore and supports cross-engine unified entity connections.
Design Principles
| Principle | Description |
|---|---|
| Layered Architecture | Depends only on Tensor Store for persistence |
| Direction-Aware | Supports both directed and undirected edges |
| BFS Traversal | Breadth-first search for shortest paths |
| Cycle-Safe | Handles cyclic graphs without infinite loops via visited set |
| Unified Entities | Edges can connect shared entities across engines |
| Thread Safety | Inherits from Tensor Store’s DashMap (~16 shards) |
| Serializable Types | All types implement serde Serialize/Deserialize |
| Parallel Optimization | High-degree node deletion uses rayon for parallelism |
Key Types
Core Types
| Type | Description |
|---|---|
GraphEngine | Main entry point for graph operations |
Node | Graph node with id, label, and properties |
Edge | Graph edge with from/to nodes, type, properties, and direction flag |
Path | Result of path finding containing node and edge sequences |
Direction | Edge traversal direction (Outgoing, Incoming, Both) |
PropertyValue | Node/edge property values (Null, Int, Float, String, Bool) |
GraphError | Error types for graph operations |
PropertyValue Variants
| Variant | Rust Type | Description |
|---|---|---|
Null | — | NULL value |
Int | i64 | 64-bit signed integer |
Float | f64 | 64-bit floating point |
String | String | UTF-8 string |
Bool | bool | Boolean |
Error Types
| Error | Cause |
|---|---|
NodeNotFound(u64) | Node with given ID does not exist |
EdgeNotFound(u64) | Edge with given ID does not exist |
PathNotFound | No path exists between the specified nodes |
StorageError(String) | Underlying Tensor Store error |
Architecture
graph TB
subgraph GraphEngine
GE[GraphEngine]
NC[Node Counter<br/>AtomicU64]
EC[Edge Counter<br/>AtomicU64]
end
subgraph Storage["Storage Model"]
NM["node:{id}"]
NO["node:{id}:out"]
NI["node:{id}:in"]
EM["edge:{id}"]
end
subgraph Operations
CreateNode[create_node]
CreateEdge[create_edge]
Neighbors[neighbors]
Traverse[traverse]
FindPath[find_path]
end
GE --> NC
GE --> EC
GE --> TS[TensorStore]
CreateNode --> NM
CreateNode --> NO
CreateNode --> NI
CreateEdge --> EM
CreateEdge --> NO
CreateEdge --> NI
Neighbors --> NO
Neighbors --> NI
Traverse --> Neighbors
FindPath --> NO
Internal Architecture
GraphEngine Struct
#![allow(unused)] fn main() { pub struct GraphEngine { store: TensorStore, // Underlying key-value storage node_counter: AtomicU64, // Atomic counter for node IDs edge_counter: AtomicU64, // Atomic counter for edge IDs } }
The engine uses atomic counters (SeqCst ordering) to generate unique IDs:
- Node IDs start at 1 and increment monotonically
- Edge IDs are separate from node IDs
- Both counters support concurrent ID allocation
Key Generation Functions
#![allow(unused)] fn main() { fn node_key(id: u64) -> String { format!("node:{}", id) } fn edge_key(id: u64) -> String { format!("edge:{}", id) } fn outgoing_edges_key(node_id: u64) -> String { format!("node:{}:out", node_id) } fn incoming_edges_key(node_id: u64) -> String { format!("node:{}:in", node_id) } }
Storage Model
Nodes and edges are stored in Tensor Store using the following key patterns:
| Key Pattern | Content | TensorData Fields |
|---|---|---|
node:{id} | Node data | _id, _type="node", _label, user properties |
node:{id}:out | List of outgoing edge IDs | e{edge_id} fields |
node:{id}:in | List of incoming edge IDs | e{edge_id} fields |
edge:{id} | Edge data | _id, _type="edge", _from, _to, _edge_type, _directed, user properties |
Edge List Storage Format
Edge lists are stored as TensorData with dynamically named fields:
#![allow(unused)] fn main() { // Each edge ID stored as: "e{edge_id}" -> edge_id tensor.set("e1", TensorValue::Scalar(ScalarValue::Int(1))); tensor.set("e5", TensorValue::Scalar(ScalarValue::Int(5))); }
This format allows O(1) edge addition but O(n) edge listing. The edge retrieval scans all keys starting with ‘e’:
#![allow(unused)] fn main() { fn get_edge_list(&self, key: &str) -> Result<Vec<u64>> { let tensor = self.store.get(key)?; let mut edges = Vec::new(); for k in tensor.keys() { if k.starts_with('e') { if let Some(TensorValue::Scalar(ScalarValue::Int(id))) = tensor.get(k) { edges.push(*id as u64); } } } Ok(edges) } }
API Reference
Engine Construction
#![allow(unused)] fn main() { // Create new engine with internal store let engine = GraphEngine::new(); // Create engine with shared store (for cross-engine queries) let store = TensorStore::new(); let engine = GraphEngine::with_store(store.clone()); // Access underlying store let store = engine.store(); }
Node Operations
#![allow(unused)] fn main() { // Create node with properties let mut props = HashMap::new(); props.insert("name".to_string(), PropertyValue::String("Alice".into())); props.insert("age".to_string(), PropertyValue::Int(30)); let id = engine.create_node("Person", props)?; // Get node by ID let node = engine.get_node(id)?; // Check node existence let exists = engine.node_exists(id); // Delete node (cascades to connected edges) engine.delete_node(id)?; // Count nodes in graph let count = engine.node_count(); }
Edge Operations
#![allow(unused)] fn main() { // Create directed edge let edge_id = engine.create_edge(from, to, "KNOWS", properties, true)?; // Create undirected edge let edge_id = engine.create_edge(from, to, "FRIENDS", properties, false)?; // Get edge by ID let edge = engine.get_edge(edge_id)?; }
Undirected Edge Implementation
When an undirected edge is created, it is added to four edge lists to enable bidirectional traversal:
#![allow(unused)] fn main() { if !directed { // Add to both nodes' outgoing AND incoming lists self.add_edge_to_list(Self::outgoing_edges_key(to), id)?; self.add_edge_to_list(Self::incoming_edges_key(from), id)?; } }
This enables undirected edges to be traversed from either endpoint regardless of direction filter.
Traversal Operations
#![allow(unused)] fn main() { // Get neighbors (all edge types, both directions) let neighbors = engine.neighbors(node_id, None, Direction::Both)?; // Get neighbors filtered by edge type let friends = engine.neighbors(node_id, Some("FRIENDS"), Direction::Both)?; // BFS traversal with depth limit let nodes = engine.traverse(start_id, Direction::Outgoing, max_depth, None)?; // Traversal filtered by edge type let deps = engine.traverse(start_id, Direction::Outgoing, 10, Some("DEPENDS_ON"))?; // Find shortest path (BFS) let path = engine.find_path(from_id, to_id)?; }
Direction Enum
| Direction | Behavior |
|---|---|
Outgoing | Follow edges away from the node |
Incoming | Follow edges toward the node |
Both | Follow edges in either direction |
BFS Traversal Algorithm
The traverse method implements breadth-first search with depth limiting and
cycle detection:
flowchart TD
Start[Start: traverse] --> Init[Initialize visited set<br/>Initialize result vec<br/>Initialize queue with start, depth=0]
Init --> Check{Queue empty?}
Check -- No --> Pop[Pop current_id, depth]
Pop --> GetNode[Get node, add to result]
GetNode --> DepthCheck{depth >= max_depth?}
DepthCheck -- Yes --> Check
DepthCheck -- No --> GetNeighbors[Get neighbor IDs]
GetNeighbors --> ForEach[For each neighbor]
ForEach --> Visited{Already visited?}
Visited -- Yes --> ForEach
Visited -- No --> Add[Add to visited<br/>Push to queue with depth+1]
Add --> ForEach
Check -- Yes --> Return[Return result]
Implementation Details
#![allow(unused)] fn main() { pub fn traverse( &self, start: u64, direction: Direction, max_depth: usize, edge_type: Option<&str>, ) -> Result<Vec<Node>> { if !self.node_exists(start) { return Err(GraphError::NodeNotFound(start)); } let mut visited = HashSet::new(); let mut result = Vec::new(); let mut queue = VecDeque::new(); queue.push_back((start, 0usize)); visited.insert(start); while let Some((current_id, depth)) = queue.pop_front() { if let Ok(node) = self.get_node(current_id) { result.push(node); } if depth >= max_depth { continue; } let neighbors = self.get_neighbor_ids(current_id, edge_type, direction)?; for neighbor_id in neighbors { if !visited.contains(&neighbor_id) { visited.insert(neighbor_id); queue.push_back((neighbor_id, depth + 1)); } } } Ok(result) } }
Key Properties
- Cycle-Safe: The
visitedHashSet prevents revisiting nodes - Depth-Limited: The
max_depthparameter bounds traversal depth - Level-Order: BFS naturally visits nodes in level order
- Start Node Included: The starting node is always in the result at depth 0
Shortest Path Algorithm
The find_path method uses BFS to find the shortest (minimum hop) path
between two nodes:
flowchart TD
Start[Start: find_path] --> Validate[Validate from and to exist]
Validate --> SameNode{from == to?}
SameNode -- Yes --> ReturnSingle[Return path with single node]
SameNode -- No --> InitBFS[Initialize BFS:<br/>visited set<br/>queue with from<br/>parent map]
InitBFS --> BFSLoop{Queue empty?}
BFSLoop -- Yes --> NotFound[Return PathNotFound]
BFSLoop -- No --> Dequeue[Dequeue current node]
Dequeue --> GetEdges[Get outgoing edges]
GetEdges --> ForEdge[For each edge]
ForEdge --> GetNeighbor[Determine neighbor<br/>considering direction]
GetNeighbor --> VisitedCheck{Visited?}
VisitedCheck -- Yes --> ForEdge
VisitedCheck -- No --> MarkVisited[Mark visited<br/>Record parent + edge]
MarkVisited --> FoundTarget{neighbor == to?}
FoundTarget -- Yes --> Reconstruct[Reconstruct path]
FoundTarget -- No --> Enqueue[Enqueue neighbor]
Enqueue --> ForEdge
ForEdge --> BFSLoop
Reconstruct --> Return[Return Path]
Implementation Details
#![allow(unused)] fn main() { pub fn find_path(&self, from: u64, to: u64) -> Result<Path> { // Validate endpoints exist if !self.node_exists(from) { return Err(GraphError::NodeNotFound(from)); } if !self.node_exists(to) { return Err(GraphError::NodeNotFound(to)); } // Handle trivial case if from == to { return Ok(Path { nodes: vec![from], edges: vec![], }); } // BFS for shortest path let mut visited = HashSet::new(); let mut queue = VecDeque::new(); let mut parent: HashMap<u64, (u64, u64)> = HashMap::new(); // node -> (parent_node, edge_id) queue.push_back(from); visited.insert(from); while let Some(current) = queue.pop_front() { let out_edges = self.get_edge_list(&Self::outgoing_edges_key(current))?; for edge_id in out_edges { if let Ok(edge) = self.get_edge(edge_id) { let neighbor = if edge.from == current { edge.to } else if !edge.directed && edge.to == current { edge.from } else { continue; }; if !visited.contains(&neighbor) { visited.insert(neighbor); parent.insert(neighbor, (current, edge_id)); if neighbor == to { return Ok(self.reconstruct_path(from, to, &parent)); } queue.push_back(neighbor); } } } } Err(GraphError::PathNotFound) } }
Path Reconstruction
The path is reconstructed by following parent pointers backwards from the target to the source:
#![allow(unused)] fn main() { fn reconstruct_path(&self, from: u64, to: u64, parent: &HashMap<u64, (u64, u64)>) -> Path { let mut nodes = Vec::new(); let mut edges = Vec::new(); let mut current = to; // Walk backwards from target to source while current != from { nodes.push(current); if let Some((p, edge_id)) = parent.get(¤t) { edges.push(*edge_id); current = *p; } else { break; } } nodes.push(from); // Reverse to get source-to-target order nodes.reverse(); edges.reverse(); Path { nodes, edges } } }
Parallel Deletion Optimization
High-degree nodes (>100 edges) use rayon’s parallel iterator for edge deletion:
#![allow(unused)] fn main() { const PARALLEL_THRESHOLD: usize = 100; pub fn delete_node(&self, id: u64) -> Result<()> { if !self.node_exists(id) { return Err(GraphError::NodeNotFound(id)); } // Collect all connected edges let out_edges = self.get_edge_list(&Self::outgoing_edges_key(id))?; let in_edges = self.get_edge_list(&Self::incoming_edges_key(id))?; let all_edges: Vec<u64> = out_edges.into_iter().chain(in_edges).collect(); // Parallel deletion for high-degree nodes if all_edges.len() >= Self::PARALLEL_THRESHOLD { all_edges.par_iter().for_each(|edge_id| { let _ = self.store.delete(&Self::edge_key(*edge_id)); }); } else { for edge_id in all_edges { let _ = self.store.delete(&Self::edge_key(edge_id)); } } // Delete node and edge lists self.store.delete(&Self::node_key(id))?; self.store.delete(&Self::outgoing_edges_key(id))?; self.store.delete(&Self::incoming_edges_key(id))?; Ok(()) } }
Performance Characteristics
| Edge Count | Deletion Strategy | Benefit |
|---|---|---|
| < 100 | Sequential | Lower overhead for small nodes |
| >= 100 | Parallel (rayon) | ~2-4x speedup on multi-core systems |
Unified Entity API
The Unified Entity API connects any shared entities (not just graph nodes) for
cross-engine queries. Entity edges use the _out and _in reserved fields
in TensorData, enabling the same entity key to have relational fields, graph
connections, and a vector embedding.
graph LR
subgraph Entity["Entity (TensorData)"]
Fields[User Fields<br/>name, age, etc.]
Out["_out<br/>[edge keys]"]
In["_in<br/>[edge keys]"]
Emb["_embedding<br/>[vector]"]
end
subgraph Engines
RE[Relational Engine]
GE[Graph Engine]
VE[Vector Engine]
end
Fields --> RE
Out --> GE
In --> GE
Emb --> VE
Reserved Fields
| Field | Type | Purpose |
|---|---|---|
_out | Vec<String> | Outgoing edge keys |
_in | Vec<String> | Incoming edge keys |
_embedding | Vec<f32> | Vector embedding |
_type | String | Entity type |
_id | i64 | Entity numeric ID |
_label | String | Entity label |
Entity Edge Key Format
Entity edges use a different key format from node-based edges:
edge:{edge_type}:{edge_id}
For example: edge:follows:42
API Reference
#![allow(unused)] fn main() { // Create engine with shared store let store = TensorStore::new(); let engine = GraphEngine::with_store(store.clone()); // Add directed edge between entities let edge_key = engine.add_entity_edge("user:1", "user:2", "follows")?; // Add undirected edge between entities let edge_key = engine.add_entity_edge_undirected("user:1", "user:2", "friend")?; // Get neighbors let neighbors = engine.get_entity_neighbors("user:1")?; let out_neighbors = engine.get_entity_neighbors_out("user:1")?; let in_neighbors = engine.get_entity_neighbors_in("user:1")?; // Get edge lists let outgoing = engine.get_entity_outgoing("user:1")?; let incoming = engine.get_entity_incoming("user:1")?; // Get edge details let (from, to, edge_type, directed) = engine.get_entity_edge(&edge_key)?; // Check if entity has edges let has_edges = engine.entity_has_edges("user:1"); // Delete edge engine.delete_entity_edge(&edge_key)?; // Scan for entities with edges let entities = engine.scan_entities_with_edges(); }
Undirected Entity Edges
For undirected entity edges, both entities receive the edge in both _out and
_in:
#![allow(unused)] fn main() { pub fn add_entity_edge_undirected( &self, key1: &str, key2: &str, edge_type: &str, ) -> Result<String> { // ... create edge data ... // Both entities get the edge in both directions let mut entity1 = self.get_or_create_entity(key1); entity1.add_outgoing_edge(edge_key.clone()); entity1.add_incoming_edge(edge_key.clone()); let mut entity2 = self.get_or_create_entity(key2); entity2.add_outgoing_edge(edge_key.clone()); entity2.add_incoming_edge(edge_key.clone()); Ok(edge_key) } }
Cross-Engine Integration
Query Router Integration
The Query Router provides unified queries combining graph traversal with vector similarity:
#![allow(unused)] fn main() { // Find entities similar to query that are connected to a specific entity let items = router.find_similar_connected("query:entity", "connected_to:entity", top_k)?; // Find graph neighbors sorted by embedding similarity let items = router.find_neighbors_by_similarity("entity:key", &query_vector, top_k)?; }
Tensor Vault Integration
Tensor Vault uses GraphEngine for access control relationships:
#![allow(unused)] fn main() { pub struct Vault { store: TensorStore, pub graph: Arc<GraphEngine>, // Shared graph for access edges // ... } }
Access control edges connect principals to secrets with permission metadata.
Tensor Chain Integration
Tensor Chain uses GraphEngine for block linking:
#![allow(unused)] fn main() { pub struct Chain { graph: Arc<GraphEngine>, // Stores blocks as nodes, links as edges // ... } }
Blocks are stored with chain:block:{height} keys and linked via graph edges
with type chain_next.
Performance Characteristics
| Operation | Complexity | Notes |
|---|---|---|
create_node | O(1) | Store put |
create_edge | O(1) | Store put + edge list updates |
get_node | O(1) | Store get |
get_edge | O(1) | Store get |
neighbors | O(e) | e = edges from node |
traverse | O(n + e) | BFS over reachable nodes |
find_path | O(n + e) | BFS shortest path |
delete_node | O(e) | Parallel for e >= 100 |
node_count | O(k) | k = total keys (scan-based) |
get_edge_list | O(k) | k = keys in edge list |
Memory Characteristics
| Data | Storage |
|---|---|
| Node | ~50-200 bytes + properties |
| Edge | ~50-150 bytes + properties |
| Edge list entry | ~10 bytes per edge |
Edge Cases and Gotchas
Self-Loop Edges
Self-loops (edges from a node to itself) are valid but filtered from neighbor results:
#![allow(unused)] fn main() { #[test] fn self_loop_edge() { let engine = GraphEngine::new(); let n1 = engine.create_node("A", HashMap::new()).unwrap(); engine.create_edge(n1, n1, "SELF", HashMap::new(), true).unwrap(); // Self-loop doesn't appear in neighbors let neighbors = engine.neighbors(n1, None, Direction::Both).unwrap(); assert_eq!(neighbors.len(), 0); } }
Same-Node Path
Finding a path from a node to itself returns a single-node path:
#![allow(unused)] fn main() { let path = engine.find_path(n1, n1)?; assert_eq!(path.nodes, vec![n1]); assert!(path.edges.is_empty()); }
Deleted Edge Orphans
When deleting a node, connected edges are deleted from storage but may remain in other nodes’ edge lists. This is a known limitation - the edge retrieval gracefully handles missing edges.
Bytes Property Conversion
ScalarValue::Bytes converts to PropertyValue::Null since PropertyValue
doesn’t support binary data:
#![allow(unused)] fn main() { let bytes = ScalarValue::Bytes(vec![1, 2, 3]); assert_eq!(PropertyValue::from_scalar(&bytes), PropertyValue::Null); }
Node Count Calculation
The node_count method uses a formula based on scan counts to account for
edge lists:
#![allow(unused)] fn main() { pub fn node_count(&self) -> usize { // Each node has 3 keys: node:{id}, node:{id}:out, node:{id}:in self.store.scan_count("node:") - self.store.scan_count("node:") / 3 * 2 } }
Best Practices
Use Shared Store for Cross-Engine Queries
#![allow(unused)] fn main() { // Create shared store first let store = TensorStore::new(); // Create engines with shared store let graph = GraphEngine::with_store(store.clone()); let vector = VectorEngine::with_store(store.clone()); // Now entities can have both graph edges and embeddings }
Prefer Entity API for Cross-Engine Data
Use the Unified Entity API when entities need to combine relational, graph, and vector data:
#![allow(unused)] fn main() { // Good: Entity API preserves all fields engine.add_entity_edge("user:1", "user:2", "follows")?; // Less flexible: Node API creates graph-only entities engine.create_node("User", props)?; }
Batch Edge Creation
When creating many edges, avoid creating them one at a time if possible. Consider the overhead of multiple store operations.
Choose Direction Wisely
- Use
Direction::Outgoingfor forward-only traversals (dependency graphs) - Use
Direction::Bothfor symmetric relationships (social graphs) - Use
Direction::Incomingfor reverse lookups (finding predecessors)
Set Appropriate Traversal Depth
BFS traversal can be expensive on dense graphs. Set max_depth based on
expected graph diameter:
#![allow(unused)] fn main() { // For typical social networks, 3-6 hops is usually sufficient let reachable = engine.traverse(start, Direction::Both, 4, None)?; }
Usage Examples
Social Network
#![allow(unused)] fn main() { let engine = GraphEngine::new(); // Create users let alice = engine.create_node("User", user_props("Alice"))?; let bob = engine.create_node("User", user_props("Bob"))?; let charlie = engine.create_node("User", user_props("Charlie"))?; // Create friendships (undirected) engine.create_edge(alice, bob, "FRIENDS", HashMap::new(), false)?; engine.create_edge(bob, charlie, "FRIENDS", HashMap::new(), false)?; // Find path from Alice to Charlie let path = engine.find_path(alice, charlie)?; // path.nodes = [alice, bob, charlie] // Get Alice's friends let friends = engine.neighbors(alice, Some("FRIENDS"), Direction::Both)?; }
Dependency Graph
#![allow(unused)] fn main() { let engine = GraphEngine::new(); // Create packages let app = engine.create_node("Package", package_props("app"))?; let lib_a = engine.create_node("Package", package_props("lib-a"))?; let lib_b = engine.create_node("Package", package_props("lib-b"))?; // Create dependencies (directed) engine.create_edge(app, lib_a, "DEPENDS_ON", HashMap::new(), true)?; engine.create_edge(app, lib_b, "DEPENDS_ON", HashMap::new(), true)?; engine.create_edge(lib_a, lib_b, "DEPENDS_ON", HashMap::new(), true)?; // Find all dependencies of app let deps = engine.traverse(app, Direction::Outgoing, 10, Some("DEPENDS_ON"))?; }
Cross-Engine Unified Entities
#![allow(unused)] fn main() { // Shared store for multiple engines let store = TensorStore::new(); let graph = GraphEngine::with_store(store.clone()); // Add graph edges between entities graph.add_entity_edge("user:1", "post:1", "created")?; graph.add_entity_edge("user:2", "post:1", "liked")?; // Query relationships let creators = graph.get_entity_neighbors_in("post:1")?; }
High-Degree Node Operations
#![allow(unused)] fn main() { let engine = GraphEngine::new(); // Create a hub with many connections (will use parallel deletion) let hub = engine.create_node("Hub", HashMap::new())?; for i in 0..150 { let leaf = engine.create_node("Leaf", HashMap::new())?; engine.create_edge(hub, leaf, "CONNECTS", HashMap::new(), true)?; } // Deletion will use parallel processing (150 > 100 threshold) engine.delete_node(hub)?; }
Configuration
The Graph Engine has minimal configuration as it inherits behavior from TensorStore:
| Setting | Value | Description |
|---|---|---|
| Parallel threshold | 100 | Edge count triggering parallel deletion |
| ID ordering | SeqCst | Atomic ordering for ID generation |
Dependencies
| Crate | Purpose |
|---|---|
tensor_store | Underlying key-value storage |
rayon | Parallel iteration for high-degree node deletion |
serde | Serialization of graph types |
Related Modules
| Module | Relationship |
|---|---|
| Tensor Store | Storage backend |
| Tensor Vault | Uses graph for access control |
| Tensor Chain | Uses graph for block linking |
| Query Router | Executes graph queries |
Vector Engine
Module 4 of Neumann. Provides embeddings storage and similarity search with SIMD-accelerated distance computations.
The Vector Engine builds on tensor_store to provide k-NN search capabilities.
It supports both brute-force O(n) search and HNSW O(log n) approximate search,
with automatic sparse vector optimization for memory efficiency.
Design Principles
| Principle | Description |
|---|---|
| Layered Architecture | Depends only on Tensor Store for persistence |
| Multiple Distance Metrics | Cosine, Euclidean, and Dot Product similarity |
| SIMD Acceleration | 8-wide SIMD for dot products and magnitudes |
| Dual Search Modes | Brute-force O(n) or HNSW O(log n) |
| Unified Entities | Embeddings can be attached to shared entities |
| Thread Safety | Inherits from Tensor Store |
| Serializable Types | All types implement serde::Serialize/Deserialize |
| Automatic Sparsity Detection | Vectors with >50% zeros stored efficiently |
Architecture
graph TB
subgraph VectorEngine
VE[VectorEngine]
SR[SearchResult]
DM[DistanceMetric]
VE --> |uses| SR
VE --> |uses| DM
end
subgraph TensorStore
TS[TensorStore]
HNSW[HNSWIndex]
SV[SparseVector]
SIMD[SIMD Functions]
ES[EmbeddingStorage]
end
VE --> |stores to| TS
VE --> |builds| HNSW
VE --> |uses| SV
VE --> |uses| SIMD
subgraph Storage
EMB["emb:{key}"]
ENT["entity:{key}._embedding"]
end
TS --> EMB
TS --> ENT
Key Types
| Type | Description |
|---|---|
VectorEngine | Main engine for storing and searching embeddings |
VectorEngineConfig | Configuration for engine behavior and memory bounds |
SearchResult | Result with key and similarity score |
DistanceMetric | Enum: Cosine, Euclidean, DotProduct |
ExtendedDistanceMetric | Extended metrics for HNSW (9+ variants) |
VectorError | Error types for vector operations |
EmbeddingInput | Input for batch store operations |
BatchResult | Result of batch operations |
Pagination | Parameters for paginated queries |
PagedResult<T> | Paginated query result |
HNSWIndex | Hierarchical navigable small world graph (re-exported from tensor_store) |
HNSWConfig | HNSW index configuration (re-exported from tensor_store) |
SparseVector | Memory-efficient sparse embedding storage |
FilterCondition | Filter for metadata-based search (Eq, Ne, Lt, Gt, And, Or, In, etc.) |
FilterValue | Value type for filters (Int, Float, String, Bool, Null) |
FilterStrategy | Strategy selection (Auto, PreFilter, PostFilter) |
FilteredSearchConfig | Configuration for filtered search behavior |
VectorCollectionConfig | Configuration for vector collections |
MetadataValue | Simplified value type for embedding metadata |
PersistentVectorIndex | Serializable index for disk persistence |
VectorError Variants
| Variant | Description | When Triggered |
|---|---|---|
NotFound | Embedding key doesn’t exist | get_embedding, delete_embedding |
DimensionMismatch | Vectors have different dimensions | compute_similarity, exceeds max_dimension |
EmptyVector | Empty vector provided | Any operation with vec![] |
InvalidTopK | top_k is 0 | search_similar, search_with_hnsw |
StorageError | Underlying Tensor Store error | Storage failures |
BatchValidationError | Invalid input in batch | batch_store_embeddings validation |
BatchOperationError | Operation failed in batch | batch_store_embeddings execution |
ConfigurationError | Invalid configuration | VectorEngineConfig::validate() |
CollectionExists | Collection already exists | create_collection with existing name |
CollectionNotFound | Collection not found | Collection operations on missing collection |
IoError | IO error during persistence | save_to_file, load_from_file |
SerializationError | Serialization error | Index persistence operations |
SearchTimeout | Search operation timed out | Search operations exceeding configured timeout |
Configuration
VectorEngineConfig
Configuration for the Vector Engine with memory bounds and performance tuning.
| Field | Type | Default | Description |
|---|---|---|---|
default_dimension | Option<usize> | None | Expected embedding dimension |
sparse_threshold | f32 | 0.5 | Sparsity threshold (0.0-1.0) |
parallel_threshold | usize | 5000 | Dataset size for parallel search |
default_metric | DistanceMetric | Cosine | Default distance metric |
max_dimension | Option<usize> | None | Maximum allowed dimension |
max_keys_per_scan | Option<usize> | None | Limit for unbounded scans |
batch_parallel_threshold | usize | 100 | Batch size for parallel processing |
search_timeout | Option<Duration> | None | Search operation timeout |
Configuration Presets
| Preset | Description | Key Settings |
|---|---|---|
default() | Balanced for most workloads | All defaults |
high_throughput() | Optimized for write-heavy loads | parallel_threshold: 1000 |
low_memory() | Memory-constrained environments | max_dimension: 4096, max_keys_per_scan: 10000, search_timeout: 30s |
Builder Methods
All builder methods are const fn for compile-time configuration:
#![allow(unused)] fn main() { use std::time::Duration; let config = VectorEngineConfig::default() .with_default_dimension(768) .with_sparse_threshold(0.7) .with_parallel_threshold(1000) .with_default_metric(DistanceMetric::Cosine) .with_max_dimension(4096) .with_max_keys_per_scan(50_000) .with_batch_parallel_threshold(200) .with_search_timeout(Duration::from_secs(5)); let engine = VectorEngine::with_config(config)?; }
Memory Bounds
For production deployments, configure memory bounds to prevent resource exhaustion:
#![allow(unused)] fn main() { // Reject embeddings larger than 4096 dimensions let config = VectorEngineConfig::default() .with_max_dimension(4096) .with_max_keys_per_scan(10_000); let engine = VectorEngine::with_config(config)?; // This will fail with DimensionMismatch engine.store_embedding("too_big", vec![0.0; 5000])?; // Error! }
Search Timeout
Configure a timeout for search operations to prevent runaway queries:
#![allow(unused)] fn main() { use std::time::Duration; use vector_engine::{VectorEngine, VectorEngineConfig, VectorError}; let config = VectorEngineConfig::default() .with_search_timeout(Duration::from_secs(5)); let engine = VectorEngine::with_config(config)?; match engine.search_similar(&query, 10) { Ok(results) => { /* process results */ }, Err(VectorError::SearchTimeout { operation, timeout_ms }) => { eprintln!("Search '{}' timed out after {}ms", operation, timeout_ms); }, Err(e) => { /* handle other errors */ }, } }
The timeout applies to all search methods. When a timeout occurs, no partial results are returned to prevent misleading results that may miss better matches.
Distance Metrics
| Metric | Formula | Score Range | Use Case | HNSW Support |
|---|---|---|---|---|
| Cosine | a.b / (‖a‖ * ‖b‖) | -1.0 to 1.0 | Semantic similarity | Yes |
| Euclidean | 1 / (1 + sqrt(sum((a-b)^2))) | 0.0 to 1.0 | Spatial distance | No (brute-force) |
| DotProduct | sum(a * b) | unbounded | Magnitude-aware | No (brute-force) |
All metrics return higher scores for better matches. Euclidean distance is transformed to similarity score.
Extended Distance Metrics (HNSW)
The ExtendedDistanceMetric enum provides additional metrics for HNSW-based
search via search_with_hnsw_and_metric():
| Metric | Description | Best For |
|---|---|---|
Cosine | Angle-based similarity | Text embeddings, normalized vectors |
Euclidean | L2 distance | Spatial data, absolute distances |
Angular | Cosine converted to angular | When angle interpretation needed |
Manhattan | L1 norm | Robust to outliers |
Chebyshev | L-infinity (max diff) | When max deviation matters |
Jaccard | Set similarity | Binary/sparse vectors, TF-IDF |
Overlap | Minimum overlap coefficient | Partial matches |
Geodesic | Spherical distance | Geographic coordinates |
Composite | Weighted combination | Custom similarity functions |
#![allow(unused)] fn main() { use vector_engine::ExtendedDistanceMetric; let (index, keys) = engine.build_hnsw_index_default()?; // Search with Jaccard similarity for sparse vectors let results = engine.search_with_hnsw_and_metric( &index, &keys, &query, 10, ExtendedDistanceMetric::Jaccard, )?; }
Distance Metric Implementation Details
flowchart TD
Query[Query Vector] --> MetricCheck{Which Metric?}
MetricCheck -->|Cosine| CosMag[Pre-compute query magnitude]
CosMag --> CosDot[SIMD dot product]
CosDot --> CosDiv[Divide by magnitudes]
CosDiv --> CosScore[Score: dot / mag_a * mag_b]
MetricCheck -->|Euclidean| EucDiff[Compute differences]
EucDiff --> EucSum[Sum of squares]
EucSum --> EucSqrt[Square root]
EucSqrt --> EucScore[Score: 1 / 1 + distance]
MetricCheck -->|DotProduct| DotSIMD[SIMD dot product]
DotSIMD --> DotScore[Score: raw dot product]
Cosine Similarity Edge Cases
#![allow(unused)] fn main() { // Zero-magnitude vectors return 0.0 similarity let zero = vec![0.0, 0.0, 0.0]; let normal = vec![1.0, 2.0, 3.0]; VectorEngine::compute_similarity(&zero, &normal)?; // Returns 0.0 // Identical vectors return 1.0 VectorEngine::compute_similarity(&normal, &normal)?; // Returns 1.0 // Opposite vectors return -1.0 let opposite = vec![-1.0, -2.0, -3.0]; VectorEngine::compute_similarity(&normal, &opposite)?; // Returns -1.0 // Orthogonal vectors return 0.0 let a = vec![1.0, 0.0]; let b = vec![0.0, 1.0]; VectorEngine::compute_similarity(&a, &b)?; // Returns 0.0 }
Euclidean Distance Transformation
The engine transforms Euclidean distance to similarity score using 1 / (1 + distance):
| Distance | Similarity Score |
|---|---|
| 0.0 | 1.0 (identical) |
| 1.0 | 0.5 |
| 2.0 | 0.333 |
| 9.0 | 0.1 |
| Infinity | 0.0 |
SIMD Implementation
The Vector Engine uses 8-wide SIMD operations via the wide crate for
accelerated distance computations.
SIMD Dot Product Algorithm
#![allow(unused)] fn main() { // Simplified view of the SIMD implementation pub fn dot_product(a: &[f32], b: &[f32]) -> f32 { let chunks = a.len() / 8; // Process 8 floats at a time let remainder = a.len() % 8; let mut sum = f32x8::ZERO; // Process 8 elements at a time with SIMD for i in 0..chunks { let offset = i * 8; let va = f32x8::from(&a[offset..offset + 8]); let vb = f32x8::from(&b[offset..offset + 8]); sum += va * vb; // Parallel multiply-add } // Sum SIMD lanes + handle remainder scalar let arr: [f32; 8] = sum.into(); let mut result: f32 = arr.iter().sum(); // Handle remainder with scalar operations let start = chunks * 8; for i in 0..remainder { result += a[start + i] * b[start + i]; } result } }
SIMD Performance Characteristics
| Dimension | SIMD Speedup | Notes |
|---|---|---|
| 8 | 1x | Baseline (single SIMD operation) |
| 64 | 4-6x | Full pipeline utilization |
| 384 | 6-8x | Sentence Transformers size |
| 768 | 6-8x | BERT embedding size |
| 1536 | 6-8x | OpenAI ada-002 size |
| 3072 | 6-8x | OpenAI text-embedding-3-large |
SIMD operations are cache-friendly due to sequential memory access patterns.
API Reference
Basic Operations
#![allow(unused)] fn main() { let engine = VectorEngine::new(); // Store an embedding engine.store_embedding("doc1", vec![0.1, 0.2, 0.3])?; // Get an embedding let vector = engine.get_embedding("doc1")?; // Delete an embedding engine.delete_embedding("doc1")?; // Check existence engine.exists("doc1"); // -> bool // Count embeddings engine.count(); // -> usize // List all keys let keys = engine.list_keys(); // Clear all embeddings engine.clear()?; // Get dimension (from first embedding) engine.dimension(); // -> Option<usize> }
Similarity Search
#![allow(unused)] fn main() { // Find top-k most similar (cosine by default) let query = vec![0.1, 0.2, 0.3]; let results = engine.search_similar(&query, 5)?; for result in results { println!("Key: {}, Score: {}", result.key, result.score); } // Search with specific metric let results = engine.search_similar_with_metric( &query, 5, DistanceMetric::Euclidean )?; // Direct similarity computation let similarity = VectorEngine::compute_similarity(&vec_a, &vec_b)?; }
Filtered Search
Search with metadata filters to narrow results without post-processing:
#![allow(unused)] fn main() { use vector_engine::{FilterCondition, FilterValue, FilteredSearchConfig, FilterStrategy}; // Build a filter condition let filter = FilterCondition::Eq("category".to_string(), FilterValue::String("science".to_string())) .and(FilterCondition::Gt("year".to_string(), FilterValue::Int(2020))); // Search with filter (auto strategy) let results = engine.search_similar_filtered(&query, 10, &filter, None)?; // Search with explicit pre-filter strategy (best for selective filters) let config = FilteredSearchConfig::pre_filter(); let results = engine.search_similar_filtered(&query, 10, &filter, Some(config))?; // Search with post-filter and custom oversample let config = FilteredSearchConfig::post_filter().with_oversample(5); let results = engine.search_similar_filtered(&query, 10, &filter, Some(config))?; }
Filter Conditions
| Condition | Description | Example |
|---|---|---|
Eq(field, value) | Equality | category = "science" |
Ne(field, value) | Not equal | status != "deleted" |
Lt(field, value) | Less than | price < 100 |
Le(field, value) | Less than or equal | price <= 100 |
Gt(field, value) | Greater than | year > 2020 |
Ge(field, value) | Greater than or equal | year >= 2020 |
And(a, b) | Logical AND | Combined conditions |
Or(a, b) | Logical OR | Alternative conditions |
In(field, values) | Value in list | status IN ["active", "pending"] |
Contains(field, substr) | String contains | title CONTAINS "rust" |
StartsWith(field, prefix) | String prefix | name STARTS WITH "doc:" |
Exists(field) | Field exists | HAS embedding |
True | Always matches | No filter |
Filter Strategies
| Strategy | When to Use | Behavior |
|---|---|---|
Auto | Default | Estimates selectivity and chooses |
PreFilter | < 10% matches | Filters first, then searches subset |
PostFilter | > 10% matches | Searches with oversample, then filters |
flowchart TD
Query[Query + Filter] --> Strategy{Which Strategy?}
Strategy -->|Auto| Estimate[Estimate Selectivity]
Estimate -->|< 10%| Pre[Pre-Filter]
Estimate -->|>= 10%| Post[Post-Filter]
Strategy -->|PreFilter| Pre
Strategy -->|PostFilter| Post
Pre --> Filter1[Filter all keys]
Filter1 --> Search1[Search filtered subset]
Search1 --> Result[Top-K Results]
Post --> Search2[Search with oversample]
Search2 --> Filter2[Filter candidates]
Filter2 --> Result
Filter Helper Methods
Utilities for working with filters:
#![allow(unused)] fn main() { // Estimate how selective a filter is (0.0 = matches nothing, 1.0 = matches all) let selectivity = engine.estimate_filter_selectivity(&filter); // Count how many embeddings match a filter let matching = engine.count_matching(&filter); // Get keys of all matching embeddings let keys = engine.list_keys_matching(&filter); }
Metadata Storage
Store and retrieve metadata alongside embeddings:
#![allow(unused)] fn main() { use tensor_store::TensorValue; use std::collections::HashMap; // Store embedding with metadata let mut metadata = HashMap::new(); metadata.insert("category".to_string(), TensorValue::from("science")); metadata.insert("year".to_string(), TensorValue::from(2024i64)); metadata.insert("score".to_string(), TensorValue::from(0.95f64)); engine.store_embedding_with_metadata("doc1", vec![0.1, 0.2, 0.3], metadata)?; // Get all metadata let meta = engine.get_metadata("doc1")?; // Get specific field let category = engine.get_metadata_field("doc1", "category")?; // Update metadata (merges with existing) let mut updates = HashMap::new(); updates.insert("score".to_string(), TensorValue::from(0.98f64)); engine.update_metadata("doc1", updates)?; // Check if metadata field exists if engine.has_metadata_field("doc1", "category") { // Remove specific metadata field engine.remove_metadata_field("doc1", "category")?; } }
Batch Operations
For bulk insert and delete operations with parallel processing:
#![allow(unused)] fn main() { use vector_engine::EmbeddingInput; // Batch store - validates all inputs first, then stores in parallel let inputs = vec![ EmbeddingInput::new("doc1", vec![0.1, 0.2, 0.3]), EmbeddingInput::new("doc2", vec![0.2, 0.3, 0.4]), EmbeddingInput::new("doc3", vec![0.3, 0.4, 0.5]), ]; let result = engine.batch_store_embeddings(inputs)?; println!("Stored {} embeddings", result.count); // -> 3 // Batch delete - returns count of successfully deleted let keys = vec!["doc1".to_string(), "doc2".to_string()]; let deleted = engine.batch_delete_embeddings(keys)?; println!("Deleted {} embeddings", deleted); // -> 2 }
Batches larger than batch_parallel_threshold (default: 100) use parallel
processing via rayon.
Pagination
For memory-efficient iteration over large datasets:
#![allow(unused)] fn main() { use vector_engine::Pagination; // List keys with pagination let page = Pagination::new(0, 100); // skip=0, limit=100 let result = engine.list_keys_paginated(page); println!("Items: {}, Has more: {}", result.items.len(), result.has_more); // Get total count with pagination let page = Pagination::new(0, 100).with_total(); let result = engine.list_keys_paginated(page); println!("Total: {:?}", result.total_count); // Some(total) // Paginated similarity search let page = Pagination::new(10, 5); // skip first 10, return 5 let results = engine.search_similar_paginated(&query, 100, page)?; // Paginated entity search let results = engine.search_entities_paginated(&query, 100, page)?; }
Use list_keys_bounded() for production to enforce max_keys_per_scan limits.
Search Flow Diagram
sequenceDiagram
participant Client
participant VE as VectorEngine
participant TS as TensorStore
participant SIMD
Client->>VE: search_similar(query, k)
VE->>VE: Validate query (non-empty, k > 0)
VE->>SIMD: Pre-compute query magnitude
VE->>TS: scan("emb:")
TS-->>VE: List of embedding keys
alt Dataset < 5000 vectors
VE->>VE: Sequential search
else Dataset >= 5000 vectors
VE->>VE: Parallel search (rayon)
end
loop For each embedding
VE->>TS: get(key)
TS-->>VE: TensorData
VE->>VE: Extract vector (dense or sparse)
VE->>SIMD: cosine_similarity(query, stored)
VE->>VE: Collect SearchResult
end
VE->>VE: Sort by score descending
VE->>VE: Truncate to top k
VE-->>Client: Vec<SearchResult>
HNSW Index
For large datasets, build an HNSW index for O(log n) search:
#![allow(unused)] fn main() { // Build index with default config let (index, key_mapping) = engine.build_hnsw_index_default()?; // Search using the index let results = engine.search_with_hnsw(&index, &key_mapping, &query, 10)?; // Build with custom config let config = HNSWConfig::high_recall(); let (index, key_mapping) = engine.build_hnsw_index(config)?; // Direct HNSW operations let index = HNSWIndex::new(); index.insert(vec![1.0, 2.0, 3.0]); let results = index.search(&query, 10); // Search with custom ef (recall/speed tradeoff) let results = index.search_with_ef(&query, 10, 200); }
HNSW Search Flow
flowchart TD
Query[Query Vector] --> Entry[Entry Point at Max Layer]
Entry --> Greedy1[Greedy Search Layer L]
Greedy1 --> |Find closest| Greedy2[Greedy Search Layer L-1]
Greedy2 --> |...|GreedyN[Greedy Search until Layer 1]
GreedyN --> Layer0[Full ef-Search at Layer 0]
Layer0 --> Candidates[Candidate Pool]
Candidates --> |BinaryHeap min-heap| Visit[Visit Neighbors]
Visit --> Distance[Compute Distances]
Distance --> |Update| Results[Result Pool]
Results --> |BinaryHeap max-heap| Prune[Keep top ef]
Prune --> |More candidates?| Visit
Prune --> |Done| TopK[Return Top K]
Unified Entity Mode
Attach embeddings directly to entities for cross-engine queries:
#![allow(unused)] fn main() { let store = TensorStore::new(); let engine = VectorEngine::with_store(store.clone()); // Set embedding on an entity engine.set_entity_embedding("user:1", vec![0.1, 0.2, 0.3])?; // Get embedding from an entity let embedding = engine.get_entity_embedding("user:1")?; // Check if entity has embedding engine.entity_has_embedding("user:1"); // -> bool // Remove embedding (preserves other entity data) engine.remove_entity_embedding("user:1")?; // Search entities with embeddings let results = engine.search_entities(&query, 5)?; // Scan all entities with embeddings let entity_keys = engine.scan_entities_with_embeddings(); // Count entities with embeddings let count = engine.count_entities_with_embeddings(); }
Unified entity embeddings are stored in the _embedding field of the entity’s
TensorData.
Collections
Collections provide isolated namespaces for organizing embeddings by type or purpose. Each collection can have its own dimension constraints and distance metric configuration.
Creating and Managing Collections
#![allow(unused)] fn main() { use vector_engine::{VectorEngine, VectorCollectionConfig, DistanceMetric}; let engine = VectorEngine::new(); // Create collection with custom config let config = VectorCollectionConfig::default() .with_dimension(768) .with_metric(DistanceMetric::Cosine) .with_auto_index(5000); // Auto-build HNSW at 5000 vectors engine.create_collection("documents", config)?; // List collections let collections = engine.list_collections(); // Check if collection exists engine.collection_exists("documents"); // -> true // Get collection config let config = engine.get_collection_config("documents"); // Delete collection (removes all vectors in it) engine.delete_collection("documents")?; }
Storing in Collections
#![allow(unused)] fn main() { use std::collections::HashMap; use tensor_store::TensorValue; // Store vector in collection engine.store_in_collection("documents", "doc1", vec![0.1, 0.2, 0.3])?; // Store with metadata let mut metadata = HashMap::new(); metadata.insert("title".to_string(), TensorValue::from("Introduction to Rust")); metadata.insert("author".to_string(), TensorValue::from("Alice")); engine.store_in_collection_with_metadata( "documents", "doc1", vec![0.1, 0.2, 0.3], metadata )?; }
Searching in Collections
#![allow(unused)] fn main() { use vector_engine::{FilterCondition, FilterValue}; // Basic search in collection let results = engine.search_in_collection("documents", &query, 10)?; // Filtered search in collection let filter = FilterCondition::Eq("author".to_string(), FilterValue::String("Alice".to_string())); let results = engine.search_filtered_in_collection( "documents", &query, 10, &filter, None )?; }
Collection Key Isolation
Collections use prefixed storage keys to ensure isolation:
| Operation | Storage Key Pattern |
|---|---|
| Default embeddings | emb:{key} |
| Collection embeddings | coll:{collection}:emb:{key} |
| Entity embeddings | {entity_key}._embedding |
VectorCollectionConfig
| Field | Type | Default | Description |
|---|---|---|---|
dimension | Option<usize> | None | Enforced dimension (rejects mismatches) |
distance_metric | DistanceMetric | Cosine | Default metric for this collection |
auto_index | bool | false | Auto-build HNSW on threshold |
auto_index_threshold | usize | 1000 | Vector count to trigger auto-index |
Index Persistence
Save and restore vector indices for fast startup:
#![allow(unused)] fn main() { use std::path::Path; // Save all collections to directory (one JSON file per collection) let saved = engine.save_all_indices(Path::new("./vector_index"))?; // Load all indices from directory let loaded = engine.load_all_indices(Path::new("./vector_index"))?; // Save single collection to JSON engine.save_index("documents", Path::new("./documents.json"))?; // Save single collection to compact binary format engine.save_index_binary("documents", Path::new("./documents.bin"))?; // Load single collection from JSON (returns collection name) let collection = engine.load_index(Path::new("./documents.json"))?; // Load single collection from binary let collection = engine.load_index_binary(Path::new("./documents.bin"))?; // Get a snapshot for manual serialization let index: PersistentVectorIndex = engine.snapshot_collection("documents"); }
PersistentVectorIndex Format
| Field | Type | Description |
|---|---|---|
collection | String | Collection name |
config | VectorCollectionConfig | Collection configuration |
vectors | Vec<VectorEntry> | All vectors with metadata |
created_at | u64 | Unix timestamp |
version | u32 | Format version (currently 1) |
Storage Model
| Key Pattern | Content | Use Case |
|---|---|---|
emb:{key} | TensorData with “vector” field | Default collection embeddings |
coll:{collection}:emb:{key} | TensorData with “vector” field | Named collection embeddings |
{entity_key} | TensorData with “_embedding” field | Unified entities |
Automatic Sparse Storage
Vectors with >50% zeros are automatically stored as sparse vectors:
#![allow(unused)] fn main() { // Detection threshold: nnz * 2 <= len (i.e., sparsity >= 50%) fn should_use_sparse(vector: &[f32]) -> bool { let nnz = vector.iter().filter(|&&v| v.abs() > 1e-6).count(); nnz * 2 <= vector.len() } // 97% sparse vector (3 non-zeros in 100 elements) let mut sparse = vec![0.0f32; 100]; sparse[0] = 1.0; sparse[50] = 2.0; sparse[99] = 3.0; // Stored efficiently as SparseVector engine.store_embedding("sparse_doc", sparse)?; // Retrieved as dense for computation let dense = engine.get_embedding("sparse_doc")?; }
Storage Format Comparison
| Format | Memory per Element | Best For |
|---|---|---|
| Dense | 4 bytes | Sparsity < 50% |
| Sparse | 8 bytes per non-zero (4 pos + 4 val) | Sparsity > 50% |
Example: 1000-dim vector with 100 non-zeros:
- Dense: 4000 bytes
- Sparse: 800 bytes (5x compression)
Sparse Vector Operations
Memory Layout
SparseVector {
dimension: usize, // Total vector dimension
positions: Vec<u32>, // Sorted indices of non-zeros
values: Vec<f32>, // Corresponding values
}
Sparse Dot Product Algorithm
#![allow(unused)] fn main() { // O(min(nnz_a, nnz_b)) - only overlapping positions contribute pub fn dot(&self, other: &SparseVector) -> f32 { let mut result = 0.0; let mut i = 0; let mut j = 0; // Merge-sort style traversal while i < self.positions.len() && j < other.positions.len() { match self.positions[i].cmp(&other.positions[j]) { Equal => { result += self.values[i] * other.values[j]; i += 1; j += 1; }, Less => i += 1, Greater => j += 1, } } result } }
Sparse-Dense Dot Product
#![allow(unused)] fn main() { // O(nnz) - only iterate over sparse non-zeros pub fn dot_dense(&self, dense: &[f32]) -> f32 { self.positions.iter() .zip(&self.values) .map(|(&pos, &val)| val * dense[pos as usize]) .sum() } }
Sparse Distance Metrics
| Metric | Complexity | Description |
|---|---|---|
dot | O(min(nnz_a, nnz_b)) | Sparse-sparse dot product |
dot_dense | O(nnz) | Sparse-dense dot product |
cosine_similarity | O(min(nnz_a, nnz_b)) | Angle-based similarity |
euclidean_distance | O(nnz_a + nnz_b) | L2 distance |
manhattan_distance | O(nnz_a + nnz_b) | L1 distance |
jaccard_index | O(min(nnz_a, nnz_b)) | Position overlap |
angular_distance | O(min(nnz_a, nnz_b)) | Arc-cosine |
HNSW Configuration
Configuration Parameters
| Parameter | Default | Description |
|---|---|---|
m | 16 | Max connections per node per layer |
m0 | 32 | Max connections at layer 0 (2*m) |
ef_construction | 200 | Candidates during index building |
ef_search | 50 | Candidates during search |
ml | 1/ln(m) | Level multiplier for layer selection |
sparsity_threshold | 0.5 | Auto-sparse threshold |
max_nodes | 10,000,000 | Capacity limit |
Presets
| Preset | m | m0 | ef_construction | ef_search | Use Case |
|---|---|---|---|---|---|
default() | 16 | 32 | 200 | 50 | Balanced |
high_recall() | 32 | 64 | 400 | 200 | Accuracy over speed |
high_speed() | 8 | 16 | 100 | 20 | Speed over accuracy |
Tuning Guidelines
graph TD
subgraph "Higher m / ef"
A[More connections per node]
B[Better recall]
C[More memory]
D[Slower insert]
end
subgraph "Lower m / ef"
E[Fewer connections]
F[Lower recall]
G[Less memory]
H[Faster insert]
end
A --> B
A --> C
A --> D
E --> F
E --> G
E --> H
Workload-Specific Tuning
| Workload | Recommended Config | Rationale |
|---|---|---|
| RAG/Semantic Search | high_recall() | Accuracy critical |
| Real-time recommendations | high_speed() | Latency critical |
| Batch processing | default() | Balanced |
| Small dataset (<10K) | Brute-force | HNSW overhead not worth it |
| Large dataset (>100K) | default() with higher ef_search | Scale benefits |
Memory vs Recall Tradeoff
| Config | Memory/Node | Recall@10 | Search Time |
|---|---|---|---|
| high_speed | ~128 bytes | ~85% | 0.1ms |
| default | ~256 bytes | ~95% | 0.3ms |
| high_recall | ~512 bytes | ~99% | 1.0ms |
Performance Characteristics
| Operation | Complexity | Notes |
|---|---|---|
store_embedding | O(1) | Single store put |
get_embedding | O(1) | Single store get |
delete_embedding | O(1) | Single store delete |
search_similar | O(n*d) | Brute-force, n=count, d=dimension |
search_with_hnsw | O(log n ef m) | Approximate nearest neighbor |
build_hnsw_index | O(n log n ef_construction * m) | Index construction |
count | O(n) | Scans all embeddings |
list_keys | O(n) | Scans all embeddings |
Parallel Search Threshold
Automatic parallel iteration for datasets >5000 vectors:
#![allow(unused)] fn main() { const PARALLEL_THRESHOLD: usize = 5000; if keys.len() >= PARALLEL_THRESHOLD { // Use rayon parallel iterator keys.par_iter().filter_map(...) } else { // Use sequential iterator keys.iter().filter_map(...) } }
Benchmark Results
| Dataset Size | Brute-Force | With HNSW | Speedup |
|---|---|---|---|
| 200 vectors | 4.17s | 9.3us | 448,000x |
| 1,000 vectors | ~5ms | ~20us | 250x |
| 10,000 vectors | ~50ms | ~50us | 1000x |
| 100,000 vectors | ~500ms | ~100us | 5000x |
Supported Embedding Dimensions
| Model | Dimensions | Recommended Config |
|---|---|---|
| OpenAI text-embedding-ada-002 | 1536 | default |
| OpenAI text-embedding-3-small | 1536 | default |
| OpenAI text-embedding-3-large | 3072 | high_recall |
| BERT base | 768 | default |
| Sentence Transformers | 384-768 | default |
| Cohere embed-v3 | 1024 | default |
| Custom/small | <256 | high_speed |
Edge Cases and Gotchas
Zero-Magnitude Vectors
| Metric | Behavior | Rationale |
|---|---|---|
| Cosine | Returns empty results | Division by zero undefined |
| DotProduct | Returns empty results | Undefined direction |
| Euclidean | Works correctly | Finds vectors closest to origin |
Dimension Mismatch Handling
#![allow(unused)] fn main() { // Mismatched dimensions are silently skipped during search engine.store_embedding("2d", vec![1.0, 0.0])?; engine.store_embedding("3d", vec![1.0, 0.0, 0.0])?; // Search with 2D query only matches 2D vectors let results = engine.search_similar(&[1.0, 0.0], 10)?; assert_eq!(results.len(), 1); // Only "2d" matched }
HNSW Limitations
| Limitation | Details | Workaround |
|---|---|---|
| Only cosine similarity | HNSW uses cosine distance internally | Use brute-force for other metrics |
| No deletion | Cannot remove vectors | Rebuild index |
| Static after build | Index doesn’t update with new vectors | Rebuild periodically |
| Memory overhead | Graph structure adds ~2-4x | Use for large datasets only |
NaN/Infinity Handling
Sparse vector operations sanitize NaN/Inf results:
#![allow(unused)] fn main() { // cosine_similarity returns 0.0 for NaN/Inf if result.is_nan() || result.is_infinite() { 0.0 } else { result.clamp(-1.0, 1.0) } // cosine_distance_dense returns 1.0 (max distance) for NaN/Inf if similarity.is_nan() || similarity.is_infinite() { 1.0 // Maximum distance } else { 1.0 - similarity.clamp(-1.0, 1.0) } }
Best Practices
Memory Optimization
- Use sparse vectors for high-sparsity data: Automatic at >50% zeros
- Batch insert for HNSW: Build index once after all data loaded
- Choose appropriate HNSW config: Don’t over-provision m/ef
- Monitor memory with
HNSWMemoryStats: Track dense vs sparse counts
#![allow(unused)] fn main() { let stats = index.memory_stats(); println!("Dense: {}, Sparse: {}, Total bytes: {}", stats.dense_count, stats.sparse_count, stats.embedding_bytes); }
Search Performance
- Pre-compute query magnitude: Done automatically in search
- Use HNSW for >10K vectors: Brute-force for smaller sets
- Tune ef_search: Higher for recall, lower for speed
- Parallel threshold: Automatic at 5000 vectors
Unified Entity Best Practices
- Use for cross-engine queries: When embeddings relate to graph/relational data
- Entity key conventions: Use prefixes like
user:,doc:,item: - Separate embedding namespace: Use
store_embeddingfor isolated vectors
Dependencies
| Crate | Purpose |
|---|---|
tensor_store | Persistence, SparseVector, HNSWIndex, SIMD |
rayon | Parallel iteration for large datasets |
serde | Serialization of types |
tracing | Instrumentation and observability |
Note: wide (SIMD f32x8 operations) is a transitive dependency via tensor_store.
Related Modules
- Tensor Store - Underlying storage and HNSW implementation
- Query Router - Executes SIMILAR queries using VectorEngine
- Tensor Cache - Uses vector similarity for semantic caching
Tensor Compress
Module 8 of Neumann. Provides tensor-native compression exploiting the mathematical structure of high-dimensional embeddings.
The primary compression method is Tensor Train (TT) decomposition, which decomposes vectors reshaped as tensors into a chain of smaller 3D cores using successive SVD truncations. This achieves 10-40x compression for 1024+ dimension vectors while enabling similarity computations directly in compressed space.
Design Principles
- Tensor Mathematics: Uses Tensor Train decomposition to exploit low-rank structure
- Higher Dimensions Are Lower: Decomposes vectors into products of smaller tensors
- Streaming I/O: Process large snapshots without loading entire dataset
- Incremental Updates: Delta snapshots for efficient replication
- Pure Rust: No external LAPACK/BLAS dependencies - fully portable
Key Types
| Type | Description |
|---|---|
TTVector | Complete TT-decomposition of a vector with cores, shape, and ranks |
TTCore | Single 3D tensor core (left_rank x mode_size x right_rank) |
TTConfig | Configuration for TT decomposition (shape, max_rank, tolerance) |
CompressionConfig | Snapshot compression settings (tensor mode, delta, RLE) |
TensorMode | Compression mode enum (currently TensorTrain variant) |
RleEncoded<T> | Run-length encoded data with values and run lengths |
DeltaSnapshot | Snapshot containing only changes since a base snapshot |
DeltaChain | Chain of deltas with efficient lookup and compaction |
StreamingWriter | Memory-bounded incremental snapshot writer |
StreamingReader | Iterator-based snapshot reader |
StreamingTTWriter | Streaming TT-compressed vector writer |
StreamingTTReader | Streaming TT-compressed vector reader |
Matrix | Row-major matrix for SVD operations |
SvdResult | Truncated SVD result (U, S, Vt matrices) |
TensorView | Zero-copy logical view of tensor data |
DeltaBuilder | Builder for creating delta snapshots |
Error Types
| Error | Description |
|---|---|
TTError::ShapeMismatch | Vector dimension doesn’t match reshape target |
TTError::EmptyVector | Cannot decompose empty vector |
TTError::InvalidRank | TT-rank must be >= 1 |
TTError::IncompatibleShapes | TT vectors have different shapes for operation |
TTError::InvalidShape | Shape contains zero or is empty |
TTError::InvalidTolerance | Tolerance must be 0 < tol <= 1 |
TTError::Decompose | SVD decomposition failed |
FormatError::InvalidMagic | File magic bytes don’t match expected |
FormatError::UnsupportedVersion | Format version is newer than supported |
FormatError::Serialization | Bincode serialization/deserialization error |
DeltaError::BaseNotFound | Referenced base snapshot doesn’t exist |
DeltaError::SequenceGap | Delta sequence numbers have gaps |
DeltaError::ChainTooLong | Delta chain exceeds maximum length |
DecomposeError::EmptyMatrix | Cannot decompose empty matrix |
DecomposeError::DimensionMismatch | Matrix dimensions don’t match for operation |
DecomposeError::SvdNotConverged | SVD iteration didn’t converge |
Architecture
graph TD
subgraph tensor_compress
TT[tensor_train.rs<br/>TT-SVD decomposition]
DC[decompose.rs<br/>SVD implementation]
FMT[format.rs<br/>Snapshot format]
STR[streaming.rs<br/>Streaming I/O]
STT[streaming_tt.rs<br/>Streaming TT]
INC[incremental.rs<br/>Delta snapshots]
DLT[delta.rs<br/>Delta + varint encoding]
RLE[rle.rs<br/>Run-length encoding]
end
TT --> DC
FMT --> TT
FMT --> DLT
FMT --> RLE
STR --> FMT
STT --> TT
INC --> FMT
Tensor Train Decomposition
Algorithm Overview
The TT-SVD algorithm (Oseledets 2011) decomposes a vector by:
- Reshape: Convert 1D vector to multi-dimensional tensor
- Left-to-right sweep: For each mode k from 1 to n-1:
- Left-unfold the current tensor into a matrix
- Compute truncated SVD: A = U S Vt
- Store U as the k-th core
- Multiply S * Vt to get the remainder for next iteration
- Final core: The last remainder becomes the final core
graph LR
subgraph "TT-SVD Algorithm"
V[Vector 4096-dim] --> R[Reshape to 8x8x8x8]
R --> U1[Unfold mode 1<br/>64 x 64]
U1 --> SVD1[SVD truncate<br/>rank=8]
SVD1 --> C1[Core 1<br/>1x8x8]
SVD1 --> R2[Remainder<br/>8x512]
R2 --> SVD2[SVD truncate]
SVD2 --> C2[Core 2<br/>8x8x8]
SVD2 --> R3[Remainder]
R3 --> SVD3[SVD truncate]
SVD3 --> C3[Core 3<br/>8x8x8]
SVD3 --> C4[Core 4<br/>8x8x1]
end
Compression Example
For a 4096-dim embedding reshaped to (8, 8, 8, 8):
Original: 4096 floats = 16 KB
TT-cores: 1x8x8 + 8x8x8 + 8x8x8 + 8x8x1 = 64 + 512 + 512 + 64 = 1152 floats
With max_rank=8: 1x8x4 + 4x8x4 + 4x8x4 + 4x8x1 = 32 + 128 + 128 + 32 = 320 floats = 1.25 KB
Compression: 12.8x
SVD Implementation Details
The module implements two SVD algorithms:
1. Power Iteration with Deflation (small matrices)
Used when matrix dimensions are <= 32 or rank is close to matrix size:
#![allow(unused)] fn main() { // Simplified power iteration fn power_iteration(a: &Matrix, max_iter: usize, tol: f32) -> (sigma, u, v) { // Initialize v randomly (deterministic seed) let mut v: Vec<f32> = (0..cols).map(|i| ((i * 7 + 3) % 13) as f32 / 13.0 - 0.5).collect(); normalize(&mut v); for _ in 0..max_iter { // u = A * v, then normalize u = matmul(a, v); new_sigma = normalize(&mut u); // v = A^T * u, then normalize v = matmul_transpose(a, u); normalize(&mut v); // Check convergence if (new_sigma - sigma).abs() < tol * sigma.max(1.0) { return (new_sigma, u, v); } sigma = new_sigma; } } }
After finding each singular triplet, the algorithm deflates: A = A - sigma u vT
2. Randomized SVD (large matrices)
Uses the Halko-Martinsson-Tropp 2011 algorithm for matrices > 32 dimensions:
graph TD
subgraph "Randomized SVD Pipeline"
A[Input Matrix A<br/>m x n] --> OMEGA[Generate Gaussian<br/>Omega n x k+p]
A --> SAMPLE[Y = A * Omega<br/>m x k+p]
SAMPLE --> QR[QR decompose Y<br/>Q = orth basis]
QR --> PROJECT[B = Q^T * A<br/>k+p x n]
PROJECT --> SMALL_SVD[SVD of small B<br/>power iteration]
SMALL_SVD --> RECONSTRUCT[U = Q * U_small]
end
Key implementation details:
- Gaussian matrix generation: Uses a Linear Congruential Generator (LCG) with Box-Muller transform for deterministic, portable random numbers
- QR orthonormalization: Modified Gram-Schmidt for numerical stability
- Oversampling: Adds 5 extra columns to improve accuracy
- Convergence: 20 iterations max (sufficient for embedding vectors)
#![allow(unused)] fn main() { // LCG parameters from Numerical Recipes fn lcg_next(state: &mut u64) -> u64 { *state = state.wrapping_mul(6_364_136_223_846_793_005) .wrapping_add(1_442_695_040_888_963_407); *state } // Box-Muller transform for Gaussian fn box_muller(u1: f32, u2: f32) -> (f32, f32) { let r = (-2.0 * u1.ln()).sqrt(); let theta = 2.0 * PI * u2; (r * theta.cos(), r * theta.sin()) } }
Optimal Shape Selection
The module includes hardcoded optimal shapes for common embedding dimensions:
| Dimension | Shape | Why |
|---|---|---|
| 64 | [4, 4, 4] | 3 balanced factors |
| 128 | [4, 4, 8] | Near-balanced |
| 256 | [4, 8, 8] | Near-balanced |
| 384 | [4, 8, 12] | all-MiniLM-L6-v2 |
| 512 | [8, 8, 8] | Perfect cube |
| 768 | [8, 8, 12] | BERT dimension |
| 1024 | [8, 8, 16] | Common LLM size |
| 1536 | [8, 12, 16] | OpenAI ada-002 |
| 2048 | [8, 16, 16] | Near-balanced |
| 3072 | [8, 16, 24] | Large models |
| 4096 | [8, 8, 8, 8] | 4D balanced |
| 8192 | [8, 8, 8, 16] | Extra large |
For non-standard dimensions, factorize_balanced finds factors close to the nth
root:
#![allow(unused)] fn main() { fn factorize_balanced(n: usize) -> Vec<usize> { // Target 2-6 factors based on log2(n) let target_factors = (ln(n) / ln(2)).ceil().clamp(2, 6); let target_size = n^(1/target_factors); // Greedily find factors close to target_size // ... } }
TT Operations
| Function | Description | Complexity |
|---|---|---|
tt_decompose | Decompose vector to TT format | O(n d r^2) |
tt_decompose_batch | Parallel batch decomposition (4+ vectors) | O(batch n d * r^2 / threads) |
tt_reconstruct | Reconstruct vector from TT | O(d^n * r^2) |
tt_dot_product | Dot product in TT space | O(n d r^4) |
tt_dot_product_batch | Batch dot products | Parallel when >= 4 targets |
tt_cosine_similarity | Cosine similarity in TT space | O(n d r^4) |
tt_cosine_similarity_batch | Batch cosine similarities | Parallel when >= 4 targets |
tt_euclidean_distance | Euclidean distance in TT space | O(n d r^4) |
tt_euclidean_distance_batch | Batch Euclidean distances | Parallel when >= 4 targets |
tt_norm | L2 norm of TT vector | O(n d r^4) |
tt_scale | Scale TT vector by constant | O(cores[0].size) |
Where: n = number of modes, d = mode size, r = TT-rank
TT Gram Matrix Computation
Computing dot products and norms in TT space uses the Gram matrix approach:
#![allow(unused)] fn main() { // Gram matrix propagation for dot product fn tt_dot_product(a: &TTVector, b: &TTVector) -> f32 { let mut gram = vec![1.0f32]; // Start with 1x1 identity for (core_a, core_b) in a.cores.iter().zip(b.cores.iter()) { let (r1a, n, r2a) = core_a.shape; let (r1b, _, r2b) = core_b.shape; let mut new_gram = vec![0.0; r2a * r2b]; // Contract: new_gram[a,b] = sum_{k,i,j} gram[i,j] * A[i,k,a] * B[j,k,b] for a_idx in 0..r2a { for b_idx in 0..r2b { for k in 0..n { for ia in 0..r1a { for ib in 0..r1b { let g = gram[ia * r1b + ib]; new_gram[a_idx * r2b + b_idx] += g * core_a.get(ia, k, a_idx) * core_b.get(ib, k, b_idx); } } } } } gram = new_gram; } gram[0] // Final 1x1 Gram matrix } }
Usage
#![allow(unused)] fn main() { use tensor_compress::{tt_decompose, tt_reconstruct, tt_cosine_similarity, TTConfig}; let embedding: Vec<f32> = get_embedding(); // 4096-dim let config = TTConfig::for_dim(4096)?; // Decompose let tt = tt_decompose(&embedding, &config)?; println!("Compression: {:.1}x", tt.compression_ratio()); println!("Storage: {} floats", tt.storage_size()); println!("Max rank: {}", tt.max_rank()); // Reconstruct let restored = tt_reconstruct(&tt); // Compute similarity without reconstruction let tt2 = tt_decompose(&other_embedding, &config)?; let sim = tt_cosine_similarity(&tt, &tt2)?; }
Batch Operations
Batch operations use rayon for parallel processing when handling 4+ vectors:
#![allow(unused)] fn main() { use tensor_compress::{tt_decompose_batch, tt_cosine_similarity_batch, TTConfig}; let vectors: Vec<Vec<f32>> = load_embeddings(); let config = TTConfig::for_dim(4096)?; // Batch decompose (parallel for 4+ vectors) let refs: Vec<&[f32]> = vectors.iter().map(|v| v.as_slice()).collect(); let tts = tt_decompose_batch(&refs, &config)?; // Batch similarity search let query_tt = &tts[0]; let similarities = tt_cosine_similarity_batch(query_tt, &tts[1..])?; // Find top-k let mut indexed: Vec<_> = similarities.iter().enumerate().collect(); indexed.sort_by(|a, b| b.1.partial_cmp(a.1).unwrap()); let top_5: Vec<_> = indexed.iter().take(5).collect(); }
The parallel threshold constant is:
#![allow(unused)] fn main() { const PARALLEL_THRESHOLD: usize = 4; }
Configuration
TTConfig Presets
| Preset | max_rank | tolerance | Use Case |
|---|---|---|---|
for_dim(d) | 8 | 1e-4 | Balanced compression/accuracy |
high_compression(d) | 4 | 1e-2 | Maximize compression (2-3x more) |
high_accuracy(d) | 16 | 1e-6 | Maximize accuracy (<0.1% error) |
TTConfig Validation
#![allow(unused)] fn main() { impl TTConfig { pub fn validate(&self) -> Result<(), TTError> { if self.shape.is_empty() { return Err(TTError::InvalidShape("empty shape".into())); } if self.shape.contains(&0) { return Err(TTError::InvalidShape("shape contains zero".into())); } if self.max_rank < 1 { return Err(TTError::InvalidRank); } if self.tolerance <= 0.0 || self.tolerance > 1.0 || !self.tolerance.is_finite() { return Err(TTError::InvalidTolerance(self.tolerance)); } Ok(()) } } }
CompressionConfig
#![allow(unused)] fn main() { pub struct CompressionConfig { pub tensor_mode: Option<TensorMode>, // TT compression for vectors pub delta_encoding: bool, // For sorted ID lists pub rle_encoding: bool, // For repeated values } // Presets CompressionConfig::high_compression() // max_rank=4, all encodings enabled CompressionConfig::balanced(dim) // max_rank=8, all encodings enabled CompressionConfig::high_accuracy(dim) // max_rank=16, all encodings enabled }
Dimension Presets
| Constant | Value | Model |
|---|---|---|
SMALL | 64 | MiniLM and small models |
MEDIUM | 384 | all-MiniLM-L6-v2 |
STANDARD | 768 | BERT, sentence-transformers |
LARGE | 1536 | OpenAI text-embedding-ada-002 |
XLARGE | 4096 | LLaMA and large models |
Streaming Operations
State Machine
stateDiagram-v2
[*] --> Created: new()
Created --> Writing: write_entry() / write_vector()
Writing --> Writing: write_entry() / write_vector()
Writing --> Finishing: finish()
Finishing --> [*]: success
note right of Created
Magic bytes written
entry_count = 0
end note
note right of Writing
Length-prefixed entries
entry_count incremented
end note
note right of Finishing
Trailer written with:
- entry_count
- config
- data_start offset
end note
File Format
Uses a trailer-based header so entry count is known at the end:
+------------------------+
| Magic (NEUS/NEUT) 4B | Identifies streaming snapshot/TT
+------------------------+
| Entry 1 length 4B | Little-endian u32
+------------------------+
| Entry 1 data var | Bincode-serialized entry
+------------------------+
| Entry 2 length 4B |
+------------------------+
| Entry 2 data var |
+------------------------+
| ... |
+------------------------+
| Trailer var | Bincode-serialized header
+------------------------+
| Trailer size 8B | Little-endian u64
+------------------------+
Security limits:
- Maximum trailer size: 1 MB (
MAX_TRAILER_SIZE) - Maximum entry size: 100 MB (
MAX_ENTRY_SIZE)
Usage
#![allow(unused)] fn main() { use tensor_compress::streaming::{StreamingWriter, StreamingReader}; // Write entries one at a time let mut writer = StreamingWriter::new(file, config)?; for entry in entries { writer.write_entry(&entry)?; } writer.finish()?; // Read entries one at a time (iterator-based) let reader = StreamingReader::open(file)?; println!("Entry count: {}", reader.entry_count()); for entry in reader { process(entry?); } }
Streaming TT Operations
| Function | Description |
|---|---|
StreamingTTWriter::new | Create TT streaming writer |
StreamingTTWriter::write_vector | Decompose and write vector |
StreamingTTWriter::write_tt | Write pre-decomposed TT |
StreamingTTWriter::finish | Finalize with trailer |
StreamingTTReader::open | Open TT streaming file |
streaming_tt_similarity_search | Search streaming TT file |
convert_vectors_to_streaming_tt | Batch convert vectors |
read_streaming_tt_all | Load all TT vectors |
#![allow(unused)] fn main() { use tensor_compress::streaming_tt::{StreamingTTWriter, StreamingTTReader, streaming_tt_similarity_search}; // Create streaming TT file let config = TTConfig::for_dim(768)?; let mut writer = StreamingTTWriter::new(file, config.clone())?; for vector in vectors { writer.write_vector(&vector)?; // Decompose on-the-fly } writer.finish()?; // Similarity search without loading all into memory let query_tt = tt_decompose(&query, &config)?; let top_10 = streaming_tt_similarity_search(file, &query_tt, 10)?; // Returns Vec<(index, similarity)> sorted by descending similarity }
Merge and Convert Operations
#![allow(unused)] fn main() { use tensor_compress::streaming::{convert_to_streaming, read_streaming_to_snapshot, merge_streaming}; // Convert non-streaming snapshot to streaming format let count = convert_to_streaming(&snapshot, output_file)?; // Read streaming format into full snapshot (for compatibility) let snapshot = read_streaming_to_snapshot(file)?; // Merge multiple streaming snapshots let count = merge_streaming(vec![file1, file2, file3], output, config)?; }
Incremental Updates
Delta Snapshot Architecture
graph TD
subgraph "Delta Chain"
BASE[Base Snapshot<br/>Seq 0] --> D1[Delta 1<br/>Seq 1-10]
D1 --> D2[Delta 2<br/>Seq 11-25]
D2 --> D3[Delta 3<br/>Seq 26-30]
end
subgraph "Compaction"
BASE2[Base] --> COMPACT[Compacted<br/>Snapshot]
D1_2[Delta 1] --> COMPACT
D2_2[Delta 2] --> COMPACT
D3_2[Delta 3] --> COMPACT
end
Delta Entry Types
#![allow(unused)] fn main() { pub enum ChangeType { Put, // Entry was added or updated Delete, // Entry was deleted } pub struct DeltaEntry { pub key: String, pub change: ChangeType, pub value: Option<CompressedEntry>, // None for Delete pub sequence: u64, } }
Usage
#![allow(unused)] fn main() { use tensor_compress::incremental::{DeltaBuilder, DeltaChain, apply_delta, merge_deltas, diff_snapshots}; // Create delta let mut builder = DeltaBuilder::new("base_snapshot_id", sequence); builder.put("key1", entry1); builder.delete("key2"); let delta = builder.build(); // Apply delta let new_snapshot = apply_delta(&base, &delta)?; // Chain management let mut chain = DeltaChain::new(base_snapshot); chain.push(delta1)?; chain.push(delta2)?; let value = chain.get("key1"); // Checks chain then base // Compact when chain grows long if chain.should_compact(10) { let compacted = chain.compact()?; } // Compare two snapshots let delta = diff_snapshots(&old_snapshot, &new_snapshot, "old_id")?; // Merge multiple deltas into one let merged = merge_deltas(&[delta1, delta2, delta3])?; }
Delta Operations
| Function | Description |
|---|---|
DeltaBuilder::new | Create delta builder with base ID and start sequence |
DeltaBuilder::put | Record a put (add/update) change |
DeltaBuilder::delete | Record a delete change |
DeltaBuilder::build | Build the delta snapshot |
apply_delta | Apply delta to base snapshot |
merge_deltas | Merge multiple deltas (keeps latest state per key) |
diff_snapshots | Compute delta between two snapshots |
DeltaChain::get | Get current state of key (checks chain then base) |
DeltaChain::compact | Compact all deltas into new base |
DeltaChain::should_compact | Check if compaction is recommended |
Delta Format
+------------------------+
| Magic (NEUD) 4B |
+------------------------+
| Version 2B |
+------------------------+
| Base ID var | String (length-prefixed)
+------------------------+
| Sequence Range 16B | (start, end) u64 pair
+------------------------+
| Change Count 8B |
+------------------------+
| Created At 8B | Unix timestamp
+------------------------+
| Entries var | Bincode-serialized Vec<DeltaEntry>
+------------------------+
Lossless Compression
Delta + Varint Encoding
For sorted integer sequences (node IDs, timestamps):
graph LR
subgraph "Delta + Varint Pipeline"
IDS[IDs: 100, 101, 102, 105, 110] --> DELTA[Delta encode:<br/>100, 1, 1, 3, 5]
DELTA --> VARINT[Varint encode]
VARINT --> OUT[Bytes: ~7 bytes<br/>vs 40 bytes raw]
end
Algorithm:
#![allow(unused)] fn main() { // Delta encoding: store first value, then differences pub fn delta_encode(ids: &[u64]) -> Vec<u64> { let mut result = vec![ids[0]]; for window in ids.windows(2) { result.push(window[1].saturating_sub(window[0])); } result } // Varint encoding: 7 bits per byte, high bit = continuation pub fn varint_encode(values: &[u64]) -> Vec<u8> { let mut result = Vec::with_capacity(values.len() * 2); for &value in values { let mut v = value; loop { let byte = (v & 0x7f) as u8; v >>= 7; if v == 0 { result.push(byte); // Final byte (no continuation) break; } result.push(byte | 0x80); // Continuation bit set } } result } }
Usage:
#![allow(unused)] fn main() { use tensor_compress::{compress_ids, decompress_ids}; let ids: Vec<u64> = (1000..2000).collect(); let compressed = compress_ids(&ids); // ~100 bytes vs 8000 let restored = decompress_ids(&compressed); assert_eq!(ids, restored); }
Varint byte sizes:
| Value Range | Bytes |
|---|---|
| 0 - 127 | 1 |
| 128 - 16,383 | 2 |
| 16,384 - 2,097,151 | 3 |
| 2,097,152 - 268,435,455 | 4 |
| … up to u64::MAX | 10 |
Run-Length Encoding
For repeated values:
#![allow(unused)] fn main() { use tensor_compress::{rle_encode, rle_decode}; let statuses = vec!["active"; 1000]; let encoded = rle_encode(&statuses); assert_eq!(encoded.runs(), 1); // Single run // Storage: 1 string + 1 u32 = ~12 bytes vs 6000+ bytes }
Internal representation:
#![allow(unused)] fn main() { pub struct RleEncoded<T: Eq> { pub values: Vec<T>, // Unique values in order pub run_lengths: Vec<u32>, // Count for each value } }
Compression scenarios:
| Data Pattern | Runs | Compression |
|---|---|---|
[5, 5, 5, 5, 5] (1000x) | 1 | 500x |
[1, 2, 3, 4, 5] (all different) | 5 | 0.8x (overhead) |
[1, 1, 2, 2, 2, 3, 1, 1, 1, 1] | 4 | 2.5x |
| Status column (pending/active/done) | ~300 per 10000 | ~33x |
Sparse Vector Format
For vectors with >50% zeros:
#![allow(unused)] fn main() { use tensor_compress::{compress_sparse, compress_dense_as_sparse, should_use_sparse, should_use_sparse_threshold}; // Direct sparse compression let positions = vec![0, 50, 99]; let values = vec![1.0, 2.0, 3.0]; let compressed = compress_sparse(100, &positions, &values); // Auto-detect and compress if should_use_sparse_threshold(&vector, 0.5) { let compressed = compress_dense_as_sparse(&vector); } // Check if sparse is beneficial if should_use_sparse(dimension, non_zero_count) { // Use sparse format } }
Storage calculation:
#![allow(unused)] fn main() { // sparse_storage_size = 8 + 8 + nnz*2 + nnz*4 = 16 + nnz*6 // Dense storage = dimension * 4 // Sparse is better when: 16 + nnz*6 < dimension*4 // Solving: nnz < (dimension*4 - 16) / 6 = dimension*0.67 - 2.67 }
| Dimension | Max NNZ for Sparse | Sparsity Threshold |
|---|---|---|
| 100 | 64 | 64% |
| 1000 | 664 | 66.4% |
| 4096 | 2728 | 66.6% |
Compressed Value Types
#![allow(unused)] fn main() { pub enum CompressedValue { Scalar(CompressedScalar), // Int, Float, String, Bool, Null VectorRaw(Vec<f32>), // Uncompressed VectorTT { cores, original_dim, shape, ranks }, // TT-compressed VectorSparse { dimension, positions, values }, // Sparse IdList(Vec<u8>), // Delta + varint encoded RleInt(RleEncoded<i64>), // RLE encoded integers Pointer(String), // Single pointer Pointers(Vec<String>), // Multiple pointers } }
Automatic Format Selection
#![allow(unused)] fn main() { pub fn compress_vector(vector: &[f32], key: &str, field_name: &str, config: &CompressionConfig) -> Result<CompressedValue, FormatError> { // 1. Check for embedding-like keys let is_embedding = key.starts_with("emb:") || field_name == "_embedding" || field_name == "vector"; if is_embedding { if let Some(TensorMode::TensorTrain(tt_config)) = &config.tensor_mode { return Ok(CompressedValue::VectorTT { ... }); } } // 2. Check for ID list pattern if config.delta_encoding && looks_like_id_list(vector, field_name) { return Ok(CompressedValue::IdList(...)); } // 3. Fall back to raw Ok(CompressedValue::VectorRaw(vector.to_vec())) } }
Performance
Benchmarks on Apple M4 (aarch64, MacBook Air 24GB), release build:
| Dimension | Decompose | Reconstruct | Similarity | Compression |
|---|---|---|---|---|
| 64 | 6.2 us | 29.5 us | 1.1 us | 2.0x |
| 256 | 13.4 us | 113.0 us | 1.5 us | 4.6x |
| 768 | 26.9 us | 431.7 us | 2.4 us | 10.7x |
| 1536 | 62.0 us | 709.8 us | 2.0 us | 16.0x |
| 4096 | 464.5 us | 2142.2 us | 2.4 us | 42.7x |
Batch operations (768-dim, 1000 vectors):
| Operation | Time | Per-vector |
|---|---|---|
tt_decompose_batch | 21 ms | 21.0 us |
tt_cosine_similarity_batch | 11.3 ms | 11.4 us |
Throughput: 39,318 vectors/sec (768-dim decomposition)
Industry Comparison
| Method | Compression | Recall | Notes |
|---|---|---|---|
| Tensor Train (this) | 10-42x | ~99% | Similarity in compressed space |
| Scalar Quantization | 4x | 99%+ | Industry default |
| Product Quantization | 16-64x | 56-90% | Requires training |
| Binary Quantization | 32x | 80-95% | Speed-optimized |
Edge Cases and Gotchas
Vector Content Patterns
| Pattern | Compression | Reconstruction | Notes |
|---|---|---|---|
| Constant (all same) | Excellent (>5x) | Accurate | Rank-1 structure |
| All zeros | Good | Accurate | Degenerate case |
| Single spike | Poor | Moderate | No low-rank structure |
| Linear ramp | Good (>2x) | Good | Low-rank |
| Alternating +1/-1 | Poor | Moderate | High-frequency needs high rank |
| Random dense | Good | Good (>0.9 cosine) | Typical embeddings |
| 90% zeros | Consider sparse instead | n/a | Use compress_dense_as_sparse |
Numerical Edge Cases
#![allow(unused)] fn main() { // Very small values (denormalized floats) let tiny: Vec<f32> = (0..64).map(|i| (i as f32) * 1e-38).collect(); // Works, but may lose precision // Large values (1e6 range) let large: Vec<f32> = (0..64).map(|i| (i as f32) * 1e6).collect(); // Works, no overflow // Prime dimensions let prime_127: Vec<f32> = (0..127).map(|i| (i as f32 * 0.1).sin()).collect(); // Works but may have poor compression }
Streaming Gotchas
-
Incomplete files: Magic bytes are written first, but entry count is in trailer. If writer crashes before
finish(), the file is corrupt. -
Memory limits:
MAX_ENTRY_SIZE = 100MBandMAX_TRAILER_SIZE = 1MBprevent allocation attacks. Exceeding these returns an error. -
Seek requirement:
StreamingReader::openrequiresSeekto read the trailer. For non-seekable streams, useread_streaming_to_snapshotwhich buffers.
Delta Chain Gotchas
-
Chain length: Default
max_chain_len = 100. After this,push()returnsChainTooLongerror. Callcompact()periodically. -
Sequence gaps: Deltas should have contiguous sequences. The
merge_deltasfunction only keeps the latest state per key. -
Base reference: Deltas store a
base_idstring but don’t validate it exists. Your application must track base snapshots.
Performance Tips and Best Practices
Choosing Configuration
#![allow(unused)] fn main() { // For search/retrieval (similarity queries) let config = TTConfig::for_dim(dim)?; // Balanced // For archival/cold storage let config = TTConfig::high_compression(dim)?; // Smaller, slower queries // For real-time applications let config = TTConfig::high_accuracy(dim)?; // Larger, faster queries }
Batch Size Optimization
#![allow(unused)] fn main() { // Below parallel threshold (4), sequential is faster // due to thread spawn overhead let small_batch = tt_decompose_batch(&vectors[..3], &config); // Sequential // At threshold, parallel kicks in let large_batch = tt_decompose_batch(&vectors, &config); // Parallel if >= 4 }
Memory Efficiency
#![allow(unused)] fn main() { // Bad: Load all, then process let all_vectors = read_streaming_tt_all(file)?; // Loads all into memory // Good: Stream process for tt in StreamingTTReader::open(file)? { process(tt?); // One at a time } // Best: Use streaming search let results = streaming_tt_similarity_search(file, &query_tt, 10)?; }
Delta Compaction Strategy
#![allow(unused)] fn main() { let mut chain = DeltaChain::new(base); // After N deltas or M total changes if chain.len() >= 10 || total_changes >= 10000 { let new_base = chain.compact()?; chain = DeltaChain::new(new_base); } }
Dependencies
serde: Serialization traitsbincode: Binary formatthiserror: Error typesrayon: Parallel batch operations
No external LAPACK/BLAS - pure Rust SVD implementation.
Related Modules
| Module | Relationship |
|---|---|
| tensor_store | Uses compression for snapshot I/O |
| tensor_chain | Delta compression for state replication |
| tensor_checkpoint | Snapshot format integration |
Tensor Vault
Tensor Vault provides secure secret storage with AES-256-GCM encryption and graph-based access control. Designed for multi-agent environments, it implements a zero-trust architecture where access is determined by graph topology rather than traditional ACLs.
All secrets are encrypted at rest with authenticated encryption. The vault maintains a permanent audit trail of all operations and supports features like rate limiting, TTL-based grants, and namespace isolation for multi-tenant deployments.
Design Principles
| Principle | Description |
|---|---|
| Encryption at Rest | All secrets encrypted with AES-256-GCM |
| Topological Access Control | Access determined by graph path, not ACLs |
| Zero Trust | No bypass mode; node:root is the only universal accessor |
| Memory Safety | Keys zeroized on drop via zeroize crate |
| Permanent Audit Trail | All operations logged with queryable API |
| Defense in Depth | Multiple obfuscation layers hide patterns |
| Multi-Tenant Ready | Namespace isolation and rate limiting for agent systems |
Key Types
Core Types
| Type | Description |
|---|---|
Vault | Main API for encrypted secret storage with graph-based access control |
VaultConfig | Configuration for key derivation, rate limiting, and versioning |
VaultError | Error types (AccessDenied, NotFound, CryptoError, etc.) |
Permission | Access levels: Read, Write, Admin |
VersionInfo | Metadata about a secret version (version number, timestamp) |
ScopedVault | Entity-bound view for simplified API usage |
NamespacedVault | Namespace-prefixed view for multi-tenant isolation |
Cryptographic Types
| Type | Description |
|---|---|
MasterKey | Derived encryption key with zeroize-on-drop (32 bytes) |
Cipher | AES-256-GCM encryption wrapper |
Obfuscator | HMAC-based key obfuscation and AEAD metadata encryption |
PaddingSize | Padding buckets for length hiding (256B to 64KB) |
Access Control Types
| Type | Description |
|---|---|
AccessController | BFS-based graph path verification |
GrantTTLTracker | Min-heap tracking grant expirations with persistence |
RateLimiter | Sliding window rate limiting per entity |
RateLimitConfig | Configurable limits per operation type |
Audit Types
| Type | Description |
|---|---|
AuditLog | Query interface for audit entries |
AuditEntry | Single operation record (entity, key, operation, timestamp) |
AuditOperation | Operation types: Get, Set, Delete, Rotate, Grant, Revoke, List |
Architecture
graph TB
subgraph "Tensor Vault"
API[Vault API]
AC[AccessController]
Cipher[Cipher<br/>AES-256-GCM]
KDF[MasterKey<br/>Argon2id + HKDF]
Obf[Obfuscator<br/>HMAC + Padding]
Audit[AuditLog]
TTL[GrantTTLTracker]
RL[RateLimiter]
end
subgraph "Storage"
TS[TensorStore]
GE[GraphEngine]
end
API --> AC
API --> Cipher
API --> Obf
API --> Audit
API --> TTL
API --> RL
AC --> GE
Cipher --> KDF
Obf --> KDF
API --> TS
Audit --> TS
Data Flow
- Set Operation: Plaintext is padded, encrypted with random nonce, metadata obfuscated, stored via TensorStore
- Get Operation: Rate limit check, access path verified via BFS, ciphertext decrypted, padding removed, audit logged
- Grant Operation: Permission edge created in GraphEngine, TTL tracked if specified
- Revoke Operation: Permission edge deleted, expired grants cleaned up
Set Operation Flow
sequenceDiagram
participant C as Client
participant V as Vault
participant RL as RateLimiter
participant AC as AccessController
participant O as Obfuscator
participant Ci as Cipher
participant TS as TensorStore
participant GE as GraphEngine
participant A as AuditLog
C->>V: set(requester, key, value)
V->>RL: check_rate_limit(requester, Set)
alt Rate Limited
RL-->>V: RateLimited error
V-->>C: Error
end
alt New Secret
V->>V: Check requester == ROOT
alt Not Root
V-->>C: AccessDenied
end
else Update
V->>AC: check_path_with_permission(Write)
end
V->>O: pad_plaintext(value)
O-->>V: padded_value
V->>Ci: encrypt(padded_value)
Ci-->>V: (ciphertext, nonce)
V->>O: generate_storage_id(key, nonce)
O-->>V: blob_key
V->>TS: put(blob_key, ciphertext)
V->>O: obfuscate_key(key)
O-->>V: obfuscated_key
V->>O: encrypt_metadata(creator)
V->>O: encrypt_metadata(timestamp)
V->>TS: put(_vk:obfuscated_key, metadata)
alt New Secret
V->>GE: add_entity_edge(ROOT, secret_node, VAULT_ACCESS_ADMIN)
end
V->>A: record(requester, key, Set)
V-->>C: Ok(())
Access Control Model
Access is determined by graph topology using BFS traversal:
node:root ──VAULT_ACCESS_ADMIN──> vault_secret:api_key
^
user:alice ──VAULT_ACCESS_READ───────────┘
^
team:devs ──VAULT_ACCESS_WRITE───────────┘
^
user:bob ──MEMBER────────────────────────┘
| Requester | Path | Access |
|---|---|---|
node:root | Always | Granted (Admin) |
user:alice | Direct edge | Granted (Read only) |
team:devs | Direct edge | Granted (Write) |
user:bob | bob -> team:devs -> secret | Granted (Write via team) |
user:carol | No path | Denied |
Permission Levels
| Level | Capabilities |
|---|---|
| Read | get(), list(), get_version(), list_versions() |
| Write | Read + set() (update), rotate(), rollback() |
| Admin | Write + delete(), grant(), revoke() |
Permission propagation follows graph paths. The effective permission is
determined by the VAULT_ACCESS_* edge type at the end of the path.
Allowed Traversal Edges
Only these edge types can grant transitive access:
VAULT_ACCESS- Legacy edge type (treated as Admin for backward compatibility)VAULT_ACCESS_READ- Read-only accessVAULT_ACCESS_WRITE- Read + Write accessVAULT_ACCESS_ADMIN- Full access including grant/revokeMEMBER- Allows group membership traversal but does NOT grant permission directly
Access Control Algorithm
The AccessController uses BFS to find the best permission level along any
path:
#![allow(unused)] fn main() { // Simplified algorithm from access.rs pub fn get_permission_level(graph: &GraphEngine, source: &str, target: &str) -> Option<Permission> { if source == target { return Some(Permission::Admin); // Self-access } let mut visited = HashSet::new(); let mut queue = VecDeque::new(); let mut best_permission: Option<Permission> = None; queue.push_back(source.to_string()); visited.insert(source.to_string()); while let Some(current) = queue.pop_front() { for edge in graph.get_entity_outgoing(¤t) { let (_, to, edge_type, _) = graph.get_entity_edge(&edge); // Only traverse allowed edge types if !is_allowed_edge_type(&edge_type) { continue; } // VAULT_ACCESS_* edges grant permission to target if edge_type.starts_with("VAULT_ACCESS") && to == target { if let Some(perm) = Permission::from_edge_type(&edge_type) { best_permission = max(best_permission, perm); } } else if edge_type == "MEMBER" { // MEMBER edges allow traversal but NO permission grant if !visited.contains(&to) { visited.insert(to.clone()); queue.push_back(to); } } } } best_permission } }
Security Note: MEMBER edges enable traversal through groups but do not
grant permissions. Only VAULT_ACCESS_* edges grant actual permissions. This
prevents privilege escalation via group membership.
Access Control Flow
flowchart TD
Start([Check Access]) --> IsRoot{Is requester ROOT?}
IsRoot -->|Yes| Granted([Access Granted - Admin])
IsRoot -->|No| BFS[Start BFS from requester]
BFS --> Queue{Queue empty?}
Queue -->|Yes| CheckBest{Best permission found?}
Queue -->|No| Pop[Pop next node]
Pop --> GetEdges[Get outgoing edges]
GetEdges --> ForEdge{For each edge}
ForEdge --> IsAllowed{Edge type allowed?}
IsAllowed -->|No| ForEdge
IsAllowed -->|Yes| IsVaultAccess{VAULT_ACCESS_* ?}
IsVaultAccess -->|Yes| IsTarget{Points to target?}
IsTarget -->|Yes| UpdateBest[Update best permission]
IsTarget -->|No| ForEdge
UpdateBest --> ForEdge
IsVaultAccess -->|No| IsMember{MEMBER edge?}
IsMember -->|Yes| AddQueue[Add destination to queue]
IsMember -->|No| ForEdge
AddQueue --> ForEdge
ForEdge -->|Done| Queue
CheckBest -->|Yes| CheckLevel{Permission >= required?}
CheckBest -->|No| Denied([Access Denied])
CheckLevel -->|Yes| Granted2([Access Granted])
CheckLevel -->|No| Insufficient([Insufficient Permission])
Storage Format
Secrets use a two-tier storage model for security:
Metadata Tensor
Storage key: _vk:{HMAC(key)} (key name obfuscated via HMAC-BLAKE2b)
| Field | Type | Description |
|---|---|---|
_blob | Pointer | Reference to current version ciphertext blob |
_nonce | Bytes | 12-byte encryption nonce for current version |
_versions | Pointers | List of all version blob keys (oldest first) |
_key_enc | Bytes | AES-GCM encrypted original key name |
_key_nonce | Bytes | Nonce for key encryption |
_creator_obf | Bytes | AEAD-encrypted creator (nonce prepended) |
_created_obf | Bytes | AEAD-encrypted timestamp (nonce prepended) |
_rotator_obf | Bytes | AEAD-encrypted last rotator (optional) |
_rotated_obf | Bytes | AEAD-encrypted last rotation timestamp (optional) |
Ciphertext Blob
Storage key: _vs:{HMAC(key, nonce)} (random-looking storage ID)
| Field | Type | Description |
|---|---|---|
_data | Bytes | Padded + encrypted secret |
_nonce | Bytes | 12-byte encryption nonce |
_ts | Int | Unix timestamp (seconds) when version was created |
Storage Key Structure
_vault:salt - Persisted 16-byte salt for key derivation
_vk:<32-hex-chars> - Metadata tensor (HMAC of secret key)
_vs:<24-hex-chars> - Ciphertext blob (HMAC of key + nonce)
_va:<timestamp>:<counter> - Audit log entries
_vault_ttl_grants - Persisted TTL grants (JSON)
vault_secret:<32-hex-chars> - Secret node for graph access control
Encryption
Key Derivation
Master key derived using Argon2id with HKDF-based subkey separation:
#![allow(unused)] fn main() { // From key.rs - Argon2id parameters pub const SALT_SIZE: usize = 16; // 128-bit salt pub const KEY_SIZE: usize = 32; // 256-bit key (AES-256) // Default VaultConfig values: // argon2_memory_cost: 65536 (64 MiB) // argon2_time_cost: 3 (iterations) // argon2_parallelism: 4 (threads) // Argon2id configuration let params = Params::new( config.argon2_memory_cost, // Memory in KiB config.argon2_time_cost, // Iterations config.argon2_parallelism, // Parallelism Some(KEY_SIZE), // Output length )?; let argon2 = Argon2::new(Algorithm::Argon2id, Version::V0x13, params); argon2.hash_password_into(input, salt, &mut key)?; }
Argon2id Security Properties:
- Hybrid algorithm: Argon2i (side-channel resistant) + Argon2d (GPU resistant)
- Memory-hard: Requires 64 MiB by default, defeating GPU/ASIC attacks
- Time-hard: 3 iterations increase computation time
- Parallelism: 4 threads to utilize modern CPUs
HKDF Subkey Derivation
Each purpose gets a cryptographically independent key via HKDF-SHA256:
#![allow(unused)] fn main() { // From key.rs - Domain-separated subkeys impl MasterKey { pub fn derive_subkey(&self, domain: &[u8]) -> [u8; KEY_SIZE] { let hk = Hkdf::<Sha256>::new(None, &self.bytes); let mut output = [0u8; KEY_SIZE]; hk.expand(domain, &mut output).expect("HKDF expand cannot fail for 32 bytes"); output } pub fn encryption_key(&self) -> [u8; KEY_SIZE] { self.derive_subkey(b"neumann_vault_encryption_v1") } pub fn obfuscation_key(&self) -> [u8; KEY_SIZE] { self.derive_subkey(b"neumann_vault_obfuscation_v1") } pub fn metadata_key(&self) -> [u8; KEY_SIZE] { self.derive_subkey(b"neumann_vault_metadata_v1") } } }
Key Hierarchy:
Master Password + Salt
│
▼ Argon2id
MasterKey (32 bytes)
│
├──▶ HKDF("encryption_v1") ──▶ AES-256-GCM key
├──▶ HKDF("obfuscation_v1") ──▶ HMAC key for obfuscation
└──▶ HKDF("metadata_v1") ──▶ AES-256-GCM key for metadata
Salt Persistence
The vault automatically manages salt persistence:
#![allow(unused)] fn main() { // From lib.rs - Salt handling on vault creation pub fn new(master_key: &[u8], graph: Arc<GraphEngine>, store: TensorStore, config: VaultConfig) -> Result<Self> { let derived = if config.salt.is_some() { // Explicit salt provided - use it directly let (key, _) = MasterKey::derive(master_key, &config)?; key } else if let Some(persisted_salt) = Self::load_salt(&store) { // Use persisted salt for consistency across reopens MasterKey::derive_with_salt(master_key, &persisted_salt, &config)? } else { // Generate new random salt and persist it let (key, new_salt) = MasterKey::derive(master_key, &config)?; Self::save_salt(&store, new_salt)?; key }; // ... } }
Encryption Process
- Pad plaintext to fixed bucket size (256B, 1KB, 4KB, 16KB, 32KB, or 64KB)
- Generate random 12-byte nonce
- Encrypt with AES-256-GCM
- Store ciphertext and nonce separately
#![allow(unused)] fn main() { // From encryption.rs pub const NONCE_SIZE: usize = 12; // 96-bit nonce (AES-GCM standard) impl Cipher { pub fn encrypt(&self, plaintext: &[u8]) -> Result<(Vec<u8>, [u8; NONCE_SIZE])> { let cipher = Aes256Gcm::new_from_slice(self.key.as_bytes())?; // Generate random nonce - CRITICAL for security let mut nonce_bytes = [0u8; NONCE_SIZE]; rand::thread_rng().fill_bytes(&mut nonce_bytes); let nonce = Nonce::from_slice(&nonce_bytes); // AES-GCM provides authenticated encryption // Output: ciphertext || 16-byte authentication tag let ciphertext = cipher.encrypt(nonce, plaintext)?; Ok((ciphertext, nonce_bytes)) } pub fn decrypt(&self, ciphertext: &[u8], nonce_bytes: &[u8]) -> Result<Vec<u8>> { if nonce_bytes.len() != NONCE_SIZE { return Err(VaultError::CryptoError("Invalid nonce size")); } let cipher = Aes256Gcm::new_from_slice(self.key.as_bytes())?; let nonce = Nonce::from_slice(nonce_bytes); // Decryption verifies authentication tag // Fails if ciphertext was tampered cipher.decrypt(nonce, ciphertext) } } }
AES-256-GCM Security Properties:
- Authenticated encryption: Detects tampering via 128-bit authentication tag
- Nonce requirement: Each encryption MUST use a unique nonce
- Ciphertext expansion: 16 bytes larger than plaintext (auth tag)
Obfuscation Layers
| Layer | Purpose | Implementation |
|---|---|---|
| Key Obfuscation | Hide secret names | HMAC-BLAKE2b hash of key name |
| Pointer Indirection | Hide storage patterns | Ciphertext in separate blob with random-looking key |
| Length Padding | Hide plaintext size | Pad to fixed bucket sizes |
| Metadata Encryption | Hide creator/timestamps | AES-GCM with per-record random nonces |
| Blind Indexes | Searchable encryption | HMAC-based indexes for pattern matching |
Padding Bucket Sizes
#![allow(unused)] fn main() { // From obfuscation.rs pub enum PaddingSize { Small = 256, // API keys, tokens Medium = 1024, // Certificates, small configs Large = 4096, // Private keys, large configs ExtraLarge = 16384, // Very large secrets Huge = 32768, // Oversized secrets Maximum = 65536, // Maximum supported } // Bucket selection (includes 4-byte length prefix + 1 byte min padding) pub fn for_length(len: usize) -> Option<Self> { let min_required = len + 5; // length prefix + min padding if min_required <= 256 { Some(Small) } else if min_required <= 1024 { Some(Medium) } else if min_required <= 4096 { Some(Large) } else if min_required <= 16384 { Some(ExtraLarge) } else if min_required <= 32768 { Some(Huge) } else if min_required <= 65536 { Some(Maximum) } else { None } } }
Padding Format
+----------------+-------------------+------------------+
| Length (4B LE) | Plaintext (N B) | Random Padding |
+----------------+-------------------+------------------+
|<--------------- Bucket Size (256/1K/4K/...) -------->|
#![allow(unused)] fn main() { // From obfuscation.rs pub fn pad_plaintext(plaintext: &[u8]) -> Result<Vec<u8>> { let target_size = PaddingSize::for_length(plaintext.len())? as usize; let padding_len = target_size - 4 - plaintext.len(); // 4 = length prefix let mut padded = Vec::with_capacity(target_size); // Store original length as u32 little-endian let len_bytes = (plaintext.len() as u32).to_le_bytes(); padded.extend_from_slice(&len_bytes); // Original data padded.extend_from_slice(plaintext); // Random padding (not zeros - prevents padding oracle attacks) let mut rng_bytes = vec![0u8; padding_len]; rand::thread_rng().fill_bytes(&mut rng_bytes); padded.extend_from_slice(&rng_bytes); Ok(padded) } }
HMAC-BLAKE2b Construction
#![allow(unused)] fn main() { // From obfuscation.rs - HMAC construction for key obfuscation fn hmac_hash(&self, data: &[u8], domain: &[u8]) -> [u8; 32] { // Inner hash: H((key XOR ipad) || domain || data) let mut inner_key = self.obfuscation_key; for byte in &mut inner_key { *byte ^= 0x36; // ipad } let mut inner_hasher = Blake2b::<U32>::new(); inner_hasher.update(inner_key); inner_hasher.update(domain); inner_hasher.update(data); let inner_hash = inner_hasher.finalize(); // Outer hash: H((key XOR opad) || inner_hash) let mut outer_key = self.obfuscation_key; for byte in &mut outer_key { *byte ^= 0x5c; // opad } let mut outer_hasher = Blake2b::<U32>::new(); outer_hasher.update(outer_key); outer_hasher.update(inner_hash); outer_hasher.finalize().into() } }
Metadata AEAD Encryption
#![allow(unused)] fn main() { // From obfuscation.rs - Per-record AEAD encryption pub fn encrypt_metadata(&self, data: &[u8]) -> Result<Vec<u8>> { let cipher = Aes256Gcm::new_from_slice(&self.metadata_key)?; // Random nonce for each encryption let mut nonce_bytes = [0u8; 12]; rand::thread_rng().fill_bytes(&mut nonce_bytes); let nonce = Nonce::from_slice(&nonce_bytes); let ciphertext = cipher.encrypt(nonce, data)?; // Format: nonce || ciphertext let mut result = Vec::with_capacity(12 + ciphertext.len()); result.extend_from_slice(&nonce_bytes); result.extend(ciphertext); Ok(result) } }
Rate Limiting
Rate limiting uses a sliding window algorithm to prevent brute-force attacks:
#![allow(unused)] fn main() { // From rate_limit.rs pub struct RateLimiter { // (entity, operation) -> timestamps of recent requests history: DashMap<(String, String), VecDeque<Instant>>, config: RateLimitConfig, } impl RateLimiter { pub fn check_and_record(&self, entity: &str, op: Operation) -> Result<(), String> { let limit = op.limit(&self.config); if limit == u32::MAX { return Ok(()); // Unlimited } let key = (entity.to_string(), op.as_str().to_string()); let now = Instant::now(); let window_start = now - self.config.window; let mut entry = self.history.entry(key).or_default(); let timestamps = entry.value_mut(); // Remove expired entries outside window while let Some(front) = timestamps.front() { if *front < window_start { timestamps.pop_front(); } else { break; } } let count = timestamps.len() as u32; if count >= limit { Err(format!("Rate limit exceeded: {} {} calls in {:?}", count, op, self.config.window)) } else { timestamps.push_back(now); // Record this request Ok(()) } } } }
Sliding Window Visualization
Window: 60 seconds
Limit: 5 requests
Timeline:
|--[req1]--[req2]---[req3]--[req4]---[req5]---|
|<------------------ Window ----------------->|
^
Now (6th request blocked)
After 10 seconds:
|--[req2]---[req3]--[req4]---[req5]---|
[req1] expired |<------------------ Window --------->|
^
Now (6th request allowed)
Rate Limit Configuration Presets
#![allow(unused)] fn main() { // Default configuration impl Default for RateLimitConfig { fn default() -> Self { Self { max_gets: 60, // 60 get() calls per minute max_lists: 10, // 10 list() calls per minute max_sets: 30, // 30 set() calls per minute max_grants: 20, // 20 grant() calls per minute window: Duration::from_secs(60), } } } // Strict configuration for testing pub fn strict() -> Self { Self { max_gets: 5, max_lists: 2, max_sets: 3, max_grants: 2, window: Duration::from_secs(60), } } // No rate limiting pub fn unlimited() -> Self { Self { max_gets: u32::MAX, max_lists: u32::MAX, max_sets: u32::MAX, max_grants: u32::MAX, window: Duration::from_secs(60), } } }
Note: node:root is exempt from rate limiting.
TTL Grant Tracking
TTL grants use a min-heap for efficient expiration tracking:
#![allow(unused)] fn main() { // From ttl.rs pub struct GrantTTLTracker { // Priority queue of expiration times (min-heap) heap: Mutex<BinaryHeap<GrantTTLEntry>>, } struct GrantTTLEntry { expires_at: Instant, entity: String, secret_key: String, } // Reverse ordering for min-heap (earliest expiration first) impl Ord for GrantTTLEntry { fn cmp(&self, other: &Self) -> Ordering { other.expires_at.cmp(&self.expires_at) // Reversed! } } }
TTL Operations
#![allow(unused)] fn main() { // Add a grant with TTL pub fn add(&self, entity: &str, secret_key: &str, ttl: Duration) { let entry = GrantTTLEntry { expires_at: Instant::now() + ttl, entity: entity.to_string(), secret_key: secret_key.to_string(), }; self.heap.lock().unwrap().push(entry); } // Efficient expiration check - O(1) to peek, O(log n) to pop pub fn get_expired(&self) -> Vec<(String, String)> { let now = Instant::now(); let mut expired = Vec::new(); let mut heap = self.heap.lock().unwrap(); // Pop all expired entries (they're at the top due to min-heap) while let Some(entry) = heap.peek() { if entry.expires_at <= now { if let Some(entry) = heap.pop() { expired.push((entry.entity, entry.secret_key)); } } else { break; // No more expired entries } } expired } }
TTL Persistence
TTL grants survive vault restarts via TensorStore persistence:
#![allow(unused)] fn main() { // From ttl.rs const TTL_STORAGE_KEY: &str = "_vault_ttl_grants"; #[derive(Serialize, Deserialize)] pub struct PersistedGrant { pub expires_at_ms: i64, // Unix timestamp pub entity: String, pub secret_key: String, } pub fn persist(&self, store: &TensorStore) -> Result<()> { let grants: Vec<PersistedGrant> = self.heap.lock().unwrap() .iter() .map(|e| PersistedGrant { expires_at_ms: instant_to_unix_ms(e.expires_at), entity: e.entity.clone(), secret_key: e.secret_key.clone(), }) .collect(); let data = serde_json::to_vec(&grants)?; store.put(TTL_STORAGE_KEY, tensor_with_bytes(data))?; Ok(()) } pub fn load(store: &TensorStore) -> Result<Self> { let tracker = Self::new(); let grants: Vec<PersistedGrant> = load_from_store(store)?; for grant in grants { // Skip already expired grants if !grant.is_expired() { tracker.add_with_expiration( &grant.entity, &grant.secret_key, unix_ms_to_instant(grant.expires_at_ms), ); } } Ok(tracker) } }
Cleanup Strategy
Expired grants are cleaned up opportunistically during get() operations:
#![allow(unused)] fn main() { // From lib.rs pub fn get(&self, requester: &str, key: &str) -> Result<String> { // Opportunistic cleanup of expired grants self.cleanup_expired_grants(); // ... rest of get operation } pub fn cleanup_expired_grants(&self) -> usize { let expired = self.ttl_tracker.get_expired(); let mut revoked = 0; for (entity, key) in expired { let secret_node = self.secret_node_key(&key); // Delete the VAULT_ACCESS_* edge if let Ok(edges) = self.graph.get_entity_outgoing(&entity) { for edge_key in edges { if let Ok((_, to, edge_type, _)) = self.graph.get_entity_edge(&edge_key) { if to == secret_node && edge_type.starts_with("VAULT_ACCESS") { if self.graph.delete_entity_edge(&edge_key).is_ok() { revoked += 1; } } } } } } revoked } }
Audit Logging
Audit Entry Storage
#![allow(unused)] fn main() { // From audit.rs const AUDIT_PREFIX: &str = "_va:"; static AUDIT_COUNTER: AtomicU64 = AtomicU64::new(0); pub fn record(&self, entity: &str, secret_key: &str, operation: &AuditOperation) { let timestamp = now_millis(); let counter = AUDIT_COUNTER.fetch_add(1, Ordering::SeqCst); let key = format!("{AUDIT_PREFIX}{timestamp}:{counter}"); let mut tensor = TensorData::new(); tensor.set("_entity", entity); tensor.set("_secret", secret_key); // Already obfuscated by caller tensor.set("_op", operation.as_str()); tensor.set("_ts", timestamp); // Additional fields for grant/revoke match operation { AuditOperation::Grant { to, permission } => { tensor.set("_target", to); tensor.set("_permission", permission); }, AuditOperation::Revoke { from } => { tensor.set("_target", from); }, _ => {}, } // Best effort - audit failures don't block operations let _ = self.store.put(&key, tensor); } }
Audit Query Methods
| Method | Description | Time Complexity |
|---|---|---|
by_secret(key) | All entries for a secret | O(n) scan + filter |
by_entity(entity) | All entries by requester | O(n) scan + filter |
since(timestamp) | Entries since timestamp | O(n) scan + filter |
between(start, end) | Entries in time range | O(n) scan + filter |
recent(limit) | Last N entries | O(n log n) sort + truncate |
Note: Secret keys are obfuscated in audit logs to prevent leaking plaintext names.
Usage Examples
Basic Operations
#![allow(unused)] fn main() { use tensor_vault::{Vault, VaultConfig, Permission}; use graph_engine::GraphEngine; use tensor_store::TensorStore; use std::sync::Arc; // Initialize vault let graph = Arc::new(GraphEngine::new()); let store = TensorStore::new(); let vault = Vault::new(b"master_password", graph, store, VaultConfig::default())?; // Store a secret (root only) vault.set(Vault::ROOT, "api_key", "sk-secret123")?; // Grant access with permission level vault.grant_with_permission(Vault::ROOT, "user:alice", "api_key", Permission::Read)?; // Retrieve secret let value = vault.get("user:alice", "api_key")?; // Revoke access vault.revoke(Vault::ROOT, "user:alice", "api_key")?; }
Permission-Based Access
#![allow(unused)] fn main() { // Grant different permission levels vault.grant_with_permission(Vault::ROOT, "user:reader", "secret", Permission::Read)?; vault.grant_with_permission(Vault::ROOT, "user:writer", "secret", Permission::Write)?; vault.grant_with_permission(Vault::ROOT, "user:admin", "secret", Permission::Admin)?; // Reader can only get/list vault.get("user:reader", "secret")?; // OK vault.set("user:reader", "secret", "new")?; // InsufficientPermission // Writer can update vault.rotate("user:writer", "secret", "new_value")?; // OK vault.delete("user:writer", "secret")?; // InsufficientPermission // Admin can do everything vault.grant_with_permission("user:admin", "user:new", "secret", Permission::Read)?; // OK vault.delete("user:admin", "secret")?; // OK }
TTL Grants
#![allow(unused)] fn main() { use std::time::Duration; // Grant temporary access (1 hour) vault.grant_with_ttl( Vault::ROOT, "agent:temp", "api_key", Permission::Read, Duration::from_secs(3600), )?; // Access works during TTL vault.get("agent:temp", "api_key")?; // OK // After 1 hour, access is automatically revoked // (cleanup happens opportunistically on next vault operation) }
Namespace Isolation
#![allow(unused)] fn main() { // Create namespaced vault for multi-tenant isolation let backend = vault.namespace("team:backend", "user:alice"); let frontend = vault.namespace("team:frontend", "user:bob"); // Keys are automatically prefixed backend.set("db_password", "secret1")?; // Stored as "team:backend:db_password" frontend.set("api_key", "secret2")?; // Stored as "team:frontend:api_key" // Cross-namespace access blocked frontend.get("db_password")?; // AccessDenied }
Secret Versioning
#![allow(unused)] fn main() { // Each set/rotate creates a new version vault.set(Vault::ROOT, "api_key", "v1")?; vault.rotate(Vault::ROOT, "api_key", "v2")?; vault.rotate(Vault::ROOT, "api_key", "v3")?; // Get version info let version = vault.current_version(Vault::ROOT, "api_key")?; // 3 let versions = vault.list_versions(Vault::ROOT, "api_key")?; // [VersionInfo { version: 1, created_at: ... }, ...] // Get specific version let old_value = vault.get_version(Vault::ROOT, "api_key", 1)?; // "v1" // Rollback (creates new version with old content) vault.rollback(Vault::ROOT, "api_key", 1)?; vault.get(Vault::ROOT, "api_key")?; // "v1" vault.current_version(Vault::ROOT, "api_key")?; // 4 (rollback creates new version) }
Audit Queries
#![allow(unused)] fn main() { // Query by secret let entries = vault.audit_log("api_key"); // Query by entity let alice_actions = vault.audit_by_entity("user:alice"); // Query by time let recent = vault.audit_since(timestamp_millis); let last_10 = vault.audit_recent(10); // Audit entries include operation details for entry in entries { match &entry.operation { AuditOperation::Grant { to, permission } => { println!("Granted {} to {} at {}", permission, to, entry.timestamp); }, AuditOperation::Get => { println!("{} read secret at {}", entry.entity, entry.timestamp); }, _ => {}, } } }
Scoped Vault
#![allow(unused)] fn main() { // Create a scoped view for a specific entity let alice = vault.scope("user:alice"); // All operations use alice as the requester alice.get("api_key")?; // Same as vault.get("user:alice", "api_key") alice.list("*")?; // Same as vault.list("user:alice", "*") }
Configuration Options
VaultConfig
| Field | Type | Default | Description |
|---|---|---|---|
salt | Option<[u8; 16]> | None | Salt for key derivation (random if not provided, persisted) |
argon2_memory_cost | u32 | 65536 | Memory cost in KiB (64MB) |
argon2_time_cost | u32 | 3 | Iteration count |
argon2_parallelism | u32 | 4 | Thread count |
rate_limit | Option<RateLimitConfig> | None | Rate limiting (disabled if None) |
max_versions | usize | 5 | Maximum versions to retain per secret |
RateLimitConfig
| Field | Type | Default | Description |
|---|---|---|---|
max_gets | u32 | 60 | Maximum get() calls per window |
max_lists | u32 | 10 | Maximum list() calls per window |
max_sets | u32 | 30 | Maximum set() calls per window |
max_grants | u32 | 20 | Maximum grant() calls per window |
window | Duration | 60s | Sliding window duration |
Environment Variables
| Variable | Description |
|---|---|
NEUMANN_VAULT_KEY | Base64-encoded 32-byte master key |
Shell Commands
VAULT INIT Initialize vault from NEUMANN_VAULT_KEY
VAULT IDENTITY 'node:alice' Set current identity
VAULT NAMESPACE 'team:backend' Set current namespace
VAULT SET 'api_key' 'sk-123' Store encrypted secret
VAULT GET 'api_key' Retrieve secret
VAULT GET 'api_key' VERSION 2 Get specific version
VAULT DELETE 'api_key' Delete secret
VAULT LIST 'prefix:*' List accessible secrets
VAULT ROTATE 'api_key' 'new' Rotate secret value
VAULT VERSIONS 'api_key' List version history
VAULT ROLLBACK 'api_key' VERSION 2 Rollback to version
VAULT GRANT 'user:bob' ON 'api_key' Grant admin access
VAULT GRANT 'user:bob' ON 'api_key' READ Grant read-only access
VAULT GRANT 'user:bob' ON 'api_key' WRITE Grant write access
VAULT GRANT 'user:bob' ON 'api_key' TTL 3600 Grant with 1-hour expiry
VAULT REVOKE 'user:bob' ON 'api_key' Revoke access
VAULT AUDIT 'api_key' View audit log for secret
VAULT AUDIT BY 'user:alice' View audit log for entity
VAULT AUDIT RECENT 10 View last 10 operations
Security Considerations
Best Practices
- Use strong master passwords: At least 128 bits of entropy
- Rotate secrets regularly: Use
rotate()to maintain version history - Grant minimal permissions: Use Read when Write/Admin not needed
- Use TTL grants for temporary access: Prevents forgotten grants
- Enable rate limiting in production: Prevents brute-force attacks
- Use namespaces for multi-tenant: Enforces isolation
- Review audit logs: Monitor for suspicious access patterns
Edge Cases and Gotchas
| Scenario | Behavior |
|---|---|
| Grant to non-existent entity | Succeeds (edge created, entity may exist later) |
| Revoke non-existent grant | Succeeds silently (idempotent) |
| Get non-existent secret | Returns NotFound error |
| Set by non-root without Write | Returns AccessDenied or InsufficientPermission |
| TTL grant cleanup | Opportunistic on get() - may not be immediate |
| Version limit exceeded | Oldest versions automatically deleted |
| Plaintext > 64KB | Returns CryptoError |
| Invalid UTF-8 in secret | get() returns CryptoError |
| Concurrent modifications | Thread-safe via DashMap sharding |
| MEMBER edge to secret | Path exists but NO permission granted |
Threat Model
| Threat | Mitigation |
|---|---|
| Password brute-force | Argon2id memory-hard KDF (64MB, 3 iterations) |
| Offline dictionary attack | Random 128-bit salt, stored in TensorStore |
| Ciphertext tampering | AES-GCM authentication tag (128-bit) |
| Nonce reuse | Random 96-bit nonce per encryption |
| Key leakage | Keys zeroized on drop, subkeys via HKDF |
| Pattern analysis | Key obfuscation, padding, metadata encryption |
| Access enumeration | Rate limiting, audit logging |
| Privilege escalation | MEMBER edges don’t grant permissions |
| Replay attacks | Per-operation nonces, timestamps in metadata |
Performance
| Operation | Time | Notes |
|---|---|---|
| Key derivation (Argon2id) | ~80ms | 64MB memory cost |
| set (1KB) | ~29us | Includes encryption + versioning |
| get (1KB) | ~24us | Includes decryption + audit |
| set (10KB) | ~93us | Scales with data size |
| get (10KB) | ~91us | Scales with data size |
| Access check (shallow) | ~6us | Direct edge |
| Access check (deep, 10 hops) | ~17us | BFS traversal |
| grant | ~18us | Creates graph edge |
| revoke | ~1.1ms | Edge deletion + TTL cleanup |
| list (100 secrets) | ~291us | Pattern matching + access check |
| list (1000 secrets) | ~2.7ms | Scales linearly |
Related Modules
| Module | Relationship |
|---|---|
| Tensor Store | Underlying key-value storage for encrypted secrets |
| Graph Engine | Access control edges and audit trail |
| Query Router | VAULT command execution |
| Neumann Shell | Interactive vault commands |
Dependencies
| Crate | Purpose |
|---|---|
aes-gcm | AES-256-GCM encryption |
argon2 | Key derivation |
hkdf | Subkey derivation |
blake2 | HMAC and obfuscation hashing |
rand | Nonce generation |
zeroize | Secure memory cleanup |
dashmap | Concurrent rate limit tracking |
serde | TTL grant persistence |
Tensor Cache Architecture
Semantic caching for LLM responses with cost tracking and background eviction. Module 10 of Neumann.
The tensor_cache module provides multi-layer caching optimized for LLM
workloads. It combines O(1) exact hash lookups with O(log n) semantic similarity
search via HNSW indices. All cache entries are stored as TensorData in a
shared TensorStore, following the tensor-native paradigm used by
tensor_vault and tensor_blob.
Design Principles
| Principle | Description |
|---|---|
| Multi-Layer Caching | Exact O(1), Semantic O(log n), Embedding O(1) lookups |
| Cost-Aware | Tracks tokens and estimates savings using tiktoken |
| Background Eviction | Async eviction with configurable strategies |
| TTL Expiration | Time-based entry expiration with min-heap tracking |
| Thread-Safe | All operations are concurrent via DashMap |
| Zero Allocation Lookup | Embeddings stored inline, not as pointers |
| Sparse-Aware | Automatic sparse storage for vectors with >50% zeros |
Key Types
Core Types
| Type | Description |
|---|---|
Cache | Main API - multi-layer LLM response cache |
CacheConfig | Configuration (capacity, TTL, eviction, metrics) |
CacheHit | Successful cache lookup result |
CacheStats | Thread-safe statistics with atomic counters |
StatsSnapshot | Point-in-time snapshot for reporting |
CacheLayer | Enum: Exact, Semantic, Embedding |
CacheError | Error types for cache operations |
Configuration Types
| Type | Description |
|---|---|
EvictionStrategy | LRU, LFU, CostBased, Hybrid |
EvictionManager | Background eviction task controller |
EvictionScorer | Calculates eviction priority scores |
EvictionHandle | Handle for controlling background eviction |
EvictionConfig | Interval, batch size, and strategy settings |
Token Counting
| Type | Description |
|---|---|
TokenCounter | GPT-4 compatible token counting via tiktoken |
ModelPricing | Predefined pricing for GPT-4, Claude 3, etc. |
Index Types (Internal)
| Type | Description |
|---|---|
CacheIndex | HNSW wrapper with key-to-node mapping |
IndexSearchResult | Semantic search result with similarity score |
Architecture Diagram
+--------------------------------------------------+
| Cache (Public API) |
| - get(prompt, embedding) -> CacheHit |
| - put(prompt, embedding, response, ...) |
| - stats(), evict(), clear() |
+--------------------------------------------------+
| | |
+-------+ +------+ +------+
| | |
+--------+ +----------+ +-----------+
| Exact | | Semantic | | Embedding |
| Cache | | Cache | | Cache |
| O(1) | | O(log n) | | O(1) |
+--------+ +----------+ +-----------+
| | |
+-------+----+----+------+
|
+------------------+
| CacheIndex |
| (HNSW wrapper) |
+------------------+
|
+------------------+
| tensor_store |
| hnsw.rs |
+------------------+
Multi-Layer Cache Lookup Algorithm
The cache lookup algorithm is designed to maximize hit rates while minimizing latency. It follows a hierarchical approach, checking faster layers first before falling back to more expensive operations.
Lookup Flow Diagram
flowchart TD
A[get prompt, embedding] --> B{Exact Cache Hit?}
B -->|Yes| C[Return CacheHit layer=Exact]
B -->|No| D[Record Exact Miss]
D --> E{Embedding Provided?}
E -->|No| F[Return None]
E -->|Yes| G{Auto-Select Metric?}
G -->|Yes| H{Sparsity >= Threshold?}
G -->|No| I[Use Configured Metric]
H -->|Yes| J[Use Jaccard]
H -->|No| I
J --> K[HNSW Search with Metric]
I --> K
K --> L{Results Above Threshold?}
L -->|No| M[Record Semantic Miss]
M --> F
L -->|Yes| N{Entry Expired?}
N -->|Yes| M
N -->|No| O[Return CacheHit layer=Semantic]
Exact Cache Lookup (O(1))
The exact cache uses a hash-based key derived from the prompt text:
#![allow(unused)] fn main() { // Key generation using DefaultHasher fn exact_key(prompt: &str) -> String { let mut hasher = DefaultHasher::new(); prompt.hash(&mut hasher); let hash = hasher.finish(); format!("_cache:exact:{:016x}", hash) } }
The lookup sequence:
- Generate hash key from prompt
- Query
TensorStorewith key - Check expiration timestamp
- Return hit or proceed to semantic lookup
Semantic Cache Lookup (O(log n))
The semantic cache uses HNSW (Hierarchical Navigable Small World) graphs for approximate nearest neighbor search:
flowchart LR
A[Query Vector] --> B[HNSW Entry Point]
B --> C[Layer 2: Coarse Search]
C --> D[Layer 1: Refined Search]
D --> E[Layer 0: Fine Search]
E --> F[Top-k Candidates]
F --> G[Re-score with Metric]
G --> H[Filter by Threshold]
H --> I[Return Best Match]
Re-scoring Strategy: The HNSW index retrieves candidates using cosine similarity, then re-scores them with the requested metric. This allows using different metrics without rebuilding the index:
#![allow(unused)] fn main() { // Retrieve more candidates than needed for re-scoring let ef = (k * 3).max(10); let candidates = index.search(query, ef); // Re-score with specified metric let similarity = match &embedding { EmbeddingStorage::Dense(dense) => { let stored_sparse = SparseVector::from_dense(dense); let raw = metric.compute(&query_sparse, &stored_sparse); metric.to_similarity(raw) } EmbeddingStorage::Sparse(sparse) => { let raw = metric.compute(&query_sparse, sparse); metric.to_similarity(raw) } // ...handles Delta and TensorTrain storage types }; }
Automatic Metric Selection
When auto_select_metric is enabled, the cache automatically selects the
optimal distance metric based on embedding sparsity:
#![allow(unused)] fn main() { fn select_metric(&self, embedding: &[f32]) -> DistanceMetric { if !self.config.auto_select_metric { return self.config.distance_metric.clone(); } let sparse = SparseVector::from_dense(embedding); if sparse.sparsity() >= self.config.sparsity_metric_threshold { DistanceMetric::Jaccard // Better for sparse vectors } else { self.config.distance_metric.clone() // Default (usually Cosine) } } }
Cache Layers
Exact Cache (O(1))
Hash-based lookup for identical queries. Keys are generated from prompt text
using DefaultHasher. Stored with prefix _cache:exact:.
When to use: Repetitive queries with exact same prompts (e.g., FAQ systems, chatbots with canned responses).
Semantic Cache (O(log n))
HNSW-based similarity search for semantically similar queries. Uses configurable
distance metrics (Cosine, Jaccard, Euclidean, Angular). Stored with prefix
_cache:sem:.
When to use: Natural language queries with variations (e.g., “What’s the weather?” vs “How’s the weather today?”).
Embedding Cache (O(1))
Stores precomputed embeddings to avoid redundant embedding API calls. Keys
combine source and content hash. Stored with prefix _cache:emb:.
When to use: When embedding computation is expensive and the same content is embedded multiple times.
Storage Format
Cache entries are stored as TensorData with standardized fields:
| Field | Type | Description |
|---|---|---|
_response | String | Cached response text |
_embedding | Vector/Sparse | Embedding (semantic/embedding layers) |
_embedding_dim | Int | Embedding dimension |
_input_tokens | Int | Input token count |
_output_tokens | Int | Output token count |
_model | String | Model identifier |
_layer | String | Cache layer (exact/semantic/embedding) |
_created_at | Int | Creation timestamp (millis) |
_expires_at | Int | Expiration timestamp (millis) |
_access_count | Int | Access count for LFU |
_last_access | Int | Last access timestamp for LRU |
_version | String | Optional version tag |
_source | String | Embedding source identifier |
_content_hash | Int | Content hash for deduplication |
Sparse Storage Optimization
Embeddings with high sparsity (>50% zeros) are automatically stored in sparse format to reduce memory usage:
#![allow(unused)] fn main() { fn should_use_sparse(vector: &[f32]) -> bool { if vector.is_empty() { return false; } let nnz = vector.iter().filter(|&&v| v.abs() > 1e-6).count(); // Use sparse if non-zero count <= half of total length nnz * 2 <= vector.len() } }
Distance Metrics
Configurable distance metrics for semantic similarity:
| Metric | Best For | Range | Formula |
|---|---|---|---|
| Cosine | Dense embeddings (default) | -1 to 1 | dot(a,b) / (‖a‖ * ‖b‖) |
| Angular | Linear angle relationships | 0 to PI | acos(cosine_sim) |
| Jaccard | Sparse/binary embeddings | 0 to 1 | ‖A ∩ B‖ / ‖A ∪ B‖ |
| Euclidean | Absolute distances | 0 to inf | sqrt(sum((a-b)^2)) |
| WeightedJaccard | Sparse with magnitudes | 0 to 1 | Weighted set similarity |
Auto-selection: When auto_select_metric is true, the cache automatically
selects Jaccard for sparse embeddings (sparsity >= threshold, default 70%) and
the configured metric otherwise.
Eviction Strategies
Strategy Comparison
| Strategy | Description | Score Formula | Best For |
|---|---|---|---|
| LRU | Evicts entries that haven’t been accessed recently | -last_access_secs | General purpose |
| LFU | Evicts entries with lowest access count | access_count | Stable workloads |
| CostBased | Evicts entries with lowest cost savings per byte | cost_per_hit / size_bytes | Cost optimization |
| Hybrid | Combines all strategies with configurable weights | Weighted combination | Production systems |
Hybrid Eviction Score Algorithm
The Hybrid strategy combines recency, frequency, and cost factors:
#![allow(unused)] fn main() { pub fn score( &self, last_access_secs: f64, access_count: u64, cost_per_hit: f64, size_bytes: usize, ) -> f64 { match self.strategy { EvictionStrategy::LRU => -last_access_secs, EvictionStrategy::LFU => access_count as f64, EvictionStrategy::CostBased => { if size_bytes == 0 { 0.0 } else { cost_per_hit / size_bytes as f64 } } EvictionStrategy::Hybrid { lru_weight, lfu_weight, cost_weight } => { let total = f64::from(lru_weight) + f64::from(lfu_weight) + f64::from(cost_weight); let recency_w = f64::from(lru_weight) / total; let frequency_w = f64::from(lfu_weight) / total; let cost_w = f64::from(cost_weight) / total; let age_minutes = last_access_secs / 60.0; let recency_score = 1.0 / (1.0 + age_minutes); // Decays with age let frequency_score = (1.0 + access_count as f64).log2(); // Log scale let cost_score = cost_per_hit; recency_score * recency_w + frequency_score * frequency_w + cost_score * cost_w } } } }
Lower scores are evicted first. The hybrid formula:
recency_score: Decays as1/(1 + age_in_minutes)- newer entries score higherfrequency_score: Grows logarithmically with access count - frequently accessed entries score highercost_score: Direct cost per hit - higher cost savings score higher
Background Eviction Flow
flowchart TD
A[EvictionManager::start] --> B[Spawn Tokio Task]
B --> C[Initialize Interval Timer]
C --> D{Select Event}
D -->|Timer Tick| E[Call evict_fn batch_size]
D -->|Shutdown Signal| F[Set running=false]
E --> G{Evicted > 0?}
G -->|Yes| H[Record Eviction Stats]
G -->|No| D
H --> D
F --> I[Break Loop]
#![allow(unused)] fn main() { // Starting background eviction let handle = manager.start(move |batch_size| { cache.evict(batch_size) }); // Later: graceful shutdown handle.shutdown().await; }
Configuration
Default Configuration
#![allow(unused)] fn main() { CacheConfig { exact_capacity: 10_000, semantic_capacity: 5_000, embedding_capacity: 50_000, default_ttl: Duration::from_secs(3600), max_ttl: Duration::from_secs(86400), semantic_threshold: 0.92, embedding_dim: 1536, eviction_strategy: EvictionStrategy::Hybrid { lru_weight: 40, lfu_weight: 30, cost_weight: 30 }, eviction_interval: Duration::from_secs(60), eviction_batch_size: 100, input_cost_per_1k: 0.0015, output_cost_per_1k: 0.002, inline_threshold: 4096, distance_metric: DistanceMetric::Cosine, auto_select_metric: true, sparsity_metric_threshold: 0.7, } }
Configuration Presets
| Preset | Use Case | Exact Capacity | Semantic Capacity | Embedding Capacity | Eviction Batch |
|---|---|---|---|---|---|
default() | General purpose | 10,000 | 5,000 | 50,000 | 100 |
high_throughput() | High-traffic server | 50,000 | 20,000 | 100,000 | 500 |
low_memory() | Memory-constrained | 1,000 | 500 | 5,000 | 50 |
development() | Dev/testing | 100 | 50 | 200 | 10 |
sparse_embeddings() | Sparse vectors | 10,000 | 5,000 | 50,000 | 100 |
Configuration Validation
The config validates on cache creation:
#![allow(unused)] fn main() { pub fn validate(&self) -> Result<(), String> { if self.semantic_threshold < 0.0 || self.semantic_threshold > 1.0 { return Err("semantic_threshold must be between 0.0 and 1.0"); } if self.embedding_dim == 0 { return Err("embedding_dim must be greater than 0"); } if self.eviction_batch_size == 0 { return Err("eviction_batch_size must be greater than 0"); } if self.default_ttl > self.max_ttl { return Err("default_ttl cannot exceed max_ttl"); } if self.sparsity_metric_threshold < 0.0 || self.sparsity_metric_threshold > 1.0 { return Err("sparsity_metric_threshold must be between 0.0 and 1.0"); } Ok(()) } }
Usage Examples
Basic Usage
#![allow(unused)] fn main() { use tensor_cache::{Cache, CacheConfig}; let mut config = CacheConfig::default(); config.embedding_dim = 3; let cache = Cache::with_config(config).unwrap(); // Store a response let embedding = vec![0.1, 0.2, 0.3]; cache.put("What is 2+2?", &embedding, "4", "gpt-4", None).unwrap(); // Look up (tries exact first, then semantic) if let Some(hit) = cache.get("What is 2+2?", Some(&embedding)) { println!("Cached: {}", hit.response); } }
Explicit Metric Queries
#![allow(unused)] fn main() { use tensor_cache::DistanceMetric; let hit = cache.get_with_metric( "query", Some(&embedding), Some(&DistanceMetric::Euclidean), ); if let Some(hit) = hit { println!("Metric used: {:?}", hit.metric_used); } }
Embedding Cache with Compute Fallback
#![allow(unused)] fn main() { // Get cached embedding or compute on miss let embedding = cache.get_or_compute_embedding( "openai", // source "Hello, world!", // content "text-embedding-3-small", // model || { // Compute function called only on cache miss Ok(compute_embedding("Hello, world!")) } )?; }
Token Counting and Cost Estimation
#![allow(unused)] fn main() { use tensor_cache::{TokenCounter, ModelPricing}; // Count tokens in text let tokens = TokenCounter::count("Hello, world!"); // Count tokens in chat messages (includes overhead) let messages = vec![("user", "Hello"), ("assistant", "Hi there!")]; let total = TokenCounter::count_messages(&messages); // Estimate cost with custom rates let cost = TokenCounter::estimate_cost(1000, 500, 0.01, 0.03); // Use predefined model pricing let pricing = ModelPricing::GPT4O; let cost = pricing.estimate(1000, 500); // Lookup pricing by model name if let Some(pricing) = ModelPricing::for_model("gpt-4o-mini") { println!("Cost: ${:.4}", pricing.estimate(1000, 500)); } }
Statistics and Monitoring
#![allow(unused)] fn main() { let stats = cache.stats_snapshot(); // Hit rates by layer println!("Exact hit rate: {:.2}%", stats.hit_rate(CacheLayer::Exact) * 100.0); println!("Semantic hit rate: {:.2}%", stats.hit_rate(CacheLayer::Semantic) * 100.0); // Tokens and cost saved println!("Input tokens saved: {}", stats.tokens_saved_in); println!("Output tokens saved: {}", stats.tokens_saved_out); println!("Cost saved: ${:.2}", stats.cost_saved_dollars); // Cache utilization println!("Total entries: {}", stats.total_entries()); println!("Evictions: {}", stats.evictions); println!("Expirations: {}", stats.expirations); println!("Uptime: {} seconds", stats.uptime_secs); }
Shared TensorStore Integration
#![allow(unused)] fn main() { use tensor_store::TensorStore; use tensor_cache::{Cache, CacheConfig}; // Share store with other engines let store = TensorStore::new(); let cache = Cache::with_store(store.clone(), CacheConfig::default())?; // Other engines can use the same store let vault = Vault::with_store(store.clone(), VaultConfig::default())?; }
Token Counting Implementation
The TokenCounter uses tiktoken’s cl100k_base encoding, which is compatible
with GPT-4, GPT-3.5-turbo, and text-embedding-ada-002.
Lazy Encoder Initialization
#![allow(unused)] fn main() { static CL100K_ENCODER: OnceLock<Option<CoreBPE>> = OnceLock::new(); impl TokenCounter { fn encoder() -> Option<&'static CoreBPE> { CL100K_ENCODER .get_or_init(|| tiktoken_rs::cl100k_base().ok()) .as_ref() } } }
Fallback Estimation
If tiktoken is unavailable, falls back to character-based estimation (~4 chars per token for English text):
#![allow(unused)] fn main() { const fn estimate_tokens(text: &str) -> usize { text.len().div_ceil(4) } }
Message Token Counting
Chat messages include overhead tokens per message (role markers, separators):
#![allow(unused)] fn main() { pub fn count_message(role: &str, content: &str) -> usize { Self::encoder().map_or_else( || Self::estimate_tokens(role) + Self::estimate_tokens(content) + 4, |enc| { let role_tokens = enc.encode_ordinary(role).len(); let content_tokens = enc.encode_ordinary(content).len(); role_tokens + content_tokens + 4 // 4 tokens overhead per message }, ) } pub fn count_messages(messages: &[(&str, &str)]) -> usize { let mut total = 0; for (role, content) in messages { total += Self::count_message(role, content); } total + 3 // 3 tokens for assistant reply priming } }
Cost Calculation Formulas
#![allow(unused)] fn main() { // Basic cost calculation pub fn estimate_cost( input_tokens: usize, output_tokens: usize, input_rate: f64, // $/1000 tokens output_rate: f64, // $/1000 tokens ) -> f64 { (input_tokens as f64 / 1000.0) * input_rate + (output_tokens as f64 / 1000.0) * output_rate } // For atomic operations (avoids floating point accumulation errors) pub fn estimate_cost_microdollars(...) -> u64 { let dollars = Self::estimate_cost(...); (dollars * 1_000_000.0) as u64 } }
Model Pricing
| Model | Input/1K | Output/1K | Notes |
|---|---|---|---|
| GPT-4o | $0.005 | $0.015 | Best for complex tasks |
| GPT-4o mini | $0.00015 | $0.0006 | Cost-effective |
| GPT-4 Turbo | $0.01 | $0.03 | High capability |
| GPT-3.5 Turbo | $0.0005 | $0.0015 | Budget option |
| Claude 3 Opus | $0.015 | $0.075 | Highest quality |
| Claude 3 Sonnet | $0.003 | $0.015 | Balanced |
| Claude 3 Haiku | $0.00025 | $0.00125 | Fast and cheap |
Model Name Matching
#![allow(unused)] fn main() { pub fn for_model(model: &str) -> Option<Self> { let model_lower = model.to_lowercase(); if model_lower.contains("gpt-4o-mini") { Some(Self::GPT4O_MINI) } else if model_lower.contains("gpt-4o") { Some(Self::GPT4O) } else if model_lower.contains("gpt-4-turbo") { Some(Self::GPT4_TURBO) } else if model_lower.contains("gpt-3.5") { Some(Self::GPT35_TURBO) } else if model_lower.contains("claude-3-opus") || model_lower.contains("claude-opus") { Some(Self::CLAUDE3_OPUS) } else if model_lower.contains("claude-3-sonnet") || model_lower.contains("claude-sonnet") { Some(Self::CLAUDE3_SONNET) } else if model_lower.contains("claude-3-haiku") || model_lower.contains("claude-haiku") { Some(Self::CLAUDE3_HAIKU) } else { None } } }
Semantic Search Index Internals
CacheIndex Structure
#![allow(unused)] fn main() { pub struct CacheIndex { index: RwLock<HNSWIndex>, // HNSW graph config: HNSWConfig, // For recreation on clear key_to_node: DashMap<String, usize>, // Cache key -> HNSW node node_to_key: DashMap<usize, String>, // HNSW node -> Cache key dimension: usize, // Expected embedding dimension entry_count: AtomicUsize, // Entry count distance_metric: DistanceMetric, // Default metric } }
Insert Strategies
#![allow(unused)] fn main() { // Dense embedding insert pub fn insert(&self, key: &str, embedding: &[f32]) -> Result<usize>; // Sparse embedding insert (memory efficient) pub fn insert_sparse(&self, key: &str, embedding: &SparseVector) -> Result<usize>; // Auto-select based on sparsity threshold pub fn insert_auto( &self, key: &str, embedding: &[f32], sparsity_threshold: f32, ) -> Result<usize>; }
Key Orphaning on Re-insert
When a key is re-inserted, the old HNSW node is orphaned (not deleted) because HNSW doesn’t support efficient deletion:
#![allow(unused)] fn main() { let is_new = !self.key_to_node.contains_key(key); if !is_new { // Remove mapping but leave HNSW node (will be ignored in search) self.key_to_node.remove(key); } }
Memory Statistics
#![allow(unused)] fn main() { pub fn memory_stats(&self) -> Option<HNSWMemoryStats> { self.index.read().ok().map(|index| index.memory_stats()) } // Returns: dense_count, sparse_count, delta_count, embedding_bytes, etc. }
Error Types
| Error | Description | Recovery |
|---|---|---|
NotFound | Cache entry not found | Check key exists |
DimensionMismatch | Embedding dimension does not match config | Verify embedding size |
StorageError | Underlying tensor store error | Check store health |
SerializationError | Serialization/deserialization failed | Verify data format |
TokenizerError | Token counting failed | Falls back to estimation |
CacheFull | Cache capacity exceeded | Run eviction or increase capacity |
InvalidConfig | Invalid configuration provided | Fix config values |
Cancelled | Operation was cancelled | Retry operation |
LockPoisoned | Internal lock was poisoned | Restart cache |
Error Conversion
#![allow(unused)] fn main() { impl From<tensor_store::TensorStoreError> for CacheError { fn from(e: TensorStoreError) -> Self { Self::StorageError(e.to_string()) } } impl From<bitcode::Error> for CacheError { fn from(e: bitcode::Error) -> Self { Self::SerializationError(e.to_string()) } } }
Performance
Benchmarks (10,000 entries, 128-dim embeddings)
| Operation | Time | Notes |
|---|---|---|
| Exact lookup (hit) | ~50ns | Hash lookup + TensorStore get |
| Exact lookup (miss) | ~30ns | Hash lookup only |
| Semantic lookup | ~5us | HNSW search + re-scoring |
| Put (exact + semantic) | ~10us | Two stores + HNSW insert |
| Eviction (100 entries) | ~200us | Batch deletion |
| Clear (full index) | ~1ms | HNSW recreation |
Distance Metric Performance (128-dim, 1000 entries)
| Metric | Search Time | Notes |
|---|---|---|
| Cosine | 21 us | Default, best for dense |
| Jaccard | 18 us | Best for sparse |
| Angular | 23 us | +acos overhead |
| Euclidean | 19 us | Absolute distance |
Auto-Selection Overhead
| Operation | Time |
|---|---|
| Sparsity check | ~50 ns |
| Metric selection | ~10 ns |
Memory Efficiency
| Storage Type | Memory per Entry | Best For |
|---|---|---|
| Dense Vector | 4 * dim bytes | Low sparsity (<50% zeros) |
| Sparse Vector | 8 * nnz bytes | High sparsity (>50% zeros) |
Edge Cases and Gotchas
TTL Behavior
- Entries with
expires_at = 0never expire - Expired entries return
Noneon lookup but remain in storage until cleanup cleanup_expired()must be called explicitly or via background eviction
Capacity Limits
put()fails withCacheFullwhen capacity is reached- Capacity is checked per-layer (exact, semantic, embedding)
- No automatic eviction on put - must be explicit
Hash Collisions
- Extremely unlikely with 64-bit hashes (~1 in 18 quintillion)
- If collision occurs, exact cache will return wrong response
- Semantic cache provides fallback for semantically different queries
Metric Re-scoring
- HNSW always uses cosine similarity for graph navigation
- Re-scoring with different metrics may change result order
- Retrieves 3x candidates to account for re-ranking
Sparse Storage Threshold
- Uses sparse format when
nnz * 2 <= len(50% zeros) - Different from auto-metric selection threshold (default 70%)
- Both thresholds are configurable
Performance Tips and Best Practices
Configuration Tuning
- Semantic Threshold: Start with 0.92, lower to 0.85 for fuzzy matching
- Eviction Weights: Increase
cost_weightif API costs matter most - Batch Size: Larger batches (500+) for high-throughput systems
- TTL: Match to your content freshness requirements
Memory Optimization
- Use
sparse_embeddings()preset for sparse data - Set
inline_thresholdbased on typical response sizes - Enable
auto_select_metricfor mixed workloads - Monitor
memory_stats()to track sparse vs dense ratio
Hit Rate Optimization
- Normalize prompts before caching (lowercase, trim whitespace)
- Use versioning for model/prompt template changes
- Set appropriate semantic threshold for your domain
- Consider domain-specific embeddings
Cost Tracking
- Use
estimate_cost_microdollars()for atomic accumulation - Record cost per cache hit for ROI analysis
- Compare
tokens_savedagainst capacity costs
Shell Commands
CACHE INIT Initialize semantic cache
CACHE STATS Show cache statistics
CACHE CLEAR Clear all cache entries
API Reference
Cache Methods
| Method | Description |
|---|---|
new() | Create with default config |
with_config(config) | Create with custom config |
with_store(store, config) | Create with shared TensorStore |
get(prompt, embedding) | Look up cached response |
get_with_metric(prompt, embedding, metric) | Look up with explicit metric |
put(prompt, embedding, response, model, ttl) | Store response |
get_embedding(source, content) | Get cached embedding |
put_embedding(source, content, embedding, model) | Store embedding |
get_or_compute_embedding(source, content, model, compute) | Get or compute embedding |
get_simple(key) | Simple key-value lookup |
put_simple(key, value) | Simple key-value store |
invalidate(prompt) | Remove exact entry |
invalidate_version(version) | Remove entries by version |
invalidate_embeddings(source) | Remove embeddings by source |
evict(count) | Manually evict entries |
cleanup_expired() | Remove expired entries |
clear() | Clear all entries |
stats() | Get statistics reference |
stats_snapshot() | Get statistics snapshot |
config() | Get configuration reference |
len() | Total cached entries |
is_empty() | Check if cache is empty |
CacheHit Fields
| Field | Type | Description |
|---|---|---|
response | String | Cached response text |
layer | CacheLayer | Which layer matched |
similarity | Option<f32> | Similarity score (semantic only) |
input_tokens | usize | Input tokens saved |
output_tokens | usize | Output tokens saved |
cost_saved | f64 | Estimated cost saved (dollars) |
metric_used | Option<DistanceMetric> | Metric used (semantic only) |
StatsSnapshot Fields
| Field | Type | Description |
|---|---|---|
exact_hits | u64 | Exact cache hits |
exact_misses | u64 | Exact cache misses |
semantic_hits | u64 | Semantic cache hits |
semantic_misses | u64 | Semantic cache misses |
embedding_hits | u64 | Embedding cache hits |
embedding_misses | u64 | Embedding cache misses |
tokens_saved_in | u64 | Total input tokens saved |
tokens_saved_out | u64 | Total output tokens saved |
cost_saved_dollars | f64 | Total cost saved |
evictions | u64 | Total evictions |
expirations | u64 | Total expirations |
exact_size | usize | Current exact cache size |
semantic_size | usize | Current semantic cache size |
embedding_size | usize | Current embedding cache size |
uptime_secs | u64 | Cache uptime in seconds |
Dependencies
| Crate | Purpose |
|---|---|
tensor_store | HNSW index implementation, TensorStore |
tiktoken-rs | GPT-compatible token counting |
dashmap | Concurrent hash maps |
tokio | Async runtime for background eviction |
uuid | Unique ID generation |
thiserror | Error type derivation |
serde | Configuration serialization |
bincode | Binary serialization |
Related Modules
tensor_store- Backing storage and HNSW indexquery_router- Cache integration for query executionneumann_shell- CLI commands for cache management
Tensor Blob Architecture
S3-style object storage for large artifacts using content-addressable chunked storage with tensor-native metadata. Artifacts are split into SHA-256 hashed chunks for automatic deduplication, with metadata stored in the tensor store for integration with graph, relational, and vector queries.
All I/O operations are async via Tokio. Large files are streamed through
BlobWriter and BlobReader without loading entirely into memory. Background
garbage collection removes orphaned chunks automatically.
Key Types
Core Types
| Type | Description |
|---|---|
BlobStore | Main API for storing, retrieving, and managing artifacts |
BlobConfig | Configuration for chunk size, GC intervals, and limits |
BlobWriter | Streaming upload with incremental chunking and hash computation |
BlobReader | Streaming download with chunk-by-chunk reads and verification |
Chunk | Content-addressed data segment with SHA-256 hash |
Chunker | Splits data into fixed-size content-addressable chunks |
StreamingHasher | Incremental SHA-256 computation for large files |
GarbageCollector | Background task for cleaning orphaned chunks |
Metadata Types
| Type | Description |
|---|---|
ArtifactMetadata | Full metadata including filename, size, checksum, links, tags |
PutOptions | Upload options: content type, creator, links, tags, custom metadata, embedding |
MetadataUpdates | Partial updates for filename, content type, custom fields |
SimilarArtifact | Search result with artifact ID, filename, and similarity score |
WriteState | Internal state tracking artifact metadata during streaming upload |
Statistics Types
| Type | Description |
|---|---|
BlobStats | Storage statistics: artifact count, chunk count, dedup ratio, orphaned chunks |
GcStats | GC results: chunks deleted, bytes freed |
RepairStats | Repair results: artifacts checked, chunks verified, refs fixed, orphans deleted |
Error Types
| Error | Description |
|---|---|
NotFound | Artifact does not exist |
ChunkMissing | Referenced chunk not found in storage |
ChecksumMismatch | Data corruption detected during verification |
EmptyData | Cannot store empty artifact |
InvalidConfig | Invalid configuration parameter (e.g., zero chunk size) |
InvalidArtifactId | Malformed artifact ID format |
StorageError | Underlying tensor store error |
GraphError | Graph engine integration error (feature-gated) |
VectorError | Vector engine integration error (feature-gated) |
IoError | I/O error during streaming operations |
GcError | Garbage collection failure |
AlreadyExists | Artifact with given ID already exists |
DimensionMismatch | Embedding dimension mismatch |
Architecture Diagram
+--------------------------------------------------+
| BlobStore (Public API) |
| - put, get, delete, exists |
| - metadata, update_metadata |
| - link, unlink, tag, untag |
| - verify, repair, gc, full_gc |
+--------------------------------------------------+
| | |
+-------+ +-------+ +-------+
| | |
+--------+ +-----------+ +----------+
| Writer | | Reader | | GC |
| Stream | | Stream | | (Tokio) |
+--------+ +-----------+ +----------+
| | |
+-------+------+------+-------+
|
+------------------+
| Chunker |
| SHA-256 hash |
+------------------+
|
+------------------+
| tensor_store |
| _blob:meta:* |
| _blob:chunk:* |
+------------------+
Storage Format
Artifact Metadata
Stored at _blob:meta:{artifact_id}:
| Field | Type | Description |
|---|---|---|
_type | String | Always "blob_artifact" |
_id | String | Unique artifact identifier (UUID v4) |
_filename | String | Original filename |
_content_type | String | MIME type |
_size | Int | Total size in bytes |
_checksum | String | SHA-256 hash of full content (sha256:{hex}) |
_chunk_size | Int | Size of each chunk (except possibly last) |
_chunk_count | Int | Number of chunks |
_chunks | Pointers | Ordered list of chunk keys |
_created | Int | Unix timestamp (seconds) |
_modified | Int | Unix timestamp (seconds) |
_created_by | String | Creator identity |
_linked_to | Pointers | Linked entity IDs |
_tags | Pointers | Applied tags (prefixed with tag:) |
_meta:* | String | Custom metadata fields |
_embedding | Vector/Sparse | Optional embedding (sparse if >50% zeros) |
_embedded_model | String | Embedding model name |
Chunk Data
Stored at _blob:chunk:sha256:{hex}:
| Field | Type | Description |
|---|---|---|
_type | String | Always "blob_chunk" |
_data | Bytes | Raw chunk data |
_size | Int | Chunk size in bytes |
_refs | Int | Reference count for deduplication |
_created | Int | Unix timestamp (seconds) |
Content-Addressable Chunking Algorithm
The chunker uses a fixed-size chunking strategy with SHA-256 content addressing:
flowchart TD
A[Input Data] --> B[Split into fixed-size chunks]
B --> C{For each chunk}
C --> D[Compute SHA-256 hash]
D --> E{Chunk exists?}
E -->|Yes| F[Increment ref count]
E -->|No| G[Store new chunk]
F --> H[Record chunk key]
G --> H
H --> C
C -->|Done| I[Compute full-file checksum]
I --> J[Store metadata with chunk list]
Chunker Implementation
#![allow(unused)] fn main() { // Chunker splits data into fixed-size segments pub struct Chunker { chunk_size: usize, // Default: 1MB (1,048,576 bytes) } impl Chunker { // Split data into chunks using Rust's chunks() iterator pub fn chunk<'a>(&'a self, data: &'a [u8]) -> impl Iterator<Item = Chunk> + 'a { data.chunks(self.chunk_size).map(|chunk_data| { let hash = compute_hash(chunk_data); Chunk { hash, data: chunk_data.to_vec(), size: chunk_data.len(), } }) } // Count chunks without allocating (useful for progress estimation) pub fn chunk_count(&self, data_len: usize) -> usize { if data_len == 0 { 0 } else { data_len.div_ceil(self.chunk_size) } } } }
Chunk Key Format
Chunk keys follow a deterministic format for content addressing:
_blob:chunk:sha256:{64_hex_chars}
Example:
_blob:chunk:sha256:b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
SHA-256 Checksum Computation
The system uses the sha2 crate for cryptographic hashing:
#![allow(unused)] fn main() { use sha2::{Digest, Sha256}; // Single-shot hash for chunk content pub fn compute_hash(data: &[u8]) -> String { let mut hasher = Sha256::new(); hasher.update(data); let result = hasher.finalize(); format!("sha256:{:x}", result) // Lowercase hex encoding } // Streaming hash for large files (used by BlobWriter) pub struct StreamingHasher { hasher: Sha256, } impl StreamingHasher { pub fn new() -> Self { Self { hasher: Sha256::new() } } pub fn update(&mut self, data: &[u8]) { self.hasher.update(data); } pub fn finalize(self) -> String { let result = self.hasher.finalize(); format!("sha256:{:x}", result) } } // Multi-segment hash (for verification) pub fn compute_hash_streaming<'a>(segments: impl Iterator<Item = &'a [u8]>) -> String { let mut hasher = Sha256::new(); for segment in segments { hasher.update(segment); } let result = hasher.finalize(); format!("sha256:{:x}", result) } }
Content-Addressable Deduplication
Chunks are keyed by SHA-256 hash, enabling automatic deduplication:
- When writing data, the chunker splits it into fixed-size segments (default 1MB)
- Each chunk is hashed with SHA-256 to produce a unique key
- If the chunk already exists, only the reference count is incremented
- Identical data across different artifacts shares the same physical chunks
#![allow(unused)] fn main() { let data = vec![0u8; 10_000]; // Store same data twice blob.put("file1.bin", &data, PutOptions::default()).await?; blob.put("file2.bin", &data, PutOptions::default()).await?; let stats = blob.stats().await?; // stats.chunk_count = 1 (deduplicated) // stats.dedup_ratio > 0.0 }
Deduplication Ratio Calculation
#![allow(unused)] fn main() { let dedup_ratio = if total_bytes > 0 { 1.0 - (unique_bytes as f64 / total_bytes as f64) } else { 0.0 }; }
A ratio of 0.5 means 50% space savings through deduplication.
Streaming Upload State Machine
The BlobWriter manages incremental uploads with proper buffering:
stateDiagram-v2
[*] --> Created: new()
Created --> Buffering: write()
Buffering --> Buffering: write() [buffer < chunk_size]
Buffering --> ChunkReady: write() [buffer >= chunk_size]
ChunkReady --> StoreChunk: drain buffer
StoreChunk --> CheckExists: compute hash
CheckExists --> IncrementRefs: chunk exists
CheckExists --> StoreNew: chunk new
IncrementRefs --> Buffering
StoreNew --> Buffering
Buffering --> FlushFinal: finish()
FlushFinal --> StoreMetadata: store remaining buffer
StoreMetadata --> [*]: return artifact_id
BlobWriter Internal State
#![allow(unused)] fn main() { pub struct BlobWriter { store: TensorStore, chunker: Chunker, state: WriteState, // Artifact metadata (filename, content_type, etc.) chunks: Vec<String>, // Ordered list of chunk keys total_size: usize, // Running total of bytes written hasher: StreamingHasher, // Incremental full-file hash buffer: Vec<u8>, // Incomplete chunk buffer } }
Write Operation Flow
#![allow(unused)] fn main() { pub async fn write(&mut self, data: &[u8]) -> Result<()> { if data.is_empty() { return Ok(()); } // 1. Update full-file hash (computed independently of chunking) self.hasher.update(data); self.total_size += data.len(); // 2. Add to internal buffer self.buffer.extend_from_slice(data); // 3. Process complete chunks (may be multiple if large write) while self.buffer.len() >= self.chunker.chunk_size() { let chunk_data: Vec<u8> = self.buffer.drain(..self.chunker.chunk_size()).collect(); let chunk = Chunk::new(chunk_data); self.store_chunk(chunk).await?; } Ok(()) } }
Finish Operation
#![allow(unused)] fn main() { pub async fn finish(mut self) -> Result<String> { // 1. Flush remaining buffer as final (possibly smaller) chunk if !self.buffer.is_empty() { let chunk = Chunk::new(std::mem::take(&mut self.buffer)); self.store_chunk(chunk).await?; } // 2. Finalize full-file checksum let checksum = self.hasher.finalize(); // 3. Build and store metadata tensor let mut tensor = TensorData::new(); tensor.set("_type", "blob_artifact"); tensor.set("_id", self.state.artifact_id.clone()); tensor.set("_checksum", checksum); tensor.set("_chunks", TensorValue::Pointers(self.chunks)); // ... additional fields ... let meta_key = format!("_blob:meta:{}", self.state.artifact_id); self.store.put(&meta_key, tensor)?; Ok(self.state.artifact_id) } }
Streaming Download State Machine
The BlobReader manages incremental downloads with chunk-level iteration:
stateDiagram-v2
[*] --> Initialized: new()
Initialized --> LoadMetadata: read chunk list
LoadMetadata --> Ready: chunks loaded
Ready --> ReadChunk: next_chunk()
ReadChunk --> ChunkLoaded: fetch from store
ChunkLoaded --> Ready: return data
Ready --> [*]: all chunks read
Ready --> Verify: verify()
Verify --> HashAll: reset and hash all chunks
HashAll --> Compare: compare checksums
Compare --> [*]: return bool
BlobReader Internal State
#![allow(unused)] fn main() { pub struct BlobReader { store: TensorStore, chunks: Vec<String>, // Ordered list of chunk keys current_chunk: usize, // Index of next chunk to read current_data: Option<Vec<u8>>, // Cached current chunk for read() current_offset: usize, // Offset within current_data total_size: usize, // Total artifact size bytes_read: usize, // Bytes read so far checksum: String, // Expected checksum for verification } }
Read Modes
#![allow(unused)] fn main() { // Mode 1: Chunk-at-a-time (best for processing in batches) while let Some(chunk) = reader.next_chunk().await? { process_chunk(&chunk); } // Mode 2: Read all into memory (convenient for small files) let data = reader.read_all().await?; // Mode 3: Buffer-based reading (for streaming to other APIs) let mut buf = vec![0u8; 4096]; loop { let n = reader.read(&mut buf).await?; if n == 0 { break; } output.write_all(&buf[..n])?; } }
Garbage Collection Reference Counting
The GC system uses reference counting with two operational modes:
Reference Count Management
#![allow(unused)] fn main() { // When storing a chunk (in BlobWriter::store_chunk) if self.store.exists(&chunk_key) { // Chunk already exists - just increment ref count increment_chunk_refs(&self.store, &chunk_key)?; } else { // New chunk - store with ref count of 1 let mut tensor = TensorData::new(); tensor.set("_refs", TensorValue::Scalar(ScalarValue::Int(1))); // ... store chunk data ... } // When deleting an artifact pub fn delete_artifact(store: &TensorStore, artifact_id: &str) -> Result<()> { let tensor = store.get(&meta_key)?; if let Some(chunks) = get_pointers(&tensor, "_chunks") { for chunk_key in chunks { decrement_chunk_refs(store, &chunk_key)?; // Saturating at 0 } } store.delete(&meta_key)?; Ok(()) } }
Incremental GC (gc())
Processes a limited batch of chunks per cycle, respecting age requirements:
flowchart TD
A[Start GC Cycle] --> B[Scan chunk keys]
B --> C{Take batch_size chunks}
C --> D{For each chunk}
D --> E{refs == 0?}
E -->|No| D
E -->|Yes| F{age > min_age?}
F -->|No| D
F -->|Yes| G[Delete chunk]
G --> H[Track freed bytes]
H --> D
D -->|Done| I[Return GcStats]
#![allow(unused)] fn main() { pub async fn gc_cycle(&self) -> GcStats { let mut deleted = 0; let mut freed_bytes = 0; let now = current_timestamp(); let min_created = now.saturating_sub(self.config.min_age.as_secs()); let chunk_keys = self.store.scan("_blob:chunk:"); for chunk_key in chunk_keys.into_iter().take(self.config.batch_size) { if let Ok(tensor) = self.store.get(&chunk_key) { let refs = get_int(&tensor, "_refs").unwrap_or(0); let created = get_int(&tensor, "_created").unwrap_or(0) as u64; // Zero refs AND old enough if refs == 0 && created < min_created { let size = get_int(&tensor, "_size").unwrap_or(0) as usize; if self.store.delete(&chunk_key).is_ok() { deleted += 1; freed_bytes += size; } } } } GcStats { deleted, freed_bytes } } }
Full GC (full_gc())
Rebuilds reference counts from scratch and deletes all unreferenced chunks:
flowchart TD
A[Start Full GC] --> B[Build reference set from all artifacts]
B --> C[Scan all artifact metadata]
C --> D[Extract chunk lists]
D --> E[Add to HashSet]
E --> C
C -->|Done| F[Scan all chunks]
F --> G{Chunk in reference set?}
G -->|Yes| F
G -->|No| H[Delete chunk]
H --> I[Track freed bytes]
I --> F
F -->|Done| J[Return GcStats]
#![allow(unused)] fn main() { pub async fn full_gc(&self) -> Result<GcStats> { // Phase 1: Build reference set from all artifacts let mut referenced: HashSet<String> = HashSet::new(); for meta_key in self.store.scan("_blob:meta:") { if let Ok(tensor) = self.store.get(&meta_key) { if let Some(chunks) = get_pointers(&tensor, "_chunks") { referenced.extend(chunks); } } } // Phase 2: Delete unreferenced chunks (ignores age requirement) let mut deleted = 0; let mut freed_bytes = 0; for chunk_key in self.store.scan("_blob:chunk:") { if !referenced.contains(&chunk_key) { if let Ok(tensor) = self.store.get(&chunk_key) { let size = get_int(&tensor, "_size").unwrap_or(0) as usize; if self.store.delete(&chunk_key).is_ok() { deleted += 1; freed_bytes += size; } } } } Ok(GcStats { deleted, freed_bytes }) } }
Background GC Task
#![allow(unused)] fn main() { pub fn start(self: Arc<Self>) -> JoinHandle<()> { let gc = Arc::clone(&self); tokio::spawn(async move { gc.run().await; }) } async fn run(&self) { let mut interval = interval(self.config.check_interval); let mut shutdown_rx = self.shutdown_tx.subscribe(); loop { tokio::select! { _ = interval.tick() => { let _ = self.gc_cycle().await; } _ = shutdown_rx.recv() => { break; } } } } }
Integrity Repair Algorithm
The repair operation fixes reference count inconsistencies and removes orphans:
flowchart TD
A[Start Repair] --> B[Phase 1: Build true reference counts]
B --> C[Scan all artifacts]
C --> D[Count chunk references]
D --> E[Build HashMap chunk -> count]
E --> F[Phase 2: Verify and fix chunks]
F --> G[Scan all chunks]
G --> H{Current refs == expected?}
H -->|Yes| I{Expected refs == 0?}
H -->|No| J[Update refs to expected]
J --> I
I -->|Yes| K[Mark as orphan]
I -->|No| G
K --> G
G -->|Done| L[Phase 3: Delete orphans]
L --> M[Delete marked chunks]
M --> N[Return RepairStats]
Repair Implementation
#![allow(unused)] fn main() { pub async fn repair(store: &TensorStore) -> Result<RepairStats> { let mut stats = RepairStats::default(); // Phase 1: Build true reference counts from all artifacts let mut true_refs: HashMap<String, i64> = HashMap::new(); for meta_key in store.scan("_blob:meta:") { stats.artifacts_checked += 1; if let Ok(tensor) = store.get(&meta_key) { if let Some(chunks) = get_pointers(&tensor, "_chunks") { for chunk_key in chunks { *true_refs.entry(chunk_key).or_insert(0) += 1; } } } } // Phase 2: Verify and fix reference counts let mut orphan_keys = Vec::new(); for chunk_key in store.scan("_blob:chunk:") { stats.chunks_verified += 1; if let Ok(mut tensor) = store.get(&chunk_key) { let current_refs = get_int(&tensor, "_refs").unwrap_or(0); let expected_refs = true_refs.get(&chunk_key).copied().unwrap_or(0); if current_refs != expected_refs { tensor.set("_refs", TensorValue::Scalar(ScalarValue::Int(expected_refs))); store.put(&chunk_key, tensor)?; stats.refs_fixed += 1; } if expected_refs == 0 { orphan_keys.push(chunk_key); } } } // Phase 3: Delete orphans for orphan_key in orphan_keys { if store.delete(&orphan_key).is_ok() { stats.orphans_deleted += 1; } } Ok(stats) } }
Artifact Verification
#![allow(unused)] fn main() { pub async fn verify_artifact(store: &TensorStore, artifact_id: &str) -> Result<bool> { let meta_key = format!("_blob:meta:{artifact_id}"); let tensor = store.get(&meta_key)?; let expected_checksum = get_string(&tensor, "_checksum")?; let chunks = get_pointers(&tensor, "_chunks")?; // Recompute checksum by hashing all chunks in order let mut hasher = StreamingHasher::new(); for chunk_key in &chunks { let chunk_tensor = store.get(chunk_key)?; let chunk_data = get_bytes(&chunk_tensor, "_data")?; hasher.update(&chunk_data); } let actual_checksum = hasher.finalize(); Ok(actual_checksum == expected_checksum) } // Verify individual chunk integrity pub fn verify_chunk(store: &TensorStore, chunk_key: &str) -> Result<bool> { let expected_hash = chunk_key.strip_prefix("_blob:chunk:")?; let tensor = store.get(chunk_key)?; let data = get_bytes(&tensor, "_data")?; let actual_hash = compute_hash(&data); Ok(actual_hash == expected_hash) } }
Usage Examples
Basic Storage
#![allow(unused)] fn main() { use tensor_blob::{BlobStore, BlobConfig, PutOptions}; use tensor_store::TensorStore; let store = TensorStore::new(); let blob = BlobStore::new(store, BlobConfig::default()).await?; // Store an artifact let artifact_id = blob.put( "report.pdf", &file_bytes, PutOptions::new() .with_created_by("user:alice") .with_tag("quarterly") .with_link("task:123"), ).await?; // Retrieve it let data = blob.get(&artifact_id).await?; // Get metadata let meta = blob.metadata(&artifact_id).await?; }
Streaming API
#![allow(unused)] fn main() { // Streaming upload (memory-efficient for large files) let mut writer = blob.writer("large_file.bin", PutOptions::default()).await?; for chunk in file_chunks { writer.write(&chunk).await?; } let artifact_id = writer.finish().await?; // Streaming download let mut reader = blob.reader(&artifact_id).await?; while let Some(chunk) = reader.next_chunk().await? { process_chunk(&chunk); } // Verify integrity after download let mut reader = blob.reader(&artifact_id).await?; let valid = reader.verify().await?; }
Entity Linking and Tagging
#![allow(unused)] fn main() { // Link artifact to entities blob.link(&artifact_id, "user:alice").await?; blob.link(&artifact_id, "task:123").await?; // Find artifacts linked to an entity let artifacts = blob.artifacts_for("user:alice").await?; // Add tags blob.tag(&artifact_id, "important").await?; // Find artifacts by tag let important_files = blob.by_tag("important").await?; }
Semantic Search (with vector feature)
#![allow(unused)] fn main() { // Set embedding for artifact blob.set_embedding(&artifact_id, embedding, "text-embedding-3-small").await?; // Find similar artifacts let similar = blob.similar(&artifact_id, 10).await?; }
Configuration Options
| Option | Default | Description |
|---|---|---|
chunk_size | 1 MB (1,048,576 bytes) | Size of each chunk in bytes |
max_artifact_size | None (unlimited) | Maximum artifact size limit |
max_artifacts | None (unlimited) | Maximum number of artifacts |
gc_interval | 5 minutes (300s) | Background GC check frequency |
gc_batch_size | 100 | Chunks processed per GC cycle |
gc_min_age | 1 minute (60s) | Minimum age before GC eligible |
default_content_type | application/octet-stream | Default MIME type |
#![allow(unused)] fn main() { let config = BlobConfig::new() .with_chunk_size(1024 * 1024) .with_gc_interval(Duration::from_secs(300)) .with_gc_batch_size(100) .with_gc_min_age(Duration::from_secs(3600)) .with_max_artifact_size(100 * 1024 * 1024); }
Configuration Validation
#![allow(unused)] fn main() { // Configuration is validated on BlobStore::new() pub fn validate(&self) -> Result<()> { if self.chunk_size == 0 { return Err(BlobError::InvalidConfig("chunk_size must be > 0")); } if self.gc_batch_size == 0 { return Err(BlobError::InvalidConfig("gc_batch_size must be > 0")); } Ok(()) } }
Garbage Collection
Two GC modes are available:
| Method | Description | Age Requirement | Reference Source |
|---|---|---|---|
gc() | Incremental GC: processes batch_size chunks per cycle | Respects min_age | Uses stored _refs field |
full_gc() | Full GC: recounts all references from artifacts | Ignores age | Rebuilds from artifact metadata |
Background GC runs automatically when started:
#![allow(unused)] fn main() { blob.start().await?; // Start background GC // ... use blob store ... blob.shutdown().await?; // Graceful shutdown (waits for current cycle) }
BlobStore API
| Method | Description |
|---|---|
new(store, config) | Create with configuration (validates config) |
start() | Start background GC task |
shutdown() | Graceful shutdown (sends signal and awaits task) |
store() | Get reference to underlying TensorStore |
put(filename, data, options) | Store bytes, return artifact ID |
get(artifact_id) | Retrieve all bytes |
delete(artifact_id) | Delete artifact and decrement chunk refs |
exists(artifact_id) | Check if artifact exists |
writer(filename, options) | Create streaming upload writer |
reader(artifact_id) | Create streaming download reader |
metadata(artifact_id) | Get artifact metadata |
update_metadata(artifact_id, updates) | Apply metadata updates |
set_meta(artifact_id, key, value) | Set custom metadata field |
get_meta(artifact_id, key) | Get custom metadata field |
link(artifact_id, entity) | Link to entity |
unlink(artifact_id, entity) | Remove link |
links(artifact_id) | Get linked entities |
artifacts_for(entity) | Find artifacts by linked entity |
tag(artifact_id, tag) | Add tag |
untag(artifact_id, tag) | Remove tag |
by_tag(tag) | Find artifacts by tag |
list(prefix) | List artifacts with optional prefix filter |
by_content_type(type) | Find by content type |
by_creator(creator) | Find by creator |
verify(artifact_id) | Verify checksum integrity |
repair() | Repair broken references |
gc() | Run incremental GC |
full_gc() | Run full GC |
stats() | Get storage statistics |
set_embedding(id, vec, model) | Set artifact embedding (feature-gated) |
similar(id, k) | Find k similar artifacts (feature-gated) |
search_by_embedding(vec, k) | Search by embedding vector (feature-gated) |
BlobWriter API
| Method | Description |
|---|---|
write(data) | Write chunk of data (buffers until chunk_size reached) |
finish() | Finalize, flush buffer, store metadata, return artifact ID |
bytes_written() | Total bytes written so far |
chunks_written() | Chunks stored so far (not including buffered data) |
BlobReader API
| Method | Description |
|---|---|
next_chunk() | Read next chunk, returns None when done |
read_all() | Read all remaining data into buffer |
read(buf) | Read into buffer, returns bytes read (for streaming) |
verify() | Verify checksum against stored value (resets read position) |
checksum() | Get expected checksum |
total_size() | Total artifact size |
bytes_read() | Bytes read so far |
chunk_count() | Number of chunks |
Shell Commands
BLOB PUT 'filename' 'data' Store inline data
BLOB PUT 'filename' FROM 'path' Store from file path
BLOB GET 'artifact_id' Retrieve data
BLOB GET 'artifact_id' TO 'path' Write to file
BLOB DELETE 'artifact_id' Delete artifact
BLOB INFO 'artifact_id' Show metadata
BLOB VERIFY 'artifact_id' Verify integrity
BLOB LINK 'artifact_id' TO 'entity' Link to entity
BLOB UNLINK 'artifact_id' FROM 'entity' Remove link
BLOB TAG 'artifact_id' 'tag' Add tag
BLOB UNTAG 'artifact_id' 'tag' Remove tag
BLOB META SET 'artifact_id' 'key' 'value' Set custom metadata
BLOB META GET 'artifact_id' 'key' Get custom metadata
BLOB GC Run incremental GC
BLOB GC FULL Full garbage collection
BLOB REPAIR Repair broken references
BLOB STATS Show storage statistics
BLOBS List all artifacts
BLOBS FOR 'entity' Find by linked entity
BLOBS BY TAG 'tag' Find by tag
BLOBS WHERE TYPE = 'content/type' Find by content type
BLOBS SIMILAR TO 'artifact_id' LIMIT n Find similar (requires embeddings)
Edge Cases and Gotchas
Empty Data
#![allow(unused)] fn main() { // Empty data is rejected let result = blob.put("empty.txt", b"", PutOptions::default()).await; assert!(matches!(result, Err(BlobError::EmptyData))); }
Size Limits
#![allow(unused)] fn main() { // Exceeding max_artifact_size returns InvalidConfig error let config = BlobConfig::new().with_max_artifact_size(1024); let blob = BlobStore::new(store, config).await?; let result = blob.put("large.bin", &vec![0u8; 2048], PutOptions::default()).await; // Returns Err(BlobError::InvalidConfig("data size 2048 exceeds max 1024")) }
Concurrent Deduplication
The reference counting is not fully atomic. If two writers simultaneously store the same chunk:
- Both may check
exists()and find it missing - Both may store the chunk with
refs = 1 - One write will overwrite the other
- Result: ref count may be 1 instead of 2
Mitigation: For high-concurrency scenarios, use full_gc() periodically to
rebuild accurate reference counts.
GC Timing
- Incremental GC respects
min_ageto avoid deleting chunks from in-progress uploads - A writer that takes longer than
min_ageto complete may have chunks collected - Recommendation: Set
gc_min_agelonger than your maximum expected upload time
Checksum vs Chunk Hash
- Checksum (
_checksum): SHA-256 of the entire file content - Chunk hash (in key): SHA-256 of individual chunk data
- These are different values and cannot be compared directly
Sparse Embedding Detection
#![allow(unused)] fn main() { // Embeddings with >50% zeros are stored in sparse format pub(crate) fn should_use_sparse(vector: &[f32]) -> bool { if vector.is_empty() { return false; } let nnz = vector.iter().filter(|&&v| v.abs() > 1e-6).count(); nnz * 2 <= vector.len() // Use sparse if nnz <= 50% } }
Performance Tips and Best Practices
Chunk Size Selection
| Chunk Size | Best For | Trade-offs |
|---|---|---|
| 256 KB | Many small files, high dedup potential | More metadata overhead |
| 1 MB (default) | General purpose | Good balance |
| 4 MB | Large media files, sequential access | Less dedup, fewer chunks |
#![allow(unused)] fn main() { // Benchmark different chunk sizes for your workload let config = BlobConfig::new().with_chunk_size(512 * 1024); // 512KB }
Streaming for Large Files
#![allow(unused)] fn main() { // Bad: Loads entire file into memory let data = std::fs::read("large_file.bin")?; blob.put("large_file.bin", &data, PutOptions::default()).await?; // Good: Streams file in chunks let mut writer = blob.writer("large_file.bin", PutOptions::default()).await?; let file = std::fs::File::open("large_file.bin")?; let mut reader = std::io::BufReader::new(file); let mut buffer = vec![0u8; 64 * 1024]; // 64KB read buffer loop { let n = reader.read(&mut buffer)?; if n == 0 { break; } writer.write(&buffer[..n]).await?; } let artifact_id = writer.finish().await?; }
GC Tuning
#![allow(unused)] fn main() { // High-throughput: More aggressive GC let config = BlobConfig::new() .with_gc_interval(Duration::from_secs(60)) // Check every minute .with_gc_batch_size(500) // Process more per cycle .with_gc_min_age(Duration::from_secs(300)); // 5 minute grace period // Low-priority background: Less aggressive let config = BlobConfig::new() .with_gc_interval(Duration::from_secs(3600)) // Check hourly .with_gc_batch_size(50) // Small batches .with_gc_min_age(Duration::from_secs(86400)); // 24 hour grace period }
Batch Operations
#![allow(unused)] fn main() { // For multiple related artifacts, batch metadata updates for artifact_id in artifact_ids { blob.tag(&artifact_id, "batch-processed").await?; } // Use full_gc() after bulk deletions for artifact_id in to_delete { blob.delete(&artifact_id).await?; } blob.full_gc().await?; // Clean up all orphans at once }
Verification Strategy
#![allow(unused)] fn main() { // Verify on read (paranoid mode) let mut reader = blob.reader(&artifact_id).await?; let data = reader.read_all().await?; if !reader.verify().await? { return Err("Corruption detected"); } // Periodic verification (background task) for artifact_id in blob.list(None).await? { if !blob.verify(&artifact_id).await? { log::warn!("Corruption in artifact: {}", artifact_id); } } }
Related Modules
| Module | Relationship |
|---|---|
tensor_store | Underlying key-value storage for chunks and metadata |
query_router | Executes BLOB commands from parsed queries |
neumann_shell | Interactive CLI for blob operations |
vector_engine | Optional semantic search via embeddings |
graph_engine | Optional entity linking via graph edges |
Dependencies
| Crate | Purpose |
|---|---|
tensor_store | Key-value storage layer |
tokio | Async runtime for streaming and background GC |
sha2 | SHA-256 hashing for content addressing |
uuid | Artifact ID generation (UUID v4) |
Tensor Checkpoint
Tensor Checkpoint provides point-in-time snapshots of the database state for recovery operations. It enables users to create manual checkpoints before important operations, automatically checkpoint before destructive operations, and rollback to any previous checkpoint. Checkpoints are stored as blob artifacts in tensor_blob for content-addressable storage with automatic deduplication.
The module integrates with the query router to provide SQL-like commands
(CHECKPOINT, CHECKPOINTS, ROLLBACK TO) and supports interactive
confirmation prompts for destructive operations with configurable retention
policies.
Module Structure
tensor_checkpoint/
src/
lib.rs # CheckpointManager, CheckpointConfig
state.rs # CheckpointState, DestructiveOp, metadata types
storage.rs # Blob storage integration
retention.rs # Count-based purge logic
preview.rs # Destructive operation previews
error.rs # Error types
Key Types
Core Types
| Type | Description |
|---|---|
CheckpointManager | Main API for checkpoint operations |
CheckpointConfig | Configuration (retention, auto-checkpoint, interactive mode) |
CheckpointState | Full checkpoint data with snapshot and metadata |
CheckpointInfo | Lightweight checkpoint listing info |
CheckpointTrigger | Context for auto-checkpoints (command, operation, preview) |
State Types
| Type | Description |
|---|---|
DestructiveOp | Enum of destructive operations that trigger auto-checkpoints |
OperationPreview | Summary and sample data for confirmation prompts |
CheckpointMetadata | Statistics for validation (tables, nodes, embeddings) |
RelationalMeta | Table and row counts |
GraphMeta | Node and edge counts |
VectorMeta | Embedding count |
Error Types
| Variant | Description | Common Cause |
|---|---|---|
NotFound | Checkpoint not found by ID or name | Typo in checkpoint name or ID was pruned by retention |
Storage | Blob storage error | Disk full, permissions issue |
Serialization | Bincode serialization error | Corrupt in-memory state |
Deserialization | Bincode deserialization error | Corrupt checkpoint file |
Blob | Underlying blob store error | BlobStore not initialized |
Snapshot | TensorStore snapshot error | Store locked or corrupted |
Cancelled | Operation cancelled by user | User rejected confirmation prompt |
InvalidId | Invalid checkpoint identifier | Empty or malformed ID string |
Retention | Retention enforcement error | Failed to delete old checkpoints |
Architecture
flowchart TB
subgraph Commands
CP[CHECKPOINT]
CPS[CHECKPOINTS]
RB[ROLLBACK TO]
end
subgraph CheckpointManager
Create[create / create_auto]
List[list]
Rollback[rollback]
Delete[delete]
Confirm[request_confirmation]
Preview[generate_preview]
end
subgraph Storage Layer
CS[CheckpointStorage]
RM[RetentionManager]
PG[PreviewGenerator]
end
subgraph Dependencies
Blob[tensor_blob::BlobStore]
Store[tensor_store::TensorStore]
end
CP --> Create
CPS --> List
RB --> Rollback
Create --> CS
Create --> RM
List --> CS
Rollback --> CS
Delete --> CS
Confirm --> PG
Preview --> PG
CS --> Blob
Create --> Store
Rollback --> Store
Checkpoint Creation Flow
sequenceDiagram
participant User
participant Manager as CheckpointManager
participant Store as TensorStore
participant Storage as CheckpointStorage
participant Retention as RetentionManager
participant Blob as BlobStore
User->>Manager: create(name, store)
Manager->>Manager: Generate UUID
Manager->>Manager: collect_metadata(store)
Manager->>Store: snapshot_bytes()
Store-->>Manager: Vec<u8>
Manager->>Manager: Create CheckpointState
Manager->>Storage: store(state, blob)
Storage->>Storage: bitcode::encode(state)
Storage->>Blob: put(filename, data, options)
Blob-->>Storage: artifact_id
Storage-->>Manager: artifact_id
Manager->>Retention: enforce(blob)
Retention->>Storage: list(blob)
Storage-->>Retention: Vec<CheckpointInfo>
Retention->>Retention: Sort by created_at DESC
Retention->>Storage: delete(oldest beyond limit)
Retention-->>Manager: deleted_count
Manager-->>User: checkpoint_id
Rollback Flow
sequenceDiagram
participant User
participant Manager as CheckpointManager
participant Storage as CheckpointStorage
participant Blob as BlobStore
participant Store as TensorStore
User->>Manager: rollback(id_or_name, store)
Manager->>Storage: load(id_or_name, blob)
Storage->>Storage: find_by_id_or_name()
Storage->>Storage: list() and match
Storage->>Blob: get(artifact_id)
Blob-->>Storage: checkpoint_bytes
Storage->>Storage: bitcode::decode()
Storage-->>Manager: CheckpointState
Manager->>Store: restore_from_bytes(state.store_snapshot)
Store->>Store: SlabRouter::from_bytes()
Store->>Store: clear() current data
Store->>Store: copy all entries from new router
Store-->>Manager: Ok(())
Manager-->>User: Success
Storage Format
Checkpoints are stored as blob artifacts using content-addressable storage:
| Property | Value |
|---|---|
| Tag | _system:checkpoint |
| Content-Type | application/x-neumann-checkpoint |
| Format | bincode-serialized CheckpointState |
| Filename | checkpoint_{id}.ncp |
| Creator | system:checkpoint |
Checkpoint State Structure
The CheckpointState is serialized using bincode for efficient binary encoding:
#![allow(unused)] fn main() { #[derive(Serialize, Deserialize)] pub struct CheckpointState { pub id: String, // UUID v4 pub name: String, // User-provided or auto-generated pub created_at: u64, // Unix timestamp (seconds) pub trigger: Option<CheckpointTrigger>, // For auto-checkpoints pub store_snapshot: Vec<u8>, // Serialized SlabRouterSnapshot pub metadata: CheckpointMetadata, } }
Snapshot Serialization Format
The store_snapshot field contains a V3 format snapshot:
#![allow(unused)] fn main() { // V3 snapshot structure (bincode serialized) pub struct V3Snapshot { pub header: SnapshotHeader, // Magic bytes, version, entry count pub router: SlabRouterSnapshot, // All slab data } pub struct SlabRouterSnapshot { pub index: EntityIndexSnapshot, // Key-to-entity mapping pub embeddings: EmbeddingSlabSnapshot, pub graph: GraphTensorSnapshot, pub relations: RelationalSlabSnapshot, pub metadata: MetadataSlabSnapshot, pub cache: CacheRingSnapshot<TensorData>, pub blobs: BlobLogSnapshot, } }
Custom metadata stored with each artifact:
| Key | Type | Description |
|---|---|---|
checkpoint_id | String | UUID identifier |
checkpoint_name | String | User-provided or auto-generated name |
created_at | String | Unix timestamp (parsed to u64) |
trigger | String | Operation name (for auto-checkpoints only) |
Metadata Collection Algorithm
When creating a checkpoint, metadata is collected by scanning the store:
#![allow(unused)] fn main() { fn collect_metadata(&self, store: &TensorStore) -> CheckpointMetadata { let store_key_count = store.len(); // Count relational tables by scanning _schema: prefix let table_keys: Vec<_> = store.scan("_schema:"); let table_count = table_keys.len(); let mut total_rows = 0; for key in &table_keys { if let Some(table_name) = key.strip_prefix("_schema:") { total_rows += store.scan_count(&format!("{table_name}:")); } } // Count graph entities let node_count = store.scan_count("node:"); let edge_count = store.scan_count("edge:"); // Count embeddings let embedding_count = store.scan_count("_embed:"); CheckpointMetadata::new( RelationalMeta::new(table_count, total_rows), GraphMeta::new(node_count, edge_count), VectorMeta::new(embedding_count), store_key_count, ) } }
Configuration
CheckpointConfig
| Field | Type | Default | Description |
|---|---|---|---|
max_checkpoints | usize | 10 | Maximum checkpoints before pruning |
auto_checkpoint | bool | true | Enable auto-checkpoints before destructive ops |
interactive_confirm | bool | true | Require confirmation for destructive ops |
preview_sample_size | usize | 5 | Number of sample rows in previews |
Builder Pattern
#![allow(unused)] fn main() { let config = CheckpointConfig::default() .with_max_checkpoints(20) .with_auto_checkpoint(true) .with_interactive_confirm(false) .with_preview_sample_size(10); }
Configuration Presets
| Preset | max_checkpoints | auto_checkpoint | interactive_confirm | Use Case |
|---|---|---|---|---|
| Default | 10 | true | true | Interactive CLI usage |
| Automated | 20 | true | false | Batch processing scripts |
| Minimal | 3 | false | false | Memory-constrained environments |
| Safe | 50 | true | true | Production with high retention |
Destructive Operations
Operations that trigger auto-checkpoints when auto_checkpoint is enabled:
| Operation | Variant | Fields | Affected Count |
|---|---|---|---|
| DELETE | Delete | table, row_count | row_count |
| DROP TABLE | DropTable | table, row_count | row_count |
| DROP INDEX | DropIndex | table, column | 1 |
| NODE DELETE | NodeDelete | node_id, edge_count | 1 + edge_count |
| EMBED DELETE | EmbedDelete | key | 1 |
| VAULT DELETE | VaultDelete | key | 1 |
| BLOB DELETE | BlobDelete | artifact_id, size | 1 |
| CACHE CLEAR | CacheClear | entry_count | entry_count |
DestructiveOp Implementation
#![allow(unused)] fn main() { #[derive(Debug, Clone, Serialize, Deserialize)] pub enum DestructiveOp { Delete { table: String, row_count: usize }, DropTable { table: String, row_count: usize }, DropIndex { table: String, column: String }, NodeDelete { node_id: u64, edge_count: usize }, EmbedDelete { key: String }, VaultDelete { key: String }, BlobDelete { artifact_id: String, size: usize }, CacheClear { entry_count: usize }, } impl DestructiveOp { pub fn operation_name(&self) -> &'static str { match self { DestructiveOp::Delete { .. } => "DELETE", DestructiveOp::DropTable { .. } => "DROP TABLE", // ... etc } } pub fn affected_count(&self) -> usize { match self { DestructiveOp::Delete { row_count, .. } => *row_count, DestructiveOp::NodeDelete { edge_count, .. } => 1 + edge_count, DestructiveOp::DropIndex { .. } => 1, // ... etc } } } }
SQL Commands
CHECKPOINT
-- Named checkpoint
CHECKPOINT 'before-migration'
-- Auto-generated name (checkpoint-{timestamp})
CHECKPOINT
CHECKPOINTS
-- List all checkpoints
CHECKPOINTS
-- List last N checkpoints
CHECKPOINTS LIMIT 10
Returns: ID, Name, Created, Type (manual/auto)
ROLLBACK TO
-- By name
ROLLBACK TO 'checkpoint-name'
-- By ID
ROLLBACK TO 'uuid-string'
API Reference
CheckpointManager
#![allow(unused)] fn main() { impl CheckpointManager { /// Create manager with blob storage and configuration pub async fn new( blob: Arc<Mutex<BlobStore>>, config: CheckpointConfig ) -> Self; /// Create a manual checkpoint pub async fn create( &self, name: Option<&str>, store: &TensorStore ) -> Result<String>; /// Create an auto-checkpoint before destructive operation pub async fn create_auto( &self, command: &str, op: DestructiveOp, preview: OperationPreview, store: &TensorStore ) -> Result<String>; /// Rollback to a checkpoint by ID or name pub async fn rollback( &self, id_or_name: &str, store: &TensorStore ) -> Result<()>; /// List checkpoints, most recent first pub async fn list( &self, limit: Option<usize> ) -> Result<Vec<CheckpointInfo>>; /// Delete a checkpoint by ID or name pub async fn delete(&self, id_or_name: &str) -> Result<()>; /// Generate preview for a destructive operation pub fn generate_preview( &self, op: &DestructiveOp, sample_data: Vec<String> ) -> OperationPreview; /// Request user confirmation for an operation pub fn request_confirmation( &self, op: &DestructiveOp, preview: &OperationPreview ) -> bool; /// Set custom confirmation handler pub fn set_confirmation_handler( &mut self, handler: Arc<dyn ConfirmationHandler> ); /// Check if auto-checkpoint is enabled pub fn auto_checkpoint_enabled(&self) -> bool; /// Check if interactive confirmation is enabled pub fn interactive_confirm_enabled(&self) -> bool; /// Access the current configuration pub fn config(&self) -> &CheckpointConfig; } }
ConfirmationHandler
#![allow(unused)] fn main() { pub trait ConfirmationHandler: Send + Sync { fn confirm(&self, op: &DestructiveOp, preview: &OperationPreview) -> bool; } }
Built-in implementations:
| Type | Behavior | Use Case |
|---|---|---|
AutoConfirm | Always returns true | Automated scripts, testing |
AutoReject | Always returns false | Testing cancellation paths |
CheckpointStorage
Internal storage layer for checkpoint persistence:
#![allow(unused)] fn main() { impl CheckpointStorage { /// Store a checkpoint state to blob storage pub async fn store(state: &CheckpointState, blob: &BlobStore) -> Result<String>; /// Load a checkpoint by ID or name pub async fn load(checkpoint_id: &str, blob: &BlobStore) -> Result<CheckpointState>; /// List all checkpoints (sorted by created_at descending) pub async fn list(blob: &BlobStore) -> Result<Vec<CheckpointInfo>>; /// Delete a checkpoint by artifact ID pub async fn delete(artifact_id: &str, blob: &BlobStore) -> Result<()>; } }
PreviewGenerator
Generates human-readable previews for destructive operations:
#![allow(unused)] fn main() { impl PreviewGenerator { pub fn new(sample_size: usize) -> Self; pub fn generate(&self, op: &DestructiveOp, sample_data: Vec<String>) -> OperationPreview; } // Utility functions pub fn format_warning(op: &DestructiveOp) -> String; pub fn format_confirmation_prompt(op: &DestructiveOp, preview: &OperationPreview) -> String; }
Usage Examples
Basic Usage
#![allow(unused)] fn main() { use tensor_checkpoint::{CheckpointManager, CheckpointConfig}; use tensor_blob::{BlobStore, BlobConfig}; use tensor_store::TensorStore; // Initialize let store = TensorStore::new(); let blob = BlobStore::new(store.clone(), BlobConfig::default()).await?; let blob = Arc::new(Mutex::new(blob)); let config = CheckpointConfig::default(); let manager = CheckpointManager::new(blob, config).await; // Create checkpoint let id = manager.create(Some("before-migration"), &store).await?; // ... make changes ... // Rollback if needed manager.rollback("before-migration", &store).await?; }
With Query Router
#![allow(unused)] fn main() { use query_router::QueryRouter; let mut router = QueryRouter::new(); router.init_blob()?; router.init_checkpoint()?; // Execute checkpoint commands via SQL router.execute_parsed("CHECKPOINT 'backup'")?; router.execute_parsed("CHECKPOINTS")?; router.execute_parsed("ROLLBACK TO 'backup'")?; }
Custom Confirmation Handler
#![allow(unused)] fn main() { use tensor_checkpoint::{ConfirmationHandler, DestructiveOp, OperationPreview}; use std::io::{self, Write}; struct InteractiveHandler; impl ConfirmationHandler for InteractiveHandler { fn confirm(&self, op: &DestructiveOp, preview: &OperationPreview) -> bool { println!("{}", tensor_checkpoint::format_confirmation_prompt(op, preview)); io::stdout().flush().unwrap(); let mut input = String::new(); io::stdin().read_line(&mut input).unwrap(); input.trim().to_lowercase() == "yes" } } // Usage manager.set_confirmation_handler(Arc::new(InteractiveHandler)); }
Auto-Checkpoint with Rejection
#![allow(unused)] fn main() { use tensor_checkpoint::{AutoReject, CheckpointConfig}; // Create config with auto-checkpoint enabled let config = CheckpointConfig::default() .with_auto_checkpoint(true) .with_interactive_confirm(true); let mut manager = CheckpointManager::new(blob, config).await; manager.set_confirmation_handler(Arc::new(AutoReject)); // DELETE will be rejected, no checkpoint created, operation cancelled let result = router.execute("DELETE FROM users WHERE age > 50"); assert!(result.is_err()); // Operation cancelled by user }
Retention Management
Checkpoints are automatically pruned when max_checkpoints is exceeded:
Retention Algorithm
#![allow(unused)] fn main() { pub async fn enforce(&self, blob: &BlobStore) -> Result<usize> { let checkpoints = CheckpointStorage::list(blob).await?; if checkpoints.len() <= self.max_checkpoints { return Ok(0); } let to_remove = checkpoints.len() - self.max_checkpoints; let mut removed = 0; // Checkpoints are sorted by created_at descending, oldest are at end for checkpoint in checkpoints.iter().rev().take(to_remove) { if CheckpointStorage::delete(&checkpoint.artifact_id, blob) .await .is_ok() { removed += 1; } } Ok(removed) } }
Retention Timing
Retention is enforced after every checkpoint creation:
- Create new checkpoint
- Store in blob storage
- Call
retention.enforce() - Return checkpoint ID
This ensures the checkpoint count never exceeds max_checkpoints + 1 at any
point.
Retention Edge Cases
| Scenario | Behavior |
|---|---|
| Creation fails | Retention not called, count unchanged |
| Retention delete fails | Logged but not fatal, continues deleting |
| max_checkpoints = 0 | All checkpoints deleted after creation |
| max_checkpoints = 1 | Only newest checkpoint retained |
Interactive Confirmation
When interactive_confirm is enabled, destructive operations display a preview:
WARNING: About to delete 5 row(s) from table 'users'
Will delete 5 row(s) from table 'users'
Affected data sample:
1. id=1, name='Alice'
2. id=2, name='Bob'
... and 3 more
Type 'yes' to proceed, anything else to cancel:
Preview Generation
The preview generator formats human-readable summaries:
#![allow(unused)] fn main() { fn format_summary(&self, op: &DestructiveOp) -> String { match op { DestructiveOp::Delete { table, row_count } => { format!("Will delete {row_count} row(s) from table '{table}'") }, DestructiveOp::DropTable { table, row_count } => { format!("Will drop table '{table}' containing {row_count} row(s)") }, DestructiveOp::BlobDelete { artifact_id, size } => { let size_str = format_bytes(*size); format!("Will delete blob artifact '{artifact_id}' ({size_str})") }, // ... etc } } }
Size Formatting
Blob sizes are formatted for readability:
| Bytes | Display |
|---|---|
| < 1024 | “N bytes” |
| >= 1 KB | “N.NN KB” |
| >= 1 MB | “N.NN MB” |
| >= 1 GB | “N.NN GB” |
Rollback Algorithm
The rollback process completely replaces the store contents:
Algorithm Steps
- Locate Checkpoint: Search by ID first, then by name
- Load State: Deserialize
CheckpointStatefrom blob storage - Deserialize Snapshot: Convert
store_snapshotbytes toSlabRouter - Clear Current Data: Remove all entries from current store
- Copy Restored Data: Iterate and copy all entries from restored router
Rollback Implementation
#![allow(unused)] fn main() { pub async fn rollback(&self, id_or_name: &str, store: &TensorStore) -> Result<()> { let blob = self.blob.lock().await; let state = CheckpointStorage::load(id_or_name, &blob).await?; store .restore_from_bytes(&state.store_snapshot) .map_err(|e| CheckpointError::Snapshot(e.to_string()))?; Ok(()) } // In TensorStore pub fn restore_from_bytes(&self, bytes: &[u8]) -> SnapshotResult<()> { let new_router = SlabRouter::from_bytes(bytes)?; // Clear current and copy data from new router self.router.clear(); for key in new_router.scan("") { if let Ok(value) = new_router.get(&key) { let _ = self.router.put(&key, value); } } Ok(()) } }
Rollback Characteristics
| Aspect | Behavior |
|---|---|
| Atomicity | Not atomic - partial restore possible on failure |
| Isolation | No locking - concurrent operations may see partial state |
| Duration | O(n) where n = number of entries |
| Memory | Requires 2x memory during restore (old + new) |
Edge Cases and Gotchas
Name vs ID Lookup
Checkpoints can be referenced by either name or ID:
#![allow(unused)] fn main() { async fn find_by_id_or_name(id_or_name: &str, blob: &BlobStore) -> Result<String> { let checkpoints = Self::list(blob).await?; for cp in checkpoints { // Exact match on ID or name if cp.id == id_or_name || cp.name == id_or_name { return Ok(cp.artifact_id); } } Err(CheckpointError::NotFound(id_or_name.to_string())) } }
Gotcha: If a checkpoint is named with a valid UUID format, it may conflict with ID lookup.
Auto-Generated Names
When no name is provided:
#![allow(unused)] fn main() { let name = name.map(String::from).unwrap_or_else(|| { let now = std::time::SystemTime::now() .duration_since(std::time::UNIX_EPOCH) .map(|d| d.as_secs()) .unwrap_or(0); format!("checkpoint-{now}") }); }
Auto-checkpoint names follow the pattern: auto-before-{operation-name}
Timestamp Edge Cases
| Scenario | Behavior |
|---|---|
| System time before epoch | Timestamp becomes 0 |
| Rapid checkpoint creation | May have same second timestamp |
| Clock drift | Checkpoints may be out of order |
Blob Store Dependency
#![allow(unused)] fn main() { pub fn init_checkpoint(&mut self) -> Result<()> { self.init_checkpoint_with_config(CheckpointConfig::default()) } pub fn init_checkpoint_with_config(&mut self, config: CheckpointConfig) -> Result<()> { let blob = self .blob .as_ref() .ok_or_else(|| { RouterError::CheckpointError( "Blob store must be initialized first".to_string() ) })?; // ... } }
Gotcha: Always call init_blob() before init_checkpoint().
Performance Tips
Checkpoint Creation Performance
| Factor | Impact | Recommendation |
|---|---|---|
| Store size | O(n) serialization | Keep hot data separate |
| Retention limit | More deletions on creation | Set appropriate max_checkpoints |
| Blob storage | Network latency for remote | Use local storage for fast checkpoints |
Memory Considerations
- Full snapshot is held in memory during creation
- Rollback requires 2x memory temporarily
- Large embeddings significantly increase checkpoint size
Optimization Strategies
-
Incremental Checkpoints (not yet supported)
- Currently full snapshots only
- Future: delta-based checkpoints
-
Selective Checkpointing
- Use separate stores for hot vs. cold data
- Only checkpoint critical data
-
Compression
- TensorStore supports compressed snapshots for file I/O
- Checkpoint uses bincode (no compression)
Benchmarks
| Store Size | Checkpoint Time | Rollback Time | Memory |
|---|---|---|---|
| 1K entries | ~5ms | ~3ms | ~100KB |
| 10K entries | ~50ms | ~30ms | ~1MB |
| 100K entries | ~500ms | ~300ms | ~10MB |
| 1M entries | ~5s | ~3s | ~100MB |
Related Modules
| Module | Relationship |
|---|---|
tensor_blob | Storage backend for checkpoint data |
tensor_store | Source of snapshots and restore target |
query_router | SQL command integration |
neumann_shell | Interactive confirmation handling |
Limitations
- Full snapshots only (no incremental checkpoints)
- Single-node operation (no distributed checkpoints)
- In-memory restore (entire snapshot loaded)
- No automatic scheduling (manual or trigger-based only)
- Not atomic (partial restore possible on failure)
- No encryption (checkpoints stored in plaintext)
- Bloom filter state not preserved (rebuilt on load if needed)
Tensor Unified
Cross-engine operations and unified entity management for Neumann. Provides a single interface for queries that span relational, graph, and vector engines with async-first design and thread safety inherited from TensorStore.
Design Principles
- Cross-Engine Abstraction: Single interface for operations spanning multiple engines
- Unified Entities: Entities can have relational fields, graph connections, and embeddings
- Composable Queries: Combine vector similarity with graph connectivity
- Async-First: All cross-engine operations support async execution
- Thread Safety: Inherits from underlying engines via TensorStore
Architecture
+------------------+
| UnifiedEngine |
+------------------+
|
+------------------+------------------+
| | |
v v v
+---------------+ +---------------+ +---------------+
| Relational | | Graph | | Vector |
| Engine | | Engine | | Engine |
+---------------+ +---------------+ +---------------+
| | |
+------------------+------------------+
|
+------v------+
| TensorStore |
+-------------+
All engines share the same TensorStore instance, enabling cross-engine queries without data duplication.
Internal Engine Coordination
sequenceDiagram
participant Client
participant UnifiedEngine
participant VectorEngine
participant GraphEngine
participant TensorStore
Client->>UnifiedEngine: create_entity("user:1", fields, embedding)
UnifiedEngine->>VectorEngine: set_entity_embedding("user:1", embedding)
VectorEngine->>TensorStore: put("user:1", TensorData{_embedding: ...})
UnifiedEngine->>TensorStore: get("user:1")
TensorStore-->>UnifiedEngine: TensorData
UnifiedEngine->>TensorStore: put("user:1", TensorData{fields + _embedding})
UnifiedEngine-->>Client: Ok(())
Key Types
| Type | Description |
|---|---|
UnifiedEngine | Main entry point for cross-engine operations |
UnifiedResult | Query result containing description and items |
UnifiedItem | Single item with source, id, data, embedding, and score |
UnifiedError | Error type wrapping engine-specific errors |
FindPattern | Pattern for FIND queries (Nodes or Edges) |
DistanceMetric | Similarity metric (Cosine, Euclidean, DotProduct) |
EntityInput | Tuple type for batch operations: (key, fields, embedding) |
Unified | Trait for converting engine types to UnifiedItem |
FilterCondition | Re-exported from vector_engine for filtered search |
FilterValue | Re-exported from vector_engine for filter values |
VectorCollectionConfig | Re-exported from vector_engine for collection config |
UnifiedItem
#![allow(unused)] fn main() { pub struct UnifiedItem { pub source: String, // "relational", "graph", "vector", or combined pub id: String, // Entity key pub data: HashMap<String, String>, // Entity fields pub embedding: Option<Vec<f32>>, // Optional embedding pub score: Option<f32>, // Similarity score if applicable } }
The source field indicates which engine(s) produced the result:
"graph"- Result from graph operations (nodes, edges)"vector"- Result from vector similarity search"unified"- Result from cross-engine entity retrieval"vector+graph"- Result fromfind_similar_connected(similarity + connectivity)"graph+vector"- Result fromfind_neighbors_by_similarity(connectivity + similarity)
UnifiedError
| Variant | Cause |
|---|---|
RelationalError | Error from relational engine |
GraphError | Error from graph engine |
VectorError | Error from vector engine |
NotFound | Entity not found |
InvalidOperation | Invalid operation attempted |
Error conversion is automatic via From implementations:
#![allow(unused)] fn main() { impl From<graph_engine::GraphError> for UnifiedError { fn from(e: graph_engine::GraphError) -> Self { UnifiedError::GraphError(e.to_string()) } } impl From<vector_engine::VectorError> for UnifiedError { fn from(e: vector_engine::VectorError) -> Self { UnifiedError::VectorError(e.to_string()) } } impl From<relational_engine::RelationalError> for UnifiedError { fn from(e: relational_engine::RelationalError) -> Self { UnifiedError::RelationalError(e.to_string()) } } }
Entity Storage Format
Unified entities use reserved field prefixes in TensorData to store
cross-engine data within a single key-value entry:
| Field | Type | Description |
|---|---|---|
_out | Pointers(Vec<String>) | Outgoing edge keys |
_in | Pointers(Vec<String>) | Incoming edge keys |
_embedding | Vector(Vec<f32>) or Sparse(SparseVector) | Embedding vector |
_label | Scalar(String) | Entity type/label |
_type | Scalar(String) | Discriminator (“node”, “edge”, “row”) |
_id | Scalar(Int) | Numeric entity ID |
_from | Scalar(String) | Edge source key |
_to | Scalar(String) | Edge target key |
_edge_type | Scalar(String) | Edge type |
_directed | Scalar(Bool) | Whether edge is directed |
_table | Scalar(String) | Table name for relational rows |
Entity Storage Example
Key: "user:alice"
TensorData:
_embedding: Vector([0.1, 0.2, 0.3, 0.4])
_out: Pointers(["edge:follows:1", "edge:likes:2"])
_in: Pointers(["edge:follows:3"])
name: Scalar(String("Alice"))
role: Scalar(String("admin"))
Key: "edge:follows:1"
TensorData:
_type: Scalar(String("edge"))
_from: Scalar(String("user:alice"))
_to: Scalar(String("user:bob"))
_edge_type: Scalar(String("follows"))
_directed: Scalar(Bool(true))
Sparse Vector Auto-Detection
Embeddings are automatically stored in sparse format when >50% of values are zero:
#![allow(unused)] fn main() { fn should_use_sparse(vector: &[f32]) -> bool { if vector.is_empty() { return false; } let nnz = vector.iter().filter(|&&v| v.abs() > 1e-6).count(); // Sparse if nnz <= len/2 nnz * 2 <= vector.len() } }
Initialization
#![allow(unused)] fn main() { use tensor_unified::UnifiedEngine; use tensor_store::TensorStore; // Create with new store let engine = UnifiedEngine::new(); // Create with shared store let store = TensorStore::new(); let engine = UnifiedEngine::with_store(store); // Create with existing engines let engine = UnifiedEngine::with_engines(store, relational, graph, vector); }
Internal Structure
#![allow(unused)] fn main() { pub struct UnifiedEngine { store: TensorStore, relational: Arc<RelationalEngine>, graph: Arc<GraphEngine>, vector: Arc<VectorEngine>, } }
The Arc wrappers enable:
- Thread-safe sharing across async tasks
- Zero-copy cloning of the engine
- Independent engine access when needed
Entity Operations
Creating Entities
#![allow(unused)] fn main() { use std::collections::HashMap; // Create an entity with fields and optional embedding let mut fields = HashMap::new(); fields.insert("name".to_string(), "Alice".to_string()); fields.insert("role".to_string(), "admin".to_string()); engine.create_entity( "user:1", fields, Some(vec![0.1, 0.2, 0.3, 0.4]) // Optional embedding ).await?; }
Internal flow:
flowchart TD
A[create_entity] --> B{Has embedding?}
B -->|Yes| C[VectorEngine::set_entity_embedding]
C --> D[Store to TensorData._embedding]
B -->|No| E[Get existing TensorData or new]
D --> E
E --> F[For each field]
F --> G[Set field as TensorValue::Scalar]
G --> H[TensorStore::put]
Connecting Entities
#![allow(unused)] fn main() { // Connect entities via graph edge let edge_id = engine.connect_entities("user:1", "user:2", "follows").await?; }
Edge creation updates three TensorData entries:
- Creates new edge entry with
_from,_to,_edge_type,_directed - Adds edge key to source entity’s
_outfield - Adds edge key to target entity’s
_infield
Retrieving Entities
#![allow(unused)] fn main() { // Get entity with all data and embedding let item = engine.get_entity("user:1").await?; println!("Fields: {:?}", item.data); println!("Embedding: {:?}", item.embedding); }
Gotcha: Returns UnifiedError::NotFound if the entity has neither fields
nor embedding.
Cross-Engine Queries
Find Similar Connected
Find entities similar to a query that are also connected via graph edges:
flowchart LR
A[Query Entity] --> B[Get embedding]
B --> C[Vector search top_k*2]
C --> D[Get connected neighbors]
D --> E[HashSet intersection]
E --> F[Take top_k]
F --> G[Return UnifiedItems]
#![allow(unused)] fn main() { // Find entities similar to query AND connected to target let results = engine.find_similar_connected( "user:1", // Query entity (uses its embedding) "hub:main", // Find entities connected to this 10 // Top-k results ).await?; }
Algorithm details:
- Retrieves embedding from
query_keyviaVectorEngine::get_entity_embedding - Searches for top
k*2similar entities (over-fetches for filtering) - Gets neighbors of
connected_toviaGraphEngine::get_entity_neighbors - Builds HashSet of connected neighbors for O(1) lookup
- Filters similar results to only those in the neighbor set
- Returns top-k results with source
"vector+graph"
Edge case: If query_key has no embedding, returns VectorError::NotFound.
Find Similar Connected with Filter
Enhanced version that combines vector similarity, graph connectivity, and metadata filtering:
#![allow(unused)] fn main() { use vector_engine::{FilterCondition, FilterValue}; // Build a filter for metadata let filter = FilterCondition::Eq( "category".to_string(), FilterValue::String("article".to_string()) ); // Find entities similar to query, connected to hub, matching filter let results = engine.find_similar_connected_filtered( "user:1", // Query entity (uses its embedding) "hub:main", // Find entities connected to this Some(&filter), // Optional metadata filter 10 // Top-k results ).await?; }
Algorithm:
- Gets query embedding from
query_key - Gets connected neighbor keys from graph
- Builds combined filter:
key IN neighbors AND user_filter - Uses pre-filter strategy for high selectivity
- Returns filtered results with source
"vector+graph"
The filtered version eliminates post-processing by pushing filters into the vector search, improving performance for selective queries.
Find Neighbors by Similarity
Find graph neighbors sorted by similarity to a query vector:
flowchart LR
A[Entity Key] --> B[Get neighbors via graph]
B --> C[For each neighbor]
C --> D[Get embedding]
D --> E{Dimension match?}
E -->|Yes| F[Compute cosine similarity]
E -->|No| G[Skip]
F --> H[Collect results]
H --> I[Sort by score desc]
I --> J[Truncate to top_k]
#![allow(unused)] fn main() { // Find neighbors of an entity sorted by similarity to a vector let results = engine.find_neighbors_by_similarity( "user:1", // Entity to get neighbors of &[0.1, 0.2, 0.3, 0.4], // Query vector 10 // Top-k results ).await?; }
Algorithm details:
- Gets all neighbors (both directions) via
GraphEngine::get_entity_neighbors - For each neighbor:
- Attempts to get embedding via
VectorEngine::get_entity_embedding - Skips if no embedding or dimension mismatch
- Computes cosine similarity with query vector
- Attempts to get embedding via
- Sorts results by score descending
- Truncates to top-k
- Returns results with source
"graph+vector"
Gotcha: Neighbors without embeddings are silently skipped.
Unified Entity Storage
The unified entity storage methods provide a streamlined API for storing entity fields as vector metadata, eliminating double-storage overhead.
Creating Unified Entities
#![allow(unused)] fn main() { use std::collections::HashMap; let mut fields = HashMap::new(); fields.insert("title".to_string(), "Introduction to Rust".to_string()); fields.insert("author".to_string(), "Alice".to_string()); // Store entity with fields as vector metadata engine.create_entity_unified( "doc:1", fields, Some(vec![0.1, 0.2, 0.3, 0.4]) ).await?; // Without embedding, stores to TensorStore only engine.create_entity_unified("doc:2", fields, None).await?; }
When an embedding is provided, fields are stored as vector metadata alongside the embedding. This enables filtered search without requiring a separate storage lookup.
Retrieving Unified Entities
#![allow(unused)] fn main() { // Get entity with fields from vector metadata let item = engine.get_entity_unified("doc:1").await?; println!("Title: {:?}", item.data.get("title")); println!("Embedding: {:?}", item.embedding); }
The retrieval first attempts to load from vector metadata. If not found, it falls back to the standard TensorStore lookup.
Collection-Based Entity Organization
Collections provide type-based organization for entities, enabling scoped searches and dimension enforcement.
Creating Entities in Collections
#![allow(unused)] fn main() { use vector_engine::VectorCollectionConfig; use std::collections::HashMap; // Create a collection for documents let config = VectorCollectionConfig::default() .with_dimension(768) .with_metric(DistanceMetric::Cosine); engine.create_entity_collection("documents", config)?; // Store entity in collection let mut fields = HashMap::new(); fields.insert("title".to_string(), "ML Paper".to_string()); engine.create_entity_in_collection( "documents", "paper:1", fields, vec![0.1; 768] ).await?; }
Searching in Collections
#![allow(unused)] fn main() { use vector_engine::FilterCondition; // Basic search in collection let results = engine.find_similar_in_collection( "documents", &query_embedding, None, // No filter 10 ).await?; // Filtered search in collection let filter = FilterCondition::Eq("author".to_string(), "Alice".into()); let results = engine.find_similar_in_collection( "documents", &query_embedding, Some(&filter), 10 ).await?; }
Managing Collections
#![allow(unused)] fn main() { // List all entity collections let collections = engine.list_entity_collections(); // Delete a collection engine.delete_entity_collection("documents")?; }
Collection Isolation
Collections ensure entity isolation:
- Each collection has its own key namespace
- Dimension mismatches are rejected per-collection config
- Searches only see entities within the specified collection
- Deleting a collection removes all entities in it
Find Nodes and Edges
#![allow(unused)] fn main() { // Find all nodes with optional label filter let nodes = engine.find_nodes(Some("person"), None).await?; // Find all edges with optional type filter let edges = engine.find_edges(Some("follows"), None).await?; // Find with pattern and limit let pattern = FindPattern::Nodes { label: Some("document".to_string()) }; let result = engine.find(&pattern, Some(10)).await?; }
Find Pattern Matching Implementation
The find_nodes and find_edges methods scan the TensorStore for matching
entities:
Node Scanning Algorithm
#![allow(unused)] fn main() { fn scan_nodes(&self, label_filter: Option<&str>) -> Result<Vec<Node>> { let keys = self.store.scan("node:"); // Prefix scan for key in keys { // Filter out edge lists (node:123:out, node:123:in) if key.contains(":out") || key.contains(":in") { continue; } // Parse node ID from key "node:{id}" if let Some(id_str) = key.strip_prefix("node:") { if let Ok(id) = id_str.parse::<u64>() { // Fetch and optionally filter by label } } } } }
Condition Matching
Conditions are evaluated against node/edge properties:
| Condition | Node Fields | Edge Fields |
|---|---|---|
Eq("id", ...) | Matches node.id | Matches edge.id |
Eq("label", ...) | Matches node.label | N/A |
Eq("type", ...) | N/A | Matches edge.edge_type |
Eq("edge_type", ...) | N/A | Matches edge.edge_type (alias) |
Eq("from", ...) | N/A | Matches edge.from |
Eq("to", ...) | N/A | Matches edge.to |
Eq(property, ...) | Matches node.properties[property] | Matches edge.properties[property] |
And(a, b) | Both must match | Both must match |
Or(a, b) | Either must match | Either must match |
| Other conditions | Returns true (pass-through) | Returns true (pass-through) |
Gotcha: Conditions other than Eq, And, Or return true (not yet
implemented for graph entities).
Batch Operations
#![allow(unused)] fn main() { // Store multiple embeddings let items = vec![ ("doc1".to_string(), vec![0.1, 0.2, 0.3]), ("doc2".to_string(), vec![0.4, 0.5, 0.6]), ]; let count = engine.embed_batch(items).await?; // Create multiple entities let entities: Vec<EntityInput> = vec![ ("e1".to_string(), HashMap::from([("name".to_string(), "A".to_string())]), None), ("e2".to_string(), HashMap::from([("name".to_string(), "B".to_string())]), Some(vec![0.1, 0.2])), ]; let count = engine.create_entities_batch(entities).await?; }
Note: Batch operations process sequentially (not parallel). Failed individual operations are counted as failures but don’t abort the batch.
Unified Trait
Types implementing the Unified trait can be converted to UnifiedItem:
#![allow(unused)] fn main() { pub trait Unified { fn as_unified(&self) -> UnifiedItem; fn source_engine(&self) -> &'static str; fn unified_id(&self) -> String; } }
Implemented for:
graph_engine::Node- Converts label and properties to data fieldsgraph_engine::Edge- Converts from, to, type, and properties to data fieldsvector_engine::SearchResult- Converts key and score
Implementation Examples
#![allow(unused)] fn main() { impl Unified for Node { fn as_unified(&self) -> UnifiedItem { let mut item = UnifiedItem::new("graph", self.id.to_string()); item.set("label", &self.label); for (k, v) in &self.properties { item.set(k.clone(), format!("{:?}", v)); // Debug format for PropertyValue } item } fn source_engine(&self) -> &'static str { "graph" } fn unified_id(&self) -> String { self.id.to_string() } } impl Unified for SearchResult { fn as_unified(&self) -> UnifiedItem { UnifiedItem::new("vector", &self.key).with_score(self.score) } fn source_engine(&self) -> &'static str { "vector" } fn unified_id(&self) -> String { self.key.clone() } } }
Query Language
Cross-engine operations are exposed via the query language:
Entity Creation
-- Create entity with fields and embedding
ENTITY CREATE 'user:1' {name: 'Alice', role: 'admin'} EMBEDDING [0.1, 0.2, 0.3]
-- Create entity with fields only
ENTITY CREATE 'user:2' {name: 'Bob'}
-- Connect entities
ENTITY CONNECT 'user:1' -> 'user:2' : follows
Cross-Engine Similarity
-- Find similar entities that are also connected to a hub
SIMILAR 'query:key' CONNECTED TO 'hub:entity' LIMIT 10
-- Find neighbors sorted by similarity
NEIGHBORS 'entity:key' BY SIMILAR [0.1, 0.2, 0.3] LIMIT 10
QueryRouter Integration
QueryRouter integrates with UnifiedEngine for cross-engine operations. When
created with with_shared_store(), the router automatically initializes an
internal UnifiedEngine:
classDiagram
class QueryRouter {
-relational: Arc~RelationalEngine~
-graph: Arc~GraphEngine~
-vector: Arc~VectorEngine~
-unified: Option~UnifiedEngine~
-hnsw_index: Option~HNSWIndex~
+with_shared_store(store) QueryRouter
+unified() Option~UnifiedEngine~
+find_similar_connected()
+find_neighbors_by_similarity()
}
class UnifiedEngine {
-store: TensorStore
-relational: Arc~RelationalEngine~
-graph: Arc~GraphEngine~
-vector: Arc~VectorEngine~
}
QueryRouter --> UnifiedEngine : contains
QueryRouter --> RelationalEngine : shares Arc
QueryRouter --> GraphEngine : shares Arc
QueryRouter --> VectorEngine : shares Arc
UnifiedEngine --> RelationalEngine : shares Arc
UnifiedEngine --> GraphEngine : shares Arc
UnifiedEngine --> VectorEngine : shares Arc
#![allow(unused)] fn main() { use query_router::QueryRouter; use tensor_store::TensorStore; // Create router with shared store - this initializes UnifiedEngine let store = TensorStore::new(); let router = QueryRouter::with_shared_store(store); // Verify UnifiedEngine is available assert!(router.unified().is_some()); // Cross-engine Rust API methods delegate to UnifiedEngine let results = router.find_neighbors_by_similarity("entity:1", &[0.1, 0.2], 10)?; let results = router.find_similar_connected("query:1", "hub:1", 5)?; // Query language commands also use the integrated engines router.execute_parsed("ENTITY CREATE 'doc:1' {title: 'Hello'} EMBEDDING [0.1, 0.2]")?; router.execute_parsed("ENTITY CONNECT 'user:1' -> 'doc:1' : authored")?; router.execute_parsed("SIMILAR 'query:doc' CONNECTED TO 'user:1' LIMIT 5")?; }
HNSW Optimization Path
When QueryRouter has an HNSW index, find_similar_connected uses it instead of
brute-force search:
#![allow(unused)] fn main() { // Use HNSW index if available, otherwise fall back to brute-force let similar = if let Some((ref index, ref keys)) = self.hnsw_index { self.vector.search_with_hnsw(index, keys, &query_embedding, top_k * 2) } else { self.vector.search_entities(&query_embedding, top_k * 2) }; }
Performance
| Operation | Complexity | Notes |
|---|---|---|
create_entity | O(1) | Single store put + optional embedding |
connect_entities | O(1) | Three store operations (edge + 2 entity updates) |
get_entity | O(1) | Single store get + optional embedding lookup |
find_similar_connected | O(k log n) | HNSW search + graph intersection |
find_similar_connected (brute) | O(n) | Linear scan when no HNSW index |
find_similar_connected_filtered | O(m) | Pre-filter search, m = matching keys |
create_entity_unified | O(1) | Single store with metadata |
get_entity_unified | O(1) | Metadata lookup with fallback |
create_entity_in_collection | O(1) | Collection-scoped store |
find_similar_in_collection | O(c) | c = collection size |
find_neighbors_by_similarity | O(d * k) | Neighbor fetch + k similarity computations |
find_nodes | O(n) | Full scan with prefix filter |
find_edges | O(e) | Full scan with prefix filter |
embed_batch | O(b) | Sequential embedding storage |
create_entities_batch | O(b) | Sequential entity creation |
Where:
- n = number of entities with embeddings
- d = average degree (number of neighbors)
- k = top-k results requested
- e = number of edges
- b = batch size
Benchmarks
From tensor_unified_bench.rs:
| Operation | 10 items | 100 items | 1000 items |
|---|---|---|---|
create_entity | ~50us | ~500us | ~5ms |
embed_batch | ~30us | ~300us | ~3ms |
find_nodes | ~10us | ~100us | ~1ms |
UnifiedItem::new | ~50ns | — | — |
UnifiedItem::with_data | ~200ns | — | — |
Thread Safety
UnifiedEngine is thread-safe via:
Arc<VectorEngine>,Arc<GraphEngine>,Arc<RelationalEngine>- All underlying engines share thread-safe TensorStore (DashMap)
- No lock poisoning (parking_lot semantics)
#![allow(unused)] fn main() { impl Clone for UnifiedEngine { fn clone(&self) -> Self { Self { store: self.store.clone(), // Arc<DashMap> clone relational: Arc::clone(&self.relational), graph: Arc::clone(&self.graph), vector: Arc::clone(&self.vector), } } } }
Safe concurrent patterns:
- Multiple readers on same entity
- Multiple writers on different entities
- Mixed reads/writes (DashMap shard locking)
Gotcha: Concurrent writes to the same entity may interleave fields. Use transactions for atomicity.
Configuration
UnifiedEngine uses the configuration of its underlying engines:
TensorStore: Storage configurationVectorEngine: HNSW index parameters, similarity metricsGraphEngine: Graph traversal settingsRelationalEngine: Table and index configuration
Best Practices
Entity Key Naming
Use prefixed keys to distinguish entity types:
#![allow(unused)] fn main() { "user:123" // User entities "doc:456" // Document entities "hub:main" // Hub/aggregate entities "edge:follows:1" // Edge entities (auto-generated) }
Embedding Dimensions
Ensure consistent embedding dimensions across entities:
#![allow(unused)] fn main() { // Good: All entities use 384-dimensional embeddings engine.create_entity("doc:1", fields, Some(vec![0.0; 384])).await?; engine.create_entity("doc:2", fields, Some(vec![0.0; 384])).await?; // Bad: Dimension mismatch causes similarity search to skip entities engine.create_entity("doc:1", fields, Some(vec![0.0; 384])).await?; engine.create_entity("doc:2", fields, Some(vec![0.0; 768])).await?; // Different dimension! }
Cross-Engine Query Optimization
For find_similar_connected:
- Build HNSW index for large vector sets (>5000 entities)
- Ensure
connected_toentity has edges (empty neighbors returns empty results) - Request
top_k * 2internally to account for filtering
For find_neighbors_by_similarity:
- Ensure neighbors have embeddings (no embedding = skipped)
- Use same dimension for query vector as stored embeddings
- Consider degree distribution (high-degree nodes = more similarity computations)
Related Modules
| Module | Relationship |
|---|---|
tensor_store | Shared storage backend, provides TensorData and fields constants |
relational_engine | Relational data, conditions for filtering |
graph_engine | Graph connectivity, entity edges, neighbor queries |
vector_engine | Embeddings, similarity search, HNSW index, FilterCondition, FilterValue, VectorCollectionConfig |
query_router | Query execution, language integration, HNSW optimization, re-exports filter types |
Dependencies
tensor_store: Core storagerelational_engine: Table operationsgraph_engine: Graph operationsvector_engine: Vector searchtokio: Async runtime (multi-threaded)futures: Async utilitiesserde: Serialization for results and itemsserde_json: JSON output forUnifiedResult
Example: Code Intelligence System
From examples/code_search.rs:
use std::collections::HashMap; use tensor_unified::UnifiedEngine; #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { let engine = UnifiedEngine::new(); // Store functions with embeddings representing semantic meaning let mut props = HashMap::new(); props.insert("type".to_string(), "function".to_string()); props.insert("language".to_string(), "rust".to_string()); // "process_data" - embedding represents data processing semantics engine.create_entity( "func:process_data", props.clone(), Some(vec![1.0, 0.9, 0.0, 0.0]) ).await?; // "validate_input" - embedding represents validation semantics engine.create_entity( "func:validate_input", props.clone(), Some(vec![0.0, 0.1, 0.9, 0.9]) ).await?; // Create call graph relationship engine.connect_entities( "func:process_data", "func:validate_input", "CALLS" ).await?; // Find functions similar to "data processing" that call validate_input let results = engine.find_similar_connected( "func:process_data", // Query by this function's embedding "func:validate_input", // Must be connected to validation 5 ).await?; for item in results { println!("Found: {} (Score: {:.4})", item.id, item.score.unwrap_or(0.0)); } Ok(()) }
Tensor Chain Architecture
Tensor-native blockchain with semantic conflict detection, hierarchical codebook-based validation, and Tensor-Raft distributed consensus. This is the most complex module in Neumann, providing distributed transaction coordination across a cluster of nodes.
Tensor Chain treats transactions as geometric objects in embedding space. Changes are represented as delta vectors, enabling similarity-based conflict detection and automatic merging of orthogonal transactions. The module integrates Raft consensus for leader election, two-phase commit (2PC) for cross-shard transactions, SWIM gossip for failure detection, and wait-for graph analysis for deadlock detection.
Key Concepts
Raft Consensus
Tensor-Raft extends the standard Raft consensus protocol with tensor-native optimizations:
- Similarity Fast-Path: Followers can skip full validation when block embeddings are similar (>0.95 cosine) to recent blocks from the same leader
- Geometric Tie-Breaking: During elections with equal logs, candidates with similar state embeddings to the cluster centroid are preferred
- Pre-Vote Phase: Prevents disruptive elections by requiring majority agreement before incrementing term
- Automatic Heartbeat: Background task spawned on leader election maintains quorum
The leader replicates log entries containing blocks to followers. Entries are committed when a quorum (majority) acknowledges them. Committed entries are applied to the chain state machine.
Raft State Machine
stateDiagram-v2
[*] --> Follower: Node startup
Follower --> Candidate: Election timeout
Follower --> Follower: AppendEntries from leader
Follower --> Follower: Higher term seen
Candidate --> Leader: Received quorum votes
Candidate --> Follower: Higher term seen
Candidate --> Candidate: Election timeout (split vote)
Leader --> Follower: Higher term seen
Leader --> Follower: Lost quorum (heartbeat failure)
note right of Follower
Receives log entries
Grants votes
Resets election timer on heartbeat
end note
note right of Candidate
Increments term
Votes for self
Requests votes from peers
end note
note right of Leader
Proposes blocks
Sends heartbeats
Handles client requests
Tracks replication progress
end note
Pre-Vote Protocol
Pre-vote prevents disruptive elections from partitioned nodes:
Node A (partitioned, stale) Healthy Cluster
| |
|-- PreVote(term=5) --------------------->|
| |
|<-- PreVoteResponse(granted=false) ------|
| |
| Does NOT increment term |
| (prevents term inflation) |
A pre-vote is granted only if:
- Candidate’s term >= our term
- Election timeout has elapsed (no recent leader heartbeat)
- Candidate’s log is at least as up-to-date as ours
Log Replication Flow
sequenceDiagram
participant C as Client
participant L as Leader
participant F1 as Follower 1
participant F2 as Follower 2
C->>L: propose(block)
par Replicate to followers
L->>F1: AppendEntries(entries, prev_index, commit)
L->>F2: AppendEntries(entries, prev_index, commit)
end
F1->>L: AppendEntriesResponse(success, match_index)
F2->>L: AppendEntriesResponse(success, match_index)
Note over L: Quorum achieved (2/3)
L->>L: Update commit_index
L->>L: Apply to state machine
par Notify commit
L->>F1: AppendEntries(commit_index updated)
L->>F2: AppendEntries(commit_index updated)
end
L->>C: commit_success
Quorum Calculation
Quorum requires a strict majority of voting members:
#![allow(unused)] fn main() { pub fn quorum_size(total_nodes: usize) -> usize { (total_nodes / 2) + 1 } // Examples: // 3 nodes: quorum = 2 // 5 nodes: quorum = 3 // 7 nodes: quorum = 4 }
Fast-Path Validation
When enabled, followers can skip expensive block validation for similar blocks:
#![allow(unused)] fn main() { pub struct FastPathValidator { similarity_threshold: f32, // Default: 0.95 min_history: usize, // Default: 3 blocks } // Validation logic: // 1. Check if we have enough history from this leader // 2. Compute cosine similarity with recent embeddings // 3. If similarity > threshold for all recent blocks: // - Skip full validation // - Record acceptance in stats // 4. Otherwise: perform full validation }
Two-Phase Commit (2PC)
Cross-shard distributed transactions use 2PC with delta-based conflict detection:
Phase 1 - PREPARE: Coordinator sends TxPrepareMsg to each participant
shard. Participants acquire locks, compute delta embeddings, and vote Yes,
No, or Conflict.
Phase 2 - COMMIT/ABORT: If all votes are Yes and cross-shard deltas are
orthogonal (cosine < 0.1), coordinator sends TxCommitMsg. Otherwise, sends
TxAbortMsg with retry.
Orthogonal transactions (operating on independent data dimensions) can commit in parallel without coordination, reducing contention.
2PC Coordinator State Machine
stateDiagram-v2
[*] --> Preparing: begin()
Preparing --> Prepared: All votes YES + deltas orthogonal
Preparing --> Aborting: Any vote NO/Conflict
Preparing --> Aborting: Timeout
Preparing --> Aborting: Cross-shard conflict detected
Prepared --> Committing: commit()
Prepared --> Aborting: abort()
Committing --> Committed: All ACKs received
Committing --> Committed: Timeout (presumed commit)
Aborting --> Aborted: All ACKs received
Aborting --> Aborted: Timeout (presumed abort)
Committed --> [*]
Aborted --> [*]
2PC Participant State Machine
stateDiagram-v2
[*] --> Idle
Idle --> LockAcquiring: TxPrepareMsg received
LockAcquiring --> Locked: Locks acquired
LockAcquiring --> VoteNo: Lock conflict
Locked --> ConflictCheck: Compute delta
ConflictCheck --> VoteYes: No conflicts
ConflictCheck --> VoteConflict: Semantic conflict
VoteYes --> WaitingDecision: Send YES vote
VoteNo --> [*]: Send NO vote
VoteConflict --> [*]: Send CONFLICT vote
WaitingDecision --> Committed: TxCommitMsg
WaitingDecision --> Aborted: TxAbortMsg
WaitingDecision --> Aborted: Timeout
Committed --> [*]: Release locks, apply ops
Aborted --> [*]: Release locks, rollback
Lock Ordering (Deadlock Prevention)
The coordinator follows strict lock ordering to prevent internal deadlocks:
Lock acquisition order:
1. pending - Transaction state map
2. lock_manager.locks - Key-level locks
3. lock_manager.tx_locks - Per-transaction lock sets
4. pending_aborts - Abort queue
CRITICAL: Never acquire pending_aborts while holding pending
WAL Recovery Protocol
The coordinator uses write-ahead logging for crash recovery:
#![allow(unused)] fn main() { // Recovery state machine: // 1. Replay WAL to reconstruct pending transactions // 2. For each transaction, determine recovery action: match tx.phase { TxPhase::Preparing => { // Incomplete prepare - abort (presumed abort) tx.phase = TxPhase::Aborting; } TxPhase::Prepared => { // All YES votes recorded - check if can commit if all_yes_votes && deltas_orthogonal { tx.phase = TxPhase::Committing; } else { tx.phase = TxPhase::Aborting; } } TxPhase::Committing => { // Continue commit - presumed commit complete_commit(tx); } TxPhase::Aborting => { // Continue abort complete_abort(tx); } } }
SWIM Gossip Protocol
Scalable membership management replaces O(N) sequential pings with O(log N) epidemic propagation:
- Peer Sampling: Select k peers per round (default: 3) using geometric routing
- LWW-CRDT State: Last-Writer-Wins conflict resolution with Lamport timestamps
- Suspicion Protocol: Direct ping failure triggers indirect probes via intermediaries. Suspicion timer (5s default) allows refutation before marking node as failed
Gossip Message Types
#![allow(unused)] fn main() { pub enum GossipMessage { /// Sync message with piggy-backed node states Sync { sender: NodeId, states: Vec<GossipNodeState>, sender_time: u64, // Lamport timestamp }, /// Suspect a node of failure Suspect { reporter: NodeId, suspect: NodeId, incarnation: u64, }, /// Refute suspicion by proving aliveness Alive { node_id: NodeId, incarnation: u64, // Incremented to refute }, /// Indirect ping request (SWIM protocol) PingReq { origin: NodeId, target: NodeId, sequence: u64, }, /// Indirect ping response PingAck { origin: NodeId, target: NodeId, sequence: u64, success: bool, }, } }
LWW-CRDT State Merging
#![allow(unused)] fn main() { // State supersession rules: impl GossipNodeState { pub fn supersedes(&self, other: &GossipNodeState) -> bool { // Incarnation takes precedence if self.incarnation != other.incarnation { self.incarnation > other.incarnation } else { // Same incarnation: higher timestamp wins self.timestamp > other.timestamp } } } // Merge algorithm: pub fn merge(&mut self, incoming: &[GossipNodeState]) -> Vec<NodeId> { let mut changed = Vec::new(); for state in incoming { match self.states.get(&state.node_id) { Some(existing) if state.supersedes(existing) => { self.states.insert(state.node_id.clone(), state.clone()); changed.push(state.node_id.clone()); } None => { self.states.insert(state.node_id.clone(), state.clone()); changed.push(state.node_id.clone()); } _ => {} // Existing state is newer, ignore } } // Sync Lamport time to max + 1 if let Some(max_ts) = incoming.iter().map(|s| s.timestamp).max() { self.lamport_time = self.lamport_time.max(max_ts) + 1; } changed } }
SWIM Failure Detection Flow
sequenceDiagram
participant A as Node A
participant B as Node B (suspect)
participant C as Node C (intermediary)
participant D as Node D (intermediary)
A->>B: Direct Ping
Note over B: No response (timeout)
par Indirect probes
A->>C: PingReq(target=B)
A->>D: PingReq(target=B)
end
C->>B: Ping (on behalf of A)
D->>B: Ping (on behalf of A)
alt B responds to C
B->>C: Pong
C->>A: PingAck(success=true)
Note over A: B is healthy
else All indirect pings fail
C->>A: PingAck(success=false)
D->>A: PingAck(success=false)
Note over A: Start suspicion timer (5s)
A->>A: Broadcast Suspect(B)
alt B refutes within 5s
B->>A: Alive(incarnation++)
Note over A: Cancel suspicion
else Timer expires
Note over A: Mark B as Failed
end
end
Incarnation Number Protocol
Scenario: Node B receives Suspect about itself
B's current incarnation: 5
Suspect message incarnation: 5
B increments: incarnation = 6
B broadcasts: Alive { node_id: B, incarnation: 6 }
All nodes receiving Alive update B's state:
- incarnation: 6
- health: Healthy
- timestamp: <lamport_time++>
Deadlock Detection
Wait-for graph tracks transaction dependencies for cycle detection:
- Edge
A -> Badded when transaction A blocks waiting for B to release locks - Periodic DFS traversal detects cycles (deadlocks)
- Victim selected based on policy (youngest, oldest, lowest priority, or most locks)
- Victim transaction aborted to break the cycle
Wait-For Graph Structure
#![allow(unused)] fn main() { pub struct WaitForGraph { /// Maps tx_id -> set of tx_ids it is waiting for edges: HashMap<u64, HashSet<u64>>, /// Reverse edges for O(1) removal: holder -> waiters reverse_edges: HashMap<u64, HashSet<u64>>, /// Timestamp when wait started (for victim selection) wait_started: HashMap<u64, EpochMillis>, /// Priority values (lower = higher priority) priorities: HashMap<u64, u32>, } }
Tarjan’s DFS Cycle Detection Algorithm
#![allow(unused)] fn main() { fn dfs_detect( &self, node: u64, edges: &HashMap<u64, HashSet<u64>>, visited: &mut HashSet<u64>, rec_stack: &mut HashSet<u64>, // Current recursion path path: &mut Vec<u64>, // Explicit path for extraction cycles: &mut Vec<Vec<u64>>, ) { visited.insert(node); rec_stack.insert(node); path.push(node); if let Some(neighbors) = edges.get(&node) { for &neighbor in neighbors { if !visited.contains(&neighbor) { // Continue DFS on unvisited self.dfs_detect(neighbor, edges, visited, rec_stack, path, cycles); } else if rec_stack.contains(&neighbor) { // Back-edge to ancestor = cycle found! if let Some(cycle_start) = path.iter().position(|&n| n == neighbor) { cycles.push(path[cycle_start..].to_vec()); } } } } path.pop(); rec_stack.remove(&node); } }
Victim Selection Policies
| Policy | Selection Criteria | Trade-off |
|---|---|---|
Youngest | Most recent wait start (highest timestamp) | Minimizes wasted work, may starve long transactions |
Oldest | Earliest wait start (lowest timestamp) | Prevents starvation, wastes more completed work |
LowestPriority | Highest priority value | Business-rule based, requires priority assignment |
MostLocks | Transaction holding most locks | Maximizes freed resources, may abort complex transactions |
Ed25519 Signing and Identity
Cryptographic identity binding ensures message authenticity and enables geometric routing:
Identity Generation and NodeId Derivation
#![allow(unused)] fn main() { pub struct Identity { signing_key: SigningKey, // Ed25519 private key (zeroized on drop) } impl Identity { pub fn generate() -> Self { let signing_key = SigningKey::generate(&mut OsRng); Self { signing_key } } /// NodeId = BLAKE2b-128(domain || public_key) /// 16 bytes = 32 hex characters pub fn node_id(&self) -> NodeId { let mut hasher = Blake2b::<U16>::new(); hasher.update(b"neumann_node_id_v1"); hasher.update(self.signing_key.verifying_key().as_bytes()); hex::encode(hasher.finalize()) } /// Embedding = BLAKE2b-512(domain || public_key) -> 16 f32 coords /// Normalized to [-1, 1] for geometric operations pub fn to_embedding(&self) -> SparseVector { let mut hasher = Blake2b::<U64>::new(); hasher.update(b"neumann_node_embedding_v1"); hasher.update(self.signing_key.verifying_key().as_bytes()); let hash = hasher.finalize(); // 64 bytes -> 16 f32 coordinates let coords: Vec<f32> = hash.chunks(4) .map(|c| { let bits = u32::from_le_bytes([c[0], c[1], c[2], c[3]]); (bits as f64 / u32::MAX as f64 * 2.0 - 1.0) as f32 }) .collect(); SparseVector::from_dense(&coords) } } }
Signed Message Envelope
#![allow(unused)] fn main() { pub struct SignedMessage { pub sender: NodeId, // Derived from public key pub public_key: [u8; 32], // Ed25519 verifying key pub payload: Vec<u8>, // Message content pub signature: Vec<u8>, // 64-byte Ed25519 signature pub sequence: u64, // Replay protection pub timestamp_ms: u64, // Freshness check } // Signature covers: sender || sequence || timestamp || payload // This binds identity, ordering, and content together }
Replay Protection
#![allow(unused)] fn main() { pub struct SequenceTracker { sequences: DashMap<NodeId, (u64, Instant)>, config: SequenceTrackerConfig, } impl SequenceTracker { pub fn check_and_record( &self, sender: &NodeId, sequence: u64, timestamp_ms: u64, ) -> Result<()> { // 1. Reject messages from the future (allow 1 min clock skew) if timestamp_ms > now_ms + 60_000 { return Err("message timestamp is in the future"); } // 2. Reject stale messages (default: 5 min max age) if now_ms > timestamp_ms + self.config.max_age_ms { return Err("message too old"); } // 3. Check sequence number is strictly increasing let entry = self.sequences.entry(sender.clone()).or_insert((0, now)); if sequence <= entry.0 { return Err("replay detected: sequence <= last seen"); } *entry = (sequence, now); Ok(()) } } }
Message Validation Pipeline
All incoming messages pass through validation before processing:
flowchart TB
subgraph Validation["Message Validation Pipeline"]
Input["Incoming Message"]
NodeIdCheck["Validate NodeId Format"]
TypeDispatch["Dispatch by Type"]
TermCheck["Validate Term Bounds"]
ShardCheck["Validate Shard ID"]
TimeoutCheck["Validate Timeout"]
EmbeddingCheck["Validate Embedding"]
SignatureCheck["Validate Signature"]
Accept["Accept Message"]
Reject["Reject with Error"]
end
Input --> NodeIdCheck
NodeIdCheck -->|Invalid| Reject
NodeIdCheck -->|Valid| TypeDispatch
TypeDispatch -->|Raft| TermCheck
TypeDispatch -->|2PC| ShardCheck
TypeDispatch -->|Signed| SignatureCheck
TermCheck -->|Invalid| Reject
TermCheck -->|Valid| EmbeddingCheck
ShardCheck -->|Invalid| Reject
ShardCheck -->|Valid| TimeoutCheck
TimeoutCheck -->|Invalid| Reject
TimeoutCheck -->|Valid| EmbeddingCheck
EmbeddingCheck -->|Invalid| Reject
EmbeddingCheck -->|Valid| Accept
SignatureCheck -->|Invalid| Reject
SignatureCheck -->|Valid| Accept
Embedding Validation
#![allow(unused)] fn main() { pub struct EmbeddingValidator { max_dimension: usize, // Default: 65,536 max_magnitude: f32, // Default: 1,000,000 } impl EmbeddingValidator { pub fn validate(&self, embedding: &SparseVector, field: &str) -> Result<()> { // 1. Dimension bounds if embedding.dimension() == 0 { return Err("dimension cannot be zero"); } if embedding.dimension() > self.max_dimension { return Err("dimension exceeds maximum"); } // 2. NaN/Inf detection (prevents computation errors) for (i, value) in embedding.values().iter().enumerate() { if value.is_nan() { return Err(format!("NaN value at position {}", i)); } if value.is_infinite() { return Err(format!("infinite value at position {}", i)); } } // 3. Magnitude bounds (prevents DoS via huge vectors) if embedding.magnitude() > self.max_magnitude { return Err("magnitude exceeds maximum"); } // 4. Position validity (sorted, within bounds) let positions = embedding.positions(); for (i, &pos) in positions.iter().enumerate() { if pos as usize >= embedding.dimension() { return Err("position out of bounds"); } if i > 0 && positions[i - 1] >= pos { return Err("positions not strictly sorted"); } } Ok(()) } } }
Semantic Conflict Detection
The consensus manager uses hybrid detection combining angular and structural similarity:
Conflict Classification Algorithm
#![allow(unused)] fn main() { pub fn detect_conflict(&self, d1: &DeltaVector, d2: &DeltaVector) -> ConflictResult { let cosine = d1.cosine_similarity(d2); let jaccard = d1.structural_similarity(d2); // Jaccard index let overlapping_keys = d1.overlapping_keys(d2); let all_keys_overlap = overlapping_keys.len() == d1.affected_keys.len() && overlapping_keys.len() == d2.affected_keys.len(); // Classification hierarchy: let (class, action) = if cosine >= 0.99 && all_keys_overlap { // Identical: same direction, same keys (ConflictClass::Identical, MergeAction::Deduplicate) } else if cosine <= -0.95 && all_keys_overlap { // Opposite: cancel out (A + (-A) = 0) (ConflictClass::Opposite, MergeAction::Cancel) } else if cosine.abs() < 0.1 && jaccard < 0.5 { // Truly orthogonal: different directions AND different positions (ConflictClass::Orthogonal, MergeAction::VectorAdd) } else if cosine >= 0.7 { // Angular conflict: pointing same direction (ConflictClass::Conflicting, MergeAction::Reject) } else if jaccard >= 0.5 { // Structural conflict: same positions modified // Catches conflicts that cosine misses (ConflictClass::Conflicting, MergeAction::Reject) } else if !overlapping_keys.is_empty() { // Key overlap without structural/angular conflict (ConflictClass::Ambiguous, MergeAction::Reject) } else { // Low conflict: merge with weighted average (ConflictClass::LowConflict, MergeAction::WeightedAverage { weight1: 50, weight2: 50 }) }; ConflictResult { class, cosine, jaccard, overlapping_keys, action, .. } } }
Merge Operations
#![allow(unused)] fn main() { impl DeltaVector { /// Vector addition for orthogonal deltas pub fn add(&self, other: &DeltaVector) -> DeltaVector { let delta = self.delta.add(&other.delta); let keys = self.affected_keys.union(&other.affected_keys).cloned().collect(); DeltaVector::from_sparse(delta, keys, 0) } /// Weighted average for low-conflict deltas pub fn weighted_average(&self, other: &DeltaVector, w1: f32, w2: f32) -> DeltaVector { let total = w1 + w2; if total == 0.0 { return DeltaVector::zero(0); } let delta = self.delta.weighted_average(&other.delta, w1, w2); let keys = self.affected_keys.union(&other.affected_keys).cloned().collect(); DeltaVector::from_sparse(delta, keys, 0) } /// Project out conflicting component pub fn project_non_conflicting(&self, conflict_direction: &SparseVector) -> DeltaVector { let delta = self.delta.project_orthogonal(conflict_direction); DeltaVector::from_sparse(delta, self.affected_keys.clone(), self.tx_id) } } }
Types Reference
Core Types
| Type | Module | Description |
|---|---|---|
TensorChain | lib.rs | Main API for chain operations, transaction management |
Block | block.rs | Block structure with header, transactions, signatures |
BlockHeader | block.rs | Height, prev_hash, delta_embedding, quantized_codes |
Transaction | block.rs | Put, Delete, Update operations |
ChainConfig | lib.rs | Node ID, max transactions, conflict threshold, auto-merge |
ChainError | error.rs | Error types for all chain operations |
ChainMetrics | lib.rs | Aggregated metrics from all components |
Consensus Types
| Type | Module | Description |
|---|---|---|
RaftNode | raft.rs | Raft state machine with leader election, log replication |
RaftState | raft.rs | Follower, Candidate, or Leader |
RaftConfig | raft.rs | Election timeout, heartbeat interval, fast-path settings |
RaftStats | raft.rs | Fast-path acceptance, heartbeat timing, quorum tracking |
QuorumTracker | raft.rs | Tracks heartbeat responses to detect quorum loss |
SnapshotMetadata | raft.rs | Log compaction point with hash and membership config |
LogEntry | network.rs | Raft log entry with term, index, and data |
ConsensusManager | consensus.rs | Semantic conflict detection and transaction merging |
DeltaVector | consensus.rs | Sparse delta embedding with affected keys |
ConflictClass | consensus.rs | Orthogonal, LowConflict, Ambiguous, Conflicting, Identical, Opposite |
FastPathValidator | validation.rs | Block similarity validation for fast-path acceptance |
FastPathState | raft.rs | Per-leader embedding history for fast-path |
TransferState | raft.rs | Active leadership transfer tracking |
HeartbeatStats | raft.rs | Heartbeat success/failure counters |
Distributed Transaction Types
| Type | Module | Description |
|---|---|---|
DistributedTxCoordinator | distributed_tx.rs | 2PC coordinator with timeout and retry |
DistributedTransaction | distributed_tx.rs | Transaction spanning multiple shards |
TxPhase | distributed_tx.rs | Preparing, Prepared, Committing, Committed, Aborting, Aborted |
PrepareVote | distributed_tx.rs | Yes (with lock handle), No (with reason), Conflict |
LockManager | distributed_tx.rs | Key-level locking for transaction isolation |
KeyLock | distributed_tx.rs | Lock on a key with timeout and handle |
TxWal | tx_wal.rs | Write-ahead log for crash recovery |
TxWalEntry | tx_wal.rs | WAL entry types: TxBegin, PrepareVote, PhaseChange, TxComplete |
TxRecoveryState | tx_wal.rs | Reconstructed state from WAL replay |
PrepareRequest | distributed_tx.rs | Request to prepare a transaction on a shard |
CommitRequest | distributed_tx.rs | Request to commit a prepared transaction |
AbortRequest | distributed_tx.rs | Request to abort a transaction |
CoordinatorState | distributed_tx.rs | Serializable coordinator state for persistence |
ParticipantState | distributed_tx.rs | Serializable participant state for persistence |
Gossip Types
| Type | Module | Description |
|---|---|---|
GossipMembershipManager | gossip.rs | SWIM-style gossip with signing support |
GossipConfig | gossip.rs | Fanout, interval, suspicion timeout, signature requirements |
GossipMessage | gossip.rs | Sync, Suspect, Alive, PingReq, PingAck |
GossipNodeState | gossip.rs | Node health, Lamport timestamp, incarnation |
LWWMembershipState | gossip.rs | CRDT for conflict-free state merging |
PendingSuspicion | gossip.rs | Suspicion timer tracking |
HealProgress | gossip.rs | Recovery tracking for partitioned nodes |
SignedGossipMessage | signing.rs | Gossip message with Ed25519 signature |
Deadlock Detection Types
| Type | Module | Description |
|---|---|---|
DeadlockDetector | deadlock.rs | Cycle detection with configurable victim selection |
WaitForGraph | deadlock.rs | Directed graph of transaction dependencies |
DeadlockInfo | deadlock.rs | Detected cycle with selected victim |
VictimSelectionPolicy | deadlock.rs | Youngest, Oldest, LowestPriority, MostLocks |
DeadlockStats | deadlock.rs | Detection timing and cycle length statistics |
WaitInfo | deadlock.rs | Lock conflict information for wait-graph edges |
Identity and Signing Types
| Type | Module | Description |
|---|---|---|
Identity | signing.rs | Ed25519 private key (zeroized on drop) |
PublicIdentity | signing.rs | Ed25519 public key for verification |
SignedMessage | signing.rs | Message envelope with signature and replay protection |
ValidatorRegistry | signing.rs | Registry of known validator public keys |
SequenceTracker | signing.rs | Replay attack detection via sequence numbers |
SequenceTrackerConfig | signing.rs | Max age, max entries, cleanup interval |
Message Validation Types
| Type | Module | Description |
|---|---|---|
MessageValidationConfig | message_validation.rs | Bounds for DoS prevention |
CompositeValidator | message_validation.rs | Validates all message types |
EmbeddingValidator | message_validation.rs | Checks dimension, magnitude, NaN/Inf |
MessageValidator | message_validation.rs | Trait for pluggable validation |
Architecture Diagram
flowchart TB
subgraph Client["Client Layer"]
TensorChain["TensorChain API"]
TransactionWorkspace["Transaction Workspace"]
end
subgraph Consensus["Consensus Layer"]
RaftNode["Raft Node"]
ConsensusManager["Consensus Manager"]
FastPath["Fast-Path Validator"]
end
subgraph Network["Network Layer"]
Transport["Transport Trait"]
TcpTransport["TCP Transport"]
MemoryTransport["Memory Transport"]
MessageValidator["Message Validator"]
end
subgraph Membership["Membership Layer"]
GossipManager["Gossip Manager"]
MembershipManager["Membership Manager"]
GeometricMembership["Geometric Membership"]
end
subgraph DistTx["Distributed Transactions"]
Coordinator["2PC Coordinator"]
LockManager["Lock Manager"]
DeadlockDetector["Deadlock Detector"]
TxWal["Transaction WAL"]
end
subgraph Storage["Storage Layer"]
Chain["Chain (Graph Engine)"]
Codebook["Codebook Manager"]
RaftWal["Raft WAL"]
end
TensorChain --> TransactionWorkspace
TensorChain --> ConsensusManager
TensorChain --> Codebook
TransactionWorkspace --> Chain
RaftNode --> Transport
RaftNode --> ConsensusManager
RaftNode --> FastPath
RaftNode --> RaftWal
Transport --> TcpTransport
Transport --> MemoryTransport
TcpTransport --> MessageValidator
GossipManager --> Transport
GossipManager --> MembershipManager
MembershipManager --> GeometricMembership
Coordinator --> LockManager
Coordinator --> DeadlockDetector
Coordinator --> Transport
Coordinator --> TxWal
LockManager --> DeadlockDetector
Chain --> Codebook
Subsystems
Consensus Subsystem
The Raft consensus implementation provides strong consistency guarantees:
State Machine:
Follower: Receives AppendEntries from leader, grants votesCandidate: Requests votes after election timeoutLeader: Proposes blocks, sends heartbeats, handles client requests
Log Replication:
Leader: propose(block) -> AppendEntries to followers
-> Wait for quorum acknowledgment
-> Update commit_index
-> Apply to state machine
Fast-Path Validation: When enabled and block embedding similarity exceeds threshold (default 0.95), followers skip full validation. This optimization assumes semantically similar blocks from the same leader are likely valid.
Log Compaction:
After snapshot_threshold entries (default 10,000), a snapshot captures the
state machine at the commit point. Entries before the snapshot can be truncated,
keeping only snapshot_trailing_logs entries for followers catching up.
Distributed Transactions Subsystem
Cross-shard coordination uses two-phase commit with tensor-native conflict detection:
Phase 1 - Prepare:
Coordinator Participant (per shard)
| |
|--- TxPrepareMsg --------------->|
| (ops, delta_embedding) |
| |-- acquire locks
| |-- compute local delta
| |-- check conflicts
|<--- TxPrepareResponse ----------|
| (Yes/No/Conflict) |
Phase 2 - Commit or Abort:
If all Yes AND deltas orthogonal:
|--- TxCommitMsg ---------------->| -- release locks, apply ops
|<--- TxAckMsg -------------------|
Otherwise:
|--- TxAbortMsg ----------------->| -- release locks, rollback
|<--- TxAckMsg -------------------|
Conflict Detection: Uses hybrid detection combining cosine similarity (angular conflict) and Jaccard index (structural conflict):
| Cosine | Jaccard | Classification | Action |
|---|---|---|---|
| < 0.1 | < 0.5 | Orthogonal | Auto-merge (vector add) |
| 0.1-0.7 | < 0.5 | LowConflict | Weighted merge |
| >= 0.7 | any | Conflicting | Reject |
| any | >= 0.5 | Conflicting | Reject (structural) |
| >= 0.99 | all keys | Identical | Deduplicate |
| <= -0.95 | all keys | Opposite | Cancel (no-op) |
Gossip Protocol Subsystem
SWIM-style failure detection with LWW-CRDT state:
Gossip Round:
1. Select k peers (fanout=3) using geometric routing
2. Send Sync message with piggybacked node states
3. Merge received states (higher incarnation wins)
4. Update Lamport time
Failure Detection:
Direct ping failed
|
v
Send PingReq to k intermediaries
|
v
All indirect pings failed?
|-- No --> Mark healthy
|-- Yes --> Start suspicion timer (5s)
|
v
Timer expired without Alive?
|-- No --> Mark healthy (refuted)
|-- Yes --> Mark failed
Incarnation Numbers: When a node receives a Suspect about itself, it increments its incarnation and broadcasts Alive to refute the suspicion.
Deadlock Detection Subsystem
Wait-for graph analysis for cycle detection:
Graph Structure:
Edge: waiter_tx -> holder_tx
Meaning: waiter is blocked waiting for holder to release locks
Detection Algorithm (Tarjan’s DFS):
1. For each unvisited node, start DFS
2. Track recursion stack for back-edge detection
3. Back-edge to ancestor = cycle found
4. Extract cycle path for victim selection
Victim Selection Policies:
Youngest: Abort most recent transaction (minimize wasted work)Oldest: Abort earliest transaction (prevent starvation)LowestPriority: Abort transaction with highest priority valueMostLocks: Abort transaction holding most locks (minimize cascade)
Configuration Options
RaftConfig
| Field | Default | Description |
|---|---|---|
election_timeout | (150, 300) | Random timeout range in ms |
heartbeat_interval | 50 | Heartbeat interval in ms |
similarity_threshold | 0.95 | Fast-path similarity threshold |
enable_fast_path | true | Enable fast-path validation |
enable_pre_vote | true | Enable pre-vote phase |
enable_geometric_tiebreak | true | Enable geometric tie-breaking |
geometric_tiebreak_threshold | 0.3 | Minimum similarity for tiebreak |
snapshot_threshold | 10,000 | Entries before compaction |
snapshot_trailing_logs | 100 | Entries to keep after snapshot |
snapshot_chunk_size | 1MB | Chunk size for snapshot transfer |
transfer_timeout_ms | 1,000 | Leadership transfer timeout |
compaction_check_interval | 10 | Ticks between compaction checks |
compaction_cooldown_ms | 60,000 | Minimum time between compactions |
snapshot_max_memory | 256MB | Max memory for snapshot buffering |
auto_heartbeat | true | Spawn heartbeat task on leader election |
max_heartbeat_failures | 3 | Failures before logging warning |
DistributedTxConfig
| Field | Default | Description |
|---|---|---|
max_concurrent | 100 | Maximum concurrent transactions |
prepare_timeout_ms | 5,000 | Prepare phase timeout |
commit_timeout_ms | 10,000 | Commit phase timeout |
orthogonal_threshold | 0.1 | Cosine threshold for orthogonality |
optimistic_locking | true | Enable semantic conflict detection |
GossipConfig
| Field | Default | Description |
|---|---|---|
fanout | 3 | Peers per gossip round |
gossip_interval_ms | 200 | Interval between rounds |
suspicion_timeout_ms | 5,000 | Time before failure declaration |
max_states_per_message | 20 | State limit per message |
geometric_routing | true | Use embedding-based peer selection |
indirect_ping_count | 3 | Indirect pings on direct failure |
indirect_ping_timeout_ms | 500 | Timeout for indirect pings |
require_signatures | false | Require Ed25519 signatures |
max_message_age_ms | 300,000 | Maximum signed message age |
DeadlockDetectorConfig
| Field | Default | Description |
|---|---|---|
enabled | true | Enable deadlock detection |
detection_interval_ms | 100 | Detection cycle interval |
victim_policy | Youngest | Victim selection policy |
max_cycle_length | 100 | Maximum detectable cycle length |
auto_abort_victim | true | Automatically abort victim |
MessageValidationConfig
| Field | Default | Description |
|---|---|---|
enabled | true | Enable validation |
max_term | u64::MAX - 1 | Prevent overflow attacks |
max_shard_id | 65,536 | Bound shard addressing |
max_tx_timeout_ms | 300,000 | Maximum transaction timeout |
max_node_id_len | 256 | Maximum node ID length |
max_key_len | 4,096 | Maximum key length |
max_embedding_dimension | 65,536 | Prevent huge allocations |
max_embedding_magnitude | 1,000,000 | Detect invalid values |
max_query_len | 1MB | Maximum query string length |
max_message_age_ms | 300,000 | Reject stale/replayed messages |
max_blocks_per_request | 1,000 | Limit block range requests |
max_snapshot_chunk_size | 10MB | Limit snapshot chunk size |
Edge Cases and Gotchas
Raft Edge Cases
-
Split Vote: When multiple candidates split the vote evenly, election timeout triggers new election. Randomized timeouts (150-300ms) reduce collision probability.
-
Network Partition: During partition, minority side cannot commit (lacks quorum). Pre-vote prevents term inflation when partition heals.
-
Stale Leader: A partitioned leader may not know it lost leadership. Quorum tracker detects heartbeat failures and steps down.
-
Log Divergence: Followers with divergent logs are overwritten by leader’s log (consistency > availability).
-
Snapshot During Election: Snapshot transfer continues even if leadership changes. New leader may need to resend snapshot.
2PC Edge Cases
-
Coordinator Failure After Prepare: Participants holding locks may timeout. WAL recovery allows new coordinator to resume.
-
Participant Failure: Coordinator times out waiting for vote. Transaction aborts, participant recovers from WAL on restart.
-
Network Partition Between Phases: Commit messages may not reach all participants. Retry loop ensures eventual delivery.
-
Lock Timeout vs Transaction Timeout: Lock timeout (30s) should exceed transaction timeout (5s) to prevent premature lock release.
-
Orphaned Locks: Locks from crashed transactions are cleaned up by periodic
cleanup_expired()or WAL recovery.
Gossip Edge Cases
-
Incarnation Overflow: Theoretically possible with u64, but requires 2^64 restarts. Practically impossible.
-
Clock Skew: Lamport timestamps are logical, not wall-clock. Sync messages update local Lamport time to
max(local, remote) + 1. -
Signature Replay: Sequence numbers and timestamp freshness checks prevent replaying old signed messages.
-
Rapid Restart: Node restarting rapidly may have lower incarnation than suspected state. New incarnation on restart resolves this.
Conflict Detection Edge Cases
-
Zero Vector: Empty deltas (no changes) have undefined cosine similarity. Treated as orthogonal.
-
Nearly Identical: Transactions with 0.99 < similarity < 1.0 may conflict. Use structural overlap (Jaccard) as secondary check.
-
Large Dimension Mismatch: Deltas with different dimensions cannot be directly compared. Pad smaller to match larger.
Recovery Procedures
Raft Recovery from WAL
#![allow(unused)] fn main() { // 1. Open WAL and replay entries let wal = RaftWal::open(wal_path)?; let recovery = RaftRecoveryState::from_wal(&wal)?; // 2. Restore term and voted_for node.current_term = recovery.current_term; node.voted_for = recovery.voted_for; // 3. Validate snapshot if present if let Some((meta, data)) = load_snapshot() { let computed_hash = sha256(&data); if computed_hash == meta.snapshot_hash { // Valid snapshot - restore state machine apply_snapshot(meta, data); } else { // Corrupted snapshot - ignore warn!("Snapshot hash mismatch, starting fresh"); } } // 4. Start as follower node.state = RaftState::Follower; }
2PC Coordinator Recovery
#![allow(unused)] fn main() { // 1. Replay WAL to reconstruct pending transactions let recovery = TxRecoveryState::from_wal(&wal)?; // 2. Process each transaction based on phase for tx in recovery.prepared_txs { // All YES votes - resume commit coordinator.pending.insert(tx.tx_id, restore_tx(tx, TxPhase::Prepared)); } for tx in recovery.committing_txs { // Was committing - complete commit coordinator.complete_commit(tx.tx_id)?; } for tx in recovery.aborting_txs { // Was aborting - complete abort coordinator.complete_abort(tx.tx_id)?; } // 3. Timed out transactions default to abort (presumed abort) for tx in recovery.timed_out_txs { coordinator.abort(tx.tx_id, "recovered - timeout")?; } }
Gossip State Recovery
#![allow(unused)] fn main() { // Gossip state is reconstructed via protocol, not WAL // 1. Start with only local node in state let mut state = LWWMembershipState::new(); state.update_local(local_node.clone(), NodeHealth::Healthy, 0); // 2. Add known peers as Unknown for peer in known_peers { state.merge(&[GossipNodeState::new(peer, NodeHealth::Unknown, 0, 0)]); } // 3. Gossip protocol will converge to correct state // - Healthy nodes will respond to Sync // - Failed nodes will be suspected and eventually marked failed }
Operational Best Practices
Cluster Sizing
- Minimum: 3 nodes (tolerates 1 failure)
- Recommended: 5 nodes (tolerates 2 failures)
- Large: 7 nodes (tolerates 3 failures)
- Avoid even numbers (split-brain risk)
Timeout Tuning
#![allow(unused)] fn main() { // Network latency < 10ms (same datacenter) RaftConfig { election_timeout: (150, 300), heartbeat_interval: 50, } // Network latency 10-50ms (cross-datacenter) RaftConfig { election_timeout: (500, 1000), heartbeat_interval: 150, } // Network latency > 50ms (geo-distributed) RaftConfig { election_timeout: (2000, 4000), heartbeat_interval: 500, } }
Monitoring
Key metrics to monitor:
| Metric | Warning Threshold | Critical Threshold |
|---|---|---|
heartbeat_success_rate | < 0.95 | < 0.80 |
fast_path_rate | < 0.50 | < 0.20 |
commit_rate | < 0.80 | < 0.50 |
conflict_rate | > 0.10 | > 0.30 |
deadlocks_detected | > 0/min | > 10/min |
quorum_lost_events | > 0/hour | > 0/min |
Security Considerations
- Enable Message Signing: Set
require_signatures: truein production - Rotate Keys: Periodically generate new identities and update registry
- Network Isolation: Use TLS for transport, firewall cluster ports
- Audit Logging: Log all state transitions for forensic analysis
Formal Verification
The three core protocols (Raft, 2PC, SWIM gossip) are formally
specified in TLA+ (specs/tla/) and exhaustively model-checked
with TLC:
| Spec | Distinct States | Properties Verified |
|---|---|---|
| Raft.tla | 18,268,659 | ElectionSafety, LogMatching, StateMachineSafety, LeaderCompleteness, VoteIntegrity, TermMonotonicity |
| TwoPhaseCommit.tla | 2,264,939 | Atomicity, NoOrphanedLocks, ConsistentDecision, VoteIrrevocability, DecisionStability |
| Membership.tla | 54,148 | NoFalsePositivesSafety, MonotonicEpochs, MonotonicIncarnations |
Model checking discovered protocol-level bugs (out-of-order message handling, self-message processing, heartbeat log truncation) that were fixed in both the specs and the Rust implementation. See Formal Verification for full results and the list of bugs found.
Related Modules
| Module | Relationship |
|---|---|
tensor_store | Provides TensorStore for persistence, SparseVector for embeddings, ArchetypeRegistry for delta compression |
graph_engine | Blocks linked via graph edges, chain structure built on graph |
tensor_compress | Int8 quantization for delta embeddings (4x compression) |
tensor_checkpoint | Snapshot persistence for crash recovery |
Performance Characteristics
| Operation | Time | Notes |
|---|---|---|
| Transaction commit | ~50us | Single transaction block |
| Conflict detection | ~1us | Cosine + Jaccard calculation |
| Deadlock detection | ~350ns | DFS cycle detection |
| Gossip round | ~200ms | Configurable interval |
| Heartbeat | ~50ms | Leader to all followers |
| Fast-path validation | ~2us | Similarity check only |
| Full validation | ~50us | Complete block verification |
| Lock acquisition | ~100ns | Uncontended case |
| Lock acquisition (contended) | ~10us | With wait-graph update |
| Signature verification | ~50us | Ed25519 verify |
| Message validation | ~1us | Bounds checking |
Neumann Parser
The neumann_parser crate provides a hand-written recursive descent parser for
the Neumann unified query language. It converts source text into an Abstract
Syntax Tree (AST) that can be executed by the query router.
The parser is designed with zero external dependencies, full span tracking for error reporting, and support for SQL, graph, vector, and domain-specific operations in a single unified syntax.
Key Concepts
| Concept | Description |
|---|---|
| Recursive Descent | Top-down parsing where each grammar rule becomes a function |
| Pratt Parsing | Operator precedence parsing for expressions with correct associativity |
| Span Tracking | Every AST node carries source location for error messages |
| Case Insensitivity | Keywords are matched case-insensitively via uppercase conversion |
| Single Lookahead | Parser uses one-token lookahead with optional peek |
| Depth Limiting | Expression nesting is limited to 64 levels to prevent stack overflow |
Architecture
flowchart LR
subgraph Input
Source["Source String"]
end
subgraph Lexer
Chars["char iterator"] --> Tokenizer
Tokenizer --> Tokens["Token Stream"]
end
subgraph Parser
Tokens --> StatementParser["Statement Parser"]
StatementParser --> ExprParser["Expression Parser (Pratt)"]
ExprParser --> AST["Abstract Syntax Tree"]
end
Source --> Chars
AST --> Output["Statement + Span"]
Detailed Parsing Flow
sequenceDiagram
participant User
participant parse()
participant Parser
participant Lexer
participant ExprParser
User->>parse(): "SELECT * FROM users"
parse()->>Parser: new(source)
Parser->>Lexer: new(source)
Lexer-->>Parser: first token
Parser->>Parser: parse_statement()
Parser->>Parser: match on token kind
Parser->>Parser: parse_select()
Parser->>Parser: parse_select_body()
loop For each select item
Parser->>ExprParser: parse_expr()
ExprParser->>ExprParser: parse_expr_bp(0)
ExprParser->>ExprParser: parse_prefix_expr()
ExprParser-->>Parser: Expr with span
end
Parser-->>parse(): Statement
parse()-->>User: Result<Statement>
Source Files
| File | Purpose | Key Functions |
|---|---|---|
lib.rs | Public API exports | parse(), parse_all(), parse_expr(), tokenize() |
lexer.rs | Tokenization (source to tokens) | Lexer::next_token(), scan_ident(), scan_number(), scan_string() |
token.rs | Token definitions and keyword lookup | TokenKind, keyword_from_str() |
parser.rs | Statement parsing (recursive descent) | Parser::parse_statement(), parse_select(), parse_insert() |
expr.rs | Expression parsing (Pratt algorithm) | ExprParser::parse_expr(), parse_expr_bp(), infix_binding_power() |
ast.rs | AST node definitions | Statement, StatementKind, Expr, ExprKind |
span.rs | Source location tracking | BytePos, Span, line_col(), get_line() |
error.rs | Error types with source context | ParseError, ParseErrorKind, format_with_source() |
Core Types
Token System
classDiagram
class Token {
+TokenKind kind
+Span span
+is_eof() bool
+is_keyword() bool
}
class TokenKind {
<<enumeration>>
Ident(String)
Integer(i64)
Float(f64)
String(String)
Select
From
Where
...
Error(String)
Eof
}
class Span {
+BytePos start
+BytePos end
+len() u32
+merge(Span) Span
+extract(str) str
}
Token --> TokenKind
Token --> Span
| Type | Description |
|---|---|
Token | A token with its kind and span |
TokenKind | Enum of all token variants (130+ variants including keywords, literals, operators) |
Lexer | Stateful tokenizer that produces tokens from source |
AST Structure
classDiagram
class Statement {
+StatementKind kind
+Span span
}
class StatementKind {
<<enumeration>>
Select(SelectStmt)
Insert(InsertStmt)
Node(NodeStmt)
Edge(EdgeStmt)
Similar(SimilarStmt)
Vault(VaultStmt)
...
}
class Expr {
+ExprKind kind
+Span span
+boxed() Box~Expr~
}
class ExprKind {
<<enumeration>>
Literal(Literal)
Ident(Ident)
Binary(Box~Expr~, BinaryOp, Box~Expr~)
Unary(UnaryOp, Box~Expr~)
Call(FunctionCall)
...
}
Statement --> StatementKind
StatementKind --> Expr
Expr --> ExprKind
| Type | Description |
|---|---|
Statement | Top-level parsed statement with span |
StatementKind | Enum of all statement variants (30+ variants) |
Expr | Expression node with span |
ExprKind | Enum of expression variants (20+ variants) |
Literal | Literal values (Null, Boolean, Integer, Float, String) |
Ident | Identifier with name and span |
BinaryOp | Binary operators with precedence (18 operators) |
UnaryOp | Unary operators (Not, Neg, BitNot) |
Span Types
| Type | Description | Example |
|---|---|---|
BytePos | A byte offset into source text (u32) | BytePos(7) |
Span | A range of bytes (start, end) | Span { start: 0, end: 6 } |
Spanned<T> | A value paired with its source location | Spanned::new(42, span) |
#![allow(unused)] fn main() { // Span operations let span1 = Span::from_offsets(0, 6); // "SELECT" let span2 = Span::from_offsets(7, 8); // "*" let merged = span1.merge(span2); // "SELECT *" // Extract source text let source = "SELECT * FROM users"; let text = span1.extract(source); // "SELECT" // Line/column computation let (line, col) = line_col(source, BytePos(7)); // (1, 8) }
Error Types
| Type | Description |
|---|---|
ParseError | Error with kind, span, and optional help message |
ParseErrorKind | Enum of error variants (10 kinds) |
ParseResult<T> | Result<T, ParseError> |
Errors | Collection of parse errors with iteration support |
#![allow(unused)] fn main() { // Error kinds pub enum ParseErrorKind { UnexpectedToken { found: TokenKind, expected: String }, UnexpectedEof { expected: String }, InvalidSyntax(String), InvalidNumber(String), UnterminatedString, UnknownCommand(String), DuplicateColumn(String), InvalidEscape(char), TooDeep, // Expression nesting > 64 levels Custom(String), } }
Lexer Implementation
State Machine
The lexer is implemented as an iterator-based state machine with single-character lookahead:
stateDiagram-v2
[*] --> Initial
Initial --> Whitespace: is_whitespace
Initial --> LineComment: --
Initial --> BlockComment: /*
Initial --> Identifier: a-zA-Z_
Initial --> Number: 0-9
Initial --> String: ' or "
Initial --> Operator: +-*/etc
Initial --> EOF: end
Whitespace --> Initial: skip
LineComment --> Initial: newline
BlockComment --> Initial: */
Identifier --> Token: non-alnum
Number --> Token: non-digit
String --> Token: closing quote
Operator --> Token: complete
Token --> Initial: emit token
EOF --> [*]
Internal Structure
#![allow(unused)] fn main() { pub struct Lexer<'a> { source: &'a str, // Original source text chars: Chars<'a>, // Character iterator pos: u32, // Current byte position peeked: Option<char>, // One-character lookahead } }
Character Classification
| Category | Characters | Handling |
|---|---|---|
| Whitespace | space, tab, newline | Skipped |
| Line comment | -- to newline | Skipped |
| Block comment | /* */ (nestable) | Skipped, supports nesting |
| Identifier | [a-zA-Z_][a-zA-Z0-9_]* | Keyword lookup then Ident |
| Integer | [0-9]+ | Parse as i64 |
| Float | [0-9]+\.[0-9]+ or scientific | Parse as f64 |
| String | '...' or "..." | Handle escapes |
String Escape Sequences
| Escape | Result |
|---|---|
\n | Newline |
\r | Carriage return |
\t | Tab |
\\ | Backslash |
\' | Single quote |
\" | Double quote |
\0 | Null character |
'' | Single quote (SQL-style doubled) |
\x | Unknown: preserved as \x |
Operator Recognition
Multi-character operators are recognized with lookahead:
#![allow(unused)] fn main() { // Lexer::next_token() operator matching match c { '-' => if self.eat('>') { Arrow } // -> else { Minus }, // - '=' => if self.eat('>') { FatArrow } // => else { Eq }, // = '<' => if self.eat('=') { Le } // <= else if self.eat('>') { Ne } // <> else if self.eat('<') { Shl } // << else { Lt }, // < '>' => if self.eat('=') { Ge } // >= else if self.eat('>') { Shr } // >> else { Gt }, // > '|' => if self.eat('|') { Concat } // || else { Pipe }, // | '&' => if self.eat('&') { AmpAmp } // && else { Amp }, // & ':' => if self.eat(':') { ColonColon } // :: else { Colon }, // : // ... } }
Pratt Parser (Expression Parsing)
Algorithm Overview
The Pratt parser handles operator precedence through “binding power” - each operator has a left and right binding power that determines associativity and precedence.
flowchart TD
A[parse_expr_bp\nmin_bp] --> B[parse_prefix]
B --> C{More tokens?}
C -->|No| D[Return lhs]
C -->|Yes| E[parse_postfix]
E --> F{Infix op?}
F -->|No| D
F -->|Yes| G{l_bp >= min_bp?}
G -->|No| D
G -->|Yes| H[Advance]
H --> I[parse_expr_bp\nr_bp]
I --> J[Build Binary node]
J --> C
Binding Power Table
Each operator has left and right binding powers (l_bp, r_bp):
| Precedence | Operators | Binding Power (l, r) | Associativity |
|---|---|---|---|
| 1 (lowest) | OR | (1, 2) | Left |
| 2 | AND | (3, 4) | Left |
| 3 | =, !=, <, <=, >, >= | (5, 6) | Left |
| 4 | | (bitwise OR) | (7, 8) | Left |
| 5 | ^ (bitwise XOR) | (9, 10) | Left |
| 6 | & (bitwise AND) | (11, 12) | Left |
| 7 | <<, >> | (13, 14) | Left |
| 8 | +, -, || (concat) | (15, 16) | Left |
| 9 | *, /, % | (17, 18) | Left |
| 10 (highest) | NOT, -, ~ (unary) | 19 (prefix) | Right |
Left associativity is achieved by having r_bp > l_bp.
Implementation Details
#![allow(unused)] fn main() { const MAX_DEPTH: usize = 64; // Prevent stack overflow fn parse_expr_bp(&mut self, min_bp: u8) -> ParseResult<Expr> { self.depth += 1; if self.depth > MAX_DEPTH { return Err(ParseError::new(ParseErrorKind::TooDeep, self.current.span)); } let mut lhs = self.parse_prefix()?; loop { // Handle postfix operators (IS NULL, IN, BETWEEN, LIKE, DOT) lhs = self.parse_postfix(lhs)?; // Check for infix operator let op = match self.current_binary_op() { Some(op) => op, None => break, }; let (l_bp, r_bp) = infix_binding_power(op); if l_bp < min_bp { break; // Operator binds less tightly than current context } self.advance(); let rhs = self.parse_expr_bp(r_bp)?; let span = lhs.span.merge(rhs.span); lhs = Expr::new(ExprKind::Binary(Box::new(lhs), op, Box::new(rhs)), span); } self.depth -= 1; Ok(lhs) } }
Prefix Expression Handling
#![allow(unused)] fn main() { fn parse_prefix(&mut self) -> ParseResult<Expr> { match &self.current.kind { // Literals TokenKind::Integer(n) => { /* emit Literal(Integer) */ }, TokenKind::Float(n) => { /* emit Literal(Float) */ }, TokenKind::String(s) => { /* emit Literal(String) */ }, TokenKind::True | TokenKind::False => { /* emit Literal(Boolean) */ }, TokenKind::Null => { /* emit Literal(Null) */ }, // Identifiers and function calls TokenKind::Ident(_) => self.parse_ident_or_call(), // Aggregate functions (COUNT, SUM, AVG, MIN, MAX) TokenKind::Count | TokenKind::Sum | ... => self.parse_aggregate_call(), // Wildcard TokenKind::Star => { /* emit Wildcard */ }, // Parenthesized expression or tuple TokenKind::LParen => self.parse_paren_expr(), // Array literal TokenKind::LBracket => self.parse_array(), // Unary operators TokenKind::Minus => { /* parse operand with PREFIX_BP, emit Unary(Neg) */ }, TokenKind::Not | TokenKind::Bang => { /* emit Unary(Not) */ }, TokenKind::Tilde => { /* emit Unary(BitNot) */ }, // Special expressions TokenKind::Case => self.parse_case(), TokenKind::Exists => self.parse_exists(), TokenKind::Cast => self.parse_cast(), // Contextual keywords as identifiers _ if token.kind.is_contextual_keyword() => self.parse_keyword_as_ident(), // Error _ => Err(ParseError::unexpected(...)), } } }
Postfix Expression Handling
Postfix operators bind tighter than any infix operator:
#![allow(unused)] fn main() { fn parse_postfix(&mut self, mut expr: Expr) -> ParseResult<Expr> { loop { // Handle NOT IN, NOT BETWEEN, NOT LIKE if self.check(&TokenKind::Not) { let next = self.peek().kind.clone(); if next == TokenKind::In { /* parse NOT IN */ } else if next == TokenKind::Between { /* parse NOT BETWEEN */ } else if next == TokenKind::Like { /* parse NOT LIKE */ } } match self.current.kind { TokenKind::Is => { // IS [NOT] NULL self.advance(); let negated = self.eat(&TokenKind::Not); self.expect(&TokenKind::Null)?; expr = Expr::new(ExprKind::IsNull { expr, negated }, span); }, TokenKind::In => { /* parse IN (values) or IN (subquery) */ }, TokenKind::Between => { /* parse BETWEEN low AND high */ }, TokenKind::Like => { /* parse LIKE pattern */ }, TokenKind::Dot => { // Qualified name: table.column or table.* self.advance(); if self.eat(&TokenKind::Star) { expr = Expr::new(ExprKind::QualifiedWildcard(ident), span); } else { let field = self.expect_ident()?; expr = Expr::new(ExprKind::Qualified(Box::new(expr), field), span); } }, _ => return Ok(expr), } } } }
Statement Kinds
SQL Statements
| Statement | Example | AST Type |
|---|---|---|
Select | SELECT * FROM users WHERE id = 1 | SelectStmt |
Insert | INSERT INTO users (name) VALUES ('Alice') | InsertStmt |
Update | UPDATE users SET name = 'Bob' WHERE id = 1 | UpdateStmt |
Delete | DELETE FROM users WHERE id = 1 | DeleteStmt |
CreateTable | CREATE TABLE users (id INT PRIMARY KEY) | CreateTableStmt |
DropTable | DROP TABLE IF EXISTS users CASCADE | DropTableStmt |
CreateIndex | CREATE UNIQUE INDEX idx ON users(email) | CreateIndexStmt |
DropIndex | DROP INDEX idx_name | DropIndexStmt |
ShowTables | SHOW TABLES | unit |
Describe | DESCRIBE TABLE users | DescribeStmt |
Graph Statements
| Statement | Example | AST Type |
|---|---|---|
Node | NODE CREATE person {name: 'Alice'} | NodeStmt |
Edge | EDGE CREATE 1 -> 2 : FOLLOWS {since: 2023} | EdgeStmt |
Neighbors | NEIGHBORS 'entity' OUTGOING follows | NeighborsStmt |
Path | PATH SHORTEST 1 TO 5 | PathStmt |
Find | FIND NODE person WHERE age > 18 | FindStmt |
Vector Statements
| Statement | Example | AST Type |
|---|---|---|
Embed | EMBED STORE 'key' [0.1, 0.2, 0.3] | EmbedStmt |
Similar | SIMILAR 'query' LIMIT 10 COSINE | SimilarStmt |
Domain Statements
| Statement | Example | AST Type |
|---|---|---|
Vault | VAULT SET 'secret' 'value' | VaultStmt |
Cache | CACHE STATS | CacheStmt |
Blob | BLOB PUT 'file.txt' 'data' | BlobStmt |
Blobs | BLOBS BY TAG 'important' | BlobsStmt |
Chain | BEGIN CHAIN TRANSACTION | ChainStmt |
Cluster | CLUSTER STATUS | ClusterStmt |
Checkpoint | CHECKPOINT 'backup1' | CheckpointStmt |
Entity | ENTITY CREATE 'key' {props} EMBEDDING [vec] | EntityStmt |
Expression Kinds
| Kind | Example | Notes |
|---|---|---|
Literal | 42, 3.14, 'hello', TRUE, NULL | Five literal types |
Ident | column_name | Simple identifier |
Qualified | table.column | Dot notation |
Binary | a + b, x AND y | 18 binary operators |
Unary | NOT flag, -value, ~bits | 3 unary operators |
Call | COUNT(*), MAX(price) | Function with args |
Case | CASE WHEN x THEN y ELSE z END | Simple and searched |
Subquery | (SELECT ...) | Nested SELECT |
Exists | EXISTS (SELECT 1 ...) | Existence test |
In | x IN (1, 2, 3) or x IN (SELECT...) | Value list or subquery |
Between | x BETWEEN 1 AND 10 | Range check |
Like | name LIKE '%smith' | Pattern matching |
IsNull | x IS NULL, y IS NOT NULL | NULL check |
Array | [1, 2, 3] | Array literal |
Tuple | (1, 2, 3) | Tuple/row literal |
Cast | CAST(x AS INT) | Type conversion |
Wildcard | * | All columns |
QualifiedWildcard | table.* | All columns from table |
Span Tracking Implementation
BytePos and Span
#![allow(unused)] fn main() { /// A byte position in source code (u32 for memory efficiency). pub struct BytePos(pub u32); /// A span representing a range of source code. pub struct Span { pub start: BytePos, // Inclusive pub end: BytePos, // Exclusive } impl Span { /// Combines two spans into one that covers both. pub const fn merge(self, other: Span) -> Span { let start = if self.start.0 < other.start.0 { self.start } else { other.start }; let end = if self.end.0 > other.end.0 { self.end } else { other.end }; Span { start, end } } } }
Line/Column Calculation
#![allow(unused)] fn main() { /// Computes line and column from a byte position. pub fn line_col(source: &str, pos: BytePos) -> (usize, usize) { let offset = pos.as_usize().min(source.len()); let mut line = 1; let mut col = 1; for (i, ch) in source.char_indices() { if i >= offset { break; } if ch == '\n' { line += 1; col = 1; } else { col += 1; } } (line, col) } /// Returns the line containing a position. pub fn get_line(source: &str, pos: BytePos) -> &str { let offset = pos.as_usize().min(source.len()); let line_start = source[..offset].rfind('\n').map(|i| i + 1).unwrap_or(0); let line_end = source[offset..].find('\n').map(|i| offset + i).unwrap_or(source.len()); &source[line_start..line_end] } }
Error Recovery
Error Creation Patterns
#![allow(unused)] fn main() { // Unexpected token ParseError::unexpected( TokenKind::Comma, self.current.span, "column name" ) // Unexpected EOF ParseError::unexpected_eof( self.current.span, "expression" ) // Invalid syntax ParseError::invalid( "CASE requires at least one WHEN clause", self.current.span ) // With help message ParseError::invalid("unknown keyword SELCT", span) .with_help("did you mean SELECT?") }
Error Formatting
#![allow(unused)] fn main() { pub fn format_with_source(&self, source: &str) -> String { let (line, col) = line_col(source, self.span.start); let line_text = get_line(source, self.span.start); // Build error message with source context let mut result = format!("error: {}\n", self.kind); result.push_str(&format!(" --> line {}:{}\n", line, col)); result.push_str(" |\n"); result.push_str(&format!("{:3} | {}\n", line, line_text)); result.push_str(" | "); // Add carets under the error location for _ in 0..(col - 1) { result.push(' '); } let len = self.span.len().max(1) as usize; for _ in 0..len.min(line_text.len() - col + 1).max(1) { result.push('^'); } result.push('\n'); if let Some(help) = &self.help { result.push_str(&format!(" = help: {}\n", help)); } result } }
Example Error Output
error: unexpected FROM, expected expression or '*' after SELECT
--> line 1:8
|
1 | SELECT FROM users
| ^^^^
Usage Examples
Parse a Statement
#![allow(unused)] fn main() { use neumann_parser::parse; let stmt = parse("SELECT * FROM users WHERE id = 1")?; match &stmt.kind { StatementKind::Select(select) => { println!("Distinct: {}", select.distinct); println!("Columns: {}", select.columns.len()); if let Some(from) = &select.from { println!("Table: {:?}", from.table.kind); } if let Some(where_clause) = &select.where_clause { println!("Has WHERE clause"); } } _ => {} } }
Parse Multiple Statements
#![allow(unused)] fn main() { use neumann_parser::parse_all; let stmts = parse_all("SELECT 1; SELECT 2; INSERT INTO t VALUES (1)")?; assert_eq!(stmts.len(), 3); for stmt in &stmts { println!("Statement at {:?}", stmt.span); } }
Parse an Expression
#![allow(unused)] fn main() { use neumann_parser::parse_expr; let expr = parse_expr("1 + 2 * 3")?; // Parses as: 1 + (2 * 3) due to precedence if let ExprKind::Binary(lhs, BinaryOp::Add, rhs) = expr.kind { // lhs = Literal(1) // rhs = Binary(Literal(2), Mul, Literal(3)) } }
Tokenize Source
#![allow(unused)] fn main() { use neumann_parser::tokenize; let tokens = tokenize("SELECT * FROM users"); for token in tokens { println!("{:?} at {:?}", token.kind, token.span); } // Output: // Select at 0..6 // Star at 7..8 // From at 9..13 // Ident("users") at 14..19 // Eof at 19..19 }
Working with the Parser Directly
#![allow(unused)] fn main() { use neumann_parser::Parser; let mut parser = Parser::new("SELECT 1; SELECT 2"); // Parse first statement let stmt1 = parser.parse_statement()?; assert!(matches!(stmt1.kind, StatementKind::Select(_))); // Parse second statement let stmt2 = parser.parse_statement()?; assert!(matches!(stmt2.kind, StatementKind::Select(_))); // Third call returns Empty (EOF) let stmt3 = parser.parse_statement()?; assert!(matches!(stmt3.kind, StatementKind::Empty)); }
Error Handling
#![allow(unused)] fn main() { use neumann_parser::parse; let result = parse("SELECT FROM users"); if let Err(e) = result { // Format error with source context let formatted = e.format_with_source("SELECT FROM users"); println!("{}", formatted); // Access error details println!("Error kind: {:?}", e.kind); println!("Span: {:?}", e.span); if let Some(help) = &e.help { println!("Help: {}", help); } } }
Grammar Overview
SELECT Statement
SELECT [DISTINCT | ALL] columns
FROM table [alias]
[JOIN table ON condition | USING (cols)]...
[WHERE condition]
[GROUP BY exprs]
[HAVING condition]
[ORDER BY expr [ASC|DESC] [NULLS FIRST|LAST]]...
[LIMIT n]
[OFFSET n]
CREATE TABLE Statement
CREATE TABLE [IF NOT EXISTS] name (
column type [NULL|NOT NULL] [PRIMARY KEY] [UNIQUE]
[DEFAULT expr] [CHECK(expr)]
[REFERENCES table(col) [ON DELETE action] [ON UPDATE action]]
[, ...]
[, PRIMARY KEY (cols)]
[, UNIQUE (cols)]
[, FOREIGN KEY (cols) REFERENCES table(cols)]
[, CHECK (expr)]
)
Graph Operations
NODE CREATE label {properties}
NODE GET id
NODE DELETE id
NODE LIST [label]
EDGE CREATE from -> to : type {properties}
EDGE GET id
EDGE DELETE id
EDGE LIST [type]
NEIGHBORS node [OUTGOING|INCOMING|BOTH] [edge_type]
[BY SIMILAR [vector] LIMIT n]
PATH [SHORTEST|ALL] from TO to [MAX depth]
Vector Operations
EMBED STORE key [vector]
EMBED GET key
EMBED DELETE key
EMBED BUILD INDEX
EMBED BATCH [(key, [vector]), ...]
SIMILAR key|[vector] [LIMIT k] [COSINE|EUCLIDEAN|DOT_PRODUCT]
[CONNECTED TO entity]
Chain Operations
BEGIN CHAIN TRANSACTION
COMMIT CHAIN
ROLLBACK CHAIN TO height
CHAIN HEIGHT
CHAIN TIP
CHAIN BLOCK height
CHAIN VERIFY
CHAIN HISTORY key
CHAIN SIMILAR [embedding] LIMIT n
CHAIN DRIFT FROM height TO height
SHOW CODEBOOK GLOBAL
SHOW CODEBOOK LOCAL domain
ANALYZE CODEBOOK TRANSITIONS
Reserved Keywords
Keywords are case-insensitive. The lexer converts to uppercase for matching.
SQL (70+ keywords): SELECT, DISTINCT, ALL, FROM, WHERE, INSERT, INTO, VALUES, UPDATE, SET, DELETE, CREATE, DROP, TABLE, INDEX, AND, OR, NOT, NULL, IS, IN, LIKE, BETWEEN, ORDER, BY, ASC, DESC, NULLS, FIRST, LAST, LIMIT, OFFSET, GROUP, HAVING, JOIN, INNER, LEFT, RIGHT, FULL, OUTER, CROSS, NATURAL, ON, USING, AS, PRIMARY, KEY, UNIQUE, REFERENCES, FOREIGN, CHECK, DEFAULT, CASCADE, RESTRICT, IF, EXISTS, SHOW, TABLES, UNION, INTERSECT, EXCEPT, CASE, WHEN, THEN, ELSE, END, CAST, ANY
Types (16 keywords): INT, INTEGER, BIGINT, SMALLINT, FLOAT, DOUBLE, REAL, DECIMAL, NUMERIC, VARCHAR, CHAR, TEXT, BOOLEAN, DATE, TIME, TIMESTAMP
Aggregates (5 keywords): COUNT, SUM, AVG, MIN, MAX
Graph (16 keywords): NODE, EDGE, NEIGHBORS, PATH, GET, LIST, STORE, OUTGOING, INCOMING, BOTH, SHORTEST, PROPERTIES, LABEL, VERTEX, VERTICES, EDGES
Vector (10 keywords): EMBED, SIMILAR, VECTOR, EMBEDDING, DIMENSION, DISTANCE, COSINE, EUCLIDEAN, DOT_PRODUCT, BUILD
Unified (6 keywords): FIND, WITH, RETURN, MATCH, ENTITY, CONNECTED
Domain (30+ keywords): VAULT, GRANT, REVOKE, ROTATE, CACHE, INIT, STATS, CLEAR, EVICT, PUT, SEMANTIC, THRESHOLD, CHECKPOINT, ROLLBACK, CHAIN, BEGIN, COMMIT, TRANSACTION, HISTORY, DRIFT, CODEBOOK, GLOBAL, LOCAL, ANALYZE, HEIGHT, TIP, BLOCK, CLUSTER, CONNECT, DISCONNECT, STATUS, NODES, LEADER, BLOB, BLOBS, INFO, LINK, TAG, VERIFY, GC, REPAIR
Contextual Keywords
These keywords can be used as identifiers in expression contexts (column names, etc.):
#![allow(unused)] fn main() { pub fn is_contextual_keyword(&self) -> bool { matches!(self, Status | Nodes | Leader | Connect | Disconnect | Cluster | Blobs | Info | Link | Unlink | Links | Tag | Untag | Verify | Gc | Repair | Height | Transitions | Tip | Block | Codebook | Global | Local | Drift | Begin | Commit | Transaction | ... ) } }
Edge Cases and Gotchas
Ambiguous Token Sequences
- Minus vs Arrow:
-vs->distinguished by lookahead - Less-than variants:
<vs<=vs<>vs<< - Pipe variants:
|(bitwise) vs||(concat) - Keyword as identifier:
SELECT status FROM orders-statusis contextual keyword
Number Parsing Edge Cases
#![allow(unused)] fn main() { // "3." is integer 3 followed by dot tokens("3. ") // [Integer(3), Dot, Eof] // "3.0" is float tokens("3.0") // [Float(3.0), Eof] // "3.x" is integer 3, dot, identifier x tokens("3.x") // [Integer(3), Dot, Ident("x"), Eof] // Scientific notation tokens("1e10") // [Float(1e10), Eof] tokens("2.5E-3") // [Float(0.0025), Eof] }
String Literal Edge Cases
#![allow(unused)] fn main() { // SQL-style doubled quotes tokens("'it''s'") // [String("it's"), Eof] // Unterminated string tokens("'unterminated") // [Error("unterminated string literal"), Eof] // String with newline (error) tokens("'line1\nline2'") // Error - strings cannot span lines }
Expression Depth Limit
#![allow(unused)] fn main() { // Deeply nested expressions hit the depth limit (64) let mut expr = "x".to_string(); for _ in 0..70 { expr = format!("({})", expr); } parse_expr(&expr) // Err(ParseErrorKind::TooDeep) }
BETWEEN Precedence
The AND in BETWEEN low AND high is part of the BETWEEN syntax, not a logical
operator:
#![allow(unused)] fn main() { // "x BETWEEN 1 AND 10 AND y = 5" parses as: // (x BETWEEN 1 AND 10) AND (y = 5) // Not: x BETWEEN 1 AND (10 AND y = 5) }
Qualified Wildcard Restriction
#![allow(unused)] fn main() { // Valid: table.* parse_expr("users.*") // Ok(QualifiedWildcard) // Invalid: (expr).* parse_expr("(1 + 2).*") // Err("qualified wildcard requires identifier") }
Performance
| Operation | Complexity | Notes |
|---|---|---|
| Tokenize | O(n) | Single pass, no backtracking |
| Parse | O(n) | Single pass, constant stack per token |
| Total | O(n) | Where n = input length |
Memory Usage
- Lexer: O(1) - only stores position and one peeked character
- Parser: O(1) - only stores current and peeked token
- AST: O(n) - proportional to number of nodes
- Span: 8 bytes per span (two u32 values)
Optimizations
- Keyword lookup: O(1) via match statement on uppercase string
- Token comparison: Uses
std::mem::discriminantfor enum comparison - Span tracking: Constant-time merge operation
- No allocations during parsing: Identifiers and strings owned in tokens
Related Modules
| Module | Relationship |
|---|---|
query_router | Consumes AST and executes queries against engines |
neumann_shell | Uses parser for interactive REPL commands |
tensor_chain | Chain statements (BEGIN, COMMIT, HISTORY) parsed here |
tensor_vault | Vault statements (SET, GET, GRANT) parsed here |
tensor_cache | Cache statements (INIT, STATS, PUT) parsed here |
tensor_blob | Blob statements (PUT, GET, TAG) parsed here |
Testing
The parser has comprehensive test coverage including:
- Unit tests in each module: Token, span, lexer, parser, expression tests
- Integration tests: Complex SQL queries, multi-statement parsing
- Edge case tests: Unterminated strings, deeply nested expressions, ambiguous operators
- Fuzz targets:
parser_parse,parser_parse_all,parser_parse_expr,parser_tokenize
# Run parser tests
cargo test -p neumann_parser
# Run parser fuzz targets
cargo +nightly fuzz run parser_parse -- -max_total_time=60
cargo +nightly fuzz run parser_tokenize -- -max_total_time=60
Query Router
Query Router is the unified query execution layer for Neumann. It parses shell commands, routes them to appropriate engines, and combines results. All query types (relational, graph, vector, unified) flow through the router, which provides a single entry point for the entire system.
The router supports both synchronous and asynchronous execution, optional result caching, and distributed query execution when cluster mode is enabled.
Key Types
| Type | Description |
|---|---|
QueryRouter | Main router orchestrating queries across all engines |
QueryResult | Unified result enum for all query types |
RouterError | Error types for query routing failures |
NodeResult | Graph node result with id, label, properties |
EdgeResult | Graph edge result with id, from, to, label |
SimilarResult | Vector similarity result with key and score |
UnifiedResult | Cross-engine query result with description and items |
ChainResult | Blockchain operation results |
QueryPlanner | Plans distributed query execution across shards |
ResultMerger | Merges results from multiple shards |
ShardResult | Result from a single shard with timing and error info |
DistributedQueryConfig | Configuration for distributed execution |
DistributedQueryStats | Statistics tracking for distributed queries |
FilterCondition | Re-exported from vector_engine for programmatic filter building |
FilterValue | Re-exported from vector_engine for filter values |
FilterStrategy | Re-exported from vector_engine for search strategy |
FilteredSearchConfig | Re-exported from vector_engine for filtered search config |
QueryResult Variants
| Variant | Description | Typical Source |
|---|---|---|
Empty | No result (CREATE, INSERT) | DDL, writes |
Value(String) | Single value result | Scalar queries, DESCRIBE |
Count(usize) | Count of affected rows/nodes/edges | UPDATE, DELETE |
Ids(Vec<u64>) | List of IDs | INSERT |
Rows(Vec<Row>) | Relational query results | SELECT |
Nodes(Vec<NodeResult>) | Graph node results | NODE queries |
Edges(Vec<EdgeResult>) | Graph edge results | EDGE queries |
Path(Vec<u64>) | Graph traversal path | PATH queries |
Similar(Vec<SimilarResult>) | Vector similarity results | SIMILAR queries |
Unified(UnifiedResult) | Cross-engine query results | FIND queries |
TableList(Vec<String>) | List of table names | SHOW TABLES |
Blob(Vec<u8>) | Blob data bytes | BLOB GET |
ArtifactInfo(ArtifactInfoResult) | Blob artifact metadata | BLOB INFO |
ArtifactList(Vec<String>) | List of artifact IDs | BLOBS LIST |
BlobStats(BlobStatsResult) | Blob storage statistics | BLOB STATS |
CheckpointList(Vec<CheckpointInfo>) | List of checkpoints | CHECKPOINTS |
Chain(ChainResult) | Chain operation result | CHAIN queries |
RouterError Types
| Error | Cause | Recovery |
|---|---|---|
ParseError | Invalid query syntax | Fix query syntax |
UnknownCommand | Unknown command or keyword | Check command spelling |
RelationalError | Error from relational engine | Check table/column names |
GraphError | Error from graph engine | Verify node/edge IDs |
VectorError | Error from vector engine | Check embedding dimensions |
VaultError | Error from vault | Verify permissions |
CacheError | Error from cache | Check cache configuration |
BlobError | Error from blob storage | Verify artifact exists |
CheckpointError | Error from checkpoint system | Check blob store initialized |
ChainError | Error from chain system | Verify chain initialized |
InvalidArgument | Invalid argument value | Check argument types |
MissingArgument | Missing required argument | Provide required args |
TypeMismatch | Type mismatch in query | Check value types |
AuthenticationRequired | Vault operations require identity | Call SET IDENTITY first |
Error Propagation
The router implements From traits to convert engine-specific errors:
#![allow(unused)] fn main() { // Errors from underlying engines are automatically converted impl From<RelationalError> for RouterError { fn from(e: RelationalError) -> Self { RouterError::RelationalError(e.to_string()) } } impl From<GraphError> for RouterError { ... } impl From<VectorError> for RouterError { ... } impl From<VaultError> for RouterError { ... } impl From<CacheError> for RouterError { ... } impl From<BlobError> for RouterError { ... } impl From<CheckpointError> for RouterError { ... } impl From<ChainError> for RouterError { ... } impl From<UnifiedError> for RouterError { ... } }
This allows using the ? operator throughout execution methods:
#![allow(unused)] fn main() { fn exec_select(&self, select: &SelectStmt) -> Result<QueryResult> { // RelationalError automatically converts to RouterError let rows = self.relational.select_columnar(table_name, condition, options)?; Ok(QueryResult::Rows(rows)) } }
Architecture
graph TB
subgraph QueryRouter
Execute[execute_parsed]
ExecuteAsync[execute_parsed_async]
Distributed[try_execute_distributed]
Cache[Query Cache]
Statement[execute_statement]
StatementAsync[execute_statement_async]
end
Execute --> Distributed
ExecuteAsync --> StatementAsync
Distributed -->|cluster active| ScatterGather[Scatter-Gather]
Distributed -->|local| Cache
Cache -->|cache hit| Return[Return Result]
Cache -->|cache miss| Statement
Statement --> Relational[RelationalEngine]
Statement --> Graph[GraphEngine]
Statement --> Vector[VectorEngine]
Statement --> Vault[Vault]
Statement --> CacheOps[Cache Operations]
Statement --> Blob[BlobStore]
Statement --> Checkpoint[CheckpointManager]
Statement --> Chain[TensorChain]
Statement --> Cluster[ClusterOrchestrator]
subgraph Engines
Relational
Graph
Vector
end
subgraph Optional Services
Vault
CacheOps
Blob
Checkpoint
Chain
Cluster
end
Relational --> Store[TensorStore]
Graph --> Store
Vector --> Store
Internal Router Structure
#![allow(unused)] fn main() { pub struct QueryRouter { // Core engines (always initialized) relational: Arc<RelationalEngine>, graph: Arc<GraphEngine>, vector: Arc<VectorEngine>, // Unified engine for cross-engine queries (lazily initialized) unified: Option<UnifiedEngine>, // Optional services (require explicit initialization) vault: Option<Arc<Vault>>, cache: Option<Arc<Cache>>, blob: Option<Arc<tokio::sync::Mutex<BlobStore>>>, blob_runtime: Option<Arc<Runtime>>, checkpoint: Option<Arc<tokio::sync::Mutex<CheckpointManager>>>, chain: Option<Arc<TensorChain>>, // Cluster mode cluster: Option<Arc<ClusterOrchestrator>>, cluster_runtime: Option<Arc<Runtime>>, distributed_planner: Option<Arc<QueryPlanner>>, distributed_config: DistributedQueryConfig, local_shard_id: ShardId, // Authentication state current_identity: Option<String>, // Vector index for fast similarity search hnsw_index: Option<(HNSWIndex, Vec<String>)>, } }
Initialization
#![allow(unused)] fn main() { use query_router::QueryRouter; use tensor_store::TensorStore; // Create with independent engines let router = QueryRouter::new(); // Create with existing engines let router = QueryRouter::with_engines(relational, graph, vector); // Create with shared storage (enables unified entities) let store = TensorStore::new(); let router = QueryRouter::with_shared_store(store); }
Constructor Comparison
| Constructor | UnifiedEngine | Use Case |
|---|---|---|
new() | No | Simple single-engine queries |
with_engines(...) | No | Custom engine configuration |
with_shared_store(...) | Yes | Cross-engine unified queries |
Shared Store Benefits
When using with_shared_store(), all engines share the same underlying
TensorStore:
#![allow(unused)] fn main() { pub fn with_shared_store(store: TensorStore) -> Self { let relational = Arc::new(RelationalEngine::with_store(store.clone())); let graph = Arc::new(GraphEngine::with_store(store.clone())); let vector = Arc::new(VectorEngine::with_store(store.clone())); let unified = UnifiedEngine::with_engines( store, Arc::clone(&relational), Arc::clone(&graph), Arc::clone(&vector), ); // ... } }
This enables:
- Cross-engine queries via
UnifiedEngine - Entity-level operations spanning all modalities
- Consistent view of data across engines
Query Execution
Execution Methods
| Method | Parser | Async | Distributed | Cache |
|---|---|---|---|---|
execute(command) | Regex (legacy) | No | No | No |
execute_parsed(command) | AST | No | Yes | Yes |
execute_parsed_async(command) | AST | Yes | No | Yes |
execute_statement(stmt) | Pre-parsed | No | No | No |
execute_statement_async(stmt) | Pre-parsed | Yes | No | No |
Execution Flow
flowchart TD
A[execute_parsed] --> B{Cluster Active?}
B -->|Yes| C[try_execute_distributed]
B -->|No| D[Parse Command]
C --> E{Plan Type}
E -->|Local| D
E -->|Remote| F[execute_on_shard]
E -->|ScatterGather| G[execute_scatter_gather]
D --> H{Cacheable?}
H -->|Yes| I{Cache Hit?}
H -->|No| J[execute_statement]
I -->|Yes| K[Return Cached]
I -->|No| J
J --> L[Engine Dispatch]
L --> M{Write Op?}
M -->|Yes| N[Invalidate Cache]
M -->|No| O[Cache Result]
O --> P[Return Result]
N --> P
K --> P
F --> P
G --> P
Detailed Execution Steps
- Distributed Check: If cluster is active,
try_execute_distributedplans query execution - Parse: Convert command string to AST via
neumann_parser - Cache Check: For cacheable queries (
SELECT,SIMILAR,NEIGHBORS,PATH), check cache first - Execute: Dispatch to appropriate engine based on
StatementKind - Cache Update: Store result for cacheable queries (as JSON via serde)
- Invalidate: Clear entire cache on write operations (INSERT, UPDATE, DELETE, DDL)
#![allow(unused)] fn main() { // Synchronous execution let result = router.execute_parsed("SELECT * FROM users")?; // Async execution let result = router.execute_parsed_async("SELECT * FROM users").await?; // Concurrent queries let (users, posts, similar) = tokio::join!( router.execute_parsed_async("SELECT * FROM users"), router.execute_parsed_async("SELECT * FROM posts"), router.execute_parsed_async("SIMILAR 'doc:1' LIMIT 10"), ); }
Cache Key Generation
#![allow(unused)] fn main() { fn cache_key_for_query(command: &str) -> String { format!("query:{}", command.trim().to_lowercase()) } }
This normalizes queries for cache lookup by trimming whitespace and lowercasing.
Statement Routing
The router dispatches statements based on their StatementKind:
flowchart LR
subgraph StatementKind
SQL[Select/Insert/Update/Delete]
DDL[CreateTable/DropTable/CreateIndex/DropIndex]
Graph[Node/Edge/Neighbors/Path]
Vector[Embed/Similar]
Unified[Find/Entity]
Services[Vault/Cache/Blob/Checkpoint/Chain/Cluster]
end
SQL --> RE[RelationalEngine]
DDL --> RE
Graph --> GE[GraphEngine]
Vector --> VE[VectorEngine]
Unified --> UE[UnifiedEngine]
Services --> Svc[Optional Services]
Complete Statement Routing Table
| Statement Type | Engine | Handler Method | Operations |
|---|---|---|---|
Select | Relational | exec_select | Table queries with WHERE, JOIN, GROUP BY, ORDER BY |
Insert | Relational | exec_insert | Single/multi-row insert, INSERT…SELECT |
Update | Relational | exec_update | Row updates with conditions |
Delete | Relational | exec_delete | Row deletion with protection |
CreateTable | Relational | exec_create_table | Table DDL |
DropTable | Relational | inline | Table removal with protection |
CreateIndex | Relational | inline | Index creation |
DropIndex | Relational | inline | Index removal with protection |
ShowTables | Relational | inline | List tables |
Describe | Multiple | exec_describe | Schema/node/edge info |
Node | Graph | exec_node | CREATE/GET/DELETE/LIST/UPDATE |
Edge | Graph | exec_edge | CREATE/GET/DELETE/LIST/UPDATE |
Neighbors | Graph | exec_neighbors | Neighbor traversal |
Path | Graph | exec_path | Path finding |
Embed | Vector | exec_embed | Embedding storage, batch, delete |
Similar | Vector | exec_similar | k-NN search |
ShowEmbeddings | Vector | inline | List embedding keys |
CountEmbeddings | Vector | inline | Count embeddings |
Find | Unified | exec_find | Cross-engine queries |
Entity | Unified | exec_entity | Entity CRUD |
Vault | Vault | exec_vault | Secret management |
Cache | Cache | exec_cache | LLM response cache |
Blob | BlobStore | exec_blob | Artifact operations |
Blobs | BlobStore | exec_blobs | Artifact listing |
Checkpoint | Checkpoint | exec_checkpoint | Create snapshot |
Rollback | Checkpoint | exec_rollback | Restore snapshot |
Checkpoints | Checkpoint | exec_checkpoints | List snapshots |
Chain | TensorChain | exec_chain | Blockchain operations |
Cluster | Orchestrator | exec_cluster | Cluster management |
Empty | — | inline | No-op |
Statement Handler Pattern
Each handler follows a consistent pattern:
#![allow(unused)] fn main() { fn exec_<statement>(&self, stmt: &<Statement>Stmt) -> Result<QueryResult> { // 1. Validate/extract parameters let param = self.eval_string_expr(&stmt.field)?; // 2. Check service availability (for optional services) let service = self.service.as_ref() .ok_or_else(|| RouterError::ServiceError("Service not initialized".to_string()))?; // 3. For destructive ops, check protection if is_destructive { match self.protect_destructive_op(...)? { ProtectedOpResult::Cancelled => return Err(...), ProtectedOpResult::Proceed => {}, } } // 4. Execute operation let result = service.operation(...)?; // 5. Convert to QueryResult Ok(QueryResult::Variant(result)) } }
Supported Queries
Relational Operations
-- DDL
CREATE TABLE users (id INT, name VARCHAR(100), email VARCHAR(255))
DROP TABLE users
-- DML
INSERT INTO users (id, name, email) VALUES (1, 'Alice', 'alice@example.com')
INSERT INTO users SELECT * FROM temp_users
UPDATE users SET name = 'Bob' WHERE id = 1
DELETE FROM users WHERE id = 1
-- Queries
SELECT * FROM users WHERE id = 1
SELECT id, name FROM users ORDER BY name ASC LIMIT 10 OFFSET 5
SELECT COUNT(*), AVG(age) FROM users WHERE active = true GROUP BY dept HAVING COUNT(*) > 5
-- JOINs
SELECT * FROM users u INNER JOIN orders o ON u.id = o.user_id
SELECT * FROM users u LEFT JOIN profiles p ON u.id = p.user_id
SELECT * FROM a CROSS JOIN b
SELECT * FROM a NATURAL JOIN b
Aggregate Functions
| Function | Description | Null Handling |
|---|---|---|
COUNT(*) | Count all rows | Counts nulls |
COUNT(col) | Count non-null values | Excludes nulls |
SUM(col) | Sum numeric values | Skips nulls |
AVG(col) | Average numeric values | Skips nulls, returns NULL if no values |
MIN(col) | Minimum value | Skips nulls |
MAX(col) | Maximum value | Skips nulls |
Graph Operations
-- Node operations
NODE CREATE person {name: 'Alice', age: 30}
NODE GET 123
NODE DELETE 123
NODE LIST person LIMIT 100
NODE UPDATE 123 {name: 'Alice Smith'}
-- Edge operations
EDGE CREATE person:1 friend person:2 {since: 2020}
EDGE GET 456
EDGE DELETE 456
EDGE LIST friend LIMIT 50
-- Traversals
NEIGHBORS person:1 friend OUTGOING
NEIGHBORS 123 * BOTH
PATH person:1 TO person:5 VIA friend
Vector Operations
-- Single embedding
EMBED doc1 [0.1, 0.2, 0.3, 0.4]
EMBED DELETE doc1
-- Batch embedding
EMBED BATCH [('key1', [0.1, 0.2]), ('key2', [0.3, 0.4])]
-- Similarity search
SIMILAR 'doc1' LIMIT 5
SIMILAR 'doc1' LIMIT 5 EUCLIDEAN
SIMILAR [0.1, 0.2, 0.3] LIMIT 10 COSINE
-- Listing
SHOW EMBEDDINGS LIMIT 100
COUNT EMBEDDINGS
Distance Metrics
| Metric | Description | Use Case | Formula |
|---|---|---|---|
COSINE | Cosine similarity (default) | Semantic similarity | 1 - (a.b) / (‖a‖ * ‖b‖) |
EUCLIDEAN | Euclidean distance (L2) | Spatial distance | sqrt(sum((a_i - b_i)^2)) |
DOT_PRODUCT | Dot product | Magnitude-aware similarity | sum(a_i * b_i) |
Unified Entity Operations
-- Create entity with all modalities
ENTITY CREATE 'user:1' {name: 'Alice'} EMBEDDING [0.1, 0.2, 0.3]
-- Connect entities
ENTITY CONNECT 'user:1' -> 'doc:1' : authored
-- Combined similarity + graph search
SIMILAR 'query:key' CONNECTED TO 'hub:entity' LIMIT 10
Cross-Engine Queries
Cross-engine queries combine graph relationships with vector similarity:
#![allow(unused)] fn main() { let store = TensorStore::new(); let mut router = QueryRouter::with_shared_store(store); // Set up entities with embeddings router.vector().set_entity_embedding("user:1", vec![0.1, 0.2, 0.3])?; router.vector().set_entity_embedding("user:2", vec![0.15, 0.25, 0.35])?; // Connect via graph edges router.connect_entities("user:1", "user:2", "follows")?; // Build HNSW index for O(log n) similarity search router.build_vector_index()?; // Find neighbors sorted by similarity let results = router.find_neighbors_by_similarity("user:1", &query_vec, 10)?; // Find similar AND connected entities let results = router.find_similar_connected("user:1", "user:2", 5)?; }
Cross-Engine Methods
| Method | Description | Complexity |
|---|---|---|
build_vector_index() | Build HNSW index for O(log n) search | O(n log n) |
connect_entities(from, to, type) | Add graph edge between entities | O(1) |
find_neighbors_by_similarity(key, query, k) | Neighbors sorted by vector similarity | O(k * log n) with HNSW |
find_similar_connected(query, connected_to, k) | Similar AND connected entities | O(k * log n) + O(neighbors) |
create_unified_entity(key, fields, embedding) | Create entity with all modalities | O(1) |
Implementation Details
The find_similar_connected method combines vector and graph operations:
#![allow(unused)] fn main() { pub fn find_similar_connected( &self, query_key: &str, connected_to: &str, top_k: usize, ) -> Result<Vec<UnifiedItem>> { let query_embedding = self.vector.get_entity_embedding(query_key)?; // Use HNSW index if available, otherwise brute-force let similar = if let Some((ref index, ref keys)) = self.hnsw_index { self.vector.search_with_hnsw(index, keys, &query_embedding, top_k * 2)? } else { self.vector.search_entities(&query_embedding, top_k * 2)? }; // Get graph neighbors of connected_to entity let connected_neighbors: HashSet<String> = self.graph .get_entity_neighbors(connected_to) .unwrap_or_default() .into_iter() .collect(); // Filter to entities that are both similar AND connected let items: Vec<UnifiedItem> = similar .into_iter() .filter(|s| connected_neighbors.contains(&s.key)) .take(top_k) .map(|s| UnifiedItem::new("vector+graph", &s.key).with_score(s.score)) .collect(); Ok(items) } }
Optional Services
Services are lazily initialized and can be enabled as needed:
flowchart TD
subgraph Initialization Order
A[QueryRouter::new] --> B[Core Engines Ready]
B --> C{Need Vault?}
C -->|Yes| D[init_vault]
B --> E{Need Cache?}
E -->|Yes| F[init_cache]
B --> G{Need Blob?}
G -->|Yes| H[init_blob]
H --> I{Need Checkpoint?}
I -->|Yes| J[init_checkpoint]
B --> K{Need Chain?}
K -->|Yes| L[init_chain]
B --> M{Need Cluster?}
M -->|Yes| N[init_cluster]
end
style J fill:#ffcccc
note[Checkpoint requires Blob]
Vault
#![allow(unused)] fn main() { // Initialize with master key router.init_vault(master_key)?; // Or auto-initialize from NEUMANN_VAULT_KEY env var router.ensure_vault()?; // Set identity for access control router.set_identity("user:alice"); }
Vault requires authentication for all operations:
#![allow(unused)] fn main() { fn exec_vault(&self, stmt: &VaultStmt) -> Result<QueryResult> { let vault = self.vault.as_ref() .ok_or_else(|| RouterError::VaultError("Vault not initialized".to_string()))?; // SECURITY: Require explicit authentication let identity = self.require_identity()?; match &stmt.operation { VaultOp::Get { key } => { let value = vault.get(identity, &key)?; Ok(QueryResult::Value(value)) }, // ... } } }
Cache
#![allow(unused)] fn main() { // Default configuration router.init_cache(); // Custom configuration router.init_cache_with_config(CacheConfig::default())?; // Auto-initialize router.ensure_cache(); }
Cache operations are available through queries:
CACHE INIT
CACHE STATS
CACHE CLEAR
CACHE EVICT 100
CACHE GET 'key'
CACHE PUT 'key' 'value'
CACHE SEMANTIC GET 'query' THRESHOLD 0.9
CACHE SEMANTIC PUT 'query' 'response' [0.1, 0.2, 0.3]
Blob Storage
#![allow(unused)] fn main() { // Initialize blob store router.init_blob()?; router.start_blob()?; // Start GC // Graceful shutdown router.shutdown_blob()?; }
Blob operations use async execution internally:
#![allow(unused)] fn main() { fn exec_blob(&self, stmt: &BlobStmt) -> Result<QueryResult> { let blob = self.blob.as_ref() .ok_or_else(|| RouterError::BlobError("Blob store not initialized".to_string()))?; let runtime = self.blob_runtime.as_ref() .ok_or_else(|| RouterError::BlobError("Blob runtime not initialized".to_string()))?; match &stmt.operation { BlobOp::Put { filename, data, ... } => { let artifact_id = runtime.block_on(async { let blob_guard = blob.lock().await; blob_guard.put(&filename, &data, options).await })?; Ok(QueryResult::Value(artifact_id)) }, // ... } } }
Checkpoint
#![allow(unused)] fn main() { // Requires blob storage router.init_blob()?; router.init_checkpoint()?; // Set confirmation handler for destructive ops router.set_confirmation_handler(handler)?; }
Checkpoint provides automatic protection for destructive operations:
#![allow(unused)] fn main() { fn protect_destructive_op( &self, command: &str, op: DestructiveOp, sample_data: Vec<String>, ) -> Result<ProtectedOpResult> { let Some(checkpoint) = self.checkpoint.as_ref() else { return Ok(ProtectedOpResult::Proceed); }; runtime.block_on(async { let cp = checkpoint.lock().await; if !cp.auto_checkpoint_enabled() { return Ok(ProtectedOpResult::Proceed); } let preview = cp.generate_preview(&op, sample_data); if !cp.request_confirmation(&op, &preview) { return Ok(ProtectedOpResult::Cancelled); } // Create auto-checkpoint before operation cp.create_auto(command, op, preview, store).await?; Ok(ProtectedOpResult::Proceed) }) } }
Protected operations include:
DELETE(relational rows)DROP TABLEDROP INDEXNODE DELETEEMBED DELETEVAULT DELETEBLOB DELETECACHE CLEAR
Chain
#![allow(unused)] fn main() { // Initialize tensor chain router.init_chain("node_1")?; // Auto-initialize with default node ID router.ensure_chain()?; }
Chain operations available through queries:
CHAIN BEGIN
CHAIN COMMIT
CHAIN ROLLBACK 100
CHAIN HISTORY 'key'
CHAIN HEIGHT
CHAIN TIP
CHAIN BLOCK 42
CHAIN VERIFY
CHAIN SHOW CODEBOOK GLOBAL
CHAIN SHOW CODEBOOK LOCAL 'domain'
CHAIN ANALYZE TRANSITIONS
Cluster
#![allow(unused)] fn main() { // Initialize cluster mode router.init_cluster("node_1", bind_addr, &peers)?; // Check cluster status if router.is_cluster_active() { // Distributed queries enabled } // Graceful shutdown router.shutdown_cluster()?; }
Cluster initialization creates:
ClusterOrchestratorfor Raft consensusConsistentHashPartitionerfor key-based routingQueryPlannerfor distributed execution
Distributed Query Execution
When cluster mode is active, queries are automatically distributed:
flowchart TD
A[Query] --> B[QueryPlanner]
B --> C{classify_query}
C -->|GET key| D{partition key}
D -->|Local| E[QueryPlan::Local]
D -->|Remote| F[QueryPlan::Remote]
C -->|SIMILAR| G[QueryPlan::ScatterGather]
C -->|SELECT *| G
C -->|COUNT| H[QueryPlan::ScatterGather + Aggregate]
C -->|Unknown| E
F --> I[execute_on_shard]
G --> J[execute_scatter_gather]
H --> J
J --> K[ResultMerger::merge]
K --> L[QueryResult]
Query Classification
The QueryPlanner classifies queries based on text pattern matching:
#![allow(unused)] fn main() { fn classify_query(&self, query: &str) -> QueryType { let query_upper = query.to_uppercase(); // Point lookups if query_upper.starts_with("GET ") || query_upper.starts_with("NODE GET ") || query_upper.starts_with("ENTITY GET ") { if let Some(key) = self.extract_key(query) { return QueryType::PointLookup { key }; } } // Similarity search if query_upper.starts_with("SIMILAR ") { let k = self.extract_top_k(query).unwrap_or(10); return QueryType::SimilaritySearch { k }; } // Table scans with aggregates if query_upper.starts_with("SELECT ") { if query_upper.contains("COUNT(") { return QueryType::Aggregate { func: AggregateFunction::Count }; } if query_upper.contains("SUM(") { return QueryType::Aggregate { func: AggregateFunction::Sum }; } return QueryType::TableScan; } QueryType::Unknown } }
Query Plans
| Plan | When Used | Example | Shards Contacted |
|---|---|---|---|
Local | Point lookups on local shard | GET user:1 (local key) | 1 |
Remote | Point lookups on remote shard | GET user:2 (remote key) | 1 |
ScatterGather | Full scans, aggregates, similarity | SELECT *, SIMILAR, COUNT | All |
Merge Strategies
| Strategy | Description | Use Case | Algorithm |
|---|---|---|---|
Union | Combine all results | SELECT, NODE queries | Concatenate rows/nodes/edges |
TopK(k) | Keep top K by score | SIMILAR queries | Sort by score desc, truncate |
Aggregate(func) | SUM, COUNT, AVG, MAX, MIN | Aggregate queries | Combine partial aggregates |
FirstNonEmpty | First result found | Point lookups | Short-circuit on first result |
Concat | Concatenate in order | Ordered results | Same as Union |
Result Merger Implementation
#![allow(unused)] fn main() { impl ResultMerger { pub fn merge(results: Vec<ShardResult>, strategy: &MergeStrategy) -> Result<QueryResult> { // Filter out errors if not fail-fast let successful: Vec<_> = results.into_iter() .filter(|r| r.error.is_none()) .collect(); if successful.is_empty() { return Ok(QueryResult::Empty); } match strategy { MergeStrategy::Union => Self::merge_union(successful), MergeStrategy::TopK(k) => Self::merge_top_k(successful, *k), MergeStrategy::Aggregate(func) => Self::merge_aggregate(successful, *func), MergeStrategy::FirstNonEmpty => Self::merge_first_non_empty(successful), MergeStrategy::Concat => Self::merge_concat(successful), } } fn merge_top_k(results: Vec<ShardResult>, k: usize) -> Result<QueryResult> { let mut all_similar: Vec<SimilarResult> = Vec::new(); for shard_result in results { if let QueryResult::Similar(similar) = shard_result.result { all_similar.extend(similar); } } // Sort by score descending all_similar.sort_by(|a, b| b.score.partial_cmp(&a.score).unwrap_or(std::cmp::Ordering::Equal) ); // Take top K all_similar.truncate(k); Ok(QueryResult::Similar(all_similar)) } } }
Distributed Query Configuration
#![allow(unused)] fn main() { pub struct DistributedQueryConfig { /// Maximum concurrent shard queries (default: 10) pub max_concurrent: usize, /// Query timeout per shard in milliseconds (default: 5000) pub shard_timeout_ms: u64, /// Retry count for failed shards (default: 2) pub retry_count: usize, /// Whether to fail fast on first shard error (default: false) pub fail_fast: bool, } }
Semantic Routing
For embedding-aware routing, use plan_with_embedding:
#![allow(unused)] fn main() { pub fn plan_with_embedding(&self, query: &str, embedding: &[f32]) -> QueryPlan { // Get semantically relevant shards let relevant_shards = self.shards_for_embedding(embedding); if relevant_shards.is_empty() { return self.plan(query); // Fallback to all shards } // Route similarity search to relevant shards only match self.classify_query(query) { QueryType::SimilaritySearch { k } => QueryPlan::ScatterGather { shards: relevant_shards, query: query.to_string(), merge: MergeStrategy::TopK(k), }, _ => self.plan(query), } } }
Performance Characteristics
| Operation | Complexity | Notes |
|---|---|---|
| Parse | O(n) | n = query length |
| SELECT | O(m) | m = rows in table |
| SELECT with index | O(log m + k) | k = matching rows |
| INSERT | O(1) | Single row insert |
| NODE | O(1) | Single node create |
| EDGE | O(1) | Single edge create |
| PATH | O(V+E) | BFS traversal |
| SIMILAR (brute-force) | O(n*d) | n = embeddings, d = dimensions |
| SIMILAR (HNSW) | O(log n * d) | After build_vector_index() |
find_similar_connected | O(log n) or O(n) | Uses HNSW if index built |
| Distributed query | O(query) / shards | Parallelized across shards |
| Result merge (Union) | O(total results) | Linear in combined size |
| Result merge (TopK) | O(n log k) | Sort + truncate |
HNSW Index Performance
| Entities | Brute-force | With HNSW | Speedup |
|---|---|---|---|
| 200 | 4.17s | 9.3us | 448,000x |
Distributed Query Overhead
| Operation | Overhead |
|---|---|
| Query planning | ~1-5 us |
| Network round-trip | ~1-10 ms (depends on network) |
| Result serialization | ~10-100 us (depends on result size) |
| Result merging | ~1-10 us (TopK), O(n) for Union |
Query Caching
Cacheable statements are automatically cached when a cache is configured:
- Cacheable:
SELECT,SIMILAR,NEIGHBORS,PATH - Write operations:
INSERT,UPDATE,DELETE, DDL invalidate cache
#![allow(unused)] fn main() { fn is_cacheable_statement(stmt: &Statement) -> bool { matches!(&stmt.kind, StatementKind::Select(_) | StatementKind::Similar(_) | StatementKind::Neighbors(_) | StatementKind::Path(_) ) } fn is_write_statement(stmt: &Statement) -> bool { matches!(&stmt.kind, StatementKind::Insert(_) | StatementKind::Update(_) | StatementKind::Delete(_) | StatementKind::CreateTable(_) | StatementKind::DropTable(_) | StatementKind::CreateIndex(_) | StatementKind::DropIndex(_) ) } }
Cache Usage Example
#![allow(unused)] fn main() { // Enable caching router.init_cache(); // First call executes and caches (JSON serialization) let result1 = router.execute_parsed("SELECT * FROM users")?; // Second call returns cached result (JSON deserialization) let result2 = router.execute_parsed("SELECT * FROM users")?; // Write operations invalidate entire cache router.execute_parsed("INSERT INTO users VALUES (2, 'Bob')")?; // Cache is now empty }
Cache Gotchas
- Full cache invalidation: Any write operation clears the entire cache. No table-level tracking.
- Case sensitivity: Cache keys are lowercased, so
SELECTandselecthit the same entry. - Whitespace normalization: Queries are trimmed but not fully normalized.
- No TTL: Cached entries persist until invalidated by writes or explicit
CACHE CLEAR.
Best Practices
Service Initialization Order
#![allow(unused)] fn main() { // Initialize in dependency order let mut router = QueryRouter::with_shared_store(store); // Optional services (no dependencies) router.init_vault(key)?; router.init_cache(); // Blob first (required for checkpoint) router.init_blob()?; router.start_blob()?; // Checkpoint depends on blob router.init_checkpoint()?; router.set_confirmation_handler(handler)?; // Chain is independent router.init_chain("node_1")?; // Cluster is independent but typically last router.init_cluster("node_1", addr, &peers)?; }
Identity Management
#![allow(unused)] fn main() { // Always set identity before vault operations router.set_identity("user:alice"); // Check authentication status if !router.is_authenticated() { return Err("Authentication required"); } // Identity persists across queries router.execute_parsed("VAULT GET 'secret'")?; // Uses alice's identity }
Error Handling
#![allow(unused)] fn main() { match router.execute_parsed(query) { Ok(result) => handle_result(result), Err(RouterError::ParseError(msg)) => println!("Invalid query: {}", msg), Err(RouterError::AuthenticationRequired) => println!("Please run SET IDENTITY first"), Err(RouterError::RelationalError(msg)) if msg.contains("not found") => { println!("Table not found"); }, Err(e) => println!("Error: {}", e), } }
Async vs Sync
#![allow(unused)] fn main() { // Use sync for simple scripts let result = router.execute_parsed("SELECT * FROM users")?; // Use async for concurrent operations async fn parallel_queries(router: &QueryRouter) -> Result<()> { let (users, orders) = tokio::join!( router.execute_parsed_async("SELECT * FROM users"), router.execute_parsed_async("SELECT * FROM orders"), ); // Both queries execute concurrently Ok(()) } // Note: async execution doesn't support distributed routing yet }
Building Vector Index
#![allow(unused)] fn main() { // Build index after loading embeddings for (key, embedding) in embeddings { router.vector().set_entity_embedding(&key, embedding)?; } // Build HNSW index for fast similarity search router.build_vector_index()?; // Now SIMILAR queries use O(log n) search let results = router.execute_parsed("SIMILAR 'query' LIMIT 10")?; }
Related Modules
| Module | Relationship |
|---|---|
| Tensor Store | Underlying storage layer |
| Relational Engine | Table operations |
| Graph Engine | Node/edge operations |
| Vector Engine | Embedding operations |
| Tensor Unified | Cross-engine queries |
| Neumann Parser | Query parsing |
| Tensor Vault | Secret storage |
| Tensor Cache | LLM response caching |
| Tensor Blob | Artifact storage |
| Tensor Checkpoint | Snapshots |
| Tensor Chain | Blockchain |
| Neumann Shell | CLI interface |
Neumann Shell Architecture
The Neumann Shell (neumann_shell) provides an interactive CLI interface for
the Neumann database. It is a thin layer that delegates query execution to the
Query Router while providing readline-based input handling, command history,
output formatting, and crash recovery via write-ahead logging.
The shell follows four design principles: human-first interface (readable prompts, formatted output, command history), thin layer (minimal logic, delegates to Query Router), graceful handling (Ctrl+C does not exit, errors displayed cleanly), and zero configuration (works out of the box with sensible defaults).
Key Types
| Type | Description |
|---|---|
Shell | Main shell struct holding router, config, and WAL state |
ShellConfig | Configuration for history file, history size, and prompt |
CommandResult | Result enum: Output, Exit, Help, Empty, Error |
LoopAction | Action after command: Continue or Exit |
ShellError | Error type for initialization failures |
Wal | Internal write-ahead log for crash recovery |
RouterExecutor | Wrapper implementing QueryExecutor trait for cluster operations |
ShellConfirmationHandler | Interactive confirmation handler for destructive operations |
Shell Configuration
| Field | Type | Default | Description |
|---|---|---|---|
history_file | Option<PathBuf> | ~/.neumann_history | Path for persistent history |
history_size | usize | 1000 | Maximum history entries |
prompt | String | "> " | Input prompt string |
The default history file location is determined by reading the HOME
environment variable:
#![allow(unused)] fn main() { fn dirs_home() -> Option<PathBuf> { std::env::var_os("HOME").map(PathBuf::from) } }
Command Result Types
| Variant | Description | REPL Behavior |
|---|---|---|
Output(String) | Query executed successfully with output | Print to stdout, continue loop |
Exit | Shell should exit | Print “Goodbye!”, break loop |
Help(String) | Help text to display | Print to stdout, continue loop |
Empty | Empty input (no-op) | Continue loop silently |
Error(String) | Error occurred | Print to stderr, continue loop |
REPL Loop Implementation
The shell implements a Read-Eval-Print Loop (REPL) using the rustyline crate
for readline functionality. Here is the complete control flow:
flowchart TD
A[Start run] --> B[Create Editor]
B --> C[Load history file]
C --> D[Set max history size]
D --> E[Set confirmation handler if checkpoint available]
E --> F[Print version banner]
F --> G[readline with prompt]
G --> H{Input result?}
H -->|Ok line| I{Line empty?}
I -->|No| J[Add to history]
I -->|Yes| G
J --> K[execute command]
K --> L[process_result]
L --> M{LoopAction?}
M -->|Continue| G
M -->|Exit| N[Save history]
H -->|Ctrl+C| O[Print ^C]
O --> G
H -->|Ctrl+D EOF| P[Print Goodbye!]
P --> N
H -->|Error| Q[Print error]
Q --> N
N --> R[End]
Initialization Sequence
#![allow(unused)] fn main() { pub fn run(&mut self) -> Result<(), ShellError> { // 1. Create rustyline editor let editor: Editor<(), DefaultHistory> = DefaultEditor::new().map_err(|e| ShellError::Init(e.to_string()))?; let editor = Arc::new(Mutex::new(editor)); // 2. Load existing history { let mut ed = editor.lock(); if let Some(ref path) = self.config.history_file { let _ = ed.load_history(path); } ed.history_mut() .set_max_len(self.config.history_size) .map_err(|e| ShellError::Init(e.to_string()))?; } // 3. Set up confirmation handler for destructive operations { let router = self.router.read(); if router.has_checkpoint() { let handler = Arc::new(ShellConfirmationHandler::new(Arc::clone(&editor))); drop(router); let router = self.router.write(); if let Err(e) = router.set_confirmation_handler(handler) { eprintln!("Warning: Failed to set confirmation handler: {e}"); } } } println!("Neumann Database Shell v{}", Self::version()); println!("Type 'help' for available commands.\n"); // 4. Main REPL loop loop { let readline_result = { let mut ed = editor.lock(); ed.readline(&self.config.prompt) }; match readline_result { Ok(line) => { if !line.trim().is_empty() { let mut ed = editor.lock(); let _ = ed.add_history_entry(line.trim()); } if Self::process_result(&self.execute(&line)) == LoopAction::Exit { break; } }, Err(ReadlineError::Interrupted) => println!("^C"), Err(ReadlineError::Eof) => { println!("Goodbye!"); break; }, Err(err) => { eprintln!("Error: {err}"); break; }, } } // 5. Save history on exit if let Some(ref path) = self.config.history_file { let mut ed = editor.lock(); let _ = ed.save_history(path); } Ok(()) } }
Command Execution Flow
flowchart TD
A[execute input] --> B{Trim empty?}
B -->|Yes| C[Return Empty]
B -->|No| D[Convert to lowercase]
D --> E{Built-in command?}
E -->|exit/quit/\q| F[Return Exit]
E -->|help/\h/\?| G[Return Help]
E -->|tables/\dt| H[list_tables]
E -->|clear/\c| I[Return ANSI clear]
E -->|wal status| J[handle_wal_status]
E -->|wal truncate| K[handle_wal_truncate]
E -->|No match| L{Prefix match?}
L -->|save compressed| M[handle_save_compressed]
L -->|save| N[handle_save]
L -->|load| O[handle_load]
L -->|vault init| P[handle_vault_init]
L -->|vault identity| Q[handle_vault_identity]
L -->|cache init| R[handle_cache_init]
L -->|cluster connect| S[handle_cluster_connect]
L -->|cluster disconnect| T[handle_cluster_disconnect]
L -->|None| U[router.execute_parsed]
U --> V{Result?}
V -->|Ok| W{is_write_command?}
W -->|Yes| X{WAL active?}
X -->|Yes| Y[wal.append]
Y --> Z[Return Output]
X -->|No| Z
W -->|No| Z
V -->|Err| AA[Return Error]
Usage Examples
Shell Creation
#![allow(unused)] fn main() { use neumann_shell::{Shell, ShellConfig}; // Default configuration let shell = Shell::new(); // Custom configuration let config = ShellConfig { history_file: Some("/custom/path/.neumann_history".into()), history_size: 500, prompt: "neumann> ".to_string(), }; let shell = Shell::with_config(config); }
Running the REPL
#![allow(unused)] fn main() { shell.run()?; }
Programmatic Execution
#![allow(unused)] fn main() { use neumann_shell::CommandResult; match shell.execute("SELECT * FROM users") { CommandResult::Output(text) => println!("{}", text), CommandResult::Error(err) => eprintln!("Error: {}", err), CommandResult::Exit => println!("Goodbye!"), CommandResult::Help(text) => println!("{}", text), CommandResult::Empty => {}, } }
Direct Router Access
The shell provides thread-safe access to the underlying Query Router:
#![allow(unused)] fn main() { // Read-only access let router_guard = shell.router(); let tables = router_guard.list_tables(); // Mutable access let mut router_guard = shell.router_mut(); router_guard.init_vault(&key)?; // Get Arc clone for shared ownership let router_arc = shell.router_arc(); }
Built-in Commands
| Command | Aliases | Description |
|---|---|---|
help | \h, \? | Show help message |
exit | quit, \q | Exit the shell |
tables | \dt | List all tables |
clear | \c | Clear the screen (ANSI escape: \x1B[2J\x1B[H) |
save 'path' | — | Save database snapshot to file |
save compressed 'path' | — | Save compressed snapshot (int8 quantization) |
load 'path' | — | Load database snapshot from file (auto-detects format) |
wal status | — | Show write-ahead log status |
wal truncate | — | Clear the write-ahead log |
vault init | — | Initialize vault from NEUMANN_VAULT_KEY environment variable |
vault identity 'name' | — | Set current identity for vault access control |
cache init | — | Initialize semantic cache with default configuration |
cluster connect | — | Connect to cluster with specified node addresses |
cluster disconnect | — | Disconnect from cluster |
Command Parsing Details
All built-in commands are case-insensitive. The shell first converts input to lowercase before matching:
#![allow(unused)] fn main() { let lower = trimmed.to_lowercase(); match lower.as_str() { "exit" | "quit" | "\\q" => return CommandResult::Exit, "help" | "\\h" | "\\?" => return CommandResult::Help(Self::help_text()), "tables" | "\\dt" => return self.list_tables(), "clear" | "\\c" => return CommandResult::Output("\x1B[2J\x1B[H".to_string()), "wal status" => return self.handle_wal_status(), "wal truncate" => return self.handle_wal_truncate(), _ => {}, } }
Path Extraction
The extract_path function handles both quoted and unquoted paths:
#![allow(unused)] fn main() { fn extract_path(input: &str, command: &str) -> Option<String> { let rest = input[command.len()..].trim(); if rest.is_empty() { return None; } // Handle quoted path (single or double quotes) if (rest.starts_with('\'') && rest.ends_with('\'')) || (rest.starts_with('"') && rest.ends_with('"')) { if rest.len() > 2 { return Some(rest[1..rest.len() - 1].to_string()); } return None; } // Handle unquoted path Some(rest.to_string()) } }
Examples:
save 'foo.bin'->Some("foo.bin")LOAD "bar.bin"->Some("bar.bin")save /path/to/file.bin->Some("/path/to/file.bin")save ''->Nonesave->None
Query Support
The shell supports all query types from the Query Router:
Relational (SQL)
CREATE TABLE users (id INT, name TEXT, email TEXT)
INSERT INTO users VALUES (1, 'Alice', 'alice@example.com')
SELECT * FROM users WHERE id = 1
UPDATE users SET name = 'Bob' WHERE id = 1
DELETE FROM users WHERE id = 1
DROP TABLE users
Graph
NODE CREATE person {name: 'Alice', age: 30}
NODE LIST [label]
NODE GET id
EDGE CREATE node1 -> node2 : label [{props}]
EDGE LIST [type]
EDGE GET id
NEIGHBORS node_id OUTGOING|INCOMING|BOTH [: label]
PATH node1 -> node2 [LIMIT n]
Vector
EMBED STORE 'key' [vector values]
EMBED GET 'key'
EMBED DELETE 'key'
SIMILAR 'key' [COSINE|EUCLIDEAN|DOT_PRODUCT] LIMIT n
Unified (Cross-Engine)
FIND NODE [label] [WHERE condition] [LIMIT n]
FIND EDGE [type] [WHERE condition] [LIMIT n]
Blob Storage
BLOB PUT 'path' [CHUNK size] [TAGS 'a','b'] [FOR 'entity']
BLOB GET 'id' TO 'path'
BLOB DELETE 'id'
BLOB INFO 'id'
BLOB LINK 'id' TO 'entity'
BLOB UNLINK 'id' FROM 'entity'
BLOBS
BLOBS FOR 'entity'
BLOBS BY TAG 'tag'
Vault (Secrets)
VAULT INIT
VAULT IDENTITY 'node:name'
VAULT SET 'key' 'value'
VAULT GET 'key'
VAULT DELETE 'key'
VAULT LIST 'pattern'
VAULT ROTATE 'key' 'new'
VAULT GRANT 'entity' ON 'key'
VAULT REVOKE 'entity' ON 'key'
Cache (LLM Responses)
CACHE INIT
CACHE STATS
CACHE CLEAR
CACHE EVICT [n]
CACHE GET 'key'
CACHE PUT 'key' 'value'
Checkpoints (Rollback)
CHECKPOINT
CHECKPOINT 'name'
CHECKPOINTS
CHECKPOINTS LIMIT n
ROLLBACK TO 'name-or-id'
Write-Ahead Log (WAL)
The shell includes a write-ahead log for crash recovery. When enabled, all write commands are logged to a file that can be replayed after loading a snapshot.
WAL Data Structure
#![allow(unused)] fn main() { struct Wal { file: File, // Open file handle for appending path: PathBuf, // Path to WAL file (derived from snapshot: data.bin -> data.log) } impl Wal { fn open_append(path: &Path) -> std::io::Result<Self>; fn append(&mut self, cmd: &str) -> std::io::Result<()>; // Writes line + flush fn truncate(&mut self) -> std::io::Result<()>; // Recreates empty file fn path(&self) -> &Path; fn size(&self) -> std::io::Result<u64>; } }
WAL File Format
The WAL is a simple text file with one command per line. Each command is written verbatim followed by a newline and an immediate flush:
INSERT INTO users VALUES (1, 'Alice')
NODE CREATE person {name: 'Bob'}
EMBED STORE 'doc1' [0.1, 0.2, 0.3]
Format details:
- Line-delimited plain text
- UTF-8 encoded
- Each line is the exact command string
- Flushed immediately after each write for durability
- Empty lines are skipped during replay
WAL Lifecycle
stateDiagram-v2
[*] --> Inactive: Shell created
Inactive --> Active: LOAD 'snapshot.bin'
Active --> Active: Write command logged
Active --> Active: Read command (no log)
Active --> Empty: SAVE 'snapshot.bin'
Empty --> Active: Write command
Active --> Empty: WAL TRUNCATE
Active --> [*]: Shell exits
Write Command Detection
The is_write_command function determines which commands should be logged to
the WAL:
#![allow(unused)] fn main() { fn is_write_command(cmd: &str) -> bool { let upper = cmd.to_uppercase(); let first_word = upper.split_whitespace().next().unwrap_or(""); match first_word { "INSERT" | "UPDATE" | "DELETE" | "CREATE" | "DROP" => true, "NODE" => !upper.contains("NODE GET"), "EDGE" => !upper.contains("EDGE GET"), "EMBED" => upper.contains("EMBED STORE") || upper.contains("EMBED DELETE"), "VAULT" => { upper.contains("VAULT SET") || upper.contains("VAULT DELETE") || upper.contains("VAULT ROTATE") || upper.contains("VAULT GRANT") || upper.contains("VAULT REVOKE") }, "CACHE" => upper.contains("CACHE CLEAR"), "BLOB" => { upper.contains("BLOB PUT") || upper.contains("BLOB DELETE") || upper.contains("BLOB LINK") || upper.contains("BLOB UNLINK") || upper.contains("BLOB TAG") || upper.contains("BLOB UNTAG") || upper.contains("BLOB GC") || upper.contains("BLOB REPAIR") || upper.contains("BLOB META SET") }, _ => false, } } }
Write commands logged to WAL:
| Category | Commands |
|---|---|
| Relational | INSERT, UPDATE, DELETE, CREATE, DROP |
| Graph | NODE CREATE, NODE DELETE, EDGE CREATE, EDGE DELETE |
| Vector | EMBED STORE, EMBED DELETE |
| Vault | VAULT SET, VAULT DELETE, VAULT ROTATE, VAULT GRANT, VAULT REVOKE |
| Cache | CACHE CLEAR |
| Blob | BLOB PUT, BLOB DELETE, BLOB LINK, BLOB UNLINK, BLOB TAG, BLOB UNTAG, BLOB GC, BLOB REPAIR, BLOB META SET |
WAL Replay Algorithm
#![allow(unused)] fn main() { fn replay_wal(&self, wal_path: &Path) -> Result<usize, String> { let file = File::open(wal_path).map_err(|e| format!("Failed to open WAL: {e}"))?; let reader = BufReader::new(file); let mut count = 0; for (line_num, line) in reader.lines().enumerate() { let cmd = line.map_err(|e| format!("Failed to read WAL line {}: {e}", line_num + 1))?; let cmd = cmd.trim(); if cmd.is_empty() { continue; // Skip empty lines } let result = self.router.read().execute_parsed(cmd); if let Err(e) = result { return Err(format!("WAL replay failed at line {}: {e}", line_num + 1)); } count += 1; } Ok(count) } }
Example Session
> LOAD 'data.bin'
Loaded snapshot from: data.bin
> INSERT INTO users VALUES (1, 'Alice')
1 row affected
> -- If the shell crashes here, the INSERT is saved in data.log
> -- On next load, the WAL is automatically replayed:
> LOAD 'data.bin'
Loaded snapshot from: data.bin
Replayed 1 commands from WAL
> WAL STATUS
WAL enabled
Path: data.log
Size: 42 bytes
> SAVE 'data.bin'
Saved snapshot to: data.bin
> WAL STATUS
WAL enabled
Path: data.log
Size: 0 bytes
WAL Behavior Summary:
- WAL is activated after
LOAD(stored as<snapshot>.log) - All write commands (INSERT, UPDATE, DELETE, NODE CREATE, etc.) are logged
- On subsequent
LOAD, the snapshot is loaded first, then WAL is replayed SAVEtruncates the WAL (snapshot now contains all data)WAL TRUNCATEmanually clears the log without saving
Persistence Commands
Save and Load
> CREATE TABLE users (id INT, name TEXT)
OK
> INSERT INTO users VALUES (1, 'Alice')
1 row affected
> SAVE 'backup.bin'
Saved snapshot to: backup.bin
> SAVE COMPRESSED 'backup_compressed.bin'
Saved compressed snapshot to: backup_compressed.bin
> LOAD 'backup.bin'
Loaded snapshot from: backup.bin
Auto-Detection of Embedding Dimension
For compressed snapshots, the shell auto-detects the embedding dimension by sampling stored vectors:
#![allow(unused)] fn main() { fn detect_embedding_dimension(store: &TensorStore) -> usize { // Sample vectors to find dimension let keys = store.scan(""); for key in keys.iter().take(100) { if let Ok(tensor) = store.get(key) { for field in tensor.keys() { match tensor.get(field) { Some(TensorValue::Vector(v)) => return v.len(), Some(TensorValue::Sparse(s)) => return s.dimension(), _ => {}, } } } } // Default to standard BERT dimension if no vectors found tensor_compress::CompressionDefaults::STANDARD // 768 } }
Compression Options:
SAVE: Uncompressed bincode formatSAVE COMPRESSED: Uses int8 quantization (4x smaller), delta encoding, and RLELOAD: Auto-detects format (works with both compressed and uncompressed)
Output Formatting
The shell converts QueryResult variants into human-readable strings through
the format_result function:
#![allow(unused)] fn main() { fn format_result(result: &QueryResult) -> String { match result { QueryResult::Empty => "OK".to_string(), QueryResult::Value(s) => s.clone(), QueryResult::Count(n) => format_count(*n), QueryResult::Ids(ids) => format_ids(ids), QueryResult::Rows(rows) => format_rows(rows), QueryResult::Nodes(nodes) => format_nodes(nodes), QueryResult::Edges(edges) => format_edges(edges), QueryResult::Path(path) => format_path(path), QueryResult::Similar(results) => format_similar(results), QueryResult::Unified(unified) => unified.description.clone(), QueryResult::TableList(tables) => format_table_list(tables), QueryResult::Blob(data) => format_blob(data), QueryResult::ArtifactInfo(info) => format_artifact_info(info), QueryResult::ArtifactList(ids) => format_artifact_list(ids), QueryResult::BlobStats(stats) => format_blob_stats(stats), QueryResult::CheckpointList(checkpoints) => format_checkpoint_list(checkpoints), QueryResult::Chain(chain) => format_chain_result(chain), } } }
Table Formatting Algorithm (ASCII Tables)
The format_rows function implements dynamic column width calculation:
#![allow(unused)] fn main() { fn format_rows(rows: &[Row]) -> String { if rows.is_empty() { return "(0 rows)".to_string(); } // Get column names from first row let columns: Vec<&String> = rows[0].values.iter().map(|(k, _)| k).collect(); if columns.is_empty() { return "(0 rows)".to_string(); } // Convert rows to string values let string_rows: Vec<Vec<String>> = rows .iter() .map(|row| { columns .iter() .map(|col| row.get(col).map(|v| format!("{v:?}")).unwrap_or_default()) .collect() }) .collect(); // Calculate column widths (max of header and all cell widths) let mut widths: Vec<usize> = columns.iter().map(|c| c.len()).collect(); for row in &string_rows { for (i, cell) in row.iter().enumerate() { if i < widths.len() { widths[i] = widths[i].max(cell.len()); } } } // Build output with header, separator, and data rows let mut output = String::new(); // Header let header: Vec<String> = columns .iter() .zip(&widths) .map(|(col, &w)| format!("{col:w$}")) .collect(); output.push_str(&header.join(" | ")); output.push('\n'); // Separator let sep: Vec<String> = widths.iter().map(|&w| "-".repeat(w)).collect(); output.push_str(&sep.join("-+-")); output.push('\n'); // Data rows for row in &string_rows { let formatted: Vec<String> = row .iter() .zip(&widths) .map(|(cell, &w)| format!("{cell:w$}")) .collect(); output.push_str(&formatted.join(" | ")); output.push('\n'); } let _ = write!(output, "({} rows)", rows.len()); output } }
Output example:
name | age | email
------+-----+------------------
Alice | 30 | alice@example.com
Bob | 25 | bob@example.com
(2 rows)
Node Formatting
#![allow(unused)] fn main() { fn format_nodes(nodes: &[NodeResult]) -> String { if nodes.is_empty() { "(0 nodes)".to_string() } else { let lines: Vec<String> = nodes .iter() .map(|n| { let props: Vec<String> = n .properties .iter() .map(|(k, v)| format!("{k}: {v}")) .collect(); if props.is_empty() { format!(" [{}] {} {{}}", n.id, n.label) } else { format!(" [{}] {} {{{}}}", n.id, n.label, props.join(", ")) } }) .collect(); format!("Nodes:\n{}\n({} nodes)", lines.join("\n"), nodes.len()) } } }
Output example:
Nodes:
[1] person {name: Alice, age: 30}
[2] person {name: Bob, age: 25}
(2 nodes)
Edge Formatting
Edges:
[1] 1 -> 2 : knows
(1 edges)
Path Formatting
Path: 1 -> 3 -> 5 -> 7
Similar Embeddings Formatting
Similar:
1. doc1 (similarity: 0.9800)
2. doc2 (similarity: 0.9500)
Blob Formatting
Binary data handling with size threshold:
#![allow(unused)] fn main() { fn format_blob(data: &[u8]) -> String { let size = data.len(); if size <= 256 { // Try to display as UTF-8 if valid if let Ok(s) = std::str::from_utf8(data) { if s.chars().all(|c| !c.is_control() || c == '\n' || c == '\t') { return s.to_string(); } } } // Show summary for binary/large data format!("<binary data: {size} bytes>") } }
Timestamp Formatting
Relative time formatting for better readability:
#![allow(unused)] fn main() { fn format_timestamp(unix_secs: u64) -> String { let now = std::time::SystemTime::now() .duration_since(std::time::UNIX_EPOCH) .map(|d| d.as_secs()) .unwrap_or(0); if unix_secs == 0 { return "unknown".to_string(); } let diff = now.saturating_sub(unix_secs); if diff < 60 { format!("{diff}s ago") } else if diff < 3600 { let mins = diff / 60; format!("{mins}m ago") } else if diff < 86400 { let hours = diff / 3600; format!("{hours}h ago") } else { let days = diff / 86400; format!("{days}d ago") } } }
Destructive Operation Confirmation
The shell integrates with the checkpoint system to provide interactive confirmation for destructive operations:
#![allow(unused)] fn main() { struct ShellConfirmationHandler { editor: Arc<Mutex<Editor<(), DefaultHistory>>>, } impl ConfirmationHandler for ShellConfirmationHandler { fn confirm(&self, op: &DestructiveOp, preview: &OperationPreview) -> bool { let prompt = format_confirmation_prompt(op, preview); // Print the warning with sample data println!("\n{prompt}"); // Ask for confirmation using readline let mut editor = self.editor.lock(); editor .readline("Type 'yes' to proceed: ") .is_ok_and(|input| input.trim().eq_ignore_ascii_case("yes")) } } }
Supported destructive operations:
| Operation | Warning Message |
|---|---|
Delete | WARNING: About to delete N row(s) from table 'name' |
DropTable | WARNING: About to drop table 'name' with N row(s) |
DropIndex | WARNING: About to drop index on 'column' in table 'name' |
NodeDelete | WARNING: About to delete node N and M connected edge(s) |
EmbedDelete | WARNING: About to delete embedding 'key' |
VaultDelete | WARNING: About to delete vault secret 'key' |
BlobDelete | WARNING: About to delete blob 'id' (size) |
CacheClear | WARNING: About to clear cache with N entries |
Keyboard Shortcuts
Provided by rustyline:
| Shortcut | Action |
|---|---|
| Up/Down | Navigate history |
| Ctrl+C | Cancel current input (prints ^C, continues loop) |
| Ctrl+D | Exit shell (EOF) |
| Ctrl+L | Clear screen |
| Ctrl+A | Move to start of line |
| Ctrl+E | Move to end of line |
| Ctrl+W | Delete word backward |
| Ctrl+U | Delete to start of line |
Error Handling
| Error Type | Example | Output Stream |
|---|---|---|
| Parse error | Error: unexpected token 'FORM' at position 12 | stderr |
| Table not found | Error: table 'users' not found | stderr |
| Invalid query | Error: unsupported operation | stderr |
| WAL write failure | Command succeeded but WAL write failed: ... | Returned as Error |
| WAL replay failure | WAL replay failed at line N: ... | Returned as Error |
Errors are printed to stderr and do not exit the shell. The process_result
function routes output appropriately:
#![allow(unused)] fn main() { pub fn process_result(result: &CommandResult) -> LoopAction { match result { CommandResult::Output(text) | CommandResult::Help(text) => { println!("{text}"); LoopAction::Continue }, CommandResult::Error(text) => { eprintln!("{text}"); LoopAction::Continue }, CommandResult::Exit => { println!("Goodbye!"); LoopAction::Exit }, CommandResult::Empty => LoopAction::Continue, } } }
Cluster Connectivity
Connect Command Syntax
CLUSTER CONNECT 'node_id@bind_addr' ['peer_id@peer_addr', ...]
Example:
> CLUSTER CONNECT 'node1@127.0.0.1:8001' 'node2@127.0.0.1:8002'
Cluster initialized: node1 @ 127.0.0.1:8001 with 1 peer(s)
Address Parsing
#![allow(unused)] fn main() { fn parse_node_address(s: &str) -> Result<(String, SocketAddr), String> { let parts: Vec<&str> = s.splitn(2, '@').collect(); if parts.len() != 2 { return Err("Expected format 'node_id@host:port'".to_string()); } let node_id = parts[0].to_string(); let addr: SocketAddr = parts[1] .parse() .map_err(|e| format!("Invalid address '{}': {}", parts[1], e))?; Ok((node_id, addr)) } }
Cluster Query Execution
The shell wraps the router for distributed query execution:
#![allow(unused)] fn main() { struct RouterExecutor(Arc<RwLock<QueryRouter>>); impl QueryExecutor for RouterExecutor { fn execute(&self, query: &str) -> Result<Vec<u8>, String> { let router = self.0.read(); router.execute_for_cluster(query) } } }
Performance Characteristics
| Operation | Time |
|---|---|
| Empty input | 2.3 ns |
| Help command | 43 ns |
| SELECT (100 rows) | 17.8 us |
| Format 1000 rows | 267 us |
The shell adds negligible overhead to query execution.
Edge Cases and Gotchas
-
Empty quoted paths:
save ''returns an error, not an empty path. -
WAL not active by default: The WAL only becomes active after
LOAD. New shells have no WAL. -
Case sensitivity: Built-in commands are case-insensitive, but query strings preserve case for data.
-
History persistence: History is only saved when the shell exits normally (not on crash).
-
ANSI codes: The
clearcommand outputs ANSI escape sequences (\x1B[2J\x1B[H), which may not work on all terminals. -
Confirmation handler: Only active if checkpoint module is available when shell starts.
-
WAL replay stops on first error: If any command fails during replay, the entire replay stops.
-
Missing columns: When formatting rows with inconsistent columns, missing values show as empty strings.
-
Binary blob display: Blobs over 256 bytes or with control characters show as
<binary data: N bytes>. -
Timestamp overflow: Very old timestamps (before 1970) or 0 display as “unknown”.
User Experience Tips
-
Use compressed snapshots for large datasets:
SAVE COMPRESSEDreduces file size by ~4x with minimal precision loss. -
Check WAL status before critical operations: Run
WAL STATUSto verify recovery capability. -
Use tab completion: Rustyline provides filename completion in some contexts.
-
Ctrl+C is safe: It only cancels the current line, not the entire session.
-
History survives sessions: Previous commands are available across shell restarts.
-
For scripts, use programmatic API:
shell.execute()returns structured results for automation. -
Cluster connect before distributed operations: Ensure
CLUSTER CONNECTsucceeds before running distributed transactions.
Dependencies
| Crate | Purpose |
|---|---|
query_router | Query execution |
relational_engine | Row type for formatting |
tensor_store | Snapshot persistence (save/load) |
tensor_compress | Compressed snapshot support |
tensor_checkpoint | Checkpoint confirmation handling |
tensor_chain | Cluster query executor trait |
rustyline | Readline functionality (history, shortcuts, Ctrl+C) |
parking_lot | Mutex and RwLock for thread-safe router access |
base64 | Vault key decoding |
Related Modules
- query_router: The Query Router executes all queries. The shell delegates all query parsing and execution to this module.
- tensor_store: Provides the underlying storage layer and snapshot functionality.
- tensor_compress: Handles compressed snapshot format with int8 quantization.
- tensor_checkpoint: Provides checkpoint/rollback functionality with confirmation prompts.
- tensor_chain: Provides cluster connectivity and distributed transaction support.
Neumann Server Architecture
The Neumann Server (neumann_server) provides a gRPC server that exposes the
Neumann database over the network. It serves as the network gateway for remote
clients, wrapping the Query Router with authentication, TLS encryption, and
streaming support for large result sets and blob storage.
The server follows four design principles: zero-configuration startup (works out of the box with sensible defaults), security-first (API key authentication with constant-time comparison, TLS support), streaming-native (all large operations use gRPC streaming), and health monitoring (automatic failure tracking with configurable thresholds).
Architecture Overview
flowchart TD
subgraph Clients
CLI[neumann_client]
gRPC[gRPC Clients]
Web[gRPC-Web Browsers]
end
subgraph NeumannServer
QS[QueryService]
BS[BlobService]
HS[HealthService]
RS[ReflectionService]
Auth[Auth Middleware]
TLS[TLS Layer]
end
subgraph Backend
QR[QueryRouter]
Blob[BlobStore]
end
CLI --> TLS
gRPC --> TLS
Web --> TLS
TLS --> Auth
Auth --> QS
Auth --> BS
Auth --> HS
QS --> QR
BS --> Blob
RS --> |Service Discovery| gRPC
Key Types
| Type | Description |
|---|---|
NeumannServer | Main server struct with router, blob store, and configuration |
ServerConfig | Configuration for bind address, TLS, auth, and limits |
TlsConfig | TLS certificate paths and client certificate settings |
AuthConfig | API key list, header name, and anonymous access control |
ApiKey | Individual API key with identity and optional description |
QueryServiceImpl | gRPC service for query execution with streaming |
BlobServiceImpl | gRPC service for artifact storage with streaming |
HealthServiceImpl | gRPC service for health checks |
HealthState | Shared health state across services |
ServerError | Error type for server operations |
Server Configuration
| Field | Type | Default | Description |
|---|---|---|---|
bind_addr | SocketAddr | 127.0.0.1:9200 | Server bind address |
tls | Option<TlsConfig> | None | TLS configuration |
auth | Option<AuthConfig> | None | Authentication configuration |
max_message_size | usize | 64 MB | Maximum gRPC message size |
max_upload_size | usize | 512 MB | Maximum blob upload size |
enable_grpc_web | bool | true | Enable gRPC-web for browsers |
enable_reflection | bool | true | Enable service reflection |
blob_chunk_size | usize | 64 KB | Chunk size for blob streaming |
stream_channel_capacity | usize | 32 | Bounded channel capacity for backpressure |
Configuration Builder
#![allow(unused)] fn main() { use neumann_server::{ServerConfig, TlsConfig, AuthConfig, ApiKey}; use std::path::PathBuf; let config = ServerConfig::new() .with_bind_addr("0.0.0.0:9443".parse()?) .with_tls(TlsConfig::new( PathBuf::from("server.crt"), PathBuf::from("server.key"), )) .with_auth( AuthConfig::new() .with_api_key(ApiKey::new( "sk-prod-key-12345678".to_string(), "service:backend".to_string(), )) .with_anonymous(false) ) .with_max_message_size(128 * 1024 * 1024) .with_grpc_web(true) .with_reflection(true); }
TLS Configuration
| Field | Type | Default | Description |
|---|---|---|---|
cert_path | PathBuf | Required | Path to certificate file (PEM) |
key_path | PathBuf | Required | Path to private key file (PEM) |
ca_cert_path | Option<PathBuf> | None | CA certificate for client auth |
require_client_cert | bool | false | Require client certificates |
TLS Setup Example
#![allow(unused)] fn main() { use neumann_server::TlsConfig; use std::path::PathBuf; // Basic TLS let tls = TlsConfig::new( PathBuf::from("/etc/neumann/server.crt"), PathBuf::from("/etc/neumann/server.key"), ); // Mutual TLS (mTLS) let tls = TlsConfig::new( PathBuf::from("/etc/neumann/server.crt"), PathBuf::from("/etc/neumann/server.key"), ) .with_ca_cert(PathBuf::from("/etc/neumann/ca.crt")) .with_required_client_cert(true); }
Authentication
AuthConfig Options
| Field | Type | Default | Description |
|---|---|---|---|
api_keys | Vec<ApiKey> | Empty | List of valid API keys |
api_key_header | String | x-api-key | Header name for API key |
allow_anonymous | bool | false | Allow unauthenticated access |
API Key Validation
The server uses constant-time comparison to prevent timing attacks. All keys are checked regardless of match status to avoid leaking information about valid prefixes:
#![allow(unused)] fn main() { // Internal validation logic fn validate_key(&self, key: &str) -> Option<&str> { let key_bytes = key.as_bytes(); let mut found_identity: Option<&str> = None; for api_key in &self.api_keys { let stored_bytes = api_key.key.as_bytes(); let max_len = stored_bytes.len().max(key_bytes.len()); let mut matches: u8 = 1; for i in 0..max_len { let stored_byte = stored_bytes.get(i).copied().unwrap_or(0); let key_byte = key_bytes.get(i).copied().unwrap_or(0); matches &= u8::from(stored_byte == key_byte); } let lengths_match = u8::from(stored_bytes.len() == key_bytes.len()); matches &= lengths_match; if matches == 1 { found_identity = Some(api_key.identity.as_str()); } } found_identity } }
Authentication Flow
flowchart TD
A[Request arrives] --> B{Auth configured?}
B -->|No| C[Allow with no identity]
B -->|Yes| D{API key header present?}
D -->|No| E{Anonymous allowed?}
E -->|Yes| C
E -->|No| F[Return UNAUTHENTICATED]
D -->|Yes| G{Key valid?}
G -->|Yes| H[Allow with identity from key]
G -->|No| F
gRPC Services
QueryService
The QueryService provides query execution with three RPC methods:
| Method | Type | Description |
|---|---|---|
Execute | Unary | Execute single query, return full result |
ExecuteStream | Server streaming | Execute query, stream results chunk by chunk |
ExecuteBatch | Unary | Execute multiple queries, return all results |
Execute RPC
rpc Execute(QueryRequest) returns (QueryResponse);
message QueryRequest {
string query = 1;
optional string identity = 2;
}
message QueryResponse {
oneof result {
EmptyResult empty = 1;
CountResult count = 2;
RowsResult rows = 3;
NodesResult nodes = 4;
EdgesResult edges = 5;
PathResult path = 6;
SimilarResult similar = 7;
TableListResult table_list = 8;
BlobResult blob = 9;
IdsResult ids = 10;
}
optional ErrorInfo error = 15;
}
ExecuteStream RPC
For large result sets (rows, nodes, edges, similar items, blobs), the streaming RPC sends results one item at a time:
rpc ExecuteStream(QueryRequest) returns (stream QueryResponseChunk);
message QueryResponseChunk {
oneof chunk {
RowChunk row = 1;
NodeChunk node = 2;
EdgeChunk edge = 3;
SimilarChunk similar_item = 4;
bytes blob_data = 5;
ErrorInfo error = 15;
}
bool is_final = 16;
}
ExecuteBatch RPC
rpc ExecuteBatch(BatchQueryRequest) returns (BatchQueryResponse);
message BatchQueryRequest {
repeated QueryRequest queries = 1;
}
message BatchQueryResponse {
repeated QueryResponse results = 1;
}
Security Note: In batch execution, the authenticated request identity is always used. Per-query identity fields are ignored to prevent privilege escalation attacks.
BlobService
The BlobService provides artifact storage with streaming upload/download:
| Method | Type | Description |
|---|---|---|
Upload | Client streaming | Upload artifact with metadata |
Download | Server streaming | Download artifact in chunks |
Delete | Unary | Delete artifact |
GetMetadata | Unary | Get artifact metadata |
Upload Protocol
sequenceDiagram
participant C as Client
participant S as BlobService
C->>S: UploadMetadata (filename, content_type, tags)
C->>S: Chunk 1
C->>S: Chunk 2
C->>S: ...
C->>S: Chunk N (end stream)
S->>C: UploadResponse (artifact_id, size, checksum)
The first message must be metadata, followed by data chunks:
rpc Upload(stream BlobUploadRequest) returns (BlobUploadResponse);
message BlobUploadRequest {
oneof request {
UploadMetadata metadata = 1;
bytes chunk = 2;
}
}
message UploadMetadata {
string filename = 1;
optional string content_type = 2;
repeated string tags = 3;
}
message BlobUploadResponse {
string artifact_id = 1;
uint64 size = 2;
string checksum = 3;
}
Download Protocol
rpc Download(BlobDownloadRequest) returns (stream BlobDownloadChunk);
message BlobDownloadRequest {
string artifact_id = 1;
}
message BlobDownloadChunk {
bytes data = 1;
bool is_final = 2;
}
HealthService
The HealthService follows the gRPC health checking protocol:
rpc Check(HealthCheckRequest) returns (HealthCheckResponse);
message HealthCheckRequest {
optional string service = 1;
}
message HealthCheckResponse {
ServingStatus status = 1;
}
enum ServingStatus {
UNSPECIFIED = 0;
SERVING = 1;
NOT_SERVING = 2;
}
Health Check Targets
| Service Name | Checks |
|---|---|
Empty or "" | Overall server health (all services) |
neumann.v1.QueryService | Query service health |
neumann.v1.BlobService | Blob service health |
| Unknown service | Returns UNSPECIFIED |
Automatic Health Tracking
The QueryService tracks consecutive failures and marks itself unhealthy after reaching the threshold (default: 5 failures):
#![allow(unused)] fn main() { const FAILURE_THRESHOLD: u32 = 5; fn record_failure(&self) { let failures = self.consecutive_failures.fetch_add(1, Ordering::SeqCst) + 1; if failures >= FAILURE_THRESHOLD { if let Some(ref health) = self.health_state { health.set_query_service_healthy(false); } } } fn record_success(&self) { self.consecutive_failures.store(0, Ordering::SeqCst); if let Some(ref health) = self.health_state { health.set_query_service_healthy(true); } } }
Server Lifecycle
Startup Sequence
flowchart TD
A[Create NeumannServer] --> B[Validate configuration]
B --> C{TLS configured?}
C -->|Yes| D[Load certificates]
C -->|No| E[Plain TCP]
D --> F[Build TLS config]
F --> G[Create services]
E --> G
G --> H{gRPC-web enabled?}
H -->|Yes| I[Add gRPC-web layer]
H -->|No| J[Standard gRPC]
I --> K{Reflection enabled?}
J --> K
K -->|Yes| L[Add reflection service]
K -->|No| M[Start serving]
L --> M
M --> N[Accept connections]
Basic Server Setup
use neumann_server::{NeumannServer, ServerConfig}; use query_router::QueryRouter; use std::sync::Arc; use parking_lot::RwLock; #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { // Create router let router = Arc::new(RwLock::new(QueryRouter::new())); // Create server with default config let server = NeumannServer::new(router, ServerConfig::default()); // Start serving (blocks until shutdown) server.serve().await?; Ok(()) }
Server with Shared Storage
For applications that need both query and blob services sharing the same storage:
use neumann_server::{NeumannServer, ServerConfig}; #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { let config = ServerConfig::default(); // Creates QueryRouter and BlobStore sharing the same TensorStore let server = NeumannServer::with_shared_storage(config).await?; server.serve().await?; Ok(()) }
Graceful Shutdown
use tokio::signal; #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { let server = NeumannServer::with_shared_storage(ServerConfig::default()).await?; // Shutdown on Ctrl+C server.serve_with_shutdown(signal::ctrl_c().map(|_| ())).await?; Ok(()) }
Error Handling
Server Errors
| Error | Cause | gRPC Status |
|---|---|---|
Config | Invalid configuration | INVALID_ARGUMENT |
Transport | Network/TLS failure | UNAVAILABLE |
Query | Query execution failed | INVALID_ARGUMENT |
Auth | Authentication failed | UNAUTHENTICATED |
Blob | Blob operation failed | INTERNAL |
Internal | Unexpected server error | INTERNAL |
InvalidArgument | Bad request data | INVALID_ARGUMENT |
NotFound | Resource not found | NOT_FOUND |
PermissionDenied | Access denied | PERMISSION_DENIED |
Io | I/O error | INTERNAL |
Error Conversion
#![allow(unused)] fn main() { impl From<ServerError> for Status { fn from(err: ServerError) -> Self { match &err { ServerError::Config(msg) => Status::invalid_argument(msg), ServerError::Transport(e) => Status::unavailable(e.to_string()), ServerError::Query(msg) => Status::invalid_argument(msg), ServerError::Auth(msg) => Status::unauthenticated(msg), ServerError::Blob(msg) => Status::internal(msg), ServerError::Internal(msg) => Status::internal(msg), ServerError::InvalidArgument(msg) => Status::invalid_argument(msg), ServerError::NotFound(msg) => Status::not_found(msg), ServerError::PermissionDenied(msg) => Status::permission_denied(msg), ServerError::Io(e) => Status::internal(e.to_string()), } } } }
Backpressure and Flow Control
Streaming Backpressure
The server uses bounded channels for streaming responses to prevent memory exhaustion:
#![allow(unused)] fn main() { // Default: 32 items buffered let (tx, rx) = mpsc::channel(self.stream_channel_capacity); tokio::spawn(async move { for item in results { // This will block if channel is full, providing backpressure if tx.send(Ok(item)).await.is_err() { // Receiver dropped, stop sending return; } } }); }
Upload Size Limits
The BlobService enforces upload size limits:
#![allow(unused)] fn main() { if data.len().saturating_add(chunk.len()) > max_size { return Err(Status::resource_exhausted(format!( "upload exceeds maximum size of {max_size} bytes" ))); } }
Production Deployment
Recommended Configuration
#![allow(unused)] fn main() { let config = ServerConfig::new() .with_bind_addr("0.0.0.0:9443".parse()?) .with_tls(TlsConfig::new( PathBuf::from("/etc/neumann/tls/server.crt"), PathBuf::from("/etc/neumann/tls/server.key"), )) .with_auth( AuthConfig::new() .with_api_key(ApiKey::new( std::env::var("NEUMANN_API_KEY")?, "service:default".to_string(), )) .with_anonymous(false) ) .with_max_message_size(64 * 1024 * 1024) .with_max_upload_size(1024 * 1024 * 1024) // 1GB .with_stream_channel_capacity(64) .with_grpc_web(true) .with_reflection(false); // Disable in production }
Health Check Integration
Use health checks with load balancers:
# grpcurl health check
grpcurl -plaintext localhost:9200 neumann.v1.Health/Check
# With service name
grpcurl -plaintext -d '{"service":"neumann.v1.QueryService"}' \
localhost:9200 neumann.v1.Health/Check
Logging
The server uses the tracing crate for structured logging:
#![allow(unused)] fn main() { use tracing_subscriber::FmtSubscriber; let subscriber = FmtSubscriber::builder() .with_max_level(tracing::Level::INFO) .finish(); tracing::subscriber::set_global_default(subscriber)?; // Server logs connection info and errors // INFO: Starting Neumann gRPC server with TLS on 0.0.0.0:9443 // ERROR: Query execution error: table 'users' not found }
Dependencies
| Crate | Purpose |
|---|---|
query_router | Query execution backend |
tensor_blob | Blob storage backend |
tensor_store | Shared storage for both query and blob |
tonic | gRPC server framework |
tonic-web | gRPC-web layer for browser support |
tonic-reflection | Service reflection for debugging |
tokio | Async runtime |
parking_lot | Thread-safe router access |
tracing | Structured logging |
thiserror | Error type derivation |
Related Modules
| Module | Relationship |
|---|---|
neumann_client | Client SDK for connecting to this server |
query_router | Query execution backend |
tensor_blob | Blob storage backend |
neumann_shell | Interactive CLI (alternative interface) |
Neumann Client Architecture
The Neumann Client (neumann_client) provides a Rust SDK for interacting with
the Neumann database. It supports two modes: embedded mode for in-process
database access via the Query Router, and remote mode for network access via
gRPC to a Neumann Server.
The client follows four design principles: dual-mode flexibility (same API for embedded and remote), security-first (API keys are zeroized on drop), async-native (built on tokio for remote operations), and zero-copy where possible (streaming results for large datasets).
Architecture Overview
flowchart TD
subgraph Application
App[User Application]
end
subgraph NeumannClient
Client[NeumannClient]
Builder[ClientBuilder]
Config[ClientConfig]
end
subgraph EmbeddedMode
Router[QueryRouter]
Store[TensorStore]
end
subgraph RemoteMode
gRPC[gRPC Channel]
TLS[TLS Layer]
end
subgraph NeumannServer
Server[NeumannServer]
end
App --> Builder
Builder --> Client
Client -->|embedded| Router
Router --> Store
Client -->|remote| gRPC
gRPC --> TLS
TLS --> Server
Key Types
| Type | Description |
|---|---|
NeumannClient | Main client struct supporting both embedded and remote modes |
ClientBuilder | Fluent builder for remote client connections |
ClientConfig | Configuration for remote connections (address, API key, TLS) |
ClientMode | Enum: Embedded or Remote |
ClientError | Error type for client operations |
RemoteQueryResult | Wrapper for proto query response with typed accessors |
QueryResult | Re-export of query_router result type (embedded mode) |
Client Modes
| Mode | Feature Flag | Use Case |
|---|---|---|
| Embedded | embedded | In-process database, unit testing, CLI tools |
| Remote | remote (default) | Production gRPC connections to server |
| Full | full | Both modes available |
Feature Flags
[dependencies]
# Remote only (default)
neumann_client = "0.1"
# Embedded only
neumann_client = { version = "0.1", default-features = false, features = ["embedded"] }
# Both modes
neumann_client = { version = "0.1", features = ["full"] }
Client Configuration
| Field | Type | Default | Description |
|---|---|---|---|
address | String | localhost:9200 | Server address (host:port) |
api_key | Option<String> | None | API key for authentication |
tls | bool | false | Enable TLS encryption |
timeout_ms | u64 | 30000 | Request timeout in milliseconds |
Security: API Key Zeroization
API keys are automatically zeroed from memory when the configuration is dropped to prevent credential leakage:
#![allow(unused)] fn main() { impl Drop for ClientConfig { fn drop(&mut self) { if let Some(ref mut key) = self.api_key { key.zeroize(); // Overwrites memory with zeros } } } }
Remote Mode
Connection Builder
The ClientBuilder provides a fluent API for configuring remote connections:
#![allow(unused)] fn main() { use neumann_client::NeumannClient; use std::time::Duration; // Minimal connection let client = NeumannClient::connect("localhost:9200") .build() .await?; // Full configuration let client = NeumannClient::connect("db.example.com:9443") .api_key("sk-production-key") .with_tls() .timeout_ms(60_000) .build() .await?; }
Connection Flow
sequenceDiagram
participant App as Application
participant Builder as ClientBuilder
participant Client as NeumannClient
participant Channel as gRPC Channel
participant Server as NeumannServer
App->>Builder: connect("address")
App->>Builder: api_key("key")
App->>Builder: with_tls()
App->>Builder: build().await
Builder->>Channel: Create endpoint
Channel->>Server: TCP/TLS handshake
Server-->>Channel: Connection established
Channel-->>Builder: Channel ready
Builder-->>Client: NeumannClient created
Client-->>App: Ready for queries
Query Execution
#![allow(unused)] fn main() { // Single query let result = client.execute("SELECT * FROM users").await?; // With identity (for vault access) let result = client .execute_with_identity("VAULT GET 'secret'", Some("service:backend")) .await?; // Batch queries let results = client .execute_batch(&[ "CREATE TABLE orders (id:int, total:float)", "INSERT orders id=1, total=99.99", "SELECT orders", ]) .await?; }
RemoteQueryResult Accessors
The RemoteQueryResult wrapper provides typed access to query results:
#![allow(unused)] fn main() { let result = client.execute("SELECT * FROM users").await?; // Check for errors if result.has_error() { eprintln!("Error: {}", result.error_message().unwrap()); return Err(...); } // Check result type if result.is_empty() { println!("No results"); } // Access typed data if let Some(count) = result.count() { println!("Count: {}", count); } if let Some(rows) = result.rows() { for row in rows { println!("Row ID: {}", row.id); } } if let Some(nodes) = result.nodes() { for node in nodes { println!("Node: {} ({})", node.id, node.label); } } if let Some(edges) = result.edges() { for edge in edges { println!("Edge: {} -> {}", edge.from, edge.to); } } if let Some(similar) = result.similar() { for item in similar { println!("{}: {:.4}", item.key, item.score); } } // Access raw proto response let proto = result.into_inner(); }
Blocking Connection
For synchronous contexts, use the blocking builder:
#![allow(unused)] fn main() { let client = NeumannClient::connect("localhost:9200") .api_key("test-key") .build_blocking()?; // Creates temporary tokio runtime }
Embedded Mode
Creating an Embedded Client
#![allow(unused)] fn main() { use neumann_client::NeumannClient; // New embedded database let client = NeumannClient::embedded()?; // With custom router (for shared state) use query_router::QueryRouter; use std::sync::Arc; use parking_lot::RwLock; let router = Arc::new(RwLock::new(QueryRouter::new())); let client = NeumannClient::with_router(router); }
Synchronous Query Execution
Embedded mode provides synchronous execution for simpler code flow:
#![allow(unused)] fn main() { use neumann_client::QueryResult; // Create table let result = client.execute_sync("CREATE TABLE users (name:string, age:int)")?; assert!(matches!(result, QueryResult::Empty)); // Insert data let result = client.execute_sync("INSERT users name=\"Alice\", age=30")?; // Query data let result = client.execute_sync("SELECT users")?; match result { QueryResult::Rows(rows) => { for row in rows { println!("{:?}", row); } } _ => {} } }
With Identity
#![allow(unused)] fn main() { // Set identity for vault access control let result = client.execute_sync_with_identity( "VAULT GET 'api_secret'", Some("service:backend"), )?; }
Error Handling
Error Types
| Error | Code | Retryable | Description |
|---|---|---|---|
Connection | 6 | Yes | Failed to connect to server |
Query | 9 | No | Query execution failed |
Authentication | 5 | No | Invalid API key |
PermissionDenied | 3 | No | Access denied |
NotFound | 2 | No | Resource not found |
InvalidArgument | 1 | No | Bad request data |
Parse | 8 | No | Query parse error |
Internal | 7 | No | Server internal error |
Timeout | 6 | Yes | Request timed out |
Unavailable | 6 | Yes | Server unavailable |
Error Methods
#![allow(unused)] fn main() { let err = ClientError::Connection("connection refused".to_string()); // Get error code let code = err.code(); // 6 // Check if retryable if err.is_retryable() { // Retry with exponential backoff } // Display error eprintln!("Error: {}", err); // "connection error: connection refused" }
Error Handling Pattern
#![allow(unused)] fn main() { use neumann_client::ClientError; match client.execute("SELECT * FROM users").await { Ok(result) => { if result.has_error() { // Query-level error (e.g., table not found) eprintln!("Query error: {}", result.error_message().unwrap()); } else { // Process results } } Err(ClientError::Connection(msg)) => { // Network error - maybe retry eprintln!("Connection failed: {}", msg); } Err(ClientError::Authentication(msg)) => { // Bad credentials - check API key eprintln!("Auth failed: {}", msg); } Err(ClientError::Timeout(msg)) => { // Request too slow - maybe retry with longer timeout eprintln!("Timeout: {}", msg); } Err(e) => { eprintln!("Unexpected error: {}", e); } } }
Conversion from gRPC Status
Remote errors are automatically converted from tonic Status:
#![allow(unused)] fn main() { impl From<tonic::Status> for ClientError { fn from(status: tonic::Status) -> Self { match status.code() { Code::InvalidArgument => Self::InvalidArgument(status.message().to_string()), Code::NotFound => Self::NotFound(status.message().to_string()), Code::PermissionDenied => Self::PermissionDenied(status.message().to_string()), Code::Unauthenticated => Self::Authentication(status.message().to_string()), Code::Unavailable => Self::Unavailable(status.message().to_string()), Code::DeadlineExceeded => Self::Timeout(status.message().to_string()), _ => Self::Internal(status.message().to_string()), } } } }
Connection Management
Connection State
#![allow(unused)] fn main() { let client = NeumannClient::connect("localhost:9200") .build() .await?; // Check mode match client.mode() { ClientMode::Embedded => println!("In-process"), ClientMode::Remote => println!("Connected to server"), } // Check connection status if client.is_connected() { // Ready for queries } }
Closing Connections
#![allow(unused)] fn main() { let mut client = NeumannClient::connect("localhost:9200") .build() .await?; // Explicit close client.close(); // Or automatic on drop drop(client); // Connection closed, API key zeroized }
Usage Examples
Complete Remote Example
use neumann_client::{NeumannClient, ClientError}; #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { // Connect to server let client = NeumannClient::connect("localhost:9200") .api_key(std::env::var("NEUMANN_API_KEY")?) .with_tls() .timeout_ms(30_000) .build() .await?; // Create schema client.execute("CREATE TABLE products (name:string, price:float)").await?; // Insert data client.execute("INSERT products name=\"Widget\", price=9.99").await?; client.execute("INSERT products name=\"Gadget\", price=19.99").await?; // Query data let result = client.execute("SELECT products WHERE price > 10").await?; if let Some(rows) = result.rows() { for row in rows { println!("Product: {:?}", row); } } Ok(()) }
Complete Embedded Example
use neumann_client::{NeumannClient, QueryResult}; fn main() -> Result<(), Box<dyn std::error::Error>> { // Create embedded client let client = NeumannClient::embedded()?; // Create schema client.execute_sync("CREATE TABLE events (name:string, timestamp:int)")?; // Insert data client.execute_sync("INSERT events name=\"login\", timestamp=1700000000")?; // Query data match client.execute_sync("SELECT events")? { QueryResult::Rows(rows) => { println!("Found {} events", rows.len()); for row in rows { println!(" {:?}", row); } } _ => println!("Unexpected result type"), } Ok(()) }
Testing with Embedded Mode
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use neumann_client::{NeumannClient, QueryResult}; #[test] fn test_user_creation() { let client = NeumannClient::embedded().unwrap(); // Setup client .execute_sync("CREATE TABLE users (email:string, active:bool)") .unwrap(); // Test client .execute_sync("INSERT users email=\"test@example.com\", active=true") .unwrap(); // Verify let result = client.execute_sync("SELECT users").unwrap(); match result { QueryResult::Rows(rows) => { assert_eq!(rows.len(), 1); } _ => panic!("Expected rows"), } } } }
Shared Router Between Clients
#![allow(unused)] fn main() { use neumann_client::NeumannClient; use query_router::QueryRouter; use std::sync::Arc; use parking_lot::RwLock; // Create shared router let router = Arc::new(RwLock::new(QueryRouter::new())); // Create multiple clients sharing same state let client1 = NeumannClient::with_router(Arc::clone(&router)); let client2 = NeumannClient::with_router(Arc::clone(&router)); // Changes from client1 visible to client2 client1.execute_sync("CREATE TABLE shared (x:int)")?; let result = client2.execute_sync("SELECT shared")?; // Works! }
Best Practices
Connection Reuse
Create one client and reuse it for multiple queries:
#![allow(unused)] fn main() { // Good: Reuse client let client = NeumannClient::connect("localhost:9200").build().await?; for query in queries { client.execute(&query).await?; } // Bad: New connection per query for query in queries { let client = NeumannClient::connect("localhost:9200").build().await?; client.execute(&query).await?; } // Connection overhead for each query }
Timeout Configuration
Set appropriate timeouts based on query complexity:
#![allow(unused)] fn main() { // Quick queries let client = NeumannClient::connect("localhost:9200") .timeout_ms(5_000) // 5 seconds .build() .await?; // Complex analytics let client = NeumannClient::connect("localhost:9200") .timeout_ms(300_000) // 5 minutes .build() .await?; }
API Key Security
Never hardcode API keys:
#![allow(unused)] fn main() { // Good: Environment variable let api_key = std::env::var("NEUMANN_API_KEY")?; let client = NeumannClient::connect("localhost:9200") .api_key(api_key) .build() .await?; // Bad: Hardcoded key let client = NeumannClient::connect("localhost:9200") .api_key("sk-secret-key-12345") // Will be in binary! .build() .await?; }
Dependencies
| Crate | Purpose | Feature |
|---|---|---|
query_router | Embedded mode query execution | embedded |
tonic | gRPC client | remote |
tokio | Async runtime | remote |
parking_lot | Thread-safe router access | embedded |
zeroize | Secure memory clearing | Always |
thiserror | Error type derivation | Always |
tracing | Structured logging | Always |
Related Modules
| Module | Relationship |
|---|---|
neumann_server | Server counterpart for remote mode |
query_router | Query execution backend for embedded mode |
neumann_shell | Alternative interactive interface |
TypeScript SDK Architecture
The TypeScript SDK (@neumann/client) provides a TypeScript/JavaScript client
for the Neumann database with support for both Node.js (gRPC) and browser
(gRPC-Web) environments.
The SDK follows four design principles: environment agnostic (same API for Node.js and browsers via dynamic imports), type-safe (full TypeScript support with discriminated unions for results), streaming-first (async iterators for large result sets), and zero dependencies in core types (proto converters are separate from type definitions).
Architecture Overview
flowchart TD
subgraph Application
App[TypeScript Application]
end
subgraph SDK["@neumann/client"]
Client[NeumannClient]
Types[Type Definitions]
Errors[Error Classes]
Converters[Proto Converters]
end
subgraph NodeJS[Node.js Environment]
gRPC["@grpc/grpc-js"]
end
subgraph Browser[Browser Environment]
gRPCWeb[grpc-web]
end
App --> Client
Client --> Types
Client --> Errors
Client --> Converters
Client -->|connect| gRPC
Client -->|connectWeb| gRPCWeb
gRPC --> Server[NeumannServer]
gRPCWeb --> Server
Installation
# npm
npm install @neumann/client
# yarn
yarn add @neumann/client
# pnpm
pnpm add @neumann/client
For Node.js, also install the gRPC package:
npm install @grpc/grpc-js
For browsers, install gRPC-Web:
npm install grpc-web
Key Types
| Type | Description |
|---|---|
NeumannClient | Main client class for database operations |
ConnectOptions | Options for server connection (API key, TLS, metadata) |
QueryOptions | Options for query execution (identity) |
ClientMode | Client mode: 'remote' or 'embedded' |
QueryResult | Discriminated union of all result types |
Value | Typed scalar value with type tag |
Row | Relational row with column values |
Node | Graph node with label and properties |
Edge | Graph edge with type, source, target, properties |
Path | Graph path as list of segments |
SimilarItem | Vector similarity search result |
ArtifactInfo | Blob artifact metadata |
NeumannError | Base error class with error code |
Connection Options
| Field | Type | Default | Description |
|---|---|---|---|
apiKey | string? | undefined | API key for authentication |
tls | boolean? | false | Enable TLS encryption |
metadata | Record<string, string>? | undefined | Custom metadata headers |
Query Options
| Field | Type | Default | Description |
|---|---|---|---|
identity | string? | undefined | Identity for vault access control |
Connection Methods
Node.js Connection
import { NeumannClient } from '@neumann/client';
// Basic connection
const client = await NeumannClient.connect('localhost:9200');
// With authentication and TLS
const client = await NeumannClient.connect('db.example.com:9443', {
apiKey: process.env.NEUMANN_API_KEY,
tls: true,
metadata: { 'x-request-id': 'abc123' },
});
Browser Connection (gRPC-Web)
import { NeumannClient } from '@neumann/client';
// Connect via gRPC-Web
const client = await NeumannClient.connectWeb('https://api.example.com', {
apiKey: 'your-api-key',
});
Query Execution
Single Query
const result = await client.execute('SELECT users');
// With identity for vault access
const result = await client.execute('VAULT GET "secret"', {
identity: 'service:backend',
});
Streaming Query
For large result sets, use streaming to receive results incrementally:
for await (const chunk of client.executeStream('SELECT large_table')) {
if (chunk.type === 'rows') {
for (const row of chunk.rows) {
console.log(rowToObject(row));
}
}
}
Batch Query
Execute multiple queries in a single request:
const results = await client.executeBatch([
'CREATE TABLE orders (id:int, total:float)',
'INSERT orders id=1, total=99.99',
'SELECT orders',
]);
for (const result of results) {
console.log(result.type);
}
Query Result Types
The QueryResult type is a discriminated union. Use the type field to
determine which result type you have:
| Type | Fields | Description |
|---|---|---|
'empty' | - | No result (DDL operations) |
'value' | value: string | Single value result |
'count' | count: number | Row count |
'rows' | rows: Row[] | Relational query rows |
'nodes' | nodes: Node[] | Graph nodes |
'edges' | edges: Edge[] | Graph edges |
'paths' | paths: Path[] | Graph paths |
'similar' | items: SimilarItem[] | Vector similarity results |
'ids' | ids: string[] | List of IDs |
'tableList' | names: string[] | Table names |
'blob' | data: Uint8Array | Binary blob data |
'blobInfo' | info: ArtifactInfo | Blob metadata |
'error' | code: number, message: string | Error response |
Type Guards
Use the provided type guards for type-safe result handling:
import {
isRowsResult,
isNodesResult,
isErrorResult,
rowToObject,
} from '@neumann/client';
const result = await client.execute('SELECT users');
if (isErrorResult(result)) {
console.error(`Error ${result.code}: ${result.message}`);
} else if (isRowsResult(result)) {
for (const row of result.rows) {
console.log(rowToObject(row));
}
}
Result Pattern Matching
const result = await client.execute(query);
switch (result.type) {
case 'empty':
console.log('OK');
break;
case 'count':
console.log(`${result.count} rows affected`);
break;
case 'rows':
console.table(result.rows.map(rowToObject));
break;
case 'nodes':
result.nodes.forEach((n) => console.log(`[${n.id}] ${n.label}`));
break;
case 'similar':
result.items.forEach((s) => console.log(`${s.key}: ${s.score.toFixed(4)}`));
break;
case 'error':
throw new Error(result.message);
}
Value Types
Values use a tagged union pattern for type safety:
import {
Value,
nullValue,
intValue,
floatValue,
stringValue,
boolValue,
bytesValue,
valueToNative,
valueFromNative,
} from '@neumann/client';
// Create typed values
const v1: Value = nullValue();
const v2: Value = intValue(42);
const v3: Value = floatValue(3.14);
const v4: Value = stringValue('hello');
const v5: Value = boolValue(true);
const v6: Value = bytesValue(new Uint8Array([1, 2, 3]));
// Convert to native JavaScript types
const native = valueToNative(v2); // 42
// Create from native values (auto-detects type)
const auto = valueFromNative(42); // { type: 'int', data: 42 }
Conversion Utilities
Row Conversion
import { rowToObject } from '@neumann/client';
const result = await client.execute('SELECT users');
if (result.type === 'rows') {
const objects = result.rows.map(rowToObject);
// [{ name: 'Alice', age: 30 }, { name: 'Bob', age: 25 }]
}
Node Conversion
import { nodeToObject } from '@neumann/client';
const result = await client.execute('NODE LIST');
if (result.type === 'nodes') {
const objects = result.nodes.map(nodeToObject);
// [{ id: '1', label: 'person', properties: { name: 'Alice' } }]
}
Edge Conversion
import { edgeToObject } from '@neumann/client';
const result = await client.execute('EDGE LIST');
if (result.type === 'edges') {
const objects = result.edges.map(edgeToObject);
// [{ id: '1', type: 'knows', source: '1', target: '2', properties: {} }]
}
Error Handling
Error Codes
| Code | Name | Description |
|---|---|---|
| 0 | UNKNOWN | Unknown error |
| 1 | INVALID_ARGUMENT | Bad request data |
| 2 | NOT_FOUND | Resource not found |
| 3 | PERMISSION_DENIED | Access denied |
| 4 | ALREADY_EXISTS | Resource already exists |
| 5 | UNAUTHENTICATED | Authentication failed |
| 6 | UNAVAILABLE | Server unavailable |
| 7 | INTERNAL | Internal server error |
| 8 | PARSE_ERROR | Query parse error |
| 9 | QUERY_ERROR | Query execution error |
Error Classes
import {
NeumannError,
ConnectionError,
AuthenticationError,
PermissionDeniedError,
NotFoundError,
InvalidArgumentError,
ParseError,
QueryError,
InternalError,
errorFromCode,
} from '@neumann/client';
try {
await client.execute('SELECT nonexistent');
} catch (e) {
if (e instanceof ConnectionError) {
console.error('Connection failed:', e.message);
} else if (e instanceof AuthenticationError) {
console.error('Auth failed - check API key');
} else if (e instanceof ParseError) {
console.error('Query syntax error:', e.message);
} else if (e instanceof NeumannError) {
console.error(`[${e.code}] ${e.message}`);
}
}
Error Factory
Create errors from numeric codes:
import { errorFromCode, ErrorCode } from '@neumann/client';
const error = errorFromCode(ErrorCode.NOT_FOUND, 'Table not found');
// Returns NotFoundError instance
Client Lifecycle
// Create client
const client = await NeumannClient.connect('localhost:9200');
// Check connection status
console.log(client.isConnected); // true
console.log(client.clientMode); // 'remote'
// Execute queries
const result = await client.execute('SELECT users');
// Close connection when done
client.close();
console.log(client.isConnected); // false
Usage Examples
Complete CRUD Example
import { NeumannClient, isRowsResult, rowToObject } from '@neumann/client';
async function main() {
const client = await NeumannClient.connect('localhost:9200', {
apiKey: process.env.NEUMANN_API_KEY,
});
try {
// Create table
await client.execute('CREATE TABLE products (name:string, price:float)');
// Insert data
await client.execute('INSERT products name="Widget", price=9.99');
await client.execute('INSERT products name="Gadget", price=19.99');
// Query data
const result = await client.execute('SELECT products WHERE price > 10');
if (isRowsResult(result)) {
const products = result.rows.map(rowToObject);
console.log('Products over $10:', products);
}
// Update data
await client.execute('UPDATE products SET price=24.99 WHERE name="Gadget"');
// Delete data
await client.execute('DELETE products WHERE price < 15');
// Drop table
await client.execute('DROP TABLE products');
} finally {
client.close();
}
}
Graph Operations
const client = await NeumannClient.connect('localhost:9200');
// Create nodes
await client.execute('NODE CREATE person {name: "Alice", age: 30}');
await client.execute('NODE CREATE person {name: "Bob", age: 25}');
// Create edge
await client.execute('EDGE CREATE 1 -> 2 : knows {since: 2020}');
// Query nodes
const nodes = await client.execute('NODE LIST person');
if (nodes.type === 'nodes') {
nodes.nodes.forEach((n) => {
console.log(`[${n.id}] ${n.label}:`, nodeToObject(n).properties);
});
}
// Find path
const path = await client.execute('PATH 1 -> 2');
if (path.type === 'paths' && path.paths.length > 0) {
const nodeIds = path.paths[0].segments.map((s) => s.node.id);
console.log('Path:', nodeIds.join(' -> '));
}
Vector Similarity Search
const client = await NeumannClient.connect('localhost:9200');
// Store embeddings
await client.execute('EMBED STORE "doc1" [0.1, 0.2, 0.3, 0.4]');
await client.execute('EMBED STORE "doc2" [0.15, 0.25, 0.35, 0.45]');
await client.execute('EMBED STORE "doc3" [0.9, 0.8, 0.7, 0.6]');
// Find similar
const result = await client.execute('SIMILAR "doc1" COSINE LIMIT 2');
if (result.type === 'similar') {
result.items.forEach((item) => {
console.log(`${item.key}: ${item.score.toFixed(4)}`);
});
}
Browser Usage with React
import { useState, useEffect } from 'react';
import { NeumannClient, QueryResult } from '@neumann/client';
function useNeumannQuery(query: string) {
const [result, setResult] = useState<QueryResult | null>(null);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<Error | null>(null);
useEffect(() => {
let cancelled = false;
async function fetchData() {
try {
const client = await NeumannClient.connectWeb('/api/neumann');
const data = await client.execute(query);
if (!cancelled) {
setResult(data);
setLoading(false);
}
client.close();
} catch (e) {
if (!cancelled) {
setError(e as Error);
setLoading(false);
}
}
}
fetchData();
return () => {
cancelled = true;
};
}, [query]);
return { result, loading, error };
}
Proto Conversion
The SDK includes utilities for converting protobuf messages to typed objects:
| Function | Description |
|---|---|
convertProtoValue | Convert proto Value to typed Value |
convertProtoRow | Convert proto Row to Row |
convertProtoNode | Convert proto Node to Node |
convertProtoEdge | Convert proto Edge to Edge |
convertProtoPath | Convert proto Path to Path |
convertProtoSimilarItem | Convert proto SimilarItem to SimilarItem |
convertProtoArtifactInfo | Convert proto ArtifactInfo to ArtifactInfo |
These are used internally but exported for custom integrations.
Dependencies
| Package | Purpose | Environment |
|---|---|---|
@grpc/grpc-js | gRPC client | Node.js |
grpc-web | gRPC-Web client | Browser |
The SDK uses dynamic imports to load the appropriate gRPC library based on the connection method used.
Related Modules
| Module | Relationship |
|---|---|
neumann_server | Server that this SDK connects to |
neumann_client | Rust SDK with same capabilities |
neumann-py | Python SDK with same API design |
Python SDK Architecture
The Python SDK (neumann-db) provides a Python client for the Neumann database
with support for both embedded mode (via PyO3 bindings) and remote mode (via
gRPC). It includes async support and integrations for pandas and numpy.
The SDK follows four design principles: Pythonic API (context managers, type hints, dataclasses), dual-mode (same API for embedded and remote), async-first (native asyncio support), and ecosystem integration (pandas DataFrame and numpy array support).
Architecture Overview
flowchart TD
subgraph Application
App[Python Application]
end
subgraph SDK[neumann-db]
Client[NeumannClient]
AsyncClient[AsyncNeumannClient]
Tx[Transaction]
Types[Data Types]
Errors[Error Classes]
end
subgraph Integrations
Pandas[pandas Integration]
Numpy[numpy Integration]
end
subgraph EmbeddedMode[Embedded Mode]
PyO3[_native PyO3 Module]
Router[QueryRouter]
end
subgraph RemoteMode[Remote Mode]
gRPC[grpcio]
Proto[Proto Stubs]
end
App --> Client
App --> AsyncClient
Client --> Tx
Client --> Types
Client --> Errors
Client --> Pandas
Client --> Numpy
Client -->|embedded| PyO3
PyO3 --> Router
Client -->|remote| gRPC
AsyncClient --> gRPC
gRPC --> Proto
Proto --> Server[NeumannServer]
Installation
# Basic installation (remote mode only)
pip install neumann-db
# With native module for embedded mode
pip install neumann-db[native]
# With pandas integration
pip install neumann-db[pandas]
# With numpy integration
pip install neumann-db[numpy]
# Full installation
pip install neumann-db[full]
Key Types
| Type | Description |
|---|---|
NeumannClient | Synchronous client supporting both modes |
AsyncNeumannClient | Async client for remote mode |
Transaction | Transaction context manager |
QueryResult | Query result with typed accessors |
QueryResultType | Enum of result types |
Value | Typed scalar value |
ScalarType | Enum of scalar types |
Row | Relational row with typed column accessors |
Node | Graph node with properties |
Edge | Graph edge with properties |
Path | Graph path as list of segments |
PathSegment | Path segment (node + optional edge) |
SimilarItem | Vector similarity result |
ArtifactInfo | Blob artifact metadata |
NeumannError | Base exception class |
Client Modes
| Mode | Class Method | Requirements | Use Case |
|---|---|---|---|
| Embedded | NeumannClient.embedded() | neumann-db[native] | Testing, CLI tools |
| Remote | NeumannClient.connect() | grpcio | Production |
| Async Remote | AsyncNeumannClient.connect() | grpcio | Async applications |
Synchronous Client
Embedded Mode
from neumann import NeumannClient
# In-memory database
client = NeumannClient.embedded()
# Persistent storage
client = NeumannClient.embedded(path="/path/to/data")
# Use as context manager
with NeumannClient.embedded() as client:
client.execute("CREATE TABLE users (name:string)")
Remote Mode
from neumann import NeumannClient
# Basic connection
client = NeumannClient.connect("localhost:9200")
# With authentication and TLS
client = NeumannClient.connect(
"db.example.com:9443",
api_key="your-api-key",
tls=True,
)
# Context manager
with NeumannClient.connect("localhost:9200") as client:
result = client.execute("SELECT users")
Query Execution
# Single query
result = client.execute("SELECT users")
# With identity for vault access
result = client.execute(
"VAULT GET 'secret'",
identity="service:backend",
)
# Streaming query
for chunk in client.execute_stream("SELECT large_table"):
for row in chunk.rows:
print(row.to_dict())
# Batch execution
results = client.execute_batch([
"CREATE TABLE orders (id:int, total:float)",
"INSERT orders id=1, total=99.99",
"SELECT orders",
])
Async Client
The async client supports remote mode only (PyO3 has threading limitations):
from neumann.aio import AsyncNeumannClient
# Connect
client = await AsyncNeumannClient.connect(
"localhost:9200",
api_key="your-api-key",
)
# Execute query
result = await client.execute("SELECT users")
# Streaming
async for chunk in client.execute_stream("SELECT large_table"):
for row in chunk.rows:
print(row.to_dict())
# Batch
results = await client.execute_batch(queries)
# Close
await client.close()
Async Context Manager
async with await AsyncNeumannClient.connect("localhost:9200") as client:
result = await client.execute("SELECT users")
for row in result.rows:
print(row.to_dict())
Run Embedded in Async Context
Use run_in_executor to use embedded mode from async code:
async def query_embedded():
client = await AsyncNeumannClient.connect("localhost:9200")
# This runs the embedded client in a thread pool
result = await client.run_in_executor("SELECT users")
return result
Transaction Support
Transactions provide automatic commit/rollback with context managers:
from neumann import NeumannClient, Transaction
client = NeumannClient.connect("localhost:9200")
# Using Transaction directly
tx = Transaction(client)
tx.begin()
try:
tx.execute("INSERT users name='Alice'")
tx.execute("INSERT users name='Bob'")
tx.commit()
except Exception:
tx.rollback()
raise
# Using context manager (preferred)
with Transaction(client) as tx:
tx.execute("INSERT users name='Alice'")
tx.execute("INSERT users name='Bob'")
# Auto-commits on success, auto-rollbacks on exception
Transaction Properties
| Property | Type | Description |
|---|---|---|
is_active | bool | True if transaction is active |
Transaction Methods
| Method | Description |
|---|---|
begin() | Start the transaction |
commit() | Commit the transaction |
rollback() | Rollback the transaction |
execute(query) | Execute query within transaction |
Query Result Types
The QueryResult class provides typed access to query results:
| Property | Return Type | Description |
|---|---|---|
type | QueryResultType | Result type enum |
is_empty | bool | True if empty result |
is_error | bool | True if error result |
value | str or None | Single value result |
count | int or None | Row count |
rows | list[Row] | Relational rows |
nodes | list[Node] | Graph nodes |
edges | list[Edge] | Graph edges |
paths | list[Path] | Graph paths |
similar_items | list[SimilarItem] | Similarity results |
ids | list[str] | ID list |
table_names | list[str] | Table names |
blob_data | bytes or None | Binary data |
blob_info | ArtifactInfo or None | Blob metadata |
error_message | str or None | Error message |
Result Type Enum
from neumann import QueryResultType
result = client.execute(query)
match result.type:
case QueryResultType.EMPTY:
print("OK")
case QueryResultType.COUNT:
print(f"{result.count} rows affected")
case QueryResultType.ROWS:
for row in result.rows:
print(row.to_dict())
case QueryResultType.NODES:
for node in result.nodes:
print(f"[{node.id}] {node.label}")
case QueryResultType.SIMILAR:
for item in result.similar_items:
print(f"{item.key}: {item.score:.4f}")
case QueryResultType.ERROR:
raise Exception(result.error_message)
Data Types
Value
Immutable typed scalar value:
from neumann import Value, ScalarType
# Create values
v1 = Value.null()
v2 = Value.int_(42)
v3 = Value.float_(3.14)
v4 = Value.string("hello")
v5 = Value.bool_(True)
v6 = Value.bytes_(b"data")
# Access type and data
print(v2.type) # ScalarType.INT
print(v2.data) # 42
# Convert to Python native type
native = v2.as_python() # 42
Row
Relational row with typed accessors:
from neumann import Row
row = result.rows[0]
# Get raw Value
val = row.get("name")
# Get typed values
name: str | None = row.get_string("name")
age: int | None = row.get_int("age")
score: float | None = row.get_float("score")
active: bool | None = row.get_bool("active")
# Convert to dict
data = row.to_dict() # {"name": "Alice", "age": 30}
Node
Graph node with properties:
from neumann import Node
node = result.nodes[0]
print(node.id) # "1"
print(node.label) # "person"
# Get property
name = node.get_property("name")
# Convert to dict
data = node.to_dict()
# {"id": "1", "label": "person", "properties": {"name": "Alice"}}
Edge
Graph edge with properties:
from neumann import Edge
edge = result.edges[0]
print(edge.id) # "1"
print(edge.edge_type) # "knows"
print(edge.source) # "1"
print(edge.target) # "2"
# Get property
since = edge.get_property("since")
# Convert to dict
data = edge.to_dict()
# {"id": "1", "type": "knows", "source": "1", "target": "2", "properties": {}}
Path
Graph path as segments:
from neumann import Path
path = result.paths[0]
# Get all nodes in path
nodes = path.nodes # [Node, Node, ...]
# Get all edges in path
edges = path.edges # [Edge, Edge, ...]
# Path length
length = len(path)
# Iterate segments
for segment in path.segments:
print(f"Node: {segment.node.id}")
if segment.edge:
print(f" -> via edge {segment.edge.id}")
SimilarItem
Vector similarity result:
from neumann import SimilarItem
for item in result.similar_items:
print(f"Key: {item.key}")
print(f"Score: {item.score:.4f}")
if item.metadata:
print(f"Metadata: {item.metadata}")
ArtifactInfo
Blob artifact metadata:
from neumann import ArtifactInfo
info = result.blob_info
print(f"ID: {info.artifact_id}")
print(f"Filename: {info.filename}")
print(f"Size: {info.size} bytes")
print(f"Checksum: {info.checksum}")
print(f"Content-Type: {info.content_type}")
print(f"Created: {info.created_at}")
print(f"Tags: {info.tags}")
Error Handling
Error Codes
| Code | Name | Description |
|---|---|---|
| 0 | UNKNOWN | Unknown error |
| 1 | INVALID_ARGUMENT | Bad request data |
| 2 | NOT_FOUND | Resource not found |
| 3 | PERMISSION_DENIED | Access denied |
| 4 | ALREADY_EXISTS | Resource exists |
| 5 | UNAUTHENTICATED | Auth failed |
| 6 | UNAVAILABLE | Server unavailable |
| 7 | INTERNAL | Internal error |
| 8 | PARSE_ERROR | Query parse error |
| 9 | QUERY_ERROR | Query execution error |
Error Classes
from neumann import (
NeumannError,
ConnectionError,
AuthenticationError,
PermissionError,
NotFoundError,
InvalidArgumentError,
ParseError,
QueryError,
InternalError,
ErrorCode,
)
try:
result = client.execute("SELECT nonexistent")
except ConnectionError as e:
print(f"Connection failed: {e.message}")
except AuthenticationError:
print("Check your API key")
except ParseError as e:
print(f"Query syntax error: {e.message}")
except NeumannError as e:
print(f"[{e.code.name}] {e.message}")
Error Factory
from neumann.errors import error_from_code, ErrorCode
# Create error from code
error = error_from_code(ErrorCode.NOT_FOUND, "Table 'users' not found")
# Returns NotFoundError instance
Pandas Integration
Convert query results to pandas DataFrames:
from neumann.integrations.pandas import (
result_to_dataframe,
rows_to_dataframe,
dataframe_to_inserts,
)
# Result to DataFrame
result = client.execute("SELECT users")
df = result_to_dataframe(result)
# Rows to DataFrame
df = rows_to_dataframe(result.rows)
# DataFrame to INSERT statements
inserts = dataframe_to_inserts(
df,
table="users",
column_mapping={"user_name": "name"}, # Optional column rename
)
# Execute inserts
for query in inserts:
client.execute(query)
NumPy Integration
Work with vectors using numpy arrays:
from neumann.integrations.numpy import (
vector_to_insert,
vectors_to_inserts,
parse_embedding,
cosine_similarity,
euclidean_distance,
normalize_vectors,
)
import numpy as np
# Single vector to INSERT
query = vector_to_insert("doc1", np.array([0.1, 0.2, 0.3]))
client.execute(query)
# Multiple vectors
vectors = {
"doc1": np.array([0.1, 0.2, 0.3]),
"doc2": np.array([0.4, 0.5, 0.6]),
}
queries = vectors_to_inserts(vectors, normalize=True)
for q in queries:
client.execute(q)
# Parse embedding from result
embedding = parse_embedding("[0.1, 0.2, 0.3]")
# Distance calculations
sim = cosine_similarity(vec1, vec2)
dist = euclidean_distance(vec1, vec2)
# Batch normalization
normalized = normalize_vectors(np.array([vec1, vec2, vec3]))
Usage Examples
Complete CRUD Example
from neumann import NeumannClient
with NeumannClient.connect("localhost:9200") as client:
# Create table
client.execute("CREATE TABLE products (name:string, price:float)")
# Insert data
client.execute('INSERT products name="Widget", price=9.99')
client.execute('INSERT products name="Gadget", price=19.99')
# Query data
result = client.execute("SELECT products WHERE price > 10")
for row in result.rows:
print(row.to_dict())
# Update
client.execute('UPDATE products SET price=24.99 WHERE name="Gadget"')
# Delete
client.execute("DELETE products WHERE price < 15")
# Drop table
client.execute("DROP TABLE products")
Graph Operations
from neumann import NeumannClient
with NeumannClient.connect("localhost:9200") as client:
# Create nodes
client.execute('NODE CREATE person {name: "Alice", age: 30}')
client.execute('NODE CREATE person {name: "Bob", age: 25}')
# Create edge
client.execute("EDGE CREATE 1 -> 2 : knows {since: 2020}")
# List nodes
result = client.execute("NODE LIST person")
for node in result.nodes:
print(f"[{node.id}] {node.label}: {node.to_dict()['properties']}")
# Find neighbors
result = client.execute("NEIGHBORS 1 OUTGOING")
# Find path
result = client.execute("PATH 1 -> 2")
if result.paths:
path = result.paths[0]
print(" -> ".join(n.id for n in path.nodes))
Vector Search with NumPy
from neumann import NeumannClient
from neumann.integrations.numpy import vector_to_insert, normalize_vectors
import numpy as np
with NeumannClient.connect("localhost:9200") as client:
# Generate and store embeddings
embeddings = np.random.randn(100, 768).astype(np.float32)
embeddings = normalize_vectors(embeddings)
for i, emb in enumerate(embeddings):
query = vector_to_insert(f"doc{i}", emb)
client.execute(query)
# Query vector
query_vec = np.random.randn(768).astype(np.float32)
query_str = vector_to_insert("query", query_vec)
client.execute(query_str)
# Find similar
result = client.execute('SIMILAR "query" COSINE LIMIT 10')
for item in result.similar_items:
print(f"{item.key}: {item.score:.4f}")
Async Web Application
from fastapi import FastAPI
from neumann.aio import AsyncNeumannClient
app = FastAPI()
client: AsyncNeumannClient | None = None
@app.on_event("startup")
async def startup():
global client
client = await AsyncNeumannClient.connect(
"localhost:9200",
api_key="your-api-key",
)
@app.on_event("shutdown")
async def shutdown():
if client:
await client.close()
@app.get("/users")
async def get_users():
result = await client.execute("SELECT users")
return [row.to_dict() for row in result.rows]
@app.get("/users/{user_id}")
async def get_user(user_id: int):
result = await client.execute(f"SELECT users WHERE id = {user_id}")
if result.rows:
return result.rows[0].to_dict()
return {"error": "Not found"}
Dependencies
| Package | Purpose | Extra |
|---|---|---|
grpcio | gRPC client | Default |
protobuf | Protocol buffers | Default |
neumann-native | PyO3 bindings | [native] |
pandas | DataFrame support | [pandas] |
numpy | Array support | [numpy] |
Related Modules
| Module | Relationship |
|---|---|
neumann_server | Server that this SDK connects to |
neumann_client | Rust SDK with same capabilities |
@neumann/client | TypeScript SDK with same API design |
TCP Transport
The TCP transport layer provides reliable, secure node-to-node communication for the tensor_chain distributed system. It implements connection pooling, TLS security, rate limiting, compression, and automatic reconnection.
Overview
The TcpTransport implements the Transport trait, providing:
- Connection pooling for efficient peer communication
- TLS encryption with mutual authentication support
- Rate limiting using token bucket algorithm
- LZ4 compression for bandwidth efficiency
- Automatic reconnection with exponential backoff
flowchart TD
A[Application] --> B[TcpTransport]
B --> C[ConnectionManager]
C --> D1[ConnectionPool Node A]
C --> D2[ConnectionPool Node B]
C --> D3[ConnectionPool Node C]
D1 --> E1[TLS Stream]
D2 --> E2[TLS Stream]
D3 --> E3[TLS Stream]
Connection Architecture
Connection Manager
The ConnectionManager maintains connection pools for each peer. Each pool
can hold multiple connections for load distribution.
#![allow(unused)] fn main() { let config = TcpTransportConfig::new("node1", "0.0.0.0:9100".parse()?); let transport = TcpTransport::new(config); transport.start().await?; }
Connection Lifecycle
stateDiagram-v2
[*] --> Connecting
Connecting --> Handshaking: TCP Connected
Handshaking --> Active: Handshake Success
Handshaking --> Failed: Handshake Failed
Active --> Reading: Message Available
Active --> Writing: Send Request
Reading --> Active: Message Processed
Writing --> Active: Message Sent
Active --> Reconnecting: Connection Lost
Reconnecting --> Connecting: Backoff Complete
Reconnecting --> Failed: Max Retries
Failed --> [*]
Configuration
| Parameter | Default | Description |
|---|---|---|
pool_size | 2 | Connections per peer |
connect_timeout_ms | 5000 | Connection timeout in milliseconds |
io_timeout_ms | 30000 | Read/write timeout in milliseconds |
max_message_size | 16 MB | Maximum message size in bytes |
keepalive | true | Enable TCP keepalive |
keepalive_interval_secs | 30 | Keepalive probe interval |
max_pending_messages | 1000 | Outbound queue size per peer |
recv_buffer_size | 1000 | Incoming message channel size |
TLS Security
The transport supports four security modes to accommodate different deployment scenarios.
Security Modes
| Mode | TLS | mTLS | NodeId Verify | Use Case |
|---|---|---|---|---|
| Strict | Yes | Yes | Yes | Production deployments |
| Permissive | Yes | No | No | Gradual TLS rollout |
| Development | No | No | No | Local testing only |
| Legacy | No | No | No | Migration from older versions |
NodeId Verification
NodeId verification ensures the peer’s identity matches their TLS certificate:
| Mode | Description |
|---|---|
| None | Trust NodeId from handshake (testing only) |
| CommonName | NodeId must match certificate CN |
| SubjectAltName | NodeId must match a SAN DNS entry |
TLS Configuration
[tls]
cert_path = "/etc/neumann/node.crt"
key_path = "/etc/neumann/node.key"
ca_cert_path = "/etc/neumann/ca.crt"
require_client_auth = true
node_id_verification = "CommonName"
mTLS Handshake
sequenceDiagram
participant C as Client Node
participant S as Server Node
C->>S: TCP Connect
C->>S: TLS ClientHello
S->>C: TLS ServerHello + Certificate
S->>C: CertificateRequest
C->>S: Client Certificate
C->>S: CertificateVerify
C->>S: Finished
S->>C: Finished
Note over C,S: TLS Established
C->>S: Handshake(node_id, capabilities)
S->>C: Handshake(node_id, capabilities)
Note over C,S: Connection Ready
Rate Limiting
Per-peer rate limiting uses the token bucket algorithm to prevent any single peer from overwhelming the system.
Token Bucket Algorithm
flowchart LR
A[Refill Timer] -->|tokens/sec| B[Token Bucket]
B -->|check| C{Tokens > 0?}
C -->|Yes| D[Allow Message]
C -->|No| E[Reject Message]
D --> F[Consume Token]
Configuration Presets
| Preset | Bucket Size | Refill Rate | Description |
|---|---|---|---|
| Default | 100 | 50/sec | Balanced throughput |
| Aggressive | 50 | 25/sec | Lower burst, tighter limit |
| Permissive | 200 | 100/sec | Higher throughput allowed |
| Disabled | — | — | No rate limiting |
Configuration Example
[rate_limit]
enabled = true
bucket_size = 100
refill_rate = 50.0
Compression
Frame-level LZ4 compression reduces bandwidth usage for larger messages. Compression is negotiated during the handshake.
Frame Format
+--------+--------+------------+
| Length | Flags | Payload |
| 4 bytes| 1 byte | N bytes |
+--------+--------+------------+
Flags byte:
bit 0: 1 = LZ4 compressed, 0 = uncompressed
bits 1-7: reserved (must be 0)
Configuration
| Parameter | Default | Description |
|---|---|---|
enabled | true | Enable compression |
method | Lz4 | Compression algorithm |
min_size | 256 | Minimum payload size to compress |
Messages smaller than min_size are sent uncompressed to avoid overhead.
[compression]
enabled = true
method = "Lz4"
min_size = 256
Reconnection
Automatic reconnection uses exponential backoff with jitter to recover from transient failures without overwhelming the network.
Backoff Calculation
backoff = min(initial * multiplier^attempt, max_backoff)
jitter = backoff * random(-jitter_factor, +jitter_factor)
final_delay = backoff + jitter
Configuration
| Parameter | Default | Description |
|---|---|---|
enabled | true | Enable auto-reconnection |
initial_backoff_ms | 100 | Initial backoff delay |
max_backoff_ms | 30000 | Maximum backoff delay |
multiplier | 2.0 | Exponential multiplier |
max_attempts | None | Max retries (None = infinite) |
jitter | 0.1 | Jitter factor (0.0 to 1.0) |
Backoff Example
| Attempt | Base Delay | With 10% Jitter |
|---|---|---|
| 0 | 100ms | 90-110ms |
| 1 | 200ms | 180-220ms |
| 2 | 400ms | 360-440ms |
| 3 | 800ms | 720-880ms |
| … | … | … |
| 8+ | 30000ms | 27000-33000ms |
Metrics
The transport exposes statistics through TransportStats:
| Metric | Description |
|---|---|
messages_sent | Total messages sent |
messages_received | Total messages received |
bytes_sent | Total bytes sent |
bytes_received | Total bytes received |
peer_count | Number of connected peers |
connection_count | Total active connections |
#![allow(unused)] fn main() { let stats = transport.stats(); println!("Messages sent: {}", stats.messages_sent); println!("Connected peers: {}", stats.peer_count); }
Error Handling
| Error | Cause | Recovery |
|---|---|---|
Timeout | Operation exceeded timeout | Retry with backoff |
PeerNotFound | No pool for peer | Establish connection first |
HandshakeFailed | Protocol mismatch or bad cert | Check configuration |
TlsRequired | TLS needed but not configured | Configure TLS |
MtlsRequired | mTLS needed but not enabled | Enable client auth |
RateLimited | Token bucket exhausted | Wait for refill |
Compression | Decompression failed | Check for data corruption |
Usage Example
#![allow(unused)] fn main() { use tensor_chain::tcp::{ TcpTransport, TcpTransportConfig, TlsConfig, SecurityMode, RateLimitConfig, CompressionConfig, }; // Create secure production configuration let tls = TlsConfig::new_secure( "/etc/neumann/node.crt", "/etc/neumann/node.key", "/etc/neumann/ca.crt", ); let config = TcpTransportConfig::new("node1", "0.0.0.0:9100".parse()?) .with_tls(tls) .with_security_mode(SecurityMode::Strict) .with_rate_limit(RateLimitConfig::default()) .with_compression(CompressionConfig::default()) .with_pool_size(4); // Validate security before starting config.validate_security()?; // Start transport let transport = TcpTransport::new(config); transport.start().await?; // Connect to peer transport.connect(&PeerConfig { node_id: "node2".to_string(), address: "10.0.1.2:9100".to_string(), }).await?; // Send message transport.send(&"node2".to_string(), Message::Ping { term: 1 }).await?; // Receive messages let (from, msg) = transport.recv().await?; }
Source Reference
tensor_chain/src/tcp/config.rs- Configuration typestensor_chain/src/tcp/transport.rs- Transport implementationtensor_chain/src/tcp/tls.rs- TLS wrappertensor_chain/src/tcp/rate_limit.rs- Token bucket rate limitertensor_chain/src/tcp/compression.rs- LZ4 compressiontensor_chain/src/tcp/framing.rs- Wire protocol codectensor_chain/src/tcp/connection.rs- Connection pool
Snapshot Streaming
The snapshot streaming system provides memory-efficient serialization and transfer of Raft log snapshots. It enables handling of large snapshots containing millions of log entries without exhausting heap memory.
Overview
Key features:
- Incremental writing: Entries serialized one at a time via
SnapshotWriter - Lazy reading: Entries deserialized on-demand via
SnapshotReaderiterator - Memory bounded: Automatic disk spill via
SnapshotBuffer - Backwards compatible: Falls back to legacy format for old snapshots
flowchart TD
A[Raft State Machine] -->|write_entry| B[SnapshotWriter]
B -->|finish| C[SnapshotBuffer]
C -->|memory/file| D{Size Check}
D -->|< threshold| E[Memory Mode]
D -->|> threshold| F[File Mode + mmap]
E --> G[Chunk Transfer]
F --> G
G -->|network| H[SnapshotReader]
H -->|iterator| I[Follower Node]
Wire Format
The streaming format uses length-prefixed entries for efficient parsing.
Header Structure
+--------+--------+------------------+
| Magic | Version | Entry Count |
| 4 bytes | 4 bytes | 8 bytes |
+--------+--------+------------------+
| "SNAP" | 1 | u64 LE |
+--------+--------+------------------+
Total: 16 bytes
Entry Structure
+--------+------------------------+
| Length | Bincode-serialized |
| 4 bytes | LogEntry |
+--------+------------------------+
| u32 LE | variable |
+--------+------------------------+
Complete Snapshot Layout
+--------+--------+--------+--------+--------+--------+
| SNAP | Ver(1) | Count | Len1 | Entry1 | Len2 | ...
| 4B | 4B | 8B | 4B | N bytes| 4B | ...
+--------+--------+--------+--------+--------+--------+
Architecture
Leader-to-Follower Flow
sequenceDiagram
participant L as Leader
participant W as SnapshotWriter
participant B as SnapshotBuffer
participant N as Network
participant R as SnapshotReader
participant F as Follower
L->>W: write_entry(entry)
W->>B: serialize + write
Note over B: Memory or File mode
L->>W: finish()
W->>B: finalize()
B->>N: chunk transfer
N->>R: received chunks
R->>F: iterator.next()
F->>F: apply(entry)
SnapshotBuffer State Transitions
stateDiagram-v2
[*] --> Memory: new()
Memory --> Memory: write() [size < threshold]
Memory --> File: write() [size >= threshold]
File --> File: write() [grow if needed]
Memory --> Finalized: finalize()
File --> Finalized: finalize() + fsync
Finalized --> [*]: drop (cleanup)
SnapshotBuffer
The SnapshotBuffer provides adaptive memory/disk storage with bounded
memory usage.
Configuration
| Parameter | Default | Description |
|---|---|---|
max_memory_bytes | 256 MB | Threshold before disk spill |
temp_dir | System | Directory for temp files |
initial_file_capacity | 64 MB | Initial file size when spilling |
Configuration Example
#![allow(unused)] fn main() { use tensor_chain::snapshot_buffer::SnapshotBufferConfig; let config = SnapshotBufferConfig::default() .with_max_memory(512 * 1024 * 1024) // 512 MB .with_temp_dir("/var/lib/neumann/snapshots"); }
Performance Characteristics
| Operation | Memory Mode | File Mode |
|---|---|---|
write() | O(1) amortized | O(1) + possible mmap resize |
as_slice() | O(1) | O(1) zero-copy via mmap |
read_chunk() | O(n) copy | O(n) copy |
finalize() | O(1) | O(1) + fsync |
SnapshotWriter
The SnapshotWriter serializes log entries incrementally using the length-
prefixed format.
Usage
#![allow(unused)] fn main() { use tensor_chain::snapshot_streaming::SnapshotWriter; use tensor_chain::snapshot_buffer::SnapshotBufferConfig; let config = SnapshotBufferConfig::default(); let mut writer = SnapshotWriter::new(config)?; // Write entries incrementally for entry in log_entries { writer.write_entry(&entry)?; } // Check progress println!("Entries: {}", writer.entry_count()); println!("Bytes: {}", writer.bytes_written()); println!("Last index: {}", writer.last_index()); // Finalize and get buffer let buffer = writer.finish()?; }
Progress Tracking
| Method | Description |
|---|---|
entry_count() | Number of entries written |
bytes_written() | Total bytes including header |
last_index() | Index of last entry written |
last_term() | Term of last entry written |
SnapshotReader
The SnapshotReader deserializes entries on-demand using an iterator
interface.
Usage
#![allow(unused)] fn main() { use tensor_chain::snapshot_streaming::SnapshotReader; // Create reader (validates header) let reader = SnapshotReader::new(&buffer)?; println!("Entry count: {}", reader.entry_count()); // Read via iterator for result in reader { let entry = result?; state_machine.apply(entry); } }
Iterator Protocol
sequenceDiagram
participant A as Application
participant R as SnapshotReader
participant B as Buffer
loop For each entry
A->>R: next()
R->>B: read 4 bytes (length)
R->>B: read N bytes (entry)
R->>A: Some(Ok(LogEntry))
end
A->>R: next()
R->>A: None (end)
Progress Tracking
| Method | Description |
|---|---|
entry_count() | Total entries in snapshot |
entries_read() | Entries read so far |
remaining() | Entries not yet read |
Chunk Transfer
For network transfer, the buffer supports chunked reading with resume capability.
Resume Protocol
sequenceDiagram
participant L as Leader
participant F as Follower
L->>F: Chunk 0 (offset=0, len=64KB)
F->>F: Store chunk
Note over F: Network interruption
F->>L: Resume (offset=64KB)
L->>F: Chunk 1 (offset=64KB, len=64KB)
F->>F: Append chunk
L->>F: Chunk 2 (offset=128KB, len=32KB)
F->>F: Complete snapshot
Bandwidth Configuration
| Chunk Size | Use Case |
|---|---|
| 16 KB | High-latency networks |
| 64 KB | Default, balanced |
| 256 KB | Low-latency, high-bandwidth |
| 1 MB | Local/datacenter transfers |
Error Handling
| Error Type | Cause | Recovery |
|---|---|---|
Io | File/mmap operation failed | Check disk space/perms |
Buffer | Out of bounds read | Verify offset/length |
Serialization | Bincode encode/decode failed | Check data integrity |
InvalidFormat | Wrong magic, version, or size | Verify snapshot source |
UnexpectedEof | Truncated data or count error | Re-transfer snapshot |
Security Limits
| Limit | Value | Purpose |
|---|---|---|
| Max entry size | 100 MB | Prevent memory exhaustion |
| Max header version | 1 | Reject unknown formats |
Legacy Compatibility
The system automatically handles legacy (non-streaming) snapshots.
Format Detection
#![allow(unused)] fn main() { use tensor_chain::snapshot_streaming::deserialize_entries; // Automatically detects format let entries = deserialize_entries(snapshot_bytes)?; // Works with: // - Streaming format (magic = "SNAP") // - Legacy bincode Vec<LogEntry> }
Usage Example
Complete Leader Workflow
#![allow(unused)] fn main() { use tensor_chain::snapshot_streaming::{SnapshotWriter, serialize_entries}; use tensor_chain::snapshot_buffer::SnapshotBufferConfig; // Create optimized config for large snapshots let config = SnapshotBufferConfig::default() .with_max_memory(256 * 1024 * 1024); // Serialize incrementally let mut writer = SnapshotWriter::new(config)?; for entry in state_machine.log_entries() { writer.write_entry(&entry)?; } let buffer = writer.finish()?; // Serve chunks to followers let total_len = buffer.total_len(); let chunk_size = 64 * 1024; let mut offset = 0; while offset < total_len { let len = (total_len - offset).min(chunk_size as u64) as usize; let chunk = buffer.as_slice(offset, len)?; send_chunk_to_follower(offset, chunk)?; offset += len as u64; } }
Complete Follower Workflow
#![allow(unused)] fn main() { use tensor_chain::snapshot_streaming::SnapshotReader; use tensor_chain::snapshot_buffer::SnapshotBuffer; // Receive and assemble chunks let mut buffer = SnapshotBuffer::with_defaults()?; while let Some(chunk) = receive_chunk() { buffer.write(&chunk)?; } buffer.finalize()?; // Verify integrity let expected_hash = received_hash; let actual_hash = buffer.hash(); assert_eq!(expected_hash, actual_hash); // Apply entries let reader = SnapshotReader::new(&buffer)?; for result in reader { let entry = result?; state_machine.apply(entry)?; } }
Source Reference
tensor_chain/src/snapshot_streaming.rs- Streaming protocoltensor_chain/src/snapshot_buffer.rs- Adaptive buffer implementation
Transaction Workspace
The transaction workspace system provides ACID transaction semantics for tensor_chain operations. It enables isolated execution with snapshot-based reads and atomic commits via delta tracking.
Overview
Key features:
- Snapshot isolation: Reads see consistent state from transaction start
- Delta tracking: Changes tracked as semantic embeddings for conflict detection
- Atomic commit: All-or-nothing via block append
- Cross-shard coordination: Two-phase commit (2PC) for distributed transactions
flowchart TD
A[Client] -->|begin| B[TransactionWorkspace]
B -->|snapshot| C[Checkpoint]
B -->|add_operation| D[Operations]
B -->|compute_delta| E[EmbeddingState]
E --> F{Conflict Check}
F -->|orthogonal| G[Commit]
F -->|conflicting| H[Rollback]
H -->|restore| C
Workspace Lifecycle
State Machine
stateDiagram-v2
[*] --> Active: begin()
Active --> Active: add_operation()
Active --> Committing: mark_committing()
Committing --> Committed: mark_committed()
Committing --> Failed: error
Active --> RolledBack: rollback()
Failed --> [*]
Committed --> [*]
RolledBack --> [*]
State Descriptions
| State | Description |
|---|---|
| Active | Operations can be added |
| Committing | Commit in progress, no more operations |
| Committed | Successfully committed to the chain |
| RolledBack | Rolled back, state restored from checkpoint |
| Failed | Error during commit, requires manual resolution |
Transaction Operations
Basic Usage
#![allow(unused)] fn main() { use tensor_chain::{TensorStore, TransactionWorkspace}; use tensor_chain::block::Transaction; let store = TensorStore::new(); // Begin transaction let workspace = TransactionWorkspace::begin(&store)?; // Add operations workspace.add_operation(Transaction::Put { key: "user:1".to_string(), data: vec![1, 2, 3], })?; workspace.add_operation(Transaction::Put { key: "user:2".to_string(), data: vec![4, 5, 6], })?; // Check affected keys let keys = workspace.affected_keys(); assert!(keys.contains("user:1")); // Commit or rollback workspace.mark_committing()?; workspace.mark_committed(); }
Operation Types
| Operation | Description | Affected Key |
|---|---|---|
Put | Insert or update key | The key itself |
Delete | Remove key | The key itself |
Update | Modify existing key | The key itself |
Delta Tracking
The workspace tracks changes as semantic embeddings using the EmbeddingState
machine. This enables conflict detection based on vector similarity.
Before/After Embedding Flow
flowchart LR
A[begin] -->|capture| B[before embedding]
B --> C[operations]
C -->|compute| D[after embedding]
D --> E[delta = after - before]
E --> F[DeltaVector]
Computing Deltas
#![allow(unused)] fn main() { // Set the before-state embedding at transaction start workspace.set_before_embedding(before_embedding); // ... execute operations ... // Compute delta at commit time workspace.compute_delta(after_embedding); // Get delta for conflict detection let delta_vector = workspace.to_delta_vector(); }
Delta Vector Structure
| Field | Type | Description |
|---|---|---|
embedding | Vec<f32> | Semantic change vector |
affected_keys | HashSet<String> | Keys modified by transaction |
tx_id | u64 | Transaction identifier |
Isolation Levels
The workspace provides snapshot isolation by default. All reads within a
transaction see the state captured at begin().
| Level | Dirty Reads | Non-Repeatable | Phantom Reads |
|---|---|---|---|
| Snapshot (default) | No | No | No |
Snapshot Mechanism
begin()captures the store state as a binary checkpoint- All reads within the transaction see this snapshot
rollback()restores the snapshot if needed- Checkpoint is discarded after commit/rollback
Lock Management
For distributed transactions, the LockManager provides key-level locking
with deadlock prevention.
Lock Ordering
To prevent deadlocks, always acquire locks in this order:
pending- Transaction state maplock_manager.locks- Key-level lockslock_manager.tx_locks- Per-transaction lock setspending_aborts- Abort queue
Lock Configuration
| Parameter | Default | Description |
|---|---|---|
default_timeout | 30s | Lock expiration time |
timeout_ms | 5000 | Transaction timeout |
Lock Acquisition
#![allow(unused)] fn main() { // Try to acquire locks for multiple keys match lock_manager.try_lock(tx_id, &keys) { Ok(lock_handle) => { // Locks acquired successfully } Err(conflicting_tx) => { // Another transaction holds a lock } } }
Cross-Shard Coordination
Distributed transactions use two-phase commit (2PC) for cross-shard coordination.
2PC Protocol
sequenceDiagram
participant C as Coordinator
participant S1 as Shard 1
participant S2 as Shard 2
Note over C: Phase 1: Prepare
C->>S1: TxPrepare(ops, delta)
C->>S2: TxPrepare(ops, delta)
S1->>S1: acquire locks
S2->>S2: acquire locks
S1->>S1: check conflicts
S2->>S2: check conflicts
S1->>C: Vote(Yes, delta)
S2->>C: Vote(Yes, delta)
Note over C: Phase 2: Commit
C->>S1: TxCommit
C->>S2: TxCommit
S1->>C: Ack
S2->>C: Ack
Transaction Phases
| Phase | Description |
|---|---|
| Preparing | Acquiring locks, computing deltas |
| Prepared | All participants voted YES |
| Committing | Finalizing the commit |
| Committed | Successfully committed |
| Aborting | Rolling back due to NO vote or timeout |
| Aborted | Successfully aborted |
Prepare Vote Types
| Vote | Description | Action |
|---|---|---|
| Yes | Ready to commit, locks acquired | Proceed to Phase 2 |
| No | Cannot commit (validation failed) | Abort |
| Conflict | Detected semantic conflict | Abort |
Conflict Detection
The workspace uses delta embeddings to detect conflicts based on vector similarity.
Orthogonality Check
Two transactions are considered orthogonal (non-conflicting) if their delta vectors have low cosine similarity:
similarity = cos(delta_A, delta_B)
orthogonal = abs(similarity) < threshold
Configuration
| Parameter | Default | Description |
|---|---|---|
orthogonal_threshold | 0.1 | Max similarity for orthogonal |
merge_window_ms | 60000 | Window for merge candidates |
Merge Candidates
The TransactionManager can find transactions eligible for parallel commit:
#![allow(unused)] fn main() { // Find orthogonal transactions that can be merged let candidates = manager.find_merge_candidates( &workspace, 0.1, // orthogonal threshold 60_000, // merge window (60s) ); // Candidates are sorted by similarity (most orthogonal first) for candidate in candidates { println!("Tx {} similarity: {}", candidate.workspace.id(), candidate.similarity); } }
Error Handling
| Error | Cause | Recovery |
|---|---|---|
TransactionFailed | Operation on non-active workspace | Check state first |
WorkspaceError | Snapshot/restore failed | Check store health |
LockConflict | Another tx holds the lock | Retry with backoff |
Timeout | Transaction exceeded timeout | Increase timeout |
Usage Example
Complete Transaction Flow
#![allow(unused)] fn main() { use tensor_chain::{TensorStore, TransactionManager}; use tensor_chain::block::Transaction; // Create manager let store = TensorStore::new(); let manager = TransactionManager::new(); // Begin transaction let workspace = manager.begin(&store)?; // Add operations workspace.add_operation(Transaction::Put { key: "account:1".to_string(), data: serialize(&Account { balance: 100 }), })?; workspace.add_operation(Transaction::Put { key: "account:2".to_string(), data: serialize(&Account { balance: 200 }), })?; // Set embeddings for conflict detection workspace.set_before_embedding(vec![0.0; 128]); workspace.compute_delta(compute_state_embedding(&store)); // Check for conflicts with other active transactions let candidates = manager.find_merge_candidates(&workspace, 0.1, 60_000); if candidates.is_empty() { // No orthogonal transactions, commit alone workspace.mark_committing()?; workspace.mark_committed(); } else { // Can merge with orthogonal transactions // ... merge logic ... } // Remove from manager manager.remove(workspace.id()); }
Source Reference
tensor_chain/src/transaction.rs- TransactionWorkspace, TransactionManagertensor_chain/src/distributed_tx.rs- 2PC coordinator, LockManagertensor_chain/src/embedding.rs- EmbeddingState machinetensor_chain/src/consensus.rs- DeltaVector, conflict detection
Python SDK Quickstart
The Neumann Python SDK provides both synchronous and asynchronous clients for querying a Neumann server, with optional embedded mode via PyO3 bindings.
Installation
pip install neumann
Connect
Remote (gRPC)
from neumann import NeumannClient
client = NeumannClient.connect("localhost:50051", api_key="your-api-key")
Embedded (no server needed)
client = NeumannClient.embedded(path="/tmp/neumann-data")
Async
from neumann.aio import AsyncNeumannClient
async with await AsyncNeumannClient.connect("localhost:50051") as client:
result = await client.query("SELECT * FROM users")
Execute Queries
# Single query
result = client.query("SELECT * FROM users WHERE age > 25")
# Batch queries
results = client.execute_batch([
"INSERT INTO users VALUES (1, 'Alice', 30)",
"INSERT INTO users VALUES (2, 'Bob', 25)",
])
# Streaming results
for chunk in client.execute_stream("SELECT * FROM large_table"):
process(chunk)
Handle Results
Results are typed by QueryResultType. Check the type before accessing data:
from neumann import QueryResultType
result = client.query("SELECT * FROM users")
if result.type == QueryResultType.ROWS:
for row in result.rows:
print(row.get_string("name"), row.get_int("age"))
# Or convert to dict:
print(row.to_dict())
elif result.type == QueryResultType.COUNT:
print(f"Count: {result.count}")
elif result.type == QueryResultType.NODES:
for node in result.nodes:
print(node.id, node.label, node.properties)
elif result.type == QueryResultType.SIMILAR:
for item in result.similar_items:
print(f"{item.key}: {item.score:.4f}")
Result types
| Type | Field | Description |
|---|---|---|
ROWS | result.rows | Relational query results |
NODES | result.nodes | Graph nodes |
EDGES | result.edges | Graph edges |
PATHS | result.paths | Graph paths |
SIMILAR | result.similar_items | Vector similarity results |
COUNT | result.count | Integer count |
VALUE | result.value | Single scalar value |
TABLE_LIST | result.tables | Available tables |
EMPTY | – | No result |
Vector Operations
For dedicated vector operations, use VectorClient:
from neumann import VectorClient, VectorPoint
vectors = VectorClient.connect("localhost:50051", api_key="your-key")
# Create a collection
vectors.create_collection("documents", dimension=384, distance="cosine")
# Upsert points
vectors.upsert_points("documents", [
VectorPoint(id="doc1", vector=[0.1, 0.2, ...], payload={"title": "Hello"}),
VectorPoint(id="doc2", vector=[0.3, 0.4, ...], payload={"title": "World"}),
])
# Query similar points
results = vectors.query_points(
"documents",
query_vector=[0.15, 0.25, ...],
limit=10,
score_threshold=0.8,
with_payload=True,
)
for point in results:
print(f"{point.id}: {point.score:.4f} - {point.payload}")
# Manage collections
names = vectors.list_collections()
info = vectors.get_collection("documents")
count = vectors.count_points("documents")
vectors.close()
Pandas Integration
Convert query results to DataFrames:
from neumann.integrations.pandas import result_to_dataframe, dataframe_to_inserts
# Query to DataFrame
result = client.query("SELECT * FROM users")
df = result_to_dataframe(result)
print(df.head())
# DataFrame to INSERT queries
queries = dataframe_to_inserts(df, "users_backup")
client.execute_batch(queries)
NumPy Integration
Work with vectors as NumPy arrays:
import numpy as np
from neumann.integrations.numpy import (
vector_to_insert,
vectors_to_inserts,
cosine_similarity,
normalize_vectors,
)
# Single vector insert
query = vector_to_insert("doc1", np.array([0.1, 0.2, 0.3]), normalize=True)
client.query(query)
# Batch insert
vectors_dict = {"doc1": np.array([0.1, 0.2]), "doc2": np.array([0.3, 0.4])}
queries = vectors_to_inserts(vectors_dict)
client.execute_batch(queries)
# Compute similarity locally
sim = cosine_similarity(vec1, vec2)
Error Handling
from neumann import (
NeumannError,
ConnectionError,
AuthenticationError,
NotFoundError,
QueryError,
ParseError,
)
try:
result = client.query("SELECT * FROM nonexistent")
except NotFoundError as e:
print(f"Not found: {e.message}")
except ParseError as e:
print(f"Syntax error: {e.message}")
except ConnectionError as e:
print(f"Connection failed: {e}")
except NeumannError as e:
print(f"Error [{e.code.name}]: {e.message}")
Configuration
Fine-tune timeouts, retries, and keepalive:
from neumann import ClientConfig, TimeoutConfig, RetryConfig
config = ClientConfig(
timeout=TimeoutConfig(
default_timeout_s=30.0,
connect_timeout_s=10.0,
),
retry=RetryConfig(
max_attempts=3,
initial_backoff_ms=100,
max_backoff_ms=10000,
backoff_multiplier=2.0,
),
)
client = NeumannClient.connect("localhost:50051", config=config)
# Preset configurations
config = ClientConfig.fast_fail() # 5s timeout, 1 attempt
config = ClientConfig.no_retry() # Default timeout, 1 attempt
config = ClientConfig.high_latency() # 120s timeout, 5 attempts
Next Steps
- Query Language Reference – All commands
- TypeScript SDK – TypeScript alternative
- Architecture – SDK internals
TypeScript SDK Quickstart
The Neumann TypeScript SDK provides a fully typed client for Node.js (gRPC) and browser (gRPC-Web) environments.
Installation
npm install @neumann/client
# or
yarn add @neumann/client
Connect
Node.js (gRPC)
import { NeumannClient } from '@neumann/client';
const client = await NeumannClient.connect('localhost:9200', {
apiKey: 'your-api-key',
tls: false,
});
Browser (gRPC-Web)
const client = await NeumannClient.connectWeb('http://localhost:9200');
Execute Queries
// Single query
const result = await client.query('SELECT * FROM users WHERE age > 25');
// Batch queries
const results = await client.executeBatch([
"INSERT INTO users VALUES (1, 'Alice', 30)",
"INSERT INTO users VALUES (2, 'Bob', 25)",
]);
// Streaming results
for await (const chunk of client.executeStream('SELECT * FROM large_table')) {
process(chunk);
}
// Paginated results
const page = await client.executePaginated('SELECT * FROM users', {
pageSize: 100,
countTotal: true,
});
console.log(`Total: ${page.totalCount}, Has more: ${page.hasMore}`);
Handle Results
Results use discriminated unions with type guard functions:
import {
isRowsResult,
isNodesResult,
isSimilarResult,
isCountResult,
rowToObject,
} from '@neumann/client';
const result = await client.query('SELECT * FROM users');
if (isRowsResult(result)) {
for (const row of result.rows) {
const obj = rowToObject(row);
console.log(obj.name, obj.age);
}
}
if (isNodesResult(result)) {
for (const node of result.nodes) {
console.log(node.id, node.label, node.properties);
}
}
if (isSimilarResult(result)) {
for (const item of result.items) {
console.log(`${item.key}: ${item.score.toFixed(4)}`);
}
}
if (isCountResult(result)) {
console.log(`Count: ${result.count}`);
}
Result type guards
| Guard | Result Field | Description |
|---|---|---|
isRowsResult | result.rows | Relational query results |
isNodesResult | result.nodes | Graph nodes |
isEdgesResult | result.edges | Graph edges |
isSimilarResult | result.items | Vector similarity results |
isCountResult | result.count | Integer count |
isValueResult | result.value | Single scalar value |
isErrorResult | result.error | Error message |
Transactions
// Automatic commit/rollback
const result = await client.withTransaction(async (tx) => {
await tx.execute("INSERT INTO users VALUES (1, 'Alice', 30)");
await tx.execute("INSERT INTO users VALUES (2, 'Bob', 25)");
return 'inserted';
});
// Transaction is committed on success, rolled back on error
// Manual control
const tx = client.beginTransaction();
await tx.begin();
await tx.execute("INSERT INTO users VALUES (3, 'Carol', 28)");
await tx.commit(); // or tx.rollback()
Vector Operations
For dedicated vector operations, use VectorClient:
import { VectorClient } from '@neumann/client';
const vectors = await VectorClient.connect('localhost:9200');
// Create a collection
await vectors.createCollection('documents', 384, 'cosine');
// Upsert points
await vectors.upsertPoints('documents', [
{ id: 'doc1', vector: [0.1, 0.2, ...], payload: { title: 'Hello' } },
{ id: 'doc2', vector: [0.3, 0.4, ...], payload: { title: 'World' } },
]);
// Query similar points
const results = await vectors.queryPoints('documents', [0.15, 0.25, ...], {
limit: 10,
scoreThreshold: 0.8,
withPayload: true,
});
for (const point of results) {
console.log(`${point.id}: ${point.score.toFixed(4)} - ${JSON.stringify(point.payload)}`);
}
// Scroll through all points
for await (const point of vectors.scrollAllPoints('documents')) {
console.log(point.id);
}
// Manage collections
const names = await vectors.listCollections();
const info = await vectors.getCollection('documents');
const count = await vectors.countPoints('documents');
vectors.close();
Blob Operations
Upload and download binary artifacts:
import { BlobClient } from '@neumann/client';
// Upload from buffer
const result = await blob.uploadBlob('document.pdf', Buffer.from(data), {
contentType: 'application/pdf',
tags: ['quarterly', 'report'],
linkedTo: ['entity-id'],
});
console.log(`Uploaded: ${result.artifactId}`);
// Download as buffer
const data = await blob.downloadBlobFull(result.artifactId);
// Stream download
for await (const chunk of blob.downloadBlob(result.artifactId)) {
process(chunk);
}
// Metadata
const metadata = await blob.getBlobMetadata(result.artifactId);
console.log(`Size: ${metadata.size}, Type: ${metadata.contentType}`);
Error Handling
import {
NeumannError,
ConnectionError,
AuthenticationError,
NotFoundError,
ParseError,
ErrorCode,
} from '@neumann/client';
try {
const result = await client.query('SELECT * FROM nonexistent');
} catch (err) {
if (err instanceof NotFoundError) {
console.log(`Not found: ${err.message}`);
} else if (err instanceof ParseError) {
console.log(`Syntax error: ${err.message}`);
} else if (err instanceof ConnectionError) {
console.log(`Connection failed: ${err.message}`);
} else if (err instanceof NeumannError) {
console.log(`Error [${ErrorCode[err.code]}]: ${err.message}`);
}
}
Configuration
import {
ClientConfig,
mergeClientConfig,
noRetryConfig,
fastFailConfig,
highLatencyConfig,
} from '@neumann/client';
const config: ClientConfig = {
timeout: {
defaultTimeoutS: 30,
connectTimeoutS: 10,
},
retry: {
maxAttempts: 3,
initialBackoffMs: 100,
maxBackoffMs: 10000,
backoffMultiplier: 2.0,
},
};
const client = await NeumannClient.connect('localhost:9200', { config });
// Preset configurations
const fast = fastFailConfig(); // 5s timeout, 1 attempt
const noRetry = noRetryConfig(); // Default timeout, 1 attempt
const highLat = highLatencyConfig(); // 120s timeout, 5 attempts
Pagination
Iterate through large result sets:
// Single page
const page = await client.executePaginated('SELECT * FROM users', {
pageSize: 100,
countTotal: true,
cursorTtlSecs: 300,
});
// Iterate all pages
for await (const result of client.executeAllPages('SELECT * FROM users')) {
if (isRowsResult(result)) {
for (const row of result.rows) {
process(rowToObject(row));
}
}
}
// Clean up cursor
if (page.nextCursor) {
await client.closeCursor(page.nextCursor);
}
Next Steps
- Query Language Reference – All commands
- Python SDK – Python alternative
- Architecture – SDK internals
Query Language Reference
Neumann uses a SQL-inspired query language extended with graph, vector, blob, vault, cache, and chain commands. All commands are case-insensitive.
Relational Commands
SELECT
SELECT [DISTINCT] columns
FROM table [alias]
[JOIN table ON condition | USING (columns)]
[WHERE condition]
[GROUP BY columns]
[HAVING condition]
[ORDER BY columns [ASC|DESC] [NULLS FIRST|LAST]]
[LIMIT n]
[OFFSET n]
Columns can be *, expressions, or expr AS alias. Supports subqueries in FROM
and WHERE clauses.
Join types: INNER, LEFT, RIGHT, FULL, CROSS, NATURAL.
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.total > 100
ORDER BY o.total DESC
LIMIT 10
INSERT
INSERT INTO table [(columns)] VALUES (values), ...
INSERT INTO table [(columns)] SELECT ...
INSERT INTO users (id, name, age) VALUES (1, 'Alice', 30)
INSERT INTO users VALUES (2, 'Bob', 25), (3, 'Carol', 28)
UPDATE
UPDATE table SET column = value, ... [WHERE condition]
UPDATE users SET age = 31 WHERE name = 'Alice'
DELETE
DELETE FROM table [WHERE condition]
DELETE FROM users WHERE age < 18
CREATE TABLE
CREATE TABLE [IF NOT EXISTS] name (
column type [constraints],
...
[table_constraints]
)
Column types: INT, INTEGER, BIGINT, SMALLINT, FLOAT, DOUBLE, REAL,
DECIMAL(p,s), NUMERIC(p,s), VARCHAR(n), CHAR(n), TEXT, BOOLEAN, DATE,
TIME, TIMESTAMP, BLOB.
Column constraints: NOT NULL, NULL, UNIQUE, PRIMARY KEY,
DEFAULT expr, CHECK(expr), REFERENCES table(column) [ON DELETE|UPDATE action].
Table constraints: PRIMARY KEY (columns), UNIQUE (columns),
FOREIGN KEY (columns) REFERENCES table(column), CHECK(expr).
Referential actions: CASCADE, RESTRICT, SET NULL, SET DEFAULT, NO ACTION.
CREATE TABLE orders (
id INT PRIMARY KEY,
user_id INT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
total FLOAT DEFAULT 0.0,
created TIMESTAMP,
UNIQUE (user_id, created)
)
DROP TABLE
DROP TABLE [IF EXISTS] name [CASCADE]
DROP TABLE IF EXISTS orders CASCADE
CREATE INDEX
CREATE [UNIQUE] INDEX [IF NOT EXISTS] name ON table (columns)
CREATE INDEX idx_users_name ON users (name)
CREATE UNIQUE INDEX idx_email ON users (email)
DROP INDEX
DROP INDEX [IF EXISTS] name
DROP INDEX ON table(column)
DROP INDEX idx_users_name
DROP INDEX ON users(email)
SHOW TABLES
SHOW TABLES
Lists all relational tables.
DESCRIBE
DESCRIBE TABLE name
DESCRIBE NODE label
DESCRIBE EDGE type
Shows the schema of a table, node label, or edge type.
DESCRIBE TABLE users
DESCRIBE NODE person
DESCRIBE EDGE reports_to
Graph Commands
NODE CREATE
NODE CREATE label { key: value, ... }
Creates a node with the given label and properties.
NODE CREATE person { name: 'Alice', role: 'Engineer', team: 'Platform' }
NODE GET
NODE GET id
Retrieves a node by its ID.
NODE GET 'abc-123'
NODE DELETE
NODE DELETE id
Deletes a node by its ID.
NODE DELETE 'abc-123'
NODE LIST
NODE LIST [label] [LIMIT n] [OFFSET m]
Lists nodes, optionally filtered by label.
NODE LIST person LIMIT 10
NODE LIST
EDGE CREATE
EDGE CREATE from_id -> to_id : edge_type [{ key: value, ... }]
Creates a directed edge between two nodes.
EDGE CREATE 'alice-id' -> 'bob-id' : reports_to { since: '2024-01' }
EDGE GET
EDGE GET id
Retrieves an edge by its ID.
EDGE DELETE
EDGE DELETE id
Deletes an edge by its ID.
EDGE LIST
EDGE LIST [type] [LIMIT n] [OFFSET m]
Lists edges, optionally filtered by type.
EDGE LIST reports_to LIMIT 20
NEIGHBORS
NEIGHBORS id [OUTGOING|INCOMING|BOTH] [: edge_type]
[BY SIMILARITY [vector] LIMIT n]
Finds neighbors of a node. The optional BY SIMILARITY clause enables cross-engine
queries that combine graph traversal with vector similarity.
NEIGHBORS 'alice-id' OUTGOING : reports_to
NEIGHBORS 'node-1' BOTH BY SIMILARITY [0.1, 0.2, 0.3] LIMIT 5
PATH
PATH [SHORTEST|ALL|WEIGHTED|ALL_WEIGHTED|VARIABLE] from_id TO to_id
[MAX_DEPTH n] [MIN_DEPTH n] [WEIGHT property]
Finds paths between two nodes.
PATH SHORTEST 'alice-id' TO 'ceo-id'
PATH WEIGHTED 'a' TO 'b' WEIGHT cost MAX_DEPTH 5
PATH ALL 'start' TO 'end' MIN_DEPTH 2 MAX_DEPTH 4
Graph Algorithms
PAGERANK
PAGERANK [DAMPING d] [TOLERANCE t] [MAX_ITERATIONS n]
[DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]
Computes PageRank scores for all nodes.
PAGERANK DAMPING 0.85 MAX_ITERATIONS 100
PAGERANK EDGE_TYPE collaborates
BETWEENNESS
BETWEENNESS [SAMPLING_RATIO r]
[DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]
Computes betweenness centrality for all nodes.
BETWEENNESS SAMPLING_RATIO 0.5
CLOSENESS
CLOSENESS [DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]
Computes closeness centrality for all nodes.
EIGENVECTOR
EIGENVECTOR [MAX_ITERATIONS n] [TOLERANCE t]
[DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]
Computes eigenvector centrality for all nodes.
LOUVAIN
LOUVAIN [RESOLUTION r] [MAX_PASSES n]
[DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]
Detects communities using the Louvain algorithm.
LOUVAIN RESOLUTION 1.0 MAX_PASSES 10
LABEL_PROPAGATION
LABEL_PROPAGATION [MAX_ITERATIONS n]
[DIRECTION OUTGOING|INCOMING|BOTH] [EDGE_TYPE type]
Detects communities using label propagation.
Graph Constraints
GRAPH CONSTRAINT CREATE
GRAPH CONSTRAINT CREATE name ON NODE|EDGE [(label)] property UNIQUE|EXISTS|TYPE 'type'
Creates a property constraint on nodes or edges.
GRAPH CONSTRAINT CREATE unique_email ON NODE (person) email UNIQUE
GRAPH CONSTRAINT CREATE requires_name ON NODE name EXISTS
GRAPH CONSTRAINT DROP
GRAPH CONSTRAINT DROP name
GRAPH CONSTRAINT LIST
GRAPH CONSTRAINT LIST
Lists all graph constraints.
GRAPH CONSTRAINT GET
GRAPH CONSTRAINT GET name
Graph Indexes
GRAPH INDEX CREATE
GRAPH INDEX CREATE NODE PROPERTY property
GRAPH INDEX CREATE EDGE PROPERTY property
GRAPH INDEX CREATE LABEL
GRAPH INDEX CREATE EDGE_TYPE
Creates a graph property or label index.
GRAPH INDEX DROP
GRAPH INDEX DROP NODE property
GRAPH INDEX DROP EDGE property
GRAPH INDEX SHOW
GRAPH INDEX SHOW NODE
GRAPH INDEX SHOW EDGE
Graph Aggregation
COUNT NODES / COUNT EDGES
GRAPH AGGREGATE COUNT NODES [label]
GRAPH AGGREGATE COUNT EDGES [type]
GRAPH AGGREGATE COUNT NODES person
GRAPH AGGREGATE COUNT EDGES reports_to
AGGREGATE property
GRAPH AGGREGATE SUM|AVG|MIN|MAX|COUNT NODE property [label] [WHERE condition]
GRAPH AGGREGATE SUM|AVG|MIN|MAX|COUNT EDGE property [type] [WHERE condition]
GRAPH AGGREGATE AVG NODE age person
GRAPH AGGREGATE SUM EDGE weight collaborates WHERE weight > 0.5
Graph Pattern Matching
PATTERN MATCH
GRAPH PATTERN MATCH (pattern) [LIMIT n]
GRAPH PATTERN COUNT (pattern)
GRAPH PATTERN EXISTS (pattern)
Matches structural patterns in the graph.
GRAPH PATTERN MATCH (a:person)-[:reports_to]->(b:person) LIMIT 10
GRAPH PATTERN EXISTS (a:person)-[:mentors]->(b:person)
Graph Batch Operations
GRAPH BATCH CREATE NODES
GRAPH BATCH CREATE NODES [(label { props }), ...]
GRAPH BATCH CREATE EDGES
GRAPH BATCH CREATE EDGES [(from -> to : type { props }), ...]
GRAPH BATCH DELETE NODES
GRAPH BATCH DELETE NODES [id1, id2, ...]
GRAPH BATCH DELETE EDGES
GRAPH BATCH DELETE EDGES [id1, id2, ...]
GRAPH BATCH UPDATE NODES
GRAPH BATCH UPDATE NODES [(id { props }), ...]
Vector Commands
EMBED STORE
EMBED STORE key [vector] [IN collection]
Stores a vector embedding with an associated key.
EMBED STORE 'doc1' [0.1, 0.2, 0.3, 0.4]
EMBED STORE 'doc2' [0.5, 0.6, 0.7, 0.8] IN my_collection
EMBED GET
EMBED GET key [IN collection]
Retrieves a stored embedding.
EMBED GET 'doc1'
EMBED DELETE
EMBED DELETE key [IN collection]
Deletes a stored embedding.
EMBED BUILD INDEX
EMBED BUILD INDEX [IN collection]
Builds or rebuilds the HNSW index for similarity search.
EMBED BATCH
EMBED BATCH [('key1', [v1, v2, ...]), ('key2', [v1, v2, ...])] [IN collection]
Stores multiple embeddings in a single operation.
EMBED BATCH [('doc1', [0.1, 0.2]), ('doc2', [0.3, 0.4])]
SIMILAR
SIMILAR key|[vector] [LIMIT n] [METRIC COSINE|EUCLIDEAN|DOT_PRODUCT]
[CONNECTED TO node_id] [IN collection] [WHERE condition]
Finds similar embeddings by key or vector. The optional CONNECTED TO clause
combines vector similarity with graph connectivity for cross-engine queries.
SIMILAR 'doc1' LIMIT 5
SIMILAR [0.1, 0.2, 0.3] LIMIT 10 METRIC COSINE
SIMILAR [0.1, 0.2, 0.3] LIMIT 5 CONNECTED TO 'alice-id'
SIMILAR 'doc1' LIMIT 10 IN my_collection WHERE score > 0.8
SHOW EMBEDDINGS
SHOW EMBEDDINGS [LIMIT n]
Lists stored embeddings.
SHOW VECTOR INDEX
SHOW VECTOR INDEX
Shows information about the HNSW index.
COUNT EMBEDDINGS
COUNT EMBEDDINGS
Returns the number of stored embeddings.
Unified Entity Commands
ENTITY CREATE
ENTITY CREATE key { properties } [EMBEDDING [vector]]
Creates a unified entity with optional embedding. A unified entity spans all engines: it is stored as relational data, as a graph node, and optionally as a vector embedding.
ENTITY CREATE 'alice' { name: 'Alice', role: 'Engineer' } EMBEDDING [0.1, 0.2, 0.3]
ENTITY GET
ENTITY GET key
Retrieves a unified entity with all its data across engines.
ENTITY UPDATE
ENTITY UPDATE key { properties } [EMBEDDING [vector]]
Updates an existing unified entity.
ENTITY UPDATE 'alice' { role: 'Senior Engineer' } EMBEDDING [0.15, 0.25, 0.35]
ENTITY DELETE
ENTITY DELETE key
Deletes a unified entity from all engines.
ENTITY CONNECT
ENTITY CONNECT from_key -> to_key : edge_type
Creates a relationship between two unified entities.
ENTITY CONNECT 'alice' -> 'bob' : reports_to
ENTITY BATCH
ENTITY BATCH CREATE [{ key: 'k1', props... }, { key: 'k2', props... }]
Creates multiple unified entities in a single operation.
FIND
FIND NODE [label] [WHERE condition] [LIMIT n]
FIND EDGE [type] [WHERE condition] [LIMIT n]
FIND ROWS FROM table [WHERE condition] [LIMIT n]
FIND PATH from_label -[edge_type]-> to_label [WHERE condition] [LIMIT n]
Cross-engine search that queries across relational, graph, and vector engines.
FIND NODE person WHERE name = 'Alice'
FIND EDGE reports_to LIMIT 10
FIND ROWS FROM users WHERE age > 25
FIND PATH person -[reports_to]-> person LIMIT 5
Vault Commands
VAULT SET
VAULT SET key value
Stores an encrypted secret.
VAULT SET 'api_key' 'sk-abc123'
VAULT GET
VAULT GET key
Retrieves a decrypted secret (requires appropriate access).
VAULT GET 'api_key'
VAULT DELETE
VAULT DELETE key
Deletes a secret.
VAULT LIST
VAULT LIST [pattern]
Lists secrets, optionally filtered by pattern.
VAULT LIST
VAULT LIST 'api_*'
VAULT ROTATE
VAULT ROTATE key new_value
Rotates a secret to a new value while maintaining the same key.
VAULT ROTATE 'api_key' 'sk-new456'
VAULT GRANT
VAULT GRANT entity ON key
Grants an entity access to a secret.
VAULT GRANT 'alice' ON 'api_key'
VAULT REVOKE
VAULT REVOKE entity ON key
Revokes an entity’s access to a secret.
VAULT REVOKE 'bob' ON 'api_key'
Cache Commands
CACHE INIT
CACHE INIT
Initializes the LLM response cache.
CACHE STATS
CACHE STATS
Shows cache hit/miss statistics.
CACHE CLEAR
CACHE CLEAR
Clears all cache entries.
CACHE EVICT
CACHE EVICT [n]
Evicts the least recently used entries. If n is provided, evicts that many.
CACHE GET
CACHE GET key
Retrieves a cached response by exact key.
CACHE GET 'what is machine learning?'
CACHE PUT
CACHE PUT key value
Stores a response in the cache.
CACHE PUT 'what is ML?' 'Machine learning is...'
CACHE SEMANTIC GET
CACHE SEMANTIC GET query [THRESHOLD n]
Performs a semantic similarity lookup in the cache. Returns the closest matching cached response if it exceeds the similarity threshold.
CACHE SEMANTIC GET 'explain machine learning' THRESHOLD 0.85
CACHE SEMANTIC PUT
CACHE SEMANTIC PUT query response EMBEDDING [vector]
Stores a response with its embedding for semantic matching.
CACHE SEMANTIC PUT 'what is ML?' 'Machine learning is...' EMBEDDING [0.1, 0.2, 0.3]
Blob Storage Commands
BLOB INIT
BLOB INIT
Initializes the blob storage engine.
BLOB PUT
BLOB PUT filename [DATA value | FROM path]
[TYPE content_type] [BY creator] [LINK entity, ...] [TAG tag, ...]
Uploads a blob with optional metadata.
BLOB PUT 'report.pdf' FROM '/tmp/report.pdf' TYPE 'application/pdf' TAG 'quarterly'
BLOB PUT 'config.json' DATA '{"key": "value"}' BY 'admin'
BLOB GET
BLOB GET artifact_id [TO path]
Downloads a blob. If TO is specified, writes to the given file path.
BLOB GET 'art-123'
BLOB GET 'art-123' TO '/tmp/download.pdf'
BLOB DELETE
BLOB DELETE artifact_id
Deletes a blob.
BLOB INFO
BLOB INFO artifact_id
Shows metadata for a blob (size, checksum, creation date, tags, links).
BLOB LINK
BLOB LINK artifact_id TO entity
Links a blob to an entity.
BLOB LINK 'art-123' TO 'alice'
BLOB UNLINK
BLOB UNLINK artifact_id FROM entity
Removes a link between a blob and an entity.
BLOB LINKS
BLOB LINKS artifact_id
Lists all entities linked to a blob.
BLOB TAG
BLOB TAG artifact_id tag
Adds a tag to a blob.
BLOB TAG 'art-123' 'important'
BLOB UNTAG
BLOB UNTAG artifact_id tag
Removes a tag from a blob.
BLOB VERIFY
BLOB VERIFY artifact_id
Verifies the integrity of a blob by checking its checksum.
BLOB GC
BLOB GC [FULL]
Runs garbage collection on blob storage. FULL performs a thorough sweep.
BLOB REPAIR
BLOB REPAIR
Repairs blob storage by fixing inconsistencies.
BLOB STATS
BLOB STATS
Shows blob storage statistics (total count, size, etc.).
BLOB META SET
BLOB META SET artifact_id key value
Sets a custom metadata key-value pair on a blob.
BLOB META SET 'art-123' 'department' 'engineering'
BLOB META GET
BLOB META GET artifact_id key
Gets a custom metadata value from a blob.
BLOBS
BLOBS [pattern]
Lists all blobs, optionally filtered by filename pattern.
BLOBS FOR
BLOBS FOR entity
Lists blobs linked to a specific entity.
BLOBS FOR 'alice'
BLOBS BY TAG
BLOBS BY TAG tag
Lists blobs with a specific tag.
BLOBS BY TAG 'quarterly'
BLOBS WHERE TYPE
BLOBS WHERE TYPE = content_type
Lists blobs with a specific content type.
BLOBS WHERE TYPE = 'application/pdf'
BLOBS SIMILAR TO
BLOBS SIMILAR TO artifact_id [LIMIT n]
Finds blobs similar to a given blob.
Checkpoint Commands
CHECKPOINT
CHECKPOINT [name]
Creates a named checkpoint (snapshot) of the current state.
CHECKPOINT 'before-migration'
CHECKPOINT
CHECKPOINTS
CHECKPOINTS [LIMIT n]
Lists all available checkpoints.
ROLLBACK TO
ROLLBACK TO checkpoint_id
Restores the database to a previous checkpoint.
ROLLBACK TO 'before-migration'
Chain Commands
The chain subsystem provides a tensor-native blockchain with Raft consensus.
BEGIN CHAIN TRANSACTION
BEGIN CHAIN TRANSACTION
Starts a new chain transaction. All subsequent mutations are buffered until commit.
COMMIT CHAIN
COMMIT CHAIN
Commits the current chain transaction, appending a new block.
ROLLBACK CHAIN TO
ROLLBACK CHAIN TO height
Rolls back the chain to a specific block height.
CHAIN HEIGHT
CHAIN HEIGHT
Returns the current chain height (number of blocks).
CHAIN TIP
CHAIN TIP
Returns the most recent block.
CHAIN BLOCK
CHAIN BLOCK height
Retrieves a block at the given height.
CHAIN BLOCK 42
CHAIN VERIFY
CHAIN VERIFY
Verifies the integrity of the entire chain.
CHAIN HISTORY
CHAIN HISTORY key
Gets the history of changes for a specific key across all blocks.
CHAIN HISTORY 'users/alice'
CHAIN SIMILAR
CHAIN SIMILAR [embedding] [LIMIT n]
Searches the chain by embedding similarity.
CHAIN SIMILAR [0.1, 0.2, 0.3] LIMIT 5
CHAIN DRIFT
CHAIN DRIFT FROM height TO height
Computes drift metrics between two chain heights.
CHAIN DRIFT FROM 10 TO 50
SHOW CODEBOOK GLOBAL
SHOW CODEBOOK GLOBAL
Shows the global codebook used for tensor compression.
SHOW CODEBOOK LOCAL
SHOW CODEBOOK LOCAL domain
Shows the local codebook for a specific domain.
SHOW CODEBOOK LOCAL 'embeddings'
ANALYZE CODEBOOK TRANSITIONS
ANALYZE CODEBOOK TRANSITIONS
Analyzes transitions between codebook states.
Cluster Commands
CLUSTER CONNECT
CLUSTER CONNECT address
Connects to a cluster node.
CLUSTER CONNECT 'node2@192.168.1.10:7000'
CLUSTER DISCONNECT
CLUSTER DISCONNECT
Disconnects from the cluster.
CLUSTER STATUS
CLUSTER STATUS
Shows the current cluster status (membership, leader, term).
CLUSTER NODES
CLUSTER NODES
Lists all cluster nodes and their states.
CLUSTER LEADER
CLUSTER LEADER
Shows the current cluster leader.
Cypher Commands (Experimental)
Neumann includes experimental support for Cypher-style graph queries.
MATCH
[OPTIONAL] MATCH pattern [WHERE condition]
RETURN items [ORDER BY items] [SKIP n] [LIMIT n]
Pattern matching query with Cypher syntax.
MATCH (p:Person)-[:REPORTS_TO]->(m:Person)
RETURN p.name, m.name
MATCH (a:Person)-[:KNOWS*1..3]->(b:Person)
WHERE a.name = 'Alice'
RETURN b.name, COUNT(*) AS depth
ORDER BY depth
LIMIT 10
Relationship patterns: -[r:TYPE]-> (outgoing), <-[r:TYPE]- (incoming),
-[r:TYPE]- (undirected). Variable-length: -[*1..5]->.
CYPHER CREATE
CREATE (pattern)
Creates nodes and relationships.
CREATE (p:Person { name: 'Dave', role: 'Designer' })
CREATE (a)-[:KNOWS]->(b)
CYPHER DELETE
[DETACH] DELETE variables
Deletes nodes or relationships. DETACH DELETE also removes all relationships.
MERGE
MERGE (pattern) [ON CREATE SET ...] [ON MATCH SET ...]
Upsert: matches an existing pattern or creates it.
MERGE (p:Person { name: 'Alice' })
ON CREATE SET p.created = '2024-01-01'
ON MATCH SET p.updated = '2024-06-01'
Shell Commands
These commands are available in the interactive shell but are not part of the query language.
| Command | Description |
|---|---|
help | Show available commands |
exit / quit | Exit the shell |
clear | Clear the screen |
tables | Alias for SHOW TABLES |
save 'path' | Save data to binary file |
load 'path' | Load data from binary file |
Persistence
Start the shell with WAL (write-ahead log) for durability:
neumann --wal-dir ./data
Data Types
Neumann has two layers of types: scalar values for individual fields and tensor values for composite storage.
Scalar Types
ScalarValue represents a single value in the system.
| Type | Description | Examples |
|---|---|---|
Null | Absence of a value | NULL |
Bool | Boolean | TRUE, FALSE |
Int | 64-bit signed integer | 42, -1, 0 |
Float | 64-bit floating point | 3.14, -0.5, 1e10 |
String | UTF-8 text | 'hello', 'Alice' |
Bytes | Raw binary data | (used internally for blob content) |
Literals
- Strings: Single-quoted:
'hello world' - Integers: Unquoted numbers:
42,-7 - Floats: Numbers with decimal or exponent:
3.14,1e-5 - Booleans:
TRUEorFALSE(case-insensitive) - Null:
NULL - Arrays: Square brackets:
[1, 2, 3]or[0.1, 0.2, 0.3]
Tensor Types
TensorValue wraps scalar values with vector and pointer types for the unified
data model.
| Type | Description | Use Case |
|---|---|---|
Scalar(ScalarValue) | Single scalar value | Table columns, node properties |
Vector(Vec<f32>) | Dense float vector | Embeddings for similarity search |
Sparse(SparseVector) | Sparse vector | Memory-efficient high-dimensional embeddings |
Pointer(String) | Reference to another entity | Graph edges, foreign keys |
Pointers(Vec<String>) | Multiple references | Multi-valued relationships |
Column Types in CREATE TABLE
When creating relational tables, use SQL-style type names. These map to internal scalar types.
| SQL Type | Internal Type | Notes |
|---|---|---|
INT, INTEGER | Int | 64-bit signed integer |
BIGINT | Int | Same as INT (64-bit) |
SMALLINT | Int | Same as INT (64-bit) |
FLOAT | Float | 64-bit floating point |
DOUBLE | Float | Same as FLOAT |
REAL | Float | Same as FLOAT |
DECIMAL(p,s) | Float | Precision and scale are advisory |
NUMERIC(p,s) | Float | Same as DECIMAL |
VARCHAR(n) | String | Max length is advisory |
CHAR(n) | String | Fixed-width (padded) |
TEXT | String | Unlimited length |
BOOLEAN | Bool | TRUE or FALSE |
DATE | String | Stored as ISO-8601 string |
TIME | String | Stored as ISO-8601 string |
TIMESTAMP | String | Stored as ISO-8601 string |
BLOB | Bytes | Raw binary data |
| custom name | String | Any unrecognized type stores as String |
Type Coercion
Neumann performs implicit type coercion in comparisons:
IntandFloatin arithmetic: Int is promoted to FloatStringcomparisons: lexicographic orderingNullpropagation: any operation with NULL yields NULL- Boolean context: only Bool values are truthy/falsy (no implicit conversion from Int)
Vector Representation
Dense vectors are stored as Vec<f32> and used for similarity search via HNSW
indexes. All vectors in a collection must have the same dimensionality.
EMBED STORE 'doc1' [0.1, 0.2, 0.3, 0.4]
Sparse vectors use a compact representation storing only non-zero indices and values, making them efficient for high-dimensional data (e.g., 30,000+ dimensions for bag-of-words models).
Identifiers
Identifiers (table names, column names, labels) follow these rules:
- Start with a letter or underscore
- Contain letters, digits, and underscores
- Case-insensitive for keywords, case-preserving for identifiers
- No quoting required for simple names
- Use single quotes for string values:
'value'
Functions Reference
Aggregate Functions
These functions operate on groups of rows in SELECT queries with GROUP BY.
| Function | Description | Example |
|---|---|---|
COUNT(*) | Count all rows | SELECT COUNT(*) FROM users |
COUNT(column) | Count non-null values | SELECT COUNT(name) FROM users |
SUM(column) | Sum numeric values | SELECT SUM(total) FROM orders |
AVG(column) | Average numeric values | SELECT AVG(age) FROM users |
MIN(column) | Minimum value | SELECT MIN(created) FROM orders |
MAX(column) | Maximum value | SELECT MAX(total) FROM orders |
SELECT team, COUNT(*) AS headcount, AVG(age) AS avg_age
FROM employees
GROUP BY team
HAVING COUNT(*) > 5
ORDER BY headcount DESC
Graph Algorithm Functions
These are invoked as top-level commands, not as SQL functions. See the Query Language Reference for full syntax.
| Algorithm | Description | Returns |
|---|---|---|
PAGERANK | Link analysis ranking | Node scores (0.0-1.0) |
BETWEENNESS | Bridge node importance | Node scores |
CLOSENESS | Average distance to all nodes | Node scores |
EIGENVECTOR | Influence-based ranking | Node scores |
LOUVAIN | Community detection | Community assignments |
LABEL_PROPAGATION | Community detection | Community assignments |
Graph Aggregate Functions
Used with the GRAPH AGGREGATE command on node/edge properties.
| Function | Description |
|---|---|
COUNT | Count nodes/edges matching criteria |
SUM | Sum of property values |
AVG | Average of property values |
MIN | Minimum property value |
MAX | Maximum property value |
GRAPH AGGREGATE COUNT NODES person
GRAPH AGGREGATE AVG NODE age person WHERE age > 20
GRAPH AGGREGATE SUM EDGE weight collaborates
Distance Metrics
Used with SIMILAR and EMBED commands for vector similarity search.
| Metric | Keyword | Range | Best For |
|---|---|---|---|
| Cosine similarity | COSINE | -1.0 to 1.0 | Text embeddings, normalized vectors |
| Euclidean distance | EUCLIDEAN | 0.0 to infinity | Spatial data, image features |
| Dot product | DOT_PRODUCT | -infinity to infinity | Pre-normalized vectors, recommendation |
SIMILAR [0.1, 0.2, 0.3] LIMIT 10 METRIC COSINE
SIMILAR 'doc1' LIMIT 5 METRIC EUCLIDEAN
The default metric is COSINE when not specified.
Expression Operators
Arithmetic
| Operator | Description |
|---|---|
+ | Addition |
- | Subtraction |
* | Multiplication |
/ | Division |
% | Modulo |
Comparison
| Operator | Description |
|---|---|
= | Equal |
!= or <> | Not equal |
< | Less than |
<= | Less than or equal |
> | Greater than |
>= | Greater than or equal |
Logical
| Operator | Description |
|---|---|
AND | Logical AND |
OR | Logical OR |
NOT | Logical NOT |
Special Predicates
| Predicate | Description | Example |
|---|---|---|
IS NULL | Test for null | WHERE name IS NULL |
IS NOT NULL | Test for non-null | WHERE name IS NOT NULL |
IN (list) | Set membership | WHERE id IN (1, 2, 3) |
NOT IN (list) | Set non-membership | WHERE id NOT IN (1, 2) |
BETWEEN a AND b | Range check | WHERE age BETWEEN 18 AND 65 |
LIKE pattern | Pattern matching | WHERE name LIKE 'A%' |
NOT LIKE pattern | Negative pattern | WHERE name NOT LIKE '%test%' |
EXISTS (subquery) | Subquery existence | WHERE EXISTS (SELECT ...) |
CASE Expression
CASE
WHEN condition THEN result
[WHEN condition THEN result ...]
[ELSE default]
END
SELECT name,
CASE
WHEN age < 18 THEN 'minor'
WHEN age < 65 THEN 'adult'
ELSE 'senior'
END AS category
FROM users
CAST
CAST(expression AS type)
SELECT CAST(age AS FLOAT) / 10 AS decade FROM users
Tensor Data Model
Neumann uses a unified tensor-based data model that represents all data types as mathematical tensors.
Core Types
TensorValue
The fundamental value type:
| Variant | Description | Example |
|---|---|---|
Scalar(ScalarValue) | Single value | 42, "hello", true |
Vector(Vec<f32>) | Dense embedding | [0.1, 0.2, 0.3] |
Pointer(String) | Reference to entity | "user_123" |
Pointers(Vec<String>) | Multiple references | ["a", "b", "c"] |
ScalarValue
Primitive values:
| Variant | Rust Type | Example |
|---|---|---|
Int(i64) | 64-bit integer | 42 |
Float(f64) | 64-bit float | 3.14 |
String(String) | UTF-8 string | "hello" |
Bool(bool) | Boolean | true |
Bytes(Vec<u8>) | Binary data | [0x01, 0x02] |
Null | Null value | NULL |
TensorData
A map of field names to TensorValues:
#![allow(unused)] fn main() { // Conceptually: HashMap<String, TensorValue> let user = TensorData::new() .with("id", TensorValue::Scalar(ScalarValue::Int(1))) .with("name", TensorValue::Scalar(ScalarValue::String("Alice".into()))) .with("embedding", TensorValue::Vector(vec![0.1, 0.2, 0.3])); }
Sparse Vectors
For high-dimensional sparse data:
#![allow(unused)] fn main() { // Only stores non-zero values let sparse = SparseVector::new(1000) // 1000 dimensions .with_value(42, 0.5) .with_value(100, 0.3) .with_value(500, 0.8); }
Operations
| Operation | Description |
|---|---|
cosine_similarity | Cosine distance between vectors |
euclidean_distance | L2 distance |
dot_product | Inner product |
weighted_average | Blend multiple vectors |
project_orthogonal | Remove component |
Type Mapping
Relational Engine
| SQL Type | TensorValue |
|---|---|
INT | Scalar(Int) |
FLOAT | Scalar(Float) |
STRING | Scalar(String) |
BOOL | Scalar(Bool) |
VECTOR(n) | Vector |
Graph Engine
| Graph Element | TensorValue |
|---|---|
| Node ID | Scalar(String) |
| Edge target | Pointer |
| Properties | TensorData |
Vector Engine
| Vector Type | TensorValue |
|---|---|
| Dense | Vector |
| Sparse | SparseVector (internal) |
Storage Layout
Data is stored in TensorStore as key-value pairs:
Key: "users/1"
Value: TensorData {
"id": Scalar(Int(1)),
"name": Scalar(String("Alice")),
"embedding": Vector([0.1, 0.2, ...])
}
Sparse Vectors
Sparse vectors are a memory-efficient representation for high-dimensional data where most values are zero.
When to Use Sparse Vectors
| Use Case | Dense | Sparse |
|---|---|---|
| Low dimensions (<100) | Preferred | Overhead |
| High dimensions (>1000) | Memory intensive | Preferred |
| Most values non-zero | Preferred | Overhead |
| <10% values non-zero | Wasteful | Preferred |
SparseVector Type
#![allow(unused)] fn main() { pub struct SparseVector { dimension: usize, indices: Vec<usize>, values: Vec<f32>, } }
Memory Comparison
For a 10,000-dimensional vector with 100 non-zero values:
| Representation | Memory |
|---|---|
Dense Vec<f32> | 40,000 bytes |
| Sparse | ~800 bytes |
| Savings | 98% |
Operations
Creation
#![allow(unused)] fn main() { // From dense let sparse = SparseVector::from_dense(&[0.0, 0.5, 0.0, 0.3, 0.0]); // Incremental let mut sparse = SparseVector::new(1000); sparse.set(42, 0.5); sparse.set(100, 0.3); }
Arithmetic
#![allow(unused)] fn main() { // Subtraction (for deltas) let delta = new_state.sub(&old_state); // Weighted average let blended = SparseVector::weighted_average(&[ (&vec_a, 0.7), (&vec_b, 0.3), ]); // Orthogonal projection let residual = vec.project_orthogonal(&basis); }
Similarity Metrics
| Metric | Formula | Range |
|---|---|---|
| Cosine | a.b / (‖a‖ * ‖b‖) | -1 to 1 |
| Euclidean | sqrt(sum((a-b)^2)) | 0 to inf |
| Jaccard | ‖A ∩ B‖ / ‖A ∪ B‖ | 0 to 1 |
| Angular | acos(cosine) / pi | 0 to 1 |
#![allow(unused)] fn main() { let sim = vec_a.cosine_similarity(&vec_b); let dist = vec_a.euclidean_distance(&vec_b); let jacc = vec_a.jaccard_index(&vec_b); }
HNSW Index
Hierarchical Navigable Small World for approximate nearest neighbor search:
#![allow(unused)] fn main() { let mut index = HNSWIndex::new(HNSWConfig::default()); // Insert index.insert("doc_1", sparse_vec_1); index.insert("doc_2", sparse_vec_2); // Search let results = index.search(&query_vec, 10); // top 10 }
Configuration
| Parameter | Default | Description |
|---|---|---|
m | 16 | Max connections per layer |
ef_construction | 200 | Build-time search width |
ef_search | 50 | Query-time search width |
Delta Encoding
For tracking state changes:
#![allow(unused)] fn main() { // Compute delta between states let delta = DeltaVector::from_diff(&old_embedding, &new_embedding); // Apply delta let new_state = old_state.add(&delta.to_sparse()); // Check if orthogonal (non-conflicting) if delta_a.is_orthogonal(&delta_b) { // Can merge automatically } }
Compression
Sparse vectors compress well:
| Method | Ratio | Speed |
|---|---|---|
| Varint indices | 2-4x | Fast |
| Quantization (int8) | 4x | Fast |
| Binary quantization | 32x | Very fast |
Semantic Operations
Semantic operations in Neumann leverage vector embeddings to perform meaning-aware computations.
Core Concepts
Embeddings
Embeddings map data to vector space where similar items are close:
"cat" -> [0.2, 0.8, 0.1, ...]
"dog" -> [0.3, 0.7, 0.2, ...] (close to cat)
"car" -> [0.9, 0.1, 0.5, ...] (far from cat)
Similarity Search
Find items similar to a query:
SELECT * FROM documents
WHERE SIMILAR(embedding, query_vec, 0.8)
LIMIT 10;
Operations
Conflict Detection
In tensor_chain, semantic operations detect conflicts:
#![allow(unused)] fn main() { // Two changes conflict if their deltas overlap let conflict = delta_a.cosine_similarity(&delta_b) > threshold; // Orthogonal changes can be merged if delta_a.is_orthogonal(&delta_b) { let merged = delta_a.add(&delta_b); } }
Auto-Merge
Non-conflicting changes merge automatically:
flowchart LR
A[State S0] --> B[Change A: fields 1-10]
A --> C[Change B: fields 11-20]
B --> D[Merged: fields 1-20]
C --> D
Semantic Conflict Resolution
When changes overlap:
| Scenario | Detection | Resolution |
|---|---|---|
| Orthogonal | similarity < 0.1 | Auto-merge |
| Partial overlap | 0.1 <= similarity < 0.5 | Manual review |
| Direct conflict | similarity >= 0.5 | Reject newer |
Codebook Quantization
For efficient similarity comparisons:
Global Codebook
Static centroids for consensus validation:
#![allow(unused)] fn main() { let codebook = GlobalCodebook::new(1024, 128); // 1024 centroids, 128 dims let quantized = codebook.quantize(&embedding); }
Local Codebook
Adaptive centroids per domain:
#![allow(unused)] fn main() { let mut codebook = LocalCodebook::new(256, 128); codebook.update(&new_embeddings, 0.1); // EMA update }
Distance Metrics
| Metric | Use Case | Properties |
|---|---|---|
| Cosine | Text similarity | Scale-invariant |
| Euclidean | Spatial data | Absolute distance |
| Angular | Normalized comparison | 0 to 1 range |
| Geodesic | Manifold data | Curvature-aware |
Cache Semantic Search
tensor_cache uses semantic similarity:
#![allow(unused)] fn main() { // Exact match first if let Some(hit) = cache.get_exact(&prompt_hash) { return hit; } // Then semantic search if let Some(hit) = cache.search_similar(&prompt_embedding, 0.95) { return hit; } }
Embedding State Machine
tensor_chain tracks embedding lifecycle:
stateDiagram-v2
[*] --> Initial: new transaction
Initial --> Computed: compute_embedding()
Computed --> Validated: validate()
Validated --> Committed: commit()
#![allow(unused)] fn main() { pub enum EmbeddingState { Initial, // No embedding yet Computed(SparseVector), // Computed, not validated Validated(SparseVector), // Passed validation } }
Distributed Transactions
tensor_chain implements distributed transactions using Two-Phase Commit (2PC) with semantic conflict detection.
Transaction Lifecycle
stateDiagram-v2
[*] --> Pending: begin()
Pending --> Preparing: prepare()
Preparing --> Prepared: all votes received
Prepared --> Committing: commit decision
Prepared --> Aborting: abort decision
Committing --> Committed: all acks
Aborting --> Aborted: all acks
Committed --> [*]
Aborted --> [*]
Two-Phase Commit
Phase 1: Prepare
- Coordinator sends
Prepareto all participants - Each participant:
- Acquires locks
- Validates constraints
- Writes to WAL
- Votes
YesorNo
Phase 2: Commit/Abort
- If all vote
Yes: Coordinator sendsCommit - If any vote
No: Coordinator sendsAbort - Participants apply or rollback
Message Types
| Message | Direction | Purpose |
|---|---|---|
TxPrepareMsg | Coordinator -> Participant | Start prepare phase |
TxVote | Participant -> Coordinator | Vote yes/no |
TxCommitMsg | Coordinator -> Participant | Commit decision |
TxAbortMsg | Coordinator -> Participant | Abort decision |
TxAck | Participant -> Coordinator | Acknowledge commit/abort |
Lock Management
Lock Types
| Lock | Compatibility | Use |
|---|---|---|
| Shared (S) | S-S compatible | Read operations |
| Exclusive (X) | Incompatible with all | Write operations |
Lock Acquisition
#![allow(unused)] fn main() { // Acquire lock with timeout let lock = lock_manager.acquire( tx_id, key, LockMode::Exclusive, Duration::from_secs(5), )?; }
Deadlock Detection
Wait-for graph analysis:
#![allow(unused)] fn main() { // Check for cycles before waiting if wait_graph.would_create_cycle(my_tx, blocking_tx) { // Abort to prevent deadlock return Err(DeadlockDetected); } // Register wait wait_graph.add_wait(my_tx, blocking_tx); }
Victim Selection
| Policy | Behavior |
|---|---|
| Youngest | Abort most recent transaction |
| Oldest | Abort longest-running |
| LowestPriority | Abort lowest priority |
| MostLocks | Abort holding most locks |
Semantic Conflict Detection
Beyond lock-based conflicts, tensor_chain detects semantic conflicts:
#![allow(unused)] fn main() { // Compute embedding deltas let delta_a = tx_a.compute_delta(); let delta_b = tx_b.compute_delta(); // Check for semantic overlap if delta_a.cosine_similarity(&delta_b) > CONFLICT_THRESHOLD { // Semantic conflict - need manual resolution return PrepareVote::Conflict { ... }; } }
Recovery
Coordinator Failure
- New coordinator queries participants for tx state
- If any committed: complete commit
- If all prepared: re-run commit decision
- Otherwise: abort
Participant Failure
- Participant replays WAL on restart
- For prepared transactions: query coordinator
- Apply commit or abort based on coordinator state
Configuration
#![allow(unused)] fn main() { pub struct DistributedTxConfig { /// Prepare phase timeout pub prepare_timeout_ms: u64, /// Commit phase timeout pub commit_timeout_ms: u64, /// Maximum concurrent transactions pub max_concurrent_tx: usize, /// Lock wait timeout pub lock_timeout_ms: u64, } }
Formal Verification
The 2PC protocol is formally specified in TwoPhaseCommit.tla and
exhaustively model-checked with TLC across 2.3M distinct states.
The model verifies Atomicity (all-or-nothing), NoOrphanedLocks,
ConsistentDecision, VoteIrrevocability, and DecisionStability.
See Formal Verification for full results.
Best Practices
- Keep transactions short: Long transactions increase conflict probability
- Order lock acquisition: Acquire locks in consistent order to prevent deadlocks
- Use appropriate isolation: Not all operations need serializable isolation
- Monitor deadlock rate: High rates indicate contention issues
Consensus Protocols
tensor_chain uses Raft consensus with SWIM gossip for membership management.
Raft Consensus
Overview
Raft provides:
- Leader election
- Log replication
- Safety (never returns incorrect results)
- Availability (operational if majority alive)
Node States
stateDiagram-v2
[*] --> Follower
Follower --> Candidate: election timeout
Candidate --> Leader: wins election
Candidate --> Follower: discovers leader
Leader --> Follower: discovers higher term
Candidate --> Candidate: split vote
Terms
Time divided into terms with at most one leader:
Term 1: [Leader A] -----> [Follower timeout]
Term 2: [Election] -> [Leader B] -----> ...
Log Replication
sequenceDiagram
participant C as Client
participant L as Leader
participant F1 as Follower 1
participant F2 as Follower 2
C->>L: Write request
L->>L: Append to log
par Replicate
L->>F1: AppendEntries
L->>F2: AppendEntries
end
F1->>L: Success
F2->>L: Success
L->>L: Commit (majority)
L->>C: Success
Configuration
| Parameter | Default | Description |
|---|---|---|
election_timeout_min | 150ms | Min election timeout |
election_timeout_max | 300ms | Max election timeout |
heartbeat_interval | 50ms | Leader heartbeat frequency |
max_entries_per_append | 100 | Batch size for replication |
SWIM Gossip
Overview
Scalable Weakly-consistent Infection-style Membership:
- O(log N) failure detection
- Distributed membership view
- No single point of failure
Protocol
sequenceDiagram
participant A as Node A
participant B as Node B (target)
participant C as Node C
A->>B: Ping
Note over B: No response
A->>C: PingReq(B)
C->>B: Ping
alt B responds
B->>C: Ack
C->>A: Ack (indirect)
else B down
C->>A: Nack
A->>A: Mark B suspect
end
Node States
| State | Description | Transition |
|---|---|---|
| Healthy | Responding normally | — |
| Suspect | Failed direct ping | After timeout |
| Failed | Confirmed down | After indirect ping failure |
LWW-CRDT Membership
Last-Writer-Wins with incarnation numbers:
#![allow(unused)] fn main() { // State comparison fn supersedes(&self, other: &Self) -> bool { (self.incarnation, self.timestamp) > (other.incarnation, other.timestamp) } // Merge takes winner per node fn merge(&mut self, other: &Self) { for (node_id, state) in &other.states { if state.supersedes(&self.states[node_id]) { self.states.insert(node_id.clone(), state.clone()); } } } }
Configuration
| Parameter | Default | Description |
|---|---|---|
ping_interval | 1s | Direct ping frequency |
ping_timeout | 500ms | Time to wait for response |
suspect_timeout | 3s | Time before marking failed |
indirect_ping_count | 3 | Number of indirect pings |
Hybrid Logical Clocks
Combine physical time with logical counters:
#![allow(unused)] fn main() { pub struct HybridTimestamp { wall_ms: u64, // Physical time (milliseconds) logical: u16, // Logical counter } }
Properties
- Monotonic: Always increases
- Bounded drift: Stays close to wall clock
- Causality: If A happens-before B, then ts(A) < ts(B)
Usage
#![allow(unused)] fn main() { let hlc = HybridLogicalClock::new(node_id); // Local event let ts = hlc.now(); // Receive message with timestamp let ts = hlc.receive(message_ts); }
Formal Verification
Both protocols are formally specified in TLA+ and exhaustively model-checked with TLC:
- Raft.tla verifies
ElectionSafety,LogMatching,StateMachineSafety,LeaderCompleteness,VoteIntegrity, andTermMonotonicityacross 18.3M distinct states. - Membership.tla verifies
NoFalsePositivesSafety,MonotonicEpochs, andMonotonicIncarnationsacross 54K distinct states.
Model checking found and led to fixes for protocol bugs including out-of-order message handling in Raft log replication and an invalid fairness formula in the gossip spec. See Formal Verification for full results.
Integration
Raft and SWIM work together:
- SWIM detects node failures quickly
- Raft handles leader election and log consistency
- HLC provides ordering across the cluster
flowchart TB
subgraph Membership Layer
SWIM[SWIM Gossip]
end
subgraph Consensus Layer
Raft[Raft Consensus]
end
subgraph Time Layer
HLC[Hybrid Logical Clock]
end
SWIM -->|failure notifications| Raft
HLC -->|timestamps| SWIM
HLC -->|timestamps| Raft
Embedding State Machine
The EmbeddingState provides type-safe state transitions for transaction
embedding lifecycle. It eliminates Option ceremony and ensures correct API
usage at compile time.
Overview
Transaction embeddings track the semantic change from before-state to after-state. The state machine ensures:
- Before embedding is always available
- Delta is only accessible after computation
- Dimension mismatches are caught early
- Double-computation is prevented
State Diagram
stateDiagram-v2
[*] --> Initial: new(before)
Initial --> Computed: compute(after)
Computed --> Computed: access only
Initial --> Initial: access only
| State | Description | Available Data |
|---|---|---|
| Initial | Transaction started, before captured | before |
| Computed | Delta computed, ready for conflict check | before, after, delta |
API Reference
Construction Methods
| Method | Description | Result State |
|---|---|---|
new(before) | Create from sparse vector | Initial |
from_dense(&[f32]) | Create from dense slice | Initial |
empty(dim) | Create zero vector of given dimension | Initial |
default() | Create empty (dimension 0) | Initial |
State Query Methods
| Method | Initial | Computed |
|---|---|---|
before() | &SparseVector | &SparseVector |
after() | None | Some(&SparseVector) |
delta() | None | Some(&SparseVector) |
is_computed() | false | true |
dimension() | dimension | dimension |
Transition Methods
| Method | From | To | Error Conditions |
|---|---|---|---|
compute(after) | Initial | Computed | AlreadyComputed, DimensionMismatch |
compute_from_dense(&[f32]) | Initial | Computed | AlreadyComputed, DimensionMismatch |
compute_with_threshold(after, threshold) | Initial | Computed | AlreadyComputed, DimensionMismatch |
Threshold Configuration
The compute_with_threshold method creates sparse deltas by ignoring small
changes. This reduces memory usage for high-dimensional embeddings.
Threshold Effects
| Threshold | Effect | Use Case |
|---|---|---|
| 0.0 | All differences captured | Exact tracking |
| 0.001 | Ignore floating-point noise | General use |
| 0.01 | Ignore minor changes | Dimensionality reduction |
| 0.1 | Only major changes | Coarse conflict detection |
Example
#![allow(unused)] fn main() { let state = EmbeddingState::from_dense(&before); // Only capture differences > 0.01 let computed = state.compute_with_threshold(&after, 0.01)?; // Sparse delta - fewer non-zero entries let delta = computed.delta().unwrap(); println!("Non-zero entries: {}", delta.nnz()); }
Error Handling
Error Types
| Error | Cause | Prevention |
|---|---|---|
NotComputed | Accessing delta before compute | Check is_computed() |
AlreadyComputed | Calling compute twice | Check is_computed() |
DimensionMismatch | Before and after have different dims | Validate dimensions |
Error Display
#![allow(unused)] fn main() { // NotComputed "delta not yet computed" // AlreadyComputed "delta already computed" // DimensionMismatch "dimension mismatch: before=128, after=64" }
Example Usage
Basic Workflow
#![allow(unused)] fn main() { use tensor_chain::embedding::EmbeddingState; use tensor_store::SparseVector; // 1. Capture before-state at transaction start let before = SparseVector::from_dense(&[1.0, 0.0, 0.0, 0.0]); let state = EmbeddingState::new(before); // 2. State is Initial - delta not available assert!(!state.is_computed()); assert!(state.delta().is_none()); // 3. Compute delta at commit time let after = SparseVector::from_dense(&[1.0, 0.5, 0.0, 0.0]); let computed = state.compute(after)?; // 4. State is Computed - delta available assert!(computed.is_computed()); let delta = computed.delta().unwrap(); // Delta is [0.0, 0.5, 0.0, 0.0] assert_eq!(delta.nnz(), 1); // Only one non-zero }
Using delta_or_zero
For code that needs a dense vector regardless of state:
#![allow(unused)] fn main() { // Safe to call in any state let dense_delta = state.delta_or_zero(); // Returns zeros if Initial // Returns actual delta if Computed }
Delta Magnitude
#![allow(unused)] fn main() { // Check if transaction made significant changes let magnitude = state.delta_magnitude(); if magnitude < 0.001 { println!("No meaningful changes"); } else { println!("Change magnitude: {}", magnitude); } }
Integration with Consensus
The embedding state integrates with the consensus layer for conflict detection.
Delta to DeltaVector
#![allow(unused)] fn main() { use tensor_chain::consensus::DeltaVector; let state = EmbeddingState::from_dense(&before); let computed = state.compute_from_dense(&after)?; // Create DeltaVector for conflict detection let delta_vec = DeltaVector::new( computed.delta_or_zero(), affected_keys, tx_id, ); // Check orthogonality with another transaction let similarity = delta_vec.cosine_similarity(&other_delta); if similarity.abs() < 0.1 { println!("Transactions are orthogonal - can merge"); } }
Conflict Classification
| Similarity | Classification | Action |
|---|---|---|
| < 0.1 | Orthogonal | Can merge |
| 0.1 - 0.5 | Low conflict | Merge possible |
| 0.5 - 0.9 | Conflicting | Needs resolution |
| > 0.9 | Parallel | Must serialize |
Serialization
The state machine supports bitcode serialization for persistence:
#![allow(unused)] fn main() { // Serialize let bytes = bitcode::encode(&state); // Deserialize let restored: EmbeddingState = bitcode::decode(&bytes)?; // State is preserved assert_eq!(state.is_computed(), restored.is_computed()); }
Source Reference
tensor_chain/src/embedding.rs- EmbeddingState implementationtensor_store/src/lib.rs- SparseVector type
Codebook Manager
The codebook system provides vector quantization for mapping continuous tensor states to a finite vocabulary of valid states. It enables state validation and efficient consensus through hierarchical quantization.
Overview
The system consists of two levels:
- GlobalCodebook: Static codebook shared across all nodes for consensus
- LocalCodebook: Adaptive codebook per domain that captures residuals
flowchart TD
A[Input Vector] --> B[Global Codebook]
B --> C{Residual > threshold?}
C -->|No| D[Global code only]
C -->|Yes| E[Local Codebook]
E --> F[Global + Local codes]
D --> G[Final Quantization]
F --> G
Global Codebook
The global codebook provides consensus-safe quantization using static centroids shared by all nodes.
Initialization Methods
| Method | Description |
|---|---|
new(dimension) | Empty codebook |
from_centroids(Vec<Vec<f32>>) | From pre-computed centroids |
from_centroids_with_labels | Centroids with semantic labels |
from_kmeans(vectors, k, iters) | Initialize via k-means clustering |
Quantization
#![allow(unused)] fn main() { use tensor_chain::codebook::GlobalCodebook; // Initialize from training data let codebook = GlobalCodebook::from_kmeans(&training_vectors, 256, 100); // Quantize a vector if let Some((entry_id, similarity)) = codebook.quantize(&vector) { println!("Nearest entry: {}, similarity: {}", entry_id, similarity); } // Compute residual for hierarchical quantization if let Some((id, residual)) = codebook.compute_residual(&vector) { // residual = vector - centroid[id] } }
Local Codebook
Local codebooks adapt to domain-specific patterns using exponential moving average (EMA) updates. They capture residuals that the global codebook misses.
EMA Update Formula
When an observation matches an existing entry:
centroid_new = alpha * observation + (1 - alpha) * centroid_old
where alpha controls the learning rate (default: 0.1).
Configuration
| Parameter | Default | Description |
|---|---|---|
max_entries | 256 | Maximum entries in the codebook |
ema_alpha | 0.1 | EMA learning rate |
min_usage_for_prune | 2 | Minimum accesses before pruning |
pruning_strategy | Hybrid | How to select entries for removal |
Pruning Strategies
| Strategy | Description | Score Formula |
|---|---|---|
| LRU | Least Recently Used | last_access |
| LFU | Least Frequently Used | access_count |
| Hybrid | Weighted combination | w1*recency + w2*frequency |
#![allow(unused)] fn main() { use tensor_chain::codebook::{LocalCodebook, PruningStrategy}; let mut local = LocalCodebook::new("transactions", 128, 256, 0.1); // Use LRU pruning local.set_pruning_strategy(PruningStrategy::LRU); // Or hybrid with custom weights local.set_pruning_strategy(PruningStrategy::Hybrid { recency_weight: 0.7, frequency_weight: 0.3, }); }
CodebookManager
The CodebookManager coordinates hierarchical quantization across global
and local codebooks.
Configuration
#![allow(unused)] fn main() { use tensor_chain::codebook::{CodebookManager, CodebookConfig, GlobalCodebook}; let config = CodebookConfig { local_capacity: 256, // Max entries per local codebook ema_alpha: 0.1, // EMA learning rate similarity_threshold: 0.9, // Match threshold for local updates residual_threshold: 0.05, // Min residual for local quantization validity_threshold: 0.8, // State validity threshold }; let global = GlobalCodebook::from_kmeans(&training_data, 512, 100); let manager = CodebookManager::new(global, config); }
Hierarchical Quantization
sequenceDiagram
participant V as Input Vector
participant G as Global Codebook
participant L as Local Codebook
participant R as Result
V->>G: quantize()
G->>G: Find nearest centroid
G->>V: (entry_id, similarity)
V->>V: residual = vector - centroid
alt residual > threshold
V->>L: quantize_and_update(residual)
L->>L: EMA update or insert
L->>R: codes = [global_id, local_id]
else residual <= threshold
V->>R: codes = [global_id]
end
Usage
#![allow(unused)] fn main() { // Quantize a transaction embedding let result = manager.quantize("transactions", &embedding)?; println!("Global entry: {}", result.global_entry_id); println!("Global similarity: {}", result.global_similarity); if let Some(local_id) = result.local_entry_id { println!("Local entry: {}", local_id); println!("Local similarity: {}", result.local_similarity.unwrap()); } // Final codes for storage/transmission println!("Codes: {:?}", result.codes); }
State Validation
The codebook system validates states against known-good patterns.
Validation Methods
#![allow(unused)] fn main() { // Check if a state is valid (matches any codebook entry) let is_valid = manager.is_valid_state("transactions", &state); // Check if a transition is valid let is_valid_transition = manager.is_valid_transition( "transactions", &from_state, &to_state, 0.5, // max allowed distance ); }
Validation Flow
| Check | Threshold | Outcome |
|---|---|---|
| Global match | validity_threshold | Valid if similarity >= threshold |
| Local match | validity_threshold | Valid if similarity >= threshold |
| Transition distance | max_distance | Valid if euclidean <= max |
Worked Example
Training and Runtime Quantization
#![allow(unused)] fn main() { use tensor_chain::codebook::{CodebookManager, CodebookConfig, GlobalCodebook}; // Phase 1: Training - build global codebook let training_embeddings: Vec<Vec<f32>> = collect_training_data(); let global = GlobalCodebook::from_kmeans(&training_embeddings, 512, 100); // Phase 2: Runtime - create manager let config = CodebookConfig::default(); let manager = CodebookManager::new(global, config); // Phase 3: Quantize incoming transactions for tx in transactions { let embedding = compute_embedding(&tx); // Hierarchical quantization let quant = manager.quantize("transactions", &embedding)?; // Validate the state if !manager.is_valid_state("transactions", &embedding) { warn!("Unusual transaction state: {:?}", tx); } // Store codes for consensus tx.set_quantization_codes(quant.codes); } // Local codebook learns domain-specific patterns over time manager.with_local("transactions", |local| { let stats = local.stats(); println!("Local entries: {}", stats.entry_count); println!("Total updates: {}", stats.total_updates); }); }
k-Means Initialization
The global codebook uses k-means++ initialization for optimal centroid placement.
flowchart TD
A[Training Vectors] --> B[Random First Centroid]
B --> C[Compute Distances]
C --> D[Weighted Probability Selection]
D --> E{k centroids?}
E -->|No| C
E -->|Yes| F[Lloyd's Iteration]
F --> G{Converged?}
G -->|No| F
G -->|Yes| H[Final Centroids]
Configuration
#![allow(unused)] fn main() { use tensor_store::{KMeans, KMeansConfig}; let config = KMeansConfig { max_iterations: 100, tolerance: 1e-4, init_method: InitMethod::KMeansPlusPlus, }; let kmeans = KMeans::new(config); let centroids = kmeans.fit(&vectors, 512); }
Statistics and Monitoring
Local Codebook Stats
| Metric | Description |
|---|---|
entry_count | Current number of entries |
total_updates | EMA updates performed |
total_lookups | Quantization queries |
total_prunes | Entries removed due to capacity |
total_insertions | New entries added |
#![allow(unused)] fn main() { manager.with_local("transactions", |local| { let stats = local.stats(); let hit_rate = 1.0 - (stats.total_insertions as f64 / stats.total_lookups as f64); println!("Cache hit rate: {:.2}%", hit_rate * 100.0); }); }
Source Reference
tensor_chain/src/codebook.rs- Codebook implementationstensor_store/src/lib.rs- KMeans clustering
Formal Verification
Neumann’s distributed protocols are formally specified in TLA+ and
exhaustively model-checked with the TLC model checker. The
specifications live in specs/tla/ and cover the three critical
protocol layers in tensor_chain.
What Is Model Checked
TLC explores every reachable state of a bounded model, checking safety invariants and temporal properties at each state. Unlike testing (which samples executions), model checking is exhaustive: if TLC reports no errors, the properties hold for every possible interleaving within the model bounds.
Raft Consensus (Raft.tla)
Models leader election, log replication, and commit advancement for
the Tensor-Raft protocol implemented in tensor_chain/src/raft.rs.
Three configurations exercise different aspects of the protocol.
Properties verified (14):
| Property | Type | What It Means |
|---|---|---|
ElectionSafety | Invariant | At most one leader per term |
LogMatching | Invariant | Same index + term implies same entry |
StateMachineSafety | Invariant | No divergent committed entries |
LeaderCompleteness | Invariant | Committed entries survive leader changes |
VoteIntegrity | Invariant | Each node votes at most once per term |
PreVoteSafety | Invariant | Pre-vote does not disrupt existing leaders |
ReplicationInv | Invariant | Every committed entry exists on a quorum |
TermMonotonicity | Temporal | Terms never decrease |
CommittedLogAppendOnlyProp | Temporal | Committed entries never retracted |
MonotonicCommitIndexProp | Temporal | commitIndex never decreases |
MonotonicMatchIndexProp | Temporal | matchIndex monotonic per leader term |
NeverCommitEntryPrevTermsProp | Temporal | Only current-term entries committed |
StateTransitionsProp | Temporal | Valid state machine transitions |
PermittedLogChangesProp | Temporal | Log changes only via valid paths |
Result (Raft.cfg, 3 nodes): 6,641,341 states generated, 1,338,669 distinct states, depth 42, 2 min 24s. Zero errors.
Two-Phase Commit (TwoPhaseCommit.tla)
Models the 2PC protocol for cross-shard distributed transactions
implemented in tensor_chain/src/distributed_tx.rs. Includes a
fault model with message loss and participant crash/recovery.
Properties verified (6):
| Property | Type | What It Means |
|---|---|---|
Atomicity | Invariant | All participants commit or all abort |
NoOrphanedLocks | Invariant | Completed transactions release locks |
ConsistentDecision | Invariant | Coordinator decision matches outcomes |
VoteIrrevocability | Temporal | Prepared votes cannot be retracted without coordinator |
DecisionStability | Temporal | Coordinator decision never changes |
Fault model: DropMessage (network loss) and
ParticipantRestart (crash with WAL-backed lock recovery).
Result: 1,869,429,350 states generated, 190,170,601 distinct states, depth 29, 2 hr 55 min. Zero errors. Every reachable state under message loss and crash/recovery satisfies all properties.
SWIM Gossip Membership (Membership.tla)
Models the SWIM-based gossip protocol for cluster membership and
failure detection implemented in tensor_chain/src/gossip.rs and
tensor_chain/src/membership.rs.
Properties verified (3):
| Property | Type | What It Means |
|---|---|---|
NoFalsePositivesSafety | Invariant | No node marked Failed above its own incarnation |
MonotonicEpochs | Temporal | Lamport timestamps never decrease |
MonotonicIncarnations | Temporal | Incarnation numbers never decrease |
Result (2-node): 136,097 states generated, 54,148 distinct states, depth 17. Zero errors. Result (3-node): 16,513 states generated, 5,992 distinct states, depth 13. Zero errors.
Bugs Found by Model Checking
TLC discovered real protocol bugs that would be extremely difficult to find through testing alone:
-
matchIndex response reporting (Raft): Follower reported
matchIndex = Len(log)instead ofprevLogIndex + Len(entries). A heartbeat response would falsely claim the full log matched the leader’s, enabling incorrect commits. Caught byReplicationInv. -
Out-of-order matchIndex regression (Raft): Leader unconditionally set
matchIndexfrom responses. A stale heartbeat response arriving after a replication response would regress the value. Fixed by taking the max. Caught byMonotonicMatchIndexProp. -
inPreVote not reset on step-down (Raft): When stepping down to a higher term, the
inPreVoteflag was not cleared. A node could remain in pre-vote state as a Follower. Caught byPreVoteSafety. -
Self-message processing (Raft): A leader could process its own
AppendEntriesheartbeat, truncating its own log. -
Heartbeat log wipe (Raft): Empty heartbeat messages with
prevLogIndex = 0computed an empty new log, destroying committed entries. -
Out-of-order AppendEntries (Raft): Stale messages could overwrite entries from newer messages. Fixed with proper Raft Section 5.3 conflict-resolution.
-
Gossip fairness formula (Membership): Quantification over
messages(a state variable) insideWF_varsis semantically invalid in TLA+.
How to Run
cd specs/tla
# Fast CI check (~3 minutes total)
make ci
# All configs including extensions
make all
# Individual specs
java -XX:+UseParallelGC -Xmx4g -jar tla2tools.jar \
-deadlock -workers auto -config Raft.cfg Raft.tla
The -deadlock flag suppresses false deadlock reports on terminal
states in bounded models. The -workers auto flag enables
multi-threaded checking.
Relationship to Testing
| Technique | Coverage | Finds |
|---|---|---|
| Unit tests | Specific scenarios | Implementation bugs |
| Integration tests | Cross-crate workflows | Wiring bugs |
| Fuzz testing | Random inputs | Crash/panic bugs |
| Model checking | All interleavings | Protocol design bugs |
Model checking complements testing. It verifies the protocol design is correct (no possible interleaving violates safety), while tests verify the Rust implementation matches the design. Together they provide high confidence that the distributed protocols behave correctly.
Further Reading
- specs/tla/README.md for full specification documentation and source code mapping
- Consensus Protocols for Raft and SWIM protocol details
- Distributed Transactions for 2PC protocol details
Worked Examples
This tutorial demonstrates tensor_chain’s conflict detection, deadlock resolution, and orthogonal transaction merging through detailed scenarios.
Prerequisites
- Understanding of transaction workspaces
- Familiarity with delta embeddings
- Basic knowledge of distributed transactions
Scenario 1: Semantic Conflict Detection
Two transactions modify overlapping data. The system detects the conflict using delta embedding similarity.
Setup
#![allow(unused)] fn main() { use tensor_chain::{TensorStore, TransactionManager}; use tensor_chain::block::Transaction; let store = TensorStore::new(); let manager = TransactionManager::new(); // Initialize account data store.put("account:1", serialize(&Account { balance: 1000 }))?; store.put("account:2", serialize(&Account { balance: 2000 }))?; }
Transaction Execution
#![allow(unused)] fn main() { // Transaction A: Transfer from account:1 to account:2 let tx_a = manager.begin(&store)?; tx_a.add_operation(Transaction::Put { key: "account:1".to_string(), data: serialize(&Account { balance: 900 }), // -100 })?; tx_a.add_operation(Transaction::Put { key: "account:2".to_string(), data: serialize(&Account { balance: 2100 }), // +100 })?; // Transaction B: Transfer from account:1 to account:3 (conflicts on account:1) let tx_b = manager.begin(&store)?; tx_b.add_operation(Transaction::Put { key: "account:1".to_string(), data: serialize(&Account { balance: 800 }), // -200 })?; tx_b.add_operation(Transaction::Put { key: "account:3".to_string(), data: serialize(&Account { balance: 200 }), // +200 })?; }
Conflict Detection Flow
sequenceDiagram
participant A as Transaction A
participant B as Transaction B
participant CM as ConsensusManager
A->>A: compute_delta()
Note over A: delta_a = [0.8, 0.2, 0.0, 0.0]
B->>B: compute_delta()
Note over B: delta_b = [0.9, 0.0, 0.1, 0.0]
A->>CM: prepare(delta_a)
B->>CM: prepare(delta_b)
CM->>CM: cosine_similarity(delta_a, delta_b)
Note over CM: similarity = 0.72 (HIGH)
CM->>A: Vote::Yes
CM->>B: Vote::Conflict(similarity=0.72, tx=A)
A->>A: commit()
B->>B: abort() + retry
Classification Table
| Similarity Range | Classification | Action |
|---|---|---|
| 0.0 - 0.1 | Orthogonal | Parallel commit OK |
| 0.1 - 0.5 | Low overlap | Merge possible |
| 0.5 - 0.9 | Conflicting | Serialize execution |
| 0.9 - 1.0 | Parallel | Abort one |
Application Retry Logic
#![allow(unused)] fn main() { // Retry with exponential backoff let mut attempt = 0; let max_attempts = 5; loop { let workspace = manager.begin(&store)?; // Re-read current state let account = store.get("account:1")?; // Apply changes workspace.add_operation(Transaction::Put { key: "account:1".to_string(), data: serialize(&Account { balance: account.balance - 200, }), })?; // Try to commit match commit_with_conflict_check(&workspace, &manager) { Ok(()) => break, Err(ConflictError { similarity, .. }) => { attempt += 1; if attempt >= max_attempts { return Err("max retries exceeded"); } // Exponential backoff with jitter let backoff = (100 * 2u64.pow(attempt)) + rand::random::<u64>() % 50; std::thread::sleep(Duration::from_millis(backoff)); } } } }
Scenario 2: Deadlock Detection and Resolution
Two transactions wait on each other’s locks, creating a cycle in the wait-for graph.
Setup
Transaction T1: needs locks on [key_A, key_B]
Transaction T2: needs locks on [key_B, key_A]
Timeline:
T1 acquires key_A
T2 acquires key_B
T1 waits for key_B (held by T2)
T2 waits for key_A (held by T1)
-> DEADLOCK
Wait-For Graph
flowchart LR
T1((T1)) -->|waits for key_B| T2((T2))
T2 -->|waits for key_A| T1
Detection Flow
sequenceDiagram
participant T1 as Transaction 1
participant T2 as Transaction 2
participant LM as LockManager
participant WG as WaitForGraph
participant DD as DeadlockDetector
T1->>LM: try_lock(key_A)
LM->>T1: Ok(handle_1)
T2->>LM: try_lock(key_B)
LM->>T2: Ok(handle_2)
T1->>LM: try_lock(key_B)
LM->>WG: add_wait(T1, T2)
LM->>T1: Err(blocked by T2)
T2->>LM: try_lock(key_A)
LM->>WG: add_wait(T2, T1)
LM->>T2: Err(blocked by T1)
DD->>WG: detect_cycle()
WG->>DD: Some([T1, T2])
DD->>DD: select_victim(T2)
DD->>T2: abort()
DD->>LM: release(T2)
DD->>WG: remove(T2)
T1->>LM: try_lock(key_B)
LM->>T1: Ok(handle_3)
Victim Selection
| Criterion | Weight | Description |
|---|---|---|
| Lock count | 0.3 | Fewer locks = preferred victim |
| Transaction age | 0.3 | Younger = preferred victim |
| Priority | 0.4 | Lower priority = preferred victim |
#![allow(unused)] fn main() { fn select_victim(cycle: &[u64], priorities: &HashMap<u64, u32>) -> u64 { cycle .iter() .min_by_key(|&&tx_id| { let priority = priorities.get(&tx_id).copied().unwrap_or(0); let lock_count = lock_manager.lock_count_for_transaction(tx_id); (priority, lock_count) }) .copied() .unwrap() } }
Configuration
[deadlock]
detection_interval_ms = 100
max_cycle_length = 10
victim_selection = "youngest" # or "lowest_priority", "fewest_locks"
Scenario 3: Orthogonal Transaction Merging
Two transactions modify non-overlapping data with orthogonal delta embeddings. They can be committed in parallel.
Setup
#![allow(unused)] fn main() { // Transaction A: Update user preferences let tx_a = manager.begin(&store)?; tx_a.add_operation(Transaction::Put { key: "user:1:prefs".to_string(), data: serialize(&Preferences { theme: "dark" }), })?; tx_a.set_before_embedding(vec![0.0; 128]); tx_a.compute_delta(vec![1.0, 0.0, 0.0, 0.0]); // Direction: X // Transaction B: Update product inventory let tx_b = manager.begin(&store)?; tx_b.add_operation(Transaction::Put { key: "product:42:stock".to_string(), data: serialize(&Stock { quantity: 100 }), })?; tx_b.set_before_embedding(vec![0.0; 128]); tx_b.compute_delta(vec![0.0, 1.0, 0.0, 0.0]); // Direction: Y }
Parallel Commit
sequenceDiagram
participant A as Transaction A
participant B as Transaction B
participant CM as ConsensusManager
participant C as Chain
par Prepare Phase
A->>CM: prepare(delta_a)
B->>CM: prepare(delta_b)
end
CM->>CM: similarity = 0.0 (ORTHOGONAL)
par Commit Phase
CM->>A: Vote::Yes
CM->>B: Vote::Yes
A->>C: append(block_a)
B->>C: append(block_b)
end
Note over C: Both blocks committed
Orthogonality Analysis
| Transaction A | Transaction B | Overlap | Similarity | Can Merge? |
|---|---|---|---|---|
| user:1:prefs | product:42:stock | None | 0.00 | Yes |
| user:1:balance | user:2:balance | None | 0.15 | Yes |
| user:1:balance | user:1:prefs | user:1 | 0.30 | Maybe |
| account:1 | account:1 | Full | 0.95 | No |
Merge Implementation
#![allow(unused)] fn main() { // Find merge candidates let candidates = manager.find_merge_candidates( &tx_a, 0.1, // orthogonal threshold 60_000, // merge window (60s) ); if !candidates.is_empty() { // Create merged block with multiple transactions let mut merged_ops = tx_a.operations(); for candidate in &candidates { merged_ops.extend(candidate.workspace.operations()); } // Compute merged delta let merged_delta = tx_a.to_delta_vector(); for candidate in &candidates { merged_delta = merged_delta.add(&candidate.delta); } // Single block contains both transactions let block = Block::new(header, merged_ops); chain.append(block)?; // Mark all as committed tx_a.mark_committed(); for candidate in candidates { candidate.workspace.mark_committed(); } } }
Summary
| Scenario | Detection Method | Resolution |
|---|---|---|
| Conflict | Delta similarity > 0.5 | Serialize, retry loser |
| Deadlock | Wait-for graph cycle | Abort victim, retry |
| Orthogonal | Delta similarity < 0.1 | Parallel commit/merge |
Further Reading
Deployment
Single Node
For development and testing:
neumann --data-dir ./data
Cluster Deployment
Prerequisites
- 3, 5, or 7 nodes (odd number for quorum)
- Network connectivity between nodes
- Synchronized clocks (NTP)
Configuration
Each node needs a config file:
# /etc/neumann/config.toml
[node]
id = "node1"
data_dir = "/var/lib/neumann"
bind_address = "0.0.0.0:7878"
[cluster]
peers = [
"node2:7878",
"node3:7878",
]
[raft]
election_timeout_min_ms = 150
election_timeout_max_ms = 300
heartbeat_interval_ms = 50
[gossip]
bind_address = "0.0.0.0:7879"
ping_interval_ms = 1000
Starting the Cluster
# Start first node (will become leader)
neumann --config /etc/neumann/config.toml --bootstrap
# Start remaining nodes
neumann --config /etc/neumann/config.toml
Verify Cluster Health
# Check cluster status
curl http://node1:9090/health
# View membership
neumann-admin cluster-status
Docker Compose
version: '3.8'
services:
node1:
image: neumann/neumann:latest
environment:
- NEUMANN_NODE_ID=node1
- NEUMANN_PEERS=node2:7878,node3:7878
ports:
- "7878:7878"
- "9090:9090"
volumes:
- node1-data:/var/lib/neumann
node2:
image: neumann/neumann:latest
environment:
- NEUMANN_NODE_ID=node2
- NEUMANN_PEERS=node1:7878,node3:7878
volumes:
- node2-data:/var/lib/neumann
node3:
image: neumann/neumann:latest
environment:
- NEUMANN_NODE_ID=node3
- NEUMANN_PEERS=node1:7878,node2:7878
volumes:
- node3-data:/var/lib/neumann
volumes:
node1-data:
node2-data:
node3-data:
Kubernetes
See the Helm chart in deploy/helm/neumann/.
helm install neumann ./deploy/helm/neumann \
--set replicas=3 \
--set persistence.size=100Gi
Production Checklist
- Odd number of nodes (3, 5, or 7)
- Nodes in separate availability zones
- NTP configured and synchronized
- Firewall rules for ports 7878, 7879, 9090
- Monitoring and alerting configured
- Backup strategy in place
- Resource limits set appropriately
Configuration
Configuration Sources
Configuration is loaded in order (later overrides earlier):
- Default values
- Config file (
/etc/neumann/config.toml) - Environment variables (
NEUMANN_*) - Command-line flags
Config File Format
[node]
id = "node1"
data_dir = "/var/lib/neumann"
bind_address = "0.0.0.0:7878"
[cluster]
peers = ["node2:7878", "node3:7878"]
[raft]
election_timeout_min_ms = 150
election_timeout_max_ms = 300
heartbeat_interval_ms = 50
max_entries_per_append = 100
snapshot_interval = 10000
[gossip]
bind_address = "0.0.0.0:7879"
ping_interval_ms = 1000
ping_timeout_ms = 500
suspect_timeout_ms = 3000
indirect_ping_count = 3
[transaction]
prepare_timeout_ms = 5000
commit_timeout_ms = 5000
lock_timeout_ms = 5000
max_concurrent_tx = 1000
[deadlock]
enabled = true
detection_interval_ms = 100
victim_policy = "youngest"
auto_abort_victim = true
[storage]
max_memory_mb = 1024
wal_sync_mode = "fsync"
compression = "lz4"
[metrics]
enabled = true
bind_address = "0.0.0.0:9090"
Environment Variables
| Variable | Config Path | Example |
|---|---|---|
NEUMANN_NODE_ID | node.id | node1 |
NEUMANN_DATA_DIR | node.data_dir | /var/lib/neumann |
NEUMANN_PEERS | cluster.peers | node2:7878,node3:7878 |
NEUMANN_LOG_LEVEL | — | info |
Command-Line Flags
neumann \
--config /etc/neumann/config.toml \
--node-id node1 \
--data-dir /var/lib/neumann \
--bind 0.0.0.0:7878 \
--bootstrap \
--log-level debug
Key Parameters
Raft Tuning
| Parameter | Default | Tuning |
|---|---|---|
election_timeout_min_ms | 150 | Increase for high-latency networks |
election_timeout_max_ms | 300 | Should be 2x min |
heartbeat_interval_ms | 50 | Lower for faster failure detection |
snapshot_interval | 10000 | Higher for less I/O, slower recovery |
Transaction Tuning
| Parameter | Default | Tuning |
|---|---|---|
prepare_timeout_ms | 5000 | Increase for slow networks |
lock_timeout_ms | 5000 | Lower to fail fast on contention |
max_concurrent_tx | 1000 | Based on memory and CPU |
Storage Tuning
| Parameter | Default | Tuning |
|---|---|---|
max_memory_mb | 1024 | Based on available RAM |
wal_sync_mode | fsync | none for speed (data loss risk) |
compression | lz4 | none for speed, zstd for ratio |
Monitoring
Metrics Endpoint
Prometheus metrics are exposed at http://node:9090/metrics.
Key Metrics
Raft Consensus
| Metric | Type | Description |
|---|---|---|
tensor_chain_raft_state | Gauge | Current state (follower=0, candidate=1, leader=2) |
tensor_chain_term | Gauge | Current Raft term |
tensor_chain_commit_index | Gauge | Highest committed log index |
tensor_chain_applied_index | Gauge | Highest applied log index |
tensor_chain_elections_total | Counter | Total elections started |
tensor_chain_append_entries_total | Counter | Total AppendEntries RPCs |
Transactions
| Metric | Type | Description |
|---|---|---|
tensor_chain_tx_active | Gauge | Currently active transactions |
tensor_chain_tx_commits_total | Counter | Total committed transactions |
tensor_chain_tx_aborts_total | Counter | Total aborted transactions |
tensor_chain_tx_latency_seconds | Histogram | Transaction latency |
Deadlock Detection
| Metric | Type | Description |
|---|---|---|
tensor_chain_deadlocks_total | Counter | Total deadlocks detected |
tensor_chain_deadlock_victims_total | Counter | Transactions aborted as victims |
tensor_chain_wait_graph_size | Gauge | Current wait-for graph size |
Gossip
| Metric | Type | Description |
|---|---|---|
tensor_chain_gossip_members | Gauge | Known cluster members |
tensor_chain_gossip_healthy | Gauge | Healthy members |
tensor_chain_gossip_suspect | Gauge | Suspect members |
tensor_chain_gossip_failed | Gauge | Failed members |
Storage
| Metric | Type | Description |
|---|---|---|
tensor_chain_entries_total | Gauge | Total stored entries |
tensor_chain_memory_bytes | Gauge | Memory usage |
tensor_chain_disk_bytes | Gauge | Disk usage |
tensor_chain_wal_size_bytes | Gauge | WAL file size |
Prometheus Configuration
scrape_configs:
- job_name: 'neumann'
static_configs:
- targets:
- 'node1:9090'
- 'node2:9090'
- 'node3:9090'
Grafana Dashboard
Import the dashboard from deploy/grafana/neumann-dashboard.json.
Panels include:
- Cluster overview (leader, term, members)
- Transaction throughput and latency
- Replication lag
- Memory and disk usage
- Deadlock rate
Alerting Rules
See docs/book/src/operations/runbooks/ for alert definitions.
groups:
- name: neumann
rules:
- alert: NoLeader
expr: sum(tensor_chain_raft_state{state="leader"}) == 0
for: 30s
labels:
severity: critical
- alert: HighReplicationLag
expr: tensor_chain_commit_index - tensor_chain_applied_index > 1000
for: 1m
labels:
severity: warning
- alert: HighDeadlockRate
expr: rate(tensor_chain_deadlocks_total[5m]) > 1
for: 5m
labels:
severity: warning
Health Endpoint
curl http://node:9090/health
Response:
{
"status": "healthy",
"raft_state": "leader",
"term": 42,
"commit_index": 12345,
"members": 3,
"healthy_members": 3
}
Logging
Configure log level:
RUST_LOG=tensor_chain=debug neumann
Log levels: error, warn, info, debug, trace
Troubleshooting
Common Issues
Node Won’t Start
Symptom: Node exits immediately or fails to bind
Check:
# Port already in use
lsof -i :7878
lsof -i :9090
# Permissions
ls -la /var/lib/neumann
# Config syntax
neumann --config /etc/neumann/config.toml --validate
Solutions:
- Kill conflicting process
- Fix directory permissions:
chown -R neumann:neumann /var/lib/neumann - Fix config syntax errors
Can’t Connect to Cluster
Symptom: Client connections timeout
Check:
# Network connectivity
nc -zv node1 7878
# Firewall rules
iptables -L -n | grep 7878
# Node health
curl http://node1:9090/health
Solutions:
- Open firewall ports 7878, 7879, 9090
- Check DNS resolution
- Verify node is running
Slow Performance
Symptom: High latency, low throughput
Check:
# Metrics
curl http://node1:9090/metrics | grep -E "(latency|throughput)"
# Disk I/O
iostat -x 1
# Memory
free -h
# CPU
top -p $(pgrep neumann)
Solutions:
- Increase memory allocation
- Use faster storage (NVMe)
- Tune Raft parameters
- Add more nodes for read scaling
Data Inconsistency
Symptom: Different nodes return different data
Check:
# Compare commit indices
for node in node1 node2 node3; do
curl -s http://$node:9090/metrics | grep commit_index
done
# Check for partitions
neumann-admin cluster-status
Solutions:
- Wait for replication to catch up
- Check network connectivity
- Follow split-brain runbook if partitioned
High Memory Usage
Symptom: OOM kills, swap usage
Check:
# Memory breakdown
curl http://node1:9090/metrics | grep memory
# Process memory
ps aux | grep neumann
Solutions:
- Increase
max_memory_mbconfig - Trigger snapshot to reduce log size
- Add more nodes to distribute load
WAL Growing Too Large
Symptom: Disk filling up
Check:
# WAL size
du -sh /var/lib/neumann/wal/
# Snapshot status
ls -la /var/lib/neumann/snapshots/
Solutions:
- Trigger manual snapshot:
curl -X POST http://node:9090/admin/snapshot - Reduce
snapshot_interval - Add more disk space
Debug Logging
Enable detailed logging:
RUST_LOG=tensor_chain=debug,tower=warn neumann
For specific modules:
RUST_LOG=tensor_chain::raft=trace,tensor_chain::gossip=debug neumann
Getting Help
- Check the runbooks for specific scenarios
- Search GitHub issues
- Open a new issue with:
- Neumann version
- Configuration (redact secrets)
- Relevant logs
- Steps to reproduce
Example Configurations
This page provides complete configuration examples for different deployment scenarios.
Development (Single Node)
Minimal configuration for local development and testing.
[node]
id = "dev-node"
data_dir = "./data"
[cluster]
# Single node cluster - no seeds needed
seeds = []
port = 9100
# Disable TLS for development
[tls]
enabled = false
# Minimal rate limiting
[rate_limit]
enabled = false
# No compression for easier debugging
[compression]
enabled = false
# Shorter timeouts for faster feedback
[transactions]
timeout_ms = 1000
lock_timeout_ms = 500
# Verbose logging
[logging]
level = "debug"
format = "pretty"
Production (3-Node Cluster)
Standard production configuration with TLS, rate limiting, and tuned timeouts.
# === Node Configuration ===
[node]
id = "node1"
data_dir = "/var/lib/neumann/data"
# Bind to all interfaces
bind_address = "0.0.0.0"
# === Cluster Configuration ===
[cluster]
seeds = ["node1.example.com:9100", "node2.example.com:9100", "node3.example.com:9100"]
port = 9100
# Cluster name for identification
name = "production"
# === TLS Configuration ===
[tls]
enabled = true
cert_path = "/etc/neumann/node1.crt"
key_path = "/etc/neumann/node1.key"
ca_cert_path = "/etc/neumann/ca.crt"
# Require mutual TLS
require_client_auth = true
# Verify node identity matches certificate
node_id_verification = "CommonName"
# === TCP Transport ===
[tcp]
# Connections per peer
pool_size = 4
# Connection timeout
connect_timeout_ms = 5000
# Read/write timeout
io_timeout_ms = 30000
# Enable keepalive
keepalive = true
keepalive_interval_secs = 30
# Maximum message size (16 MB)
max_message_size = 16777216
# Outbound queue size
max_pending_messages = 1000
# === Rate Limiting ===
[rate_limit]
enabled = true
# Burst capacity
bucket_size = 100
# Tokens per second
refill_rate = 50.0
# === Compression ===
[compression]
enabled = true
method = "Lz4"
# Only compress messages > 256 bytes
min_size = 256
# === Transactions ===
[transactions]
# Transaction timeout
timeout_ms = 5000
# Lock timeout
lock_timeout_ms = 30000
# Default embedding dimension
embedding_dimension = 128
# === Conflict Detection ===
[consensus]
# Similarity threshold for conflict
conflict_threshold = 0.5
# Threshold for orthogonal merge
orthogonal_threshold = 0.1
# Merge window
merge_window_ms = 60000
# === Deadlock Detection ===
[deadlock]
enabled = true
detection_interval_ms = 100
max_cycle_length = 10
# === Snapshots ===
[snapshots]
# Memory threshold before disk spill
max_memory_bytes = 268435456 # 256 MB
# Snapshot interval
interval_secs = 3600
# Retention count
retain_count = 3
# === Metrics ===
[metrics]
enabled = true
# Prometheus endpoint
endpoint = "0.0.0.0:9090"
# Include detailed histograms
detailed = true
# === Logging ===
[logging]
level = "info"
format = "json"
# Log to file
file = "/var/log/neumann/neumann.log"
# Rotate logs
max_size_mb = 100
max_files = 10
High-Throughput (5-Node)
Optimized configuration for maximum write throughput.
[node]
id = "node1"
data_dir = "/var/lib/neumann/data"
[cluster]
seeds = [
"node1.example.com:9100",
"node2.example.com:9100",
"node3.example.com:9100",
"node4.example.com:9100",
"node5.example.com:9100",
]
port = 9100
name = "high-throughput"
# === TLS (same as production) ===
[tls]
enabled = true
cert_path = "/etc/neumann/node1.crt"
key_path = "/etc/neumann/node1.key"
ca_cert_path = "/etc/neumann/ca.crt"
require_client_auth = true
# === TCP - Optimized for throughput ===
[tcp]
# More connections for parallelism
pool_size = 8
# Shorter timeouts for faster failover
connect_timeout_ms = 2000
io_timeout_ms = 10000
keepalive = true
keepalive_interval_secs = 15
# Larger message size for batching
max_message_size = 67108864 # 64 MB
# Larger queues for buffering
max_pending_messages = 5000
recv_buffer_size = 5000
# === Rate Limiting - Permissive ===
[rate_limit]
enabled = true
bucket_size = 500
refill_rate = 250.0
# === Compression - Aggressive ===
[compression]
enabled = true
method = "Lz4"
# Compress even small messages
min_size = 64
# === Transactions - Fast ===
[transactions]
timeout_ms = 2000
lock_timeout_ms = 5000
embedding_dimension = 64 # Smaller for speed
# === Consensus - Optimized ===
[consensus]
# Lower thresholds for more merging
conflict_threshold = 0.7
orthogonal_threshold = 0.2
merge_window_ms = 30000
# === Deadlock - Frequent checks ===
[deadlock]
enabled = true
detection_interval_ms = 50
# === Raft - Tuned for throughput ===
[raft]
# Batch more entries
max_entries_per_append = 1000
# Shorter election timeout
election_timeout_ms = 500
# Faster heartbeats
heartbeat_interval_ms = 100
# === Snapshots - Less frequent ===
[snapshots]
max_memory_bytes = 536870912 # 512 MB
interval_secs = 7200
retain_count = 2
Geo-Distributed (Multi-Region)
Configuration for clusters spanning multiple geographic regions with higher latency tolerance.
[node]
id = "node1-us-east"
data_dir = "/var/lib/neumann/data"
region = "us-east-1"
[cluster]
seeds = [
"node1-us-east.example.com:9100",
"node2-us-west.example.com:9100",
"node3-eu-west.example.com:9100",
]
port = 9100
name = "geo-distributed"
# === TLS (same as production) ===
[tls]
enabled = true
cert_path = "/etc/neumann/node1-us-east.crt"
key_path = "/etc/neumann/node1-us-east.key"
ca_cert_path = "/etc/neumann/ca.crt"
require_client_auth = true
# === TCP - WAN optimized ===
[tcp]
pool_size = 4
# Longer timeouts for cross-region latency
connect_timeout_ms = 10000
io_timeout_ms = 60000
keepalive = true
# More frequent keepalives to detect failures
keepalive_interval_secs = 10
max_message_size = 16777216
# === Rate Limiting - Standard ===
[rate_limit]
enabled = true
bucket_size = 100
refill_rate = 50.0
# === Compression - Always on for WAN ===
[compression]
enabled = true
method = "Lz4"
min_size = 128
# === Transactions - Longer timeouts ===
[transactions]
# Higher timeout for cross-region coordination
timeout_ms = 15000
lock_timeout_ms = 60000
embedding_dimension = 128
# === Consensus - Relaxed for latency ===
[consensus]
conflict_threshold = 0.5
orthogonal_threshold = 0.1
# Longer merge window for slow convergence
merge_window_ms = 120000
# === Deadlock - Less frequent for WAN ===
[deadlock]
enabled = true
detection_interval_ms = 500
# === Raft - WAN tuned ===
[raft]
max_entries_per_append = 100
# Longer election timeout for WAN latency
election_timeout_ms = 3000
heartbeat_interval_ms = 500
# Enable pre-vote to prevent disruption during partitions
pre_vote = true
# === Snapshots ===
[snapshots]
max_memory_bytes = 268435456
interval_secs = 3600
retain_count = 5
# === Reconnection - Aggressive ===
[reconnection]
enabled = true
initial_backoff_ms = 500
max_backoff_ms = 60000
multiplier = 2.0
jitter = 0.2
# === Region awareness ===
[region]
# Prefer local reads
local_read_preference = true
# Region priority for leader election
priority = 1
Configuration Reference
Environment Variables
All configuration values can be overridden with environment variables:
NEUMANN_NODE_ID=node1
NEUMANN_CLUSTER_PORT=9100
NEUMANN_TLS_ENABLED=true
NEUMANN_LOGGING_LEVEL=debug
Configuration Precedence
- Environment variables (highest)
- Command-line arguments
- Configuration file
- Default values (lowest)
See Also
Runbooks
Operational runbooks for managing Neumann clusters, focusing on tensor_chain distributed operations.
Available Runbooks
| Runbook | Scenario | Severity |
|---|---|---|
| Leader Election | Cluster has no leader | Critical |
| Split-Brain Recovery | Network partition healed | Critical |
| Node Recovery | Node crash or disk failure | High |
| Backup and Restore | Data backup and disaster recovery | High |
| Capacity Planning | Resource sizing and scaling | Medium |
| Deadlock Resolution | Transaction deadlocks | Medium |
How to Use These Runbooks
- Identify the symptom from alerts or monitoring
- Find the matching runbook in the table above
- Follow the diagnostic steps to confirm root cause
- Execute the resolution steps in order
- Verify recovery using the provided checks
Alerting Rules
Each runbook includes Prometheus alerting rules. Deploy them to your monitoring stack:
# Copy alerting rules
cp docs/book/src/operations/alerting-rules.yml /etc/prometheus/rules/neumann.yml
# Reload Prometheus
curl -X POST http://prometheus:9090/-/reload
Emergency Contacts
For production incidents:
- Page the on-call engineer
- Start an incident channel
- Follow the relevant runbook
- Document actions taken
- Schedule post-incident review
Node Management
This runbook covers adding and removing nodes from a tensor_chain cluster.
Adding a Node
Prerequisites Checklist
- New node has network connectivity to existing cluster members
- TLS certificates are configured (if using TLS)
- Node has sufficient disk space for snapshot transfer
- Firewall rules allow traffic on cluster port (default: 9100)
- DNS/hostname resolution configured for the new node
Symptoms (Why Add a Node)
- Cluster capacity insufficient for workload
- Need additional replicas for fault tolerance
- Geographic distribution requirements
- Performance scaling requirements
Procedure
Step 1: Prepare the new node
# Install Neumann on the new node
cargo install neumann --version X.Y.Z
# Create configuration directory
mkdir -p /etc/neumann
mkdir -p /var/lib/neumann/data
# Copy TLS certificates (if using TLS)
scp admin@existing-node:/etc/neumann/ca.crt /etc/neumann/
# Generate node-specific certificates
./scripts/generate-node-cert.sh node4
Step 2: Configure the new node
Create /etc/neumann/config.toml:
[node]
id = "node4"
data_dir = "/var/lib/neumann/data"
[cluster]
# Existing cluster members for initial discovery
seeds = ["node1:9100", "node2:9100", "node3:9100"]
port = 9100
[tls]
cert_path = "/etc/neumann/node4.crt"
key_path = "/etc/neumann/node4.key"
ca_cert_path = "/etc/neumann/ca.crt"
Step 3: Join the cluster
# Start the node in join mode
neumann start --join
# Monitor the join process
neumann status --watch
Step 4: Verify cluster membership
# On any existing node
neumann cluster members
# Expected output:
# ID ADDRESS STATE ROLE
# node1 10.0.1.1:9100 healthy leader
# node2 10.0.1.2:9100 healthy follower
# node3 10.0.1.3:9100 healthy follower
# node4 10.0.1.4:9100 healthy follower <-- new node
Post-Addition Verification
# Verify snapshot transfer completed
neumann status node4 --verbose
# Check replication lag
neumann metrics node4 | grep replication_lag
# Verify the node participates in consensus
neumann raft status
Removing a Node
Prerequisites Checklist
- Cluster will maintain quorum after removal
- Node is not the current leader (trigger election first)
- Data has been replicated to other nodes
- No in-flight transactions involving this node
Symptoms (Why Remove a Node)
- Hardware failure requiring decommission
- Cluster right-sizing
- Node relocation to different region
- Maintenance requiring extended downtime
Pre-Removal Verification
# Check current cluster state
neumann cluster members
# Verify quorum will be maintained
# For N nodes, quorum = (N/2) + 1
# 5 nodes -> quorum = 3, can remove 2
# 3 nodes -> quorum = 2, can remove 1
Procedure
Step 1: Drain the node (graceful removal)
# Mark node as draining (stops accepting new requests)
neumann node drain node3
# Wait for in-flight transactions to complete
neumann node wait-drain node3 --timeout 300
Step 2: Transfer leadership if necessary
# Check if node is leader
neumann raft status
# If leader, trigger election
neumann raft transfer-leadership --to node1
Step 3: Remove from cluster
# Remove the node from cluster configuration
neumann cluster remove node3
# Verify removal
neumann cluster members
Step 4: Stop the node
# On the removed node
neumann stop
# Clean up data (optional)
rm -rf /var/lib/neumann/data/*
Post-Removal Verification
# Verify cluster health
neumann cluster health
# Check that remaining nodes have correct membership
neumann cluster members
# Verify no pending transactions for removed node
neumann transactions pending
Emergency Removal
Use emergency removal only when a node is unresponsive and cannot be drained gracefully.
Symptoms
- Node is unreachable (network partition, hardware failure)
- Node is unresponsive (hung process, resource exhaustion)
- Need to restore quorum quickly
Procedure
# Force remove unresponsive node
neumann cluster remove node3 --force
# The cluster will:
# 1. Remove node from membership
# 2. Abort any transactions involving the node
# 3. Re-elect leader if necessary
Resolution
After emergency removal:
- Investigate root cause of node failure
- Repair or replace hardware if needed
- Re-add node using the addition procedure above
Prevention
- Monitor node health with alerting
- Configure appropriate timeouts
- Maintain sufficient cluster size for fault tolerance
Quorum Considerations
| Cluster Size | Quorum | Fault Tolerance | Notes |
|---|---|---|---|
| 1 | 1 | 0 | Development only |
| 2 | 2 | 0 | Not recommended |
| 3 | 2 | 1 | Minimum for production |
| 5 | 3 | 2 | Recommended for HA |
| 7 | 4 | 3 | Maximum practical size |
Quorum Formula
quorum = (cluster_size / 2) + 1
fault_tolerance = cluster_size - quorum
Best Practices
- Always maintain odd number of nodes
- Never remove nodes if it would violate quorum
- Plan node additions/removals during low-traffic periods
- Test failover scenarios regularly
See Also
Cluster Upgrade
This runbook covers upgrading tensor_chain clusters with minimal downtime.
Upgrade Types
| Type | Downtime | Complexity | Use Case |
|---|---|---|---|
| Rolling | None | Low | Minor version upgrades |
| Blue-Green | Minimal | Medium | Major version upgrades |
| Canary | None | High | Risk-sensitive environments |
Rolling Upgrade
Upgrade nodes one at a time while maintaining cluster availability.
Prerequisites
- Cluster has 3+ nodes for quorum during upgrades
- New version is backwards compatible with current version
- Upgrade tested in staging environment
- Backup of cluster state completed
Symptoms (Why Upgrade)
- Security patches available
- New features required
- Bug fixes needed
- Performance improvements available
Upgrade Sequence
sequenceDiagram
participant F1 as Follower 1
participant F2 as Follower 2
participant L as Leader
participant A as Admin
Note over A: Start rolling upgrade
A->>F1: upgrade
F1->>F1: restart with new version
F1->>L: rejoin cluster
Note over F1: Follower 1 upgraded
A->>F2: upgrade
F2->>F2: restart with new version
F2->>L: rejoin cluster
Note over F2: Follower 2 upgraded
A->>L: transfer leadership
L->>F1: leadership transferred
A->>L: upgrade (now follower)
L->>F1: rejoin cluster
Note over L: All nodes upgraded
Procedure
Step 1: Pre-upgrade checks
# Verify cluster health
neumann cluster health
# Check current versions
neumann cluster versions
# Verify backup is current
neumann backup status
Step 2: Upgrade followers first
# For each follower node:
# 1. Drain the node
neumann node drain node2
# 2. Stop the service
ssh node2 "systemctl stop neumann"
# 3. Upgrade the binary
ssh node2 "cargo install neumann --version X.Y.Z"
# 4. Start the service
ssh node2 "systemctl start neumann"
# 5. Verify rejoin
neumann cluster members
# 6. Wait for replication catch-up
neumann metrics node2 | grep replication_lag
Step 3: Upgrade the leader
# Transfer leadership to an upgraded follower
neumann raft transfer-leadership --to node2
# Verify leadership transferred
neumann raft status
# Now upgrade the old leader (same steps as followers)
neumann node drain node1
ssh node1 "systemctl stop neumann"
ssh node1 "cargo install neumann --version X.Y.Z"
ssh node1 "systemctl start neumann"
Step 4: Post-upgrade verification
# Verify all nodes on new version
neumann cluster versions
# Expected output:
# ID VERSION
# node1 X.Y.Z
# node2 X.Y.Z
# node3 X.Y.Z
# Run health checks
neumann cluster health
# Verify functionality with test transactions
neumann test-transaction
Version Compatibility
Compatibility Matrix
| From Version | To Version | Compatible | Notes |
|---|---|---|---|
| 0.9.x | 0.10.x | Yes | Rolling upgrade supported |
| 0.10.x | 0.11.x | Yes | Rolling upgrade supported |
| 0.8.x | 0.10.x | No | Blue-green required |
| 0.x.x | 1.0.x | No | Blue-green required |
Version Skew Policy
- Maximum skew: 1 minor version during rolling upgrades
- Leader version: Must be >= follower versions
- Upgrade order: Always followers first, then leader
Rollback Procedure
If issues are discovered after upgrade:
Symptoms Requiring Rollback
- Transaction failures after upgrade
- Performance degradation
- Consensus failures
- Data corruption detected
Rollback Steps
# 1. Stop accepting new requests
neumann cluster pause
# 2. Identify problematic nodes
neumann cluster health --verbose
# 3. Rollback affected nodes
ssh node1 "cargo install neumann --version X.Y.Z-OLD"
ssh node1 "systemctl restart neumann"
# 4. Verify rollback
neumann cluster versions
# 5. Resume operations
neumann cluster resume
Rollback Limitations
- Cannot rollback if schema changes were applied
- Cannot rollback if new features were used
- Always test rollback in staging first
Canary Upgrade
For risk-sensitive environments, upgrade a single node first and monitor.
Procedure
# 1. Select canary node (typically a follower)
CANARY=node3
# 2. Upgrade canary
neumann node drain $CANARY
ssh $CANARY "cargo install neumann --version X.Y.Z"
ssh $CANARY "systemctl restart neumann"
# 3. Monitor canary for 24-48 hours
neumann metrics $CANARY --watch
# 4. Compare metrics with non-canary nodes
neumann metrics compare $CANARY node1
# 5. If healthy, proceed with rolling upgrade
# If unhealthy, rollback canary
Canary Success Criteria
| Metric | Threshold | Action if Exceeded |
|---|---|---|
| Error rate | < 0.1% | Rollback |
| Latency p99 | < 2x baseline | Investigate |
| Replication lag | < 100ms | Investigate |
| Memory usage | < 1.5x baseline | Investigate |
Automated Upgrade Script
#!/bin/bash
# rolling-upgrade.sh - Automated rolling upgrade script
set -e
NEW_VERSION=$1
NODES=$(neumann cluster members --format json | jq -r '.[] | .id')
LEADER=$(neumann raft status --format json | jq -r '.leader')
echo "Upgrading cluster to version $NEW_VERSION"
# Upgrade followers first
for node in $NODES; do
if [ "$node" == "$LEADER" ]; then
continue
fi
echo "Upgrading follower: $node"
neumann node drain $node
ssh $node "cargo install neumann --version $NEW_VERSION"
ssh $node "systemctl restart neumann"
# Wait for rejoin
sleep 10
neumann cluster wait-healthy --timeout 120
done
# Transfer leadership and upgrade old leader
echo "Transferring leadership from $LEADER"
NEW_LEADER=$(echo $NODES | tr ' ' '\n' | grep -v $LEADER | head -1)
neumann raft transfer-leadership --to $NEW_LEADER
sleep 5
echo "Upgrading old leader: $LEADER"
neumann node drain $LEADER
ssh $LEADER "cargo install neumann --version $NEW_VERSION"
ssh $LEADER "systemctl restart neumann"
# Final verification
neumann cluster wait-healthy --timeout 120
neumann cluster versions
echo "Upgrade complete"
See Also
Leader Election Failures
Symptoms
NoLeaderalert firing- Continuous election attempts in logs
- Client requests timing out with “no leader” errors
tensor_chain_elections_totalmetric increasing rapidly
Diagnostic Commands
Check Raft State
# Query each node's state
for node in node1 node2 node3; do
curl -s http://$node:9090/metrics | grep tensor_chain_raft_state
done
Inspect Logs
# Look for election-related entries
grep -E "(election|vote|term)" /var/log/neumann/tensor_chain.log | tail -100
Verify Network Connectivity
# From each node, verify connectivity to peers
for peer in node1 node2 node3; do
nc -zv $peer 7878 2>&1 | grep -v "Connection refused" || echo "FAIL: $peer"
done
Root Causes
1. Network Partition
Diagnosis: Nodes can’t reach each other
Solution:
- Check firewall rules for port 7878 (Raft) and 7879 (gossip)
- Verify network routes between nodes
- Check for packet loss:
ping -c 100 peer_node
2. Clock Skew
Diagnosis: Election timeouts inconsistent across nodes
Solution:
- Ensure NTP is running:
timedatectl status - Max recommended skew: 500ms
- Sync clocks:
chronyc makestep
3. Quorum Loss
Diagnosis: Fewer than (n/2)+1 nodes available
Solution:
- For 3-node cluster: need 2 nodes
- For 5-node cluster: need 3 nodes
- Bring failed nodes back online or add new nodes
4. Election Timeout Too Aggressive
Diagnosis: Frequent elections even with healthy network
Solution:
[raft]
election_timeout_min_ms = 300 # Increase from default 150
election_timeout_max_ms = 600 # Increase from default 300
Resolution Steps
- Identify partitioned nodes using gossip membership view
- Restore connectivity if network issue
- If quorum lost, follow disaster recovery procedure
- Monitor
tensor_chain_raft_state{state="leader"}for leader emergence
Alerting Rule
- alert: NoLeader
expr: sum(tensor_chain_raft_state{state="leader"}) == 0
for: 30s
labels:
severity: critical
annotations:
summary: "No Raft leader elected in cluster"
runbook_url: "https://docs.neumann.io/operations/runbooks/leader-election"
Prevention
- Deploy odd number of nodes (3, 5, 7)
- Use separate availability zones
- Monitor
tensor_chain_elections_totalrate - Set up network monitoring between nodes
Split-Brain Recovery
What is Split-Brain?
A network partition where multiple nodes believe they are the leader, potentially accepting conflicting writes.
Symptoms
- Multiple nodes reporting
raft_state="leader"in metrics - Clients seeing different data depending on which node they connect to
tensor_chain_partition_detectedmetric > 0- Gossip reporting different membership views
How tensor_chain Prevents Split-Brain
Raft consensus requires majority quorum:
- 3 nodes: 2 required (only 1 partition can have leader)
- 5 nodes: 3 required
Split-brain can only occur with symmetric partition where old leader is isolated but doesn’t realize it.
Automatic Recovery (Partition Merge Protocol)
When partitions heal, tensor_chain automatically reconciles:
Phase 1: Detection
- Gossip detects new reachable nodes
- Compare Raft terms and log lengths
Phase 2: Leader Resolution
- Higher term wins
- If same term, longer log wins
- Losing leader steps down
Phase 3: State Reconciliation
- Semantic conflict detection on divergent entries
- Orthogonal changes: vector-add merge
- Conflicting changes: reject newer (requires manual resolution)
Phase 4: Log Synchronization
- Follower truncates divergent suffix
- Leader replicates correct entries
Phase 5: Membership Merge
- Gossip merges LWW membership states
- Higher incarnation wins for each node
Phase 6: Checkpoint
- Create snapshot post-merge for fast recovery
Manual Intervention (When Automatic Fails)
Scenario: Conflicting Writes
# 1. Identify conflicts
neumann-admin conflicts list --since "2h ago"
# 2. Export conflicting transactions
neumann-admin conflicts export --tx-id 12345 --output conflict.json
# 3. Choose resolution
neumann-admin conflicts resolve --tx-id 12345 --keep-version A
# 4. Or merge manually
neumann-admin conflicts resolve --tx-id 12345 --merge-custom merge.json
Scenario: Completely Diverged State
# 1. Stop all nodes
systemctl stop neumann
# 2. Identify authoritative node (longest log, highest term)
for node in node1 node2 node3; do
ssh $node "neumann-admin raft-info"
done
# 3. On non-authoritative nodes, clear state
rm -rf /var/lib/neumann/raft/*
# 4. Restart authoritative node first
systemctl start neumann
# 5. Restart other nodes (will sync from leader)
systemctl start neumann
Post-Recovery Verification
# Verify single leader
curl -s http://node1:9090/metrics | grep 'raft_state{state="leader"}'
# Verify all nodes in sync
neumann-admin cluster-status
# Check for unresolved conflicts
neumann-admin conflicts list
# Verify recent transactions
neumann-admin tx-log --last 100
Prevention
- Network design: Avoid symmetric partitions
- Monitoring: Alert on partition detection
- Testing: Regularly run chaos engineering tests
- Backups: Regular snapshots enable point-in-time recovery
Node Recovery
Recovery Scenarios
| Scenario | Recovery Method | Data Loss Risk |
|---|---|---|
| Process crash | WAL replay | None |
| Node reboot | WAL replay | None |
| Disk failure | Snapshot + log from leader | Possible (uncommitted) |
| Data corruption | Snapshot from leader | Possible (uncommitted) |
Automatic Recovery Flow
flowchart TD
A[Node Starts] --> B{WAL Exists?}
B -->|Yes| C[Replay WAL]
B -->|No| D[Request Snapshot]
C --> E{Caught Up?}
E -->|Yes| F[Join as Follower]
E -->|No| D
D --> G[Install Snapshot]
G --> H[Replay Logs After Snapshot]
H --> F
F --> I[Healthy]
Manual Recovery Steps
1. Crash Recovery (WAL Intact)
# Just restart - WAL replay is automatic
systemctl start neumann
# Monitor recovery
journalctl -u neumann -f | grep -E "(recovery|replay|caught_up)"
2. Recovery from Snapshot
# 1. Stop node
systemctl stop neumann
# 2. Clear corrupted state
rm -rf /var/lib/neumann/raft/wal/*
# 3. Keep or clear snapshots (keep if valid)
ls -la /var/lib/neumann/raft/snapshots/
# 4. Restart - will fetch snapshot from leader
systemctl start neumann
# 5. Monitor snapshot transfer
watch -n1 'curl -s localhost:9090/metrics | grep snapshot_transfer'
3. Full State Rebuild
# 1. Stop node
systemctl stop neumann
# 2. Clear all Raft state
rm -rf /var/lib/neumann/raft/*
# 3. Clear tensor store (will be rebuilt)
rm -rf /var/lib/neumann/store/*
# 4. Restart
systemctl start neumann
Monitoring Recovery Progress
# Check sync status
curl -s localhost:9090/metrics | grep -E "(commit_index|applied_index|leader_commit)"
# Calculate lag
LEADER_COMMIT=$(curl -s http://leader:9090/metrics | grep tensor_chain_commit_index | awk '{print $2}')
MY_APPLIED=$(curl -s localhost:9090/metrics | grep tensor_chain_applied_index | awk '{print $2}')
echo "Lag: $((LEADER_COMMIT - MY_APPLIED)) entries"
# Estimated time to catch up (entries/sec)
watch -n5 'curl -s localhost:9090/metrics | grep tensor_chain_applied_index'
Troubleshooting
Recovery Stuck
Symptom: Node not catching up, applied_index not increasing
Causes:
- Network issue to leader
- Leader overloaded
- Snapshot transfer failing
Solution:
# Check leader connectivity
curl -v http://leader:7878/health
# Check snapshot transfer errors
grep "snapshot" /var/log/neumann/tensor_chain.log | grep -i error
# Manually trigger snapshot
curl -X POST http://leader:9090/admin/snapshot
Repeated Crashes During Recovery
Symptom: Node crashes while replaying WAL
Causes:
- Corrupted WAL entry
- Out of memory during replay
- Incompatible schema
Solution:
# Skip corrupted entries (data loss!)
neumann-admin wal-repair --skip-corrupted
# Or full rebuild
rm -rf /var/lib/neumann/raft/*
systemctl start neumann
Backup and Restore
Backup Strategy
| Type | Frequency | Retention | RPO | RTO |
|---|---|---|---|---|
| Snapshots | Every 10k entries | 7 days | Minutes | Minutes |
| Full backup | Daily | 30 days | 24 hours | Hours |
| Off-site | Weekly | 1 year | 1 week | Hours |
Creating Backups
Snapshot Backup (Hot)
# Trigger snapshot on leader
curl -X POST http://leader:9090/admin/snapshot
# Wait for completion
watch 'curl -s http://leader:9090/metrics | grep snapshot'
# Copy snapshot files
rsync -av /var/lib/neumann/raft/snapshots/ backup:/backups/neumann/snapshots/
# Include metadata
neumann-admin cluster-info > backup:/backups/neumann/metadata.json
Full Backup (Recommended: Cold)
# 1. Stop writes (or accept slightly inconsistent backup)
neumann-admin pause-writes
# 2. Create snapshot
curl -X POST http://leader:9090/admin/snapshot
sleep 10
# 3. Backup all state
tar -czf neumann-backup-$(date +%Y%m%d).tar.gz \
/var/lib/neumann/raft/snapshots/ \
/var/lib/neumann/store/ \
/etc/neumann/
# 4. Resume writes
neumann-admin resume-writes
# 5. Verify backup integrity
tar -tzf neumann-backup-*.tar.gz | head
Continuous WAL Archiving
# In config.toml
[wal]
archive_command = "aws s3 cp %p s3://backups/neumann/wal/%f"
archive_timeout = 60 # seconds
# Or to local storage
archive_command = "cp %p /mnt/backup/wal/%f"
Restore Procedures
Point-in-Time Recovery
# 1. Stop all nodes
ansible all -m systemd -a "name=neumann state=stopped"
# 2. Clear current state
ansible all -m shell -a "rm -rf /var/lib/neumann/raft/*"
# 3. Restore snapshot to one node
scp backup:/backups/neumann/snapshots/latest/* node1:/var/lib/neumann/raft/snapshots/
# 4. Replay WAL up to desired point
neumann-admin wal-replay \
--wal-dir backup:/backups/neumann/wal/ \
--until "2024-01-15T10:30:00Z"
# 5. Start first node
ssh node1 systemctl start neumann
# 6. Start remaining nodes (will sync from node1)
ansible "node2,node3" -m systemd -a "name=neumann state=started"
Full Cluster Restore
# 1. Extract backup
tar -xzf neumann-backup-20240115.tar.gz -C /tmp/restore/
# 2. Stop cluster
systemctl stop neumann
# 3. Restore files
rsync -av /tmp/restore/raft/ /var/lib/neumann/raft/
rsync -av /tmp/restore/store/ /var/lib/neumann/store/
# 4. Fix permissions
chown -R neumann:neumann /var/lib/neumann/
# 5. Start cluster
systemctl start neumann
Disaster Recovery (Complete Loss)
# 1. Provision new infrastructure
# 2. Install Neumann on all nodes
# 3. Restore from off-site backup
aws s3 cp s3://backups/neumann/latest.tar.gz /tmp/
tar -xzf /tmp/latest.tar.gz -C /var/lib/neumann/
# 4. Update config with new node addresses
vim /etc/neumann/config.toml
# 5. Initialize cluster
neumann-admin init-cluster --bootstrap
# 6. Verify
neumann-admin cluster-status
Verification
# Check data integrity
neumann-admin verify-checksums
# Compare entry counts
neumann-admin stats | grep total_entries
# Spot check recent data
neumann-admin query "SELECT COUNT(*) FROM ..."
Retention Policy
# Cron job for cleanup
0 2 * * * find /var/lib/neumann/raft/snapshots -mtime +7 -delete
0 3 * * 0 aws s3 rm s3://backups/neumann/wal/ --recursive --exclude "*.wal" --older-than 30d
Capacity Planning
Resource Requirements
Memory
| Component | Formula | Example (1M entries) |
|---|---|---|
| Raft log (in-memory) | entries * avg_size * 2 | 1M 1KB 2 = 2 GB |
| Tensor store index | entries * 64 bytes | 1M * 64 = 64 MB |
| HNSW index | vectors * dim * 4 * ef | 1M 128 4 * 16 = 8 GB |
| Codebook | centroids * dim * 4 | 1024 128 4 = 512 KB |
| Connection buffers | peers * buffer_size | 10 * 64KB = 640 KB |
Recommended minimum: 16 GB for production
Disk
| Component | Formula | Example |
|---|---|---|
| WAL | entries * avg_size | 10M * 1KB = 10 GB |
| Snapshots | state_size * 2 | 5 GB * 2 = 10 GB |
| Mmap cold storage | cold_entries * avg_size | 100M * 1KB = 100 GB |
Recommended: 3x expected data size for growth
Network
| Traffic Type | Formula | Example (100 TPS) |
|---|---|---|
| Replication | TPS * entry_size * (replicas-1) | 100 1KB 2 = 200 KB/s |
| Gossip | nodes * fanout * state_size / interval | 5 3 1KB / 1s = 15 KB/s |
| Client | TPS * (request + response) | 100 * 2KB = 200 KB/s |
Recommended: 1 Gbps minimum, 10 Gbps for high throughput
CPU
| Operation | Complexity | Cores Needed |
|---|---|---|
| Consensus | O(1) per entry | 1 core |
| Embedding computation | O(dim) | 1-2 cores |
| HNSW search | O(log N * ef) | 2-4 cores |
| Conflict detection | O(concurrent_txs^2) | 1 core |
Recommended: 8+ cores for production
Sizing Examples
Small (Dev/Test)
- 3 nodes
- 4 cores, 8 GB RAM, 100 GB SSD each
- Up to 1M entries, 10 TPS
Medium (Production)
- 5 nodes
- 8 cores, 32 GB RAM, 500 GB NVMe each
- Up to 100M entries, 1000 TPS
Large (High-Scale)
- 7+ nodes
- 16+ cores, 64+ GB RAM, 2 TB NVMe each
- 1B+ entries, 10k+ TPS
- Consider sharding
Scaling Strategies
Vertical Scaling
When to use:
- Single-node bottleneck (CPU, memory)
- Read latency requirements
Actions:
- Add RAM for larger in-memory log
- Add cores for parallel embedding computation
- Upgrade to NVMe for faster snapshots
Horizontal Scaling
When to use:
- Throughput limited by consensus
- Fault tolerance requirements
Actions:
- Add read replicas (don’t participate in consensus)
- Add consensus members (odd numbers only)
- Implement sharding by key range
Monitoring for Capacity
# Prometheus alerts
- alert: HighMemoryUsage
expr: tensor_chain_memory_usage_bytes / tensor_chain_memory_limit_bytes > 0.85
for: 5m
labels:
severity: warning
annotations:
summary: "Memory usage above 85%"
- alert: DiskSpaceLow
expr: tensor_chain_disk_free_bytes < 10737418240 # 10 GB
for: 1m
labels:
severity: critical
annotations:
summary: "Less than 10 GB disk space remaining"
- alert: HighCPUUsage
expr: rate(tensor_chain_cpu_seconds_total[5m]) > 0.9
for: 10m
labels:
severity: warning
annotations:
summary: "CPU usage above 90%"
Growth Projections
# Calculate daily growth
neumann-admin stats --since "7d ago" --format json | jq '.entries_per_day'
# Project storage needs
DAILY_GROWTH=100000 # entries
ENTRY_SIZE=1024 # bytes
DAYS=365
GROWTH=$((DAILY_GROWTH * ENTRY_SIZE * DAYS / 1024 / 1024 / 1024))
echo "Projected annual growth: ${GROWTH} GB"
Deadlock Resolution
Overview
tensor_chain automatically detects and resolves deadlocks in distributed transactions using wait-for graph analysis. This runbook covers monitoring, tuning, and manual intervention.
Automatic Detection
Deadlocks are detected within detection_interval_ms (default: 100ms) and
resolved by aborting a victim transaction based on configured policy.
Monitoring
Metrics
# Deadlock rate
rate(tensor_chain_deadlocks_total[5m])
# Detection latency
histogram_quantile(0.99, tensor_chain_deadlock_detection_seconds_bucket)
# Victim aborts by policy
tensor_chain_deadlock_victims_total{policy="youngest"}
Logs
grep "deadlock" /var/log/neumann/tensor_chain.log
# Example output:
# [WARN] Deadlock detected: cycle=[tx_123, tx_456, tx_789], victim=tx_789
# [INFO] Aborted transaction tx_789 (youngest in cycle)
Tuning
Detection Interval
[deadlock]
detection_interval_ms = 100 # Lower = faster detection, higher CPU
Trade-off:
- Lower interval: Faster detection, but more CPU overhead
- Higher interval: Less overhead, but longer deadlock duration
Victim Selection Policy
[deadlock]
victim_policy = "youngest" # Options: youngest, oldest, lowest_priority, most_locks
| Policy | Use Case |
|---|---|
youngest | Minimize wasted work (default) |
oldest | Prevent starvation of long transactions |
lowest_priority | Business-critical transactions survive |
most_locks | Maximize system throughput |
Transaction Priorities
#![allow(unused)] fn main() { // Set priority when starting transaction let tx = coordinator.begin_with_priority(Priority::High)?; }
Manual Intervention
Force Abort Specific Transaction
neumann-admin tx abort --tx-id 12345 --reason "manual deadlock resolution"
Clear All Pending Transactions
# Emergency only - will lose in-flight work
neumann-admin tx clear-pending --confirm
Disable Auto-Resolution
[deadlock]
auto_abort_victim = false # Require manual intervention
Then manually resolve:
# List detected deadlocks
neumann-admin deadlock list
# Resolve specific deadlock
neumann-admin deadlock resolve --cycle-id abc123 --abort tx_789
Prevention
Lock Ordering
Acquire locks in consistent order across all transactions:
#![allow(unused)] fn main() { // Good: always lock in sorted key order let mut keys = vec!["key_b", "key_a", "key_c"]; keys.sort(); for key in keys { tx.lock(key)?; } }
Timeout-Based Prevention
[transaction]
lock_timeout_ms = 5000 # Abort if can't acquire lock within 5s
Reduce Lock Scope
#![allow(unused)] fn main() { // Bad: lock entire table tx.lock("users/*")?; // Good: lock specific keys tx.lock("users/123")?; tx.lock("users/456")?; }
Troubleshooting
High Deadlock Rate
Cause: Hot keys with many concurrent transactions
Solution:
- Identify hot keys:
neumann-admin lock-stats --top 10 - Consider sharding hot keys
- Batch operations to reduce lock duration
Detection Latency Spikes
Cause: Large wait-for graph from many concurrent transactions
Solution:
- Increase
max_concurrent_transactions - Reduce transaction duration
- Consider optimistic concurrency for read-heavy workloads
False Positives
Cause: Network delays causing timeout-based false waits
Solution:
- Increase
lock_wait_threshold_ms - Verify network latency between nodes
- Check for GC pauses
Benchmarks
This section provides performance benchmarks for all Neumann crates, measured using Criterion.rs.
Running Benchmarks
# Run all benchmarks
cargo bench
# Run benchmarks for a specific crate
cargo bench --package tensor_store
cargo bench --package relational_engine
cargo bench --package graph_engine
cargo bench --package vector_engine
cargo bench --package neumann_parser
cargo bench --package query_router
cargo bench --package neumann_shell
cargo bench --package tensor_compress
cargo bench --package tensor_vault
cargo bench --package tensor_cache
cargo bench --package tensor_chain
Benchmark reports are generated in target/criterion/ with HTML visualizations.
Performance Summary
In-Memory Operations
| Component | Key Metric | Performance |
|---|---|---|
| tensor_store | Concurrent writes | 7.5M/sec @ 1M entities |
| relational_engine | Indexed lookup | 2.9us (1,604x vs scan) |
| graph_engine | BFS traversal | 3us/node |
| vector_engine | HNSW search | 150us @ 10K vectors |
| tensor_compress | TT decompose | 10-20x compression |
| tensor_vault | AES-256-GCM | 24us get, 29us set |
| tensor_cache | Exact lookup | 208ns hit |
| tensor_chain | Conflict detection | 52M pairs/sec @ 99% sparse |
| neumann_parser | Query parsing | 1.9M queries/sec |
| query_router | Mixed workload | 455 queries/sec |
Durable Storage (WAL)
| Operation | Key Metric | Performance |
|---|---|---|
| WAL writes | Durable PUT (128d embeddings) | 1.4M ops/sec |
| WAL recovery | Replay 10K records | ~400us (25M records/sec) |
All engines (RelationalEngine, GraphEngine, VectorEngine) support
optional durability via open_durable() with full crash consistency.
Hardware Notes
Benchmarks run on:
- Apple M-series (ARM64) or Intel x86_64
- Results may vary based on CPU cache sizes, memory bandwidth, and core count
For consistent benchmarking:
# Disable CPU frequency scaling (Linux)
sudo cpupower frequency-set --governor performance
# Run with minimal background activity
cargo bench -- --noplot # Skip HTML report generation for faster runs
Benchmark Categories
Storage Layer
- tensor_store - DashMap concurrent storage, Bloom filters, snapshots
- tensor_compress - Tensor Train, delta encoding, RLE
Engines
- relational_engine - SQL operations, indexes, JOINs, aggregates
- graph_engine - Node/edge operations, traversals, path finding
- vector_engine - Embeddings, SIMD similarity, HNSW index
Extended Modules
- tensor_vault - Encrypted storage, access control
- tensor_cache - LLM response caching, semantic search
- tensor_blob - Blob storage operations
Distributed Systems
- tensor_chain - Consensus, 2PC, gossip, sparse vectors
Query Layer
- neumann_parser - Tokenization, parsing, expressions
- query_router - Cross-engine query routing
tensor_store Benchmarks
The tensor store uses DashMap (sharded concurrent HashMap) for thread-safe key-value storage.
Core Operations
| Operation | 100 items | 1,000 items | 10,000 items |
|---|---|---|---|
| put | 40us (2.5M/s) | 447us (2.2M/s) | 7ms (1.4M/s) |
| get | 33us (3.0M/s) | 320us (3.1M/s) | 3ms (3.3M/s) |
Scan Operations (10k total items, parallel)
| Operation | Time |
|---|---|
| scan 1k keys | 191us |
| scan_count 1k keys | 41us |
Concurrent Write Performance
| Threads | Disjoint Keys | High Contention (100 keys) |
|---|---|---|
| 2 | 795us | 974us |
| 4 | 1.59ms | 1.48ms |
| 8 | 4.6ms | 2.33ms |
Mixed Workload
| Configuration | Time |
|---|---|
| 4 readers + 2 writers | 579us |
Analysis
- Read vs Write: Reads are ~20% faster than writes due to DashMap’s read-optimized design
- Scaling: Near-linear scaling up to 10k items; slight degradation at scale due to hash table growth
- Concurrency: DashMap’s 16-shard design provides excellent concurrent performance
- Contention: Under high contention, performance actually improves at 8 threads vs 4 (lock sharding distributes load)
- Parallel scans: Uses rayon for >1000 keys (25-53% faster)
- scan_count vs scan: Count-only is ~5x faster (avoids string cloning)
Bloom Filter (optional)
| Operation | Time |
|---|---|
| add | 68 ns |
| might_contain (hit) | 46 ns |
| might_contain (miss) | 63 ns |
Sparse Lookups (1K keys in store)
| Query Type | Without Bloom | With Bloom |
|---|---|---|
| Negative lookup | 52 ns | 68 ns |
| Positive lookup | 45 ns | 60 ns |
| Sparse workload (90% miss) | 52 ns | 67 ns |
Note: Bloom filter adds ~15ns overhead for in-memory DashMap stores. It’s designed for scenarios where the backing store is slower (disk, network, remote database), where the early rejection of non-existent keys avoids expensive I/O.
Snapshot Persistence (bincode)
| Operation | 100 items | 1,000 items | 10,000 items |
|---|---|---|---|
| save | 100 us (1.0M/s) | 927 us (1.08M/s) | 12.6 ms (791K/s) |
| load | 74 us (1.35M/s) | 826 us (1.21M/s) | 10.7 ms (936K/s) |
| load_with_bloom | 81 us (1.23M/s) | 840 us (1.19M/s) | 11.0 ms (908K/s) |
Each item is a TensorData with 3 fields: id (i64), name (String), embedding
(128-dim Vec<f32>).
Snapshot File Sizes
| Items | File Size | Per Item |
|---|---|---|
| 100 | ~60 KB | ~600 bytes |
| 1,000 | ~600 KB | ~600 bytes |
| 10,000 | ~6 MB | ~600 bytes |
Snapshot Analysis
- Throughput: ~1M items/second for both save and load
- Atomicity: Uses temp file + rename for crash-safe writes
- Bloom filter overhead: ~3-5% slower to rebuild filter during load
- Scaling: Near-linear with dataset size
- File size: ~600 bytes per item with 128-dim embeddings (dominated by vector data)
Write-Ahead Log (WAL)
WAL provides crash-consistent durability with minimal performance overhead. Benchmarks use same payload as in-memory tests (128-dim embeddings).
WAL Writes
| Records | Time | Throughput |
|---|---|---|
| 100 | 152 us | 657K ops/s |
| 1,000 | 753 us | 1.33M ops/s |
| 10,000 | 6.95 ms | 1.44M ops/s |
WAL Recovery
| Records | Time | Throughput |
|---|---|---|
| 100 | 382 us | 261K elem/s |
| 1,000 | 394 us | 2.5M elem/s |
| 10,000 | 391 us | 25.6M elem/s |
WAL Analysis
- Near constant recovery time: Recovery is dominated by file open overhead (~400us), not record count
- Sequential I/O: WAL replay reads sequentially, hitting 25M records/sec
- Durable vs in-memory: WAL writes at 1.4M ops/sec vs 2.0M ops/sec in-memory (72% of in-memory speed)
- Use case: Production deployments requiring crash consistency
All engines support WAL via open_durable():
#![allow(unused)] fn main() { // Durable graph engine let engine = GraphEngine::open_durable("data/graph.wal", WalConfig::default())?; // Recovery after crash let engine = GraphEngine::recover("data/graph.wal", &WalConfig::default(), None)?; }
Sparse Vectors
SparseVector provides memory-efficient storage for high-sparsity embeddings by storing only non-zero values.
Construction (768d)
| Sparsity | Time | Throughput |
|---|---|---|
| 50% | 1.2 us | 640K/s |
| 90% | 890 ns | 870K/s |
| 99% | 650 ns | 1.18M/s |
Dot Product (768d)
| Sparsity | Sparse-Sparse | Sparse-Dense | Dense-Dense | Sparse Speedup |
|---|---|---|---|---|
| 50% | 2.1 us | 1.8 us | 580 ns | 0.3x (slower) |
| 90% | 380 ns | 290 ns | 580 ns | 1.5-2x |
| 99% | 38 ns | 26 ns | 580 ns | 15-22x |
Memory Compression
| Dimension | Sparsity | Dense Size | Sparse Size | Ratio |
|---|---|---|---|---|
| 768 | 90% | 3,072 B | 1,024 B | 3x |
| 768 | 99% | 3,072 B | 96 B | 32x |
| 1536 | 99% | 6,144 B | 184 B | 33x |
Sparse Vector Analysis
- High sparsity sweet spot: At 99% sparsity, dot products are 15-22x faster than dense
- Memory scaling: Compression ratio = 1 / (1 - sparsity), so 99% sparse = ~100x smaller
- Construction overhead: Negligible (~1us per vector)
- Use case: Embeddings from sparse models, one-hot encodings, pruned representations
Delta Vectors
DeltaVector stores embeddings as differences from reference “archetype” vectors, ideal for clustered embeddings.
Construction (768d, 5% delta)
| Dimension | Time | Throughput |
|---|---|---|
| 128 | 1.9 us | 526K/s |
| 768 | 12.3 us | 81K/s |
| 1536 | 25.1 us | 40K/s |
Dot Product (768d, precomputed archetype dot)
| Method | Time | vs Dense |
|---|---|---|
| Delta precomputed | 89 ns | 6.5x faster |
| Delta full | 620 ns | ~same |
| Dense baseline | 580 ns | 1x |
Same-Archetype Dot Product (768d)
| Method | Time | Speedup |
|---|---|---|
| Delta-delta | 145 ns | 4x |
| Dense baseline | 580 ns | 1x |
Delta Memory (768d)
| Delta Fraction | Dense Size | Delta Size | Ratio |
|---|---|---|---|
| 1% diff | 3,072 B | 120 B | 25x |
| 5% diff | 3,072 B | 360 B | 8.5x |
| 10% diff | 3,072 B | 680 B | 4.5x |
Archetype Registry (8 archetypes, 768d)
| Operation | Time |
|---|---|
| find_best_archetype | 4.2 us |
| encode | 14 us |
| decode | 1.1 us |
Delta Vector Analysis
- Precomputed speedup: With archetype dot products cached, 6.5x faster than dense
- Cluster-friendly: Similar vectors share archetypes, deltas are sparse
- Use case: Semantic embeddings that cluster (documents, user profiles, products)
K-means Clustering
K-means discovers archetype vectors automatically from embedding collections.
K-means fit (128d, k=5)
| Vectors | Time | Throughput |
|---|---|---|
| 100 | 50 us | 2.0M elem/s |
| 500 | 241 us | 2.1M elem/s |
| 1000 | 482 us | 2.1M elem/s |
Varying k (1000 vectors, 128d)
| k | Time | Throughput |
|---|---|---|
| 2 | 183 us | 5.5M elem/s |
| 5 | 482 us | 2.1M elem/s |
| 10 | 984 us | 1.0M elem/s |
| 20 | 14.5 ms | 69K elem/s |
K-means Analysis
- K-means++ is faster: Better initial centroids mean fewer iterations to converge
- Linear with n: Doubling vectors roughly doubles time
- Quadratic with k at high k: Each iteration is O(n*k), and more clusters need more iterations
- Use case: Auto-discover archetypes for delta encoding, cluster analysis, centroid-based search
relational_engine Benchmarks
The relational engine provides SQL-like operations on top of tensor_store, with optional hash indexes for accelerated equality lookups and tensor-native condition evaluation.
Row Insertion
| Count | Time | Throughput |
|---|---|---|
| 100 | 462us | 216K rows/s |
| 1,000 | 3.1ms | 319K rows/s |
| 5,000 | 15.6ms | 320K rows/s |
Batch Insertion
| Count | Time | Throughput |
|---|---|---|
| 100 | 282us | 355K rows/s |
| 1,000 | 1.45ms | 688K rows/s |
| 5,000 | 7.26ms | 688K rows/s |
Select Full Scan
| Rows | Time | Throughput |
|---|---|---|
| 100 | 119us | 841K rows/s |
| 1,000 | 995us | 1.01M rows/s |
| 5,000 | 5.27ms | 949K rows/s |
Select with Index vs Without (5,000 rows)
| Query Type | With Index | Without Index | Speedup |
|---|---|---|---|
| Equality (2% match) | 105us | 4.23ms | 40x |
| By _id (single row) | 2.93us | 4.70ms | 1,604x |
Select Filtered - No Index (5,000 rows)
| Filter Type | Time |
|---|---|
| Range (20% match) | 4.16ms |
| Compound AND | 4.42ms |
Index Creation (parallel)
| Rows | Time |
|---|---|
| 100 | 554us |
| 1,000 | 2.75ms |
| 5,000 | 12.3ms |
Update/Delete (1,000 rows, 10% affected)
| Operation | Time |
|---|---|
| Update | 1.74ms |
| Delete | 2.14ms |
Join Performance (hash join)
| Tables | Result Rows | Time |
|---|---|---|
| 50 users x 500 posts | 500 | 1.78ms |
| 100 users x 1000 posts | 1,000 | 1.50ms |
| 100 users x 5000 posts | 5,000 | 32.2ms |
JOIN Types (10K x 10K rows)
| JOIN Type | Time | Throughput |
|---|---|---|
| INNER JOIN | 45ms | 2.2M rows/s |
| LEFT JOIN | 52ms | 1.9M rows/s |
| RIGHT JOIN | 51ms | 1.9M rows/s |
| FULL JOIN | 68ms | 1.5M rows/s |
| CROSS JOIN | 180ms | 555K rows/s |
| NATURAL JOIN | 48ms | 2.1M rows/s |
Aggregate Functions (1M rows, SIMD-accelerated)
| Function | Time | Notes |
|---|---|---|
| COUNT(*) | 2.1ms | O(1) via counter |
| SUM(col) | 8.5ms | SIMD i64x4 |
| AVG(col) | 8.7ms | SIMD i64x4 |
| MIN(col) | 12ms | Full scan |
| MAX(col) | 12ms | Full scan |
GROUP BY Performance (100K rows)
| Groups | Time | Notes |
|---|---|---|
| 10 | 15ms | Parallel aggregation |
| 100 | 18ms | Hash-based grouping |
| 1,000 | 25ms | Low per-group overhead |
| 10,000 | 45ms | High cardinality |
Row Count
| Rows | Time |
|---|---|
| 100 | 49us |
| 1,000 | 462us |
| 5,000 | 2.95ms |
Analysis
- Index acceleration: Hash indexes provide O(1) lookup for equality
conditions
- 40x speedup for equality queries matching 2% of rows
- 1,604x speedup for single-row _id lookups
- Full scan cost: Without index, O(n) for all queries (parallelized for
1000 rows)
- Batch insert: 2x faster than individual inserts (688K/s vs 320K/s)
- Tensor-native evaluation:
evaluate_tensor()evaluates conditions directly on TensorData, avoiding Row conversion for non-matching rows - Parallel operations: update/delete/create_index use rayon for condition evaluation
- Index maintenance: Small overhead on insert/update/delete to maintain indexes
- Join complexity: O(n+m) hash join for INNER/LEFT/RIGHT/NATURAL; O(n*m) for CROSS
- Aggregate functions: SUM/AVG use SIMD i64x4 vectors for 4x throughput improvement
- GROUP BY: Hash-based grouping with parallel per-group aggregation
Competitor Comparison
| Operation | Neumann | SQLite | DuckDB | Notes |
|---|---|---|---|---|
| Point lookup (indexed) | 2.9us | ~3us | ~30us | B-tree optimized |
| Full scan (5K rows) | 5.3ms | ~15ms | ~2ms | DuckDB columnar wins |
| Aggregation (1M rows) | 8.5ms | ~200ms | ~12ms | SIMD-accelerated |
| Hash join (10Kx10K) | 45ms | ~500ms | ~35ms | Parallel execution |
| Insert (single row) | 3.1us | ~2us | ~5us | SQLite B-tree optimal |
| Batch insert (1K rows) | 1.5ms | ~8ms | ~3ms | Neumann batch-optimized |
Design Trade-offs
- vs SQLite: Neumann trades SQLite’s proven stability for tensor-native storage and SIMD acceleration. SQLite wins on point lookups; Neumann wins on analytics.
- vs DuckDB: Similar columnar design. DuckDB has more mature query optimizer; Neumann has tighter tensor integration and lower memory footprint.
- Unique to Neumann: Unified tensor storage enables cross-engine queries (relational + graph + vector) without data movement.
graph_engine Benchmarks
The graph engine stores nodes and edges as tensors, using adjacency lists for neighbor lookups.
Node Creation
| Count | Time | Per Node |
|---|---|---|
| 100 | 107us | 1.07us |
| 1,000 | 1.67ms | 1.67us |
| 5,000 | 9.4ms | 1.88us |
Edge Creation (1,000 edges)
| Type | Time | Per Edge |
|---|---|---|
| Directed | 2.4ms | 2.4us |
| Undirected | 3.6ms | 3.6us |
Neighbor Lookup (star graph)
| Fan-out | Time | Per Neighbor |
|---|---|---|
| 10 | 16us | 1.6us |
| 50 | 79us | 1.6us |
| 100 | 178us | 1.8us |
BFS Traversal (binary tree)
| Depth | Nodes | Time | Per Node |
|---|---|---|---|
| 5 | 31 | 110us | 3.5us |
| 7 | 127 | 442us | 3.5us |
| 9 | 511 | 1.5ms | 2.9us |
Shortest Path (BFS)
| Graph Type | Size | Time |
|---|---|---|
| Chain | 10 nodes | 8.2us |
| Chain | 50 nodes | 44us |
| Chain | 100 nodes | 96us |
| Grid | 5x5 | 55us |
| Grid | 10x10 | 265us |
Analysis
- Undirected edges: ~50% slower than directed (stores reverse edge internally)
- Traversal: Consistent ~3us per node visited, good BFS implementation
- Path finding: Near-linear with path length in chains; grid explores more nodes
- Parallel delete_node: Uses rayon for high-degree nodes (>100 edges)
- Memory overhead: Each node/edge is a full TensorData (~5-10 allocations)
Storage Model
graph_engine stores each node and edge as a separate tensor:
node:{id} -> TensorData { label, properties... }
edge:{id} -> TensorData { from, to, label, directed, properties... }
adj:{node_id}:out -> TensorData { edge_ids: [...] }
adj:{node_id}:in -> TensorData { edge_ids: [...] }
Trade-offs
- Pro: Flexible property storage, consistent with tensor model
- Con: More key lookups than traditional adjacency list
- Pro: Each component independently updatable
Complexity
| Operation | Time Complexity | Notes |
|---|---|---|
| create_node | O(1) | Hash insert |
| create_edge | O(1) | Hash insert + adjacency update |
| get_neighbors | O(degree) | Adjacency list lookup |
| bfs | O(V + E) | Standard BFS |
| shortest_path | O(V + E) | BFS-based |
| delete_node | O(degree) | Removes all edges |
vector_engine Benchmarks
The vector engine stores embeddings and performs k-nearest neighbor search using cosine similarity.
Store Embedding
| Dimension | Time | Throughput |
|---|---|---|
| 128 | 366 ns | 2.7M/s |
| 768 | 892 ns | 1.1M/s |
| 1536 | 969 ns | 1.0M/s |
Get Embedding
| Dimension | Time |
|---|---|
| 768 | 287 ns |
Delete Embedding
| Operation | Time |
|---|---|
| delete | 806 ns |
Similarity Search (top 10, SIMD + adaptive parallel)
| Dataset | Time | Per Vector | Mode |
|---|---|---|---|
| 1,000 x 128d | 242 us | 242 ns | Sequential |
| 1,000 x 768d | 367 us | 367 ns | Sequential |
| 10,000 x 128d | 1.93 ms | 193 ns | Parallel |
Cosine Similarity Computation (SIMD-accelerated)
| Dimension | Time |
|---|---|
| 128 | 26 ns |
| 768 | 165 ns |
| 1536 | 369 ns |
Analysis
- SIMD acceleration: 8-wide f32 SIMD (via
widecrate) provides 3-9x speedup for cosine similarity - Adaptive parallelism: Uses rayon for parallel search when >5000 vectors (1.6x speedup at 10K)
- Linear scaling with dimension: Cosine similarity is O(d) where d is vector dimension
- Linear scaling with dataset size: Brute-force search is O(n*d) for n vectors
- Memory bound: For 768d vectors, ~3 KB per embedding (768 * 4 bytes)
- Search throughput: ~4M vector comparisons/second at 128d (with SIMD)
- Store/Get performance: Sub-microsecond for typical embedding sizes
Complexity
| Operation | Time Complexity | Notes |
|---|---|---|
| store_embedding | O(d) | Vector copy + hash insert |
| get_embedding | O(d) | Hash lookup + vector clone |
| delete_embedding | O(1) | Hash removal |
| search_similar | O(n*d) | Brute-force scan |
| compute_similarity | O(d) | Dot product + 2 magnitude calculations |
HNSW Index (Approximate Nearest Neighbor)
HNSW provides O(log n) search complexity instead of O(n) brute force.
| Configuration | Search Time (5K, 128d) |
|---|---|
| high_speed | ~50 us |
| default | ~100 us |
| high_recall | ~200 us |
HNSW vs Brute Force (10K vectors, 128d)
| Method | Search Time | Speedup |
|---|---|---|
| Brute force | ~2 ms | 1x |
| HNSW default | ~150 us | ~13x |
Recommended Approach by Corpus Size
| Corpus Size | Approach | Rationale |
|---|---|---|
| < 10K | Brute force | Fast enough, pure tensor |
| 10K - 100K | HNSW | Pragmatic, 5-13x faster |
| > 100K | HNSW | Necessary for latency |
Scaling Projections (HNSW for >10K vectors)
| Vectors | Dimension | Search Time (est.) |
|---|---|---|
| 10K | 768 | ~200 us |
| 100K | 768 | ~500 us |
| 1M | 768 | ~1 ms |
For production workloads at extreme scale (>1M vectors), consider:
- Sharded HNSW across multiple nodes
- Dimensionality reduction (PCA)
- Quantization (int8, binary)
Storage Model
vector_engine stores each embedding as a tensor:
emb:{key} -> TensorData { vector: [...] }
Trade-offs
- Pro: Simple storage model, consistent with tensor abstraction
- Pro: Sub-microsecond store/get operations
- Pro: HNSW index for O(log n) approximate nearest neighbor search
- Con: Brute-force O(n*d) for exact search (use HNSW for approximate)
tensor_compress Benchmarks
The tensor_compress crate provides compression algorithms optimized for tensor data: Tensor Train decomposition, delta encoding, sparse vectors, and run-length encoding.
Tensor Train Decomposition (primary compression method)
| Operation | Time | Peak RAM |
|---|---|---|
| tt_decompose_256d | ~50 us | 41.8 KB |
| tt_decompose_1024d | ~80 us | 60.9 KB |
| tt_decompose_4096d | ~120 us | 137.5 KB |
| tt_reconstruct_4096d | ~1.2 ms | 67.9 KB |
| tt_dot_product_4096d | ~400 ns | 69.2 KB |
| tt_cosine_similarity_4096d | ~1 us | 69.2 KB |
Delta Encoding (10K sequential IDs)
| Operation | Time | Throughput | Peak RAM |
|---|---|---|---|
| compress_ids | 8.0 us | 1.25M IDs/s | ~210 KB |
| decompress_ids | 33 us | 303K IDs/s | ~100 KB |
Run-Length Encoding (100K values)
| Operation | Time | Throughput | Peak RAM |
|---|---|---|---|
| rle_encode | 29 us | 3.4M values/s | ~445 KB |
| rle_decode | 38 us | 2.6M values/s | ~833 KB |
Compression Ratios
| Data Type | Technique | Ratio | Lossless |
|---|---|---|---|
| 4096-dim embeddings | Tensor Train | 10-20x | No (<1% error) |
| 1024-dim embeddings | Tensor Train | 4-8x | No (<1% error) |
| Sparse vectors | Native sparse | 3-32x | Yes |
| Sequential IDs | Delta + varint | 4-8x | Yes |
| Repeated values | RLE | 2-100x | Yes |
Analysis
- TT decomposition: Achieves 10-20x compression for high-dimensional embeddings (4096+)
- TT operations in compressed space: Dot product and cosine similarity computed directly in TT format without full reconstruction
- Delta encoding: Asymmetric - compression is 4x faster than decompression
- Sparse format: Efficient for vectors with >50% zeros, stores only non-zero positions/values
- RLE: Best for highly repeated data (status columns, category IDs)
- Memory efficiency: All operations use < 1 MB for typical data sizes
- Integration: Use
SAVE COMPRESSEDin shell orsave_snapshot_compressed()API
Usage Recommendations
| Data Characteristics | Recommended Compression |
|---|---|
| High-dimensional embeddings (1024+) | Tensor Train |
| Sparse embeddings (>50% zeros) | Native sparse format |
| Sequential IDs (node IDs, row IDs) | Delta + varint |
| Categorical columns with repeats | RLE |
| Mixed data snapshots | Composite (auto-detect) |
tensor_vault Benchmarks
The tensor_vault crate provides AES-256-GCM encrypted secret storage with graph-based access control, permission levels, TTL grants, rate limiting, namespace isolation, audit logging, and secret versioning.
Key Derivation (Argon2id)
| Operation | Time | Peak RAM |
|---|---|---|
| argon2id_derivation | 80 ms | ~64 MB |
Note: Argon2id is intentionally slow to resist brute-force attacks. The 64MB memory cost is configurable via
VaultConfig.
Encryption/Decryption (AES-256-GCM)
| Operation | Time | Peak RAM |
|---|---|---|
| set_1kb | 29 us | ~3 KB |
| get_1kb | 24 us | ~3 KB |
| set_10kb | 93 us | ~25 KB |
| get_10kb | 91 us | ~25 KB |
Note:
setincludes versioning overhead (storing previous version pointers).getincludes audit logging.
Access Control (Graph Path Verification)
| Operation | Time | Peak RAM |
|---|---|---|
| check_shallow (1 hop) | 6 us | ~2 KB |
| check_deep (10 hops) | 17 us | ~3 KB |
| grant | 18 us | ~1 KB |
| revoke | 1.07 ms | ~1 KB |
Secret Listing
| Operation | Time | Peak RAM |
|---|---|---|
| list_100_secrets | 291 us | ~4 KB |
| list_1000_secrets | 2.7 ms | ~40 KB |
Note: List includes access control checks and key name decryption for pattern matching.
Analysis
- Key derivation: Argon2id dominates vault initialization (~80ms). This is by design for security.
- Access check improved: Path verification is now ~6us for shallow, ~17us for deep (85% faster than before).
- Versioning overhead:
setis ~2x slower due to version tracking (stores pointer array). - Audit overhead: Every operation logs to audit store (adds ~5-10us per operation).
- Revoke performance: ~1ms due to edge deletion, TTL tracker cleanup, and audit logging.
- List scaling: ~2.7us per secret at 1000 (includes decryption for pattern matching).
Feature Performance Overhead
| Feature | Overhead |
|---|---|
| Permission check | ~1 us (edge type comparison) |
| Rate limit check | ~100 ns (DashMap lookup) |
| TTL check | ~50 ns (heap peek) |
| Audit log write | ~5 us (tensor store put) |
| Version tracking | ~10 us (pointer array update) |
Security vs Performance Trade-offs
| Configuration | Key Derivation | Security |
|---|---|---|
| Default (64MB, 3 iter) | ~80 ms | High |
| Fast (16MB, 1 iter) | ~25 ms | Medium |
| Paranoid (256MB, 10 iter) | ~800 ms | Very High |
Recommendations
- Development: Use
Fastconfiguration for quicker iteration - Production: Use
DefaultorParanoidbased on threat model - High-throughput: Cache access decisions where possible
- Audit compliance: Accept ~5us overhead for complete audit trail
tensor_cache Benchmarks
The tensor_cache crate provides LLM response caching with exact, semantic (HNSW), and embedding caches.
Exact Cache (Hash-based O(1))
| Operation | Time |
|---|---|
| lookup_hit | 208 ns |
| lookup_miss | 102 ns |
Semantic Cache (HNSW-based O(log n))
| Operation | Time |
|---|---|
| lookup_hit | 21 us |
Put (Exact + Semantic + HNSW insert)
| Entries | Time |
|---|---|
| 100 | 49 us |
| 1,000 | 47 us |
| 10,000 | 53 us |
Embedding Cache
| Operation | Time |
|---|---|
| lookup_hit | 230 ns |
| lookup_miss | 110 ns |
Eviction (batch processing)
| Entries in Cache | Time |
|---|---|
| 1,000 | 3.3 us |
| 5,000 | 4.0 us |
| 10,000 | 8.4 us |
Distance Metrics (raw computation, 128d)
| Metric | Time | Notes |
|---|---|---|
| Jaccard | 73 ns | Fastest, best for sparse |
| Euclidean | 105 ns | Good for spatial data |
| Cosine | 186 ns | Default, best for dense |
| Angular | 193 ns | Alternative to cosine |
Semantic Lookup by Metric (1000 entries)
| Metric | Time |
|---|---|
| Jaccard | 28.6 us |
| Euclidean | 27.8 us |
| Cosine | 28.4 us |
Sparse vs Dense (80% sparsity)
| Configuration | Time | Improvement |
|---|---|---|
| Dense lookup | 28.8 us | baseline |
| Sparse lookup | 24.1 us | 16% faster |
Auto-Metric Selection
| Operation | Time |
|---|---|
| Sparsity check | 0.66 ns |
| Auto-select dense | 13.4 us |
| Auto-select sparse | 16.5 us |
Redis Comparison
| System | In-Process | Over TCP |
|---|---|---|
| Redis | ~60 ns | ~143 us |
| tensor_cache (exact) | 208 ns | ~143 us* |
| tensor_cache (semantic) | 21 us | N/A |
*Estimated: network latency dominates (99.9% of time).
Key Insight: For embedded use (no network), Redis is 3.5x faster for exact lookups. Over TCP (typical deployment), both are network-bound at ~143us. Our differentiator is semantic search (21us) which Redis cannot provide.
Analysis
- Exact cache: Hash-based O(1) lookup provides sub-microsecond hit/miss detection
- Semantic cache: HNSW index provides O(log n) similarity search (~21us for hit)
- Embedding cache: Fast O(1) lookup for precomputed embeddings
- Put performance: Consistent ~50us regardless of cache size (HNSW insert is O(log n))
- Eviction: Efficient batch eviction with LRU/LFU/Cost/Hybrid strategies
- Distance metrics: Auto-selection based on sparsity (>=70% sparse uses Jaccard)
- Token counting: tiktoken cl100k_base encoding for accurate GPT-4 token counts
- Cost tracking: Estimates cost savings based on model pricing tables
Cache Layers
| Layer | Complexity | Use Case |
|---|---|---|
| Exact | O(1) | Identical prompts |
| Semantic | O(log n) | Similar prompts |
| Embedding | O(1) | Precomputed embeddings |
Eviction Strategies
| Strategy | Description |
|---|---|
| LRU | Evict least recently accessed |
| LFU | Evict least frequently accessed |
| CostBased | Evict lowest cost efficiency |
| Hybrid | Weighted combination (recommended) |
Metric Selection Guide
| Embedding Type | Recommended Metric |
|---|---|
| OpenAI/Cohere (dense) | Cosine (default) |
| Sparse (>=70% zeros) | Jaccard (auto-selected) |
| Spatial/geographic | Euclidean |
| Custom binary | Jaccard |
tensor_blob Benchmarks
The tensor_blob crate provides S3-style chunked blob storage with content-addressable chunks, garbage collection, and integrity verification.
Overview
tensor_blob focuses on correctness and durability over raw throughput. Performance characteristics depend heavily on:
- Chunk size configuration
- Storage backend (memory vs disk)
- Network conditions for streaming operations
Expected Performance Characteristics
| Operation | Complexity | Notes |
|---|---|---|
| Put (upload) | O(size / chunk_size) | Linear with data size |
| Get (download) | O(size / chunk_size) | Linear with data size |
| Delete | O(chunk_count) | Removes metadata + orphan detection |
| GC | O(total_chunks) | Full chunk scan |
| Verify | O(size) | Re-hash entire blob |
| Repair | O(corrupted_chunks) | Only processes damaged chunks |
Chunk Deduplication
Identical content shares chunks via SHA-256 content addressing:
- Duplicate blobs: Store once, reference count tracked
- Partial overlap: Shared chunks deduplicated at chunk boundaries
- Storage savings: Depends on data redundancy
Garbage Collection
| Operation | Behavior |
|---|---|
gc() | Returns GcStats { deleted, freed_bytes } |
| Orphan detection | Marks unreferenced chunks |
| Active upload protection | GC skips in-progress uploads |
Streaming Operations
| API | Use Case |
|---|---|
BlobWriter | Streaming upload, bounded memory |
BlobReader::next_chunk() | Streaming download, chunk-by-chunk |
get_full() | Small blobs (<10MB), loads to memory |
Configuration Impact
| Setting | Impact |
|---|---|
| Larger chunk_size | Fewer chunks, less overhead, less dedup |
| Smaller chunk_size | More chunks, more overhead, better dedup |
| Recommended | 1-4 MB chunks for most workloads |
Integration Notes
- Blob store persists to TensorStore
- Metadata includes checksum, size, creation time
- Links enable blob-to-graph entity relationships
- Tags support blob categorization and search
Benchmarking Blob Operations
# Run blob-specific benchmarks (if available)
cargo bench --package tensor_blob
# For custom benchmarking, use the streaming API:
# - Measure upload throughput with BlobWriter
# - Measure download throughput with BlobReader
# - Test GC performance with various orphan ratios
tensor_chain Benchmarks
The tensor_chain crate provides a tensor-native blockchain with semantic consensus, Raft replication, 2PC distributed transactions, and sparse delta encoding.
Block Creation
| Configuration | Time | Per Transaction |
|---|---|---|
| empty_block | 171 ns | — |
| block_10_txns | 13.4 us | 1.34 us |
| block_100_txns | 111 us | 1.11 us |
Transaction Commit
| Operation | Time | Throughput |
|---|---|---|
| single_put | 432 us | 2.3K/s |
| multi_put_10 | 480 us | 20.8K ops/s |
Batch Transactions
| Count | Time | Throughput |
|---|---|---|
| 10 | 822 us | 12.2K/s |
| 100 | 21.5 ms | 4.7K/s |
| 1000 | 1.6 s | 607/s |
Consensus Validation
| Operation | Time | Notes |
|---|---|---|
| conflict_detection_pair | 279 ns | Hybrid cosine + Jaccard |
| cosine_similarity | 187 ns | Sparse vector |
| merge_pair | 448 ns | Orthogonal merge |
| merge_all_10 | 632 ns | Batch merge |
| find_merge_order_10 | 9 us | Optimal ordering |
Codebook Operations
| Operation | Time | Notes |
|---|---|---|
| global_quantize_128d | 854 ns | State validation |
| global_compute_residual | 925 ns | Delta compression |
| global_is_valid_state | 1.28 us | State machine check |
| local_quantize_128d | 145 ns | EMA-adaptive |
| local_quantize_and_update | 177 ns | With EMA update |
| manager_quantize_128d | 1.2 us | Full pipeline |
Delta Vector Operations
| Operation | Time | Improvement |
|---|---|---|
| cosine_similarity_128d | 196 ns | 35% faster |
| add_128d | 975 ns | 44% faster |
| scale_128d | 163 ns | 35% faster |
| weighted_average_128d | 982 ns | 26% faster |
| overlaps_with | 8.4 ns | 35% faster |
| cosine_similarity_768d | 1.96 us | 10% faster |
| add_768d | 2.6 us | 27% faster |
Chain Query Operations
| Operation | Time | Improvement |
|---|---|---|
| get_block_by_height | 1.19 us | 38% faster |
| get_tip | 1.06 us | 45% faster |
| get_genesis | 852 ns | 53% faster |
| height | 0.87 ns | 50% faster |
| tip_hash | 11.4 ns | 32% faster |
| history_key | 163 us | 15% faster |
| verify_chain_100_blocks | 276 us | — |
Chain Iteration
| Operation | Time | Improvement |
|---|---|---|
| iterate_50_blocks | 88 us | 10% faster |
| get_blocks_range_0_25 | 35 us | 27% faster |
K-means Codebook Training
| Configuration | Time |
|---|---|
| 100 vectors, 8 clusters | 123 us |
| 1000 vectors, 16 clusters | 8.4 ms |
Sparse Vector Performance
Conflict Detection by Sparsity Level (50 deltas, 128d)
| Sparsity | Time | Throughput | vs Dense |
|---|---|---|---|
| 10% (dense) | 389 us | 3.1M pairs/s | 1x |
| 50% | 261 us | 4.6M pairs/s | 1.5x |
| 90% | 57 us | 21.5M pairs/s | 6.8x |
| 99% | 23 us | 52.3M pairs/s | 16.9x |
Individual Sparse Operations (vs previous dense implementation)
| Operation | Sparse Time | Improvement |
|---|---|---|
| cosine_similarity | 16.5 ns | 76% faster |
| angular_distance | 28.5 ns | 64% faster |
| jaccard_index | 10.4 ns | 58% faster |
| euclidean_distance | 13.6 ns | 71% faster |
| overlapping_keys | 89 ns | 45% faster |
| add | 688 ns | 19% faster |
| weighted_average | 674 ns | 12% faster |
| project_orthogonal | 624 ns | 42% faster |
| detect_conflict_full | 53 ns | 33% faster |
High Dimension Sparse Performance
| Dimension | Cosine Time | Batch Detect (20 deltas) | Improvement |
|---|---|---|---|
| 128d | 10.3 ns | 8.9 us | 57% faster |
| 256d | 19 ns | 9.5 us | 55% faster |
| 512d | 41 ns | 17.2 us | 49-75% faster |
| 768d | 62.5 ns | 24 us | 55-77% faster |
Real Transaction Delta Sparsity Analysis
Measurement of actual delta sparsity for different transaction patterns (128d embeddings):
| Pattern | Avg NNZ | Sparsity | Estimated Speedup |
|---|---|---|---|
| Single Key Update | 4.0 | 96.9% | ~10x |
| Multi-Field Update | 11.3 | 91.2% | ~3x |
| New Record Insert | 29.5 | 77.0% | ~1x |
| Counter Increment | 1.0 | 99.2% | ~10x |
| Bulk Migration | 59.5 | 53.5% | ~1x |
| Graph Edge | 7.0 | 94.5% | ~3x |
Realistic Workload Mix (70% single-key, 20% multi-field, 10% other):
- Average NNZ: 7.1 / 128 dimensions
- Average Sparsity: 94.5%
- Expected speedup: 3-10x for typical workloads
Analysis
- Sparse advantage: Real transaction deltas are 90-99% sparse, providing 3-10x speedup
- Hybrid conflict detection: Cosine + Jaccard catches both angular and structural conflicts
- Memory savings: Sparse DeltaVector uses 8-32x less memory than dense for typical deltas
- Network bandwidth: Sparse serialization reduces replication bandwidth by 8-10x
- High dimension scaling: Benefits increase with dimension (768d: 4-5x faster than dense)
- Common operations optimized: Single-key updates (most common) are 96.9% sparse
Distributed Systems Benchmarks
Raft Consensus Operations
| Operation | Time | Throughput |
|---|---|---|
| raft_node_create | 545 ns | 1.8M/sec |
| raft_become_leader | 195 ns | 5.1M/sec |
| raft_heartbeat_stats_snapshot | 4.2 ns | 238M/sec |
| raft_log_length | 3.7 ns | 270M/sec |
| raft_stats_snapshot | 416 ps | 2.4B/sec |
2PC Distributed Transaction Operations
| Operation | Time | Throughput |
|---|---|---|
| lock_manager_acquire | 256 ns | 3.9M/sec |
| lock_manager_release | 139 ns | 7.2M/sec |
| lock_manager_is_locked | 31 ns | 32M/sec |
| coordinator_create | 46 ns | 21.7M/sec |
| coordinator_stats | 418 ps | 2.4B/sec |
| participant_create | 11 ns | 91M/sec |
Gossip Protocol Operations
| Operation | Time | Throughput |
|---|---|---|
| lww_state_create | 4.2 ns | 238M/sec |
| lww_state_merge | 169 ns | 5.9M/sec |
| gossip_node_state_create | 16 ns | 62M/sec |
| gossip_message_serialize | 36 ns | 28M/sec |
| gossip_message_deserialize | 81 ns | 12M/sec |
Snapshot Operations
| Operation | Time | Throughput |
|---|---|---|
| snapshot_metadata_create | 131 ns | 7.6M/sec |
| snapshot_metadata_serialize | 76 ns | 13M/sec |
| snapshot_metadata_deserialize | 246 ns | 4.1M/sec |
| raft_membership_config_create | 102 ns | 9.8M/sec |
| raft_with_store_create | 948 ns | 1.1M/sec |
Membership Operations
| Operation | Time | Throughput |
|---|---|---|
| membership_manager_create | 526 ns | 1.9M/sec |
| membership_view | 152 ns | 6.6M/sec |
| membership_partition_status | 19 ns | 52M/sec |
| membership_node_status | 46 ns | 21.7M/sec |
| membership_stats_snapshot | 2.9 ns | 344M/sec |
| membership_peer_ids | 71 ns | 14M/sec |
Deadlock Detection
| Operation | Time | Throughput |
|---|---|---|
| wait_graph_add_edge | 372 ns | 2.7M/sec |
| wait_graph_detect_no_cycle | 374 ns | 2.7M/sec |
| wait_graph_detect_with_cycle | 302 ns | 3.3M/sec |
| deadlock_detector_detect | 392 ns | 2.6M/sec |
Distributed Systems Analysis
- Lock operations are fast: Lock acquisition at 256ns and lock checks at 31ns support high-throughput 2PC
- Gossip is lightweight: State creation <5ns, merges ~169ns - suitable for high-frequency protocol rounds
- Stats access is near-free: Sub-nanosecond stats snapshots (416ps) mean monitoring adds no overhead
- Deadlock detection is efficient: Cycle detection in ~300-400ns allows frequent checks without blocking
- Node/manager creation is slower (500-950ns) - expected for initialization with data structures
- Snapshot deserialization at 246ns is acceptable for fast recovery
neumann_parser Benchmarks
The parser is a hand-written recursive descent parser with Pratt expression parsing for operator precedence.
Tokenization
| Query Type | Time | Throughput |
|---|---|---|
| simple_select | 182 ns | 99 MiB/s |
| select_where | 640 ns | 88 MiB/s |
| complex_select | 986 ns | 95 MiB/s |
| insert | 493 ns | 120 MiB/s |
| update | 545 ns | 91 MiB/s |
| node | 625 ns | 98 MiB/s |
| edge | 585 ns | 94 MiB/s |
| path | 486 ns | 75 MiB/s |
| embed | 407 ns | 138 MiB/s |
| similar | 185 ns | 118 MiB/s |
Parsing (tokenize + parse)
| Query Type | Time | Throughput |
|---|---|---|
| simple_select | 235 ns | 77 MiB/s |
| select_where | 1.19 us | 47 MiB/s |
| complex_select | 1.89 us | 50 MiB/s |
| insert | 688 ns | 86 MiB/s |
| update | 806 ns | 61 MiB/s |
| delete | 464 ns | 62 MiB/s |
| create_table | 856 ns | 80 MiB/s |
| node | 837 ns | 81 MiB/s |
| edge | 750 ns | 74 MiB/s |
| neighbors | 520 ns | 55 MiB/s |
| path | 380 ns | 58 MiB/s |
| embed_store | 650 ns | 86 MiB/s |
| similar | 290 ns | 76 MiB/s |
Expression Complexity
| Expression Type | Time |
|---|---|
| simple (a = 1) | 350 ns |
| binary_and | 580 ns |
| binary_or | 570 ns |
| nested_and_or | 950 ns |
| deep_nesting | 1.5 us |
| arithmetic | 720 ns |
| comparison_chain | 1.3 us |
Batch Parsing Throughput
| Batch Size | Time | Queries/s |
|---|---|---|
| 10 | 5.2 us | 1.9M/s |
| 100 | 52 us | 1.9M/s |
| 1,000 | 520 us | 1.9M/s |
Large Query Parsing
| Query Type | Time |
|---|---|
| INSERT 100 rows | 45 us |
| EMBED 768-dim vector | 38 us |
| WHERE 20 conditions | 8.5 us |
Analysis
- Zero dependencies: Hand-written lexer and parser with no external crates
- Consistent throughput: ~75-120 MiB/s across query types
- Expression complexity: Linear scaling with expression depth
- Batch performance: Consistent 1.9M queries/second regardless of batch size
- Large vectors: 768-dim embedding parsing in ~38us (20K dimensions/second)
Complexity
| Operation | Time Complexity | Notes |
|---|---|---|
| Tokenization | O(n) | Linear scan of input |
| Parsing | O(n) | Single pass, no backtracking |
| Expression parsing | O(n * d) | n = tokens, d = nesting depth |
| Error recovery | O(1) | Immediate error on invalid syntax |
Parser Design
- Lexer: Character-by-character tokenization with lookahead
- Parser: Recursive descent with Pratt parsing for expressions
- AST: Zero-copy where possible, spans track source locations
- Errors: Rich error messages with span highlighting
query_router Benchmarks
The query router integrates all engines and routes queries based on parsed AST type.
Relational Operations
| Operation | Time |
|---|---|
| SELECT * (100 rows) | 17 us |
| SELECT WHERE | 17 us |
| INSERT | 290 us |
| UPDATE | 6.5 ms |
Graph Operations
| Operation | Time |
|---|---|
| NODE CREATE | 2.3 us |
| EDGE CREATE | 3.5 us |
| NEIGHBORS | 1.8 us |
| PATH (1 -> 10) | 85 us |
| FIND NODE | 1.2 us |
Vector Operations
| Operation | Time |
|---|---|
| EMBED STORE (128d) | 28 us |
| EMBED GET | 1.5 us |
| SIMILAR LIMIT 5 (100 vectors) | 10 ms |
| SIMILAR LIMIT 10 (100 vectors) | 10 ms |
Mixed Workload
| Configuration | Time | Queries/s |
|---|---|---|
| 5 mixed queries (SELECT, NEIGHBORS, SIMILAR, INSERT, NODE) | 11 ms | 455/s |
Insert Throughput
| Batch Size | Time | Rows/s |
|---|---|---|
| 100 | 29 ms | 3.4K/s |
| 500 | 145 ms | 3.4K/s |
| 1,000 | 290 ms | 3.4K/s |
Analysis
- Parse overhead: Parser adds ~200ns-2us per query (negligible vs execution)
- Routing overhead: AST-based routing is O(1) pattern matching
- Relational: SELECT is fast (17us); UPDATE scans all rows (6.5ms for 100 rows)
- Graph: Node/edge creation ~2-3us; path finding scales with path length
- Vector: Similarity search dominates mixed workloads (~10ms for 100 vectors)
- Bottleneck identification: SIMILAR queries are the slowest operation; use HNSW index for large vector stores
Query Routing Flow
Query String
│
▼
┌─────────┐
│ Parser │ ~500ns
└────┬────┘
│
▼
┌─────────┐
│ AST │
└────┬────┘
│
▼
┌─────────────┐
│ Router │ O(1) dispatch
└──────┬──────┘
│
├──► RelationalEngine
├──► GraphEngine
├──► VectorEngine
├──► Vault
├──► Cache
└──► BlobStore
Performance Recommendations
| Query Type | Optimization |
|---|---|
| High SELECT volume | Create hash indexes on filter columns |
| Large vector search | Build HNSW index |
| Graph traversals | Use NEIGHBORS with LIMIT |
| Batch inserts | Use batch_insert() API |
| Mixed workloads | Profile to identify bottlenecks |
Stress Tests
Comprehensive stress testing infrastructure for Neumann targeting 1M entity scale with extensive coverage of concurrency, data volume, and sustained load.
Quick Start
# Run all stress tests (45+ min total)
cargo test --release -p stress_tests -- --ignored --nocapture
# Run specific test suite
cargo test --release -p stress_tests --test hnsw_stress -- --ignored --nocapture
# Run with custom duration (30s instead of default)
STRESS_DURATION=30 cargo test --release -p stress_tests -- --ignored --nocapture
Test Suites
| Suite | Tests | Description |
|---|---|---|
| HNSW Stress | 4 | 1M vector indexing, concurrent builds |
| TieredStore Stress | 3 | Hot/cold migration under load |
| Mixed Workload | 2 | All engines concurrent, realistic patterns |
| TensorStore Stress | 4 | 1M entities, high contention |
| BloomFilter Stress | 3 | 1M keys, bit-level concurrency |
| QueryRouter Stress | 3 | Concurrent queries, sustained writes |
| Duration Stress | 3 | Long-running stability, memory leaks |
Key Performance Findings
TensorStore (DashMap)
- 7.5M writes/sec at 1M entities
- Sub-microsecond median latency
- Handles 16:1 contention ratio with 2.5M ops/sec
HNSW Index
- 3,372 vectors/sec insert rate at 1M scale
- 0.11ms search latency (p50)
- 99.8% recall@10 under concurrent load
BloomFilter
- 0.88% FP rate at 1M keys (target 1%)
- 15M+ ops/sec bit-level operations
- Thread-safe with AtomicU64
Mixed Workloads
- All engines can operate concurrently
- Graph operations (5us p50) and vector ops (< 1us p50) are fastest
- Relational engine adds ~12ms p50 overhead due to schema operations
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
STRESS_DURATION | 30 (quick) / 600 (full) | Test duration in seconds |
STRESS_THREADS | 16 | Thread count for tests |
Config Presets
#![allow(unused)] fn main() { // Quick mode (CI): 100K entities, 4 threads, 30s let config = quick_config(); // Full mode (local): 1M entities, 16 threads, 10min let config = full_config(); // Endurance mode: 500K entities, 8 threads, 1 hour let config = endurance_config(); }
Latency Metrics
All tests report percentile latencies using HdrHistogram:
| Metric | Description |
|---|---|
| p50 | Median latency |
| p99 | 99th percentile |
| p999 | 99.9th percentile |
| max | Maximum observed |
Running in CI
For CI pipelines, use quick_config with limited duration:
- name: Run stress tests
run: |
STRESS_DURATION=30 cargo test --release -p stress_tests -- --ignored --nocapture
timeout-minutes: 15
HNSW Stress Tests
Stress tests for the Hierarchical Navigable Small World (HNSW) index, targeting 1M vector scale.
Test Suite
| Test | Scale | Description |
|---|---|---|
stress_hnsw_1m_vectors | 1M 128d vectors | Build 1M vector index |
stress_hnsw_100k_concurrent_build | 100K vectors, 16 threads | Concurrent index construction |
stress_hnsw_search_during_insert | 50K vectors, 4+4 threads | Concurrent search during insert |
stress_hnsw_recall_under_load | 10K vectors | Verify recall@10 under load |
Results
| Test | Key Metric | Result |
|---|---|---|
| 1M vectors | Insert throughput | 3,372 vectors/sec |
| 1M vectors | Search latency (p50) | 0.11ms |
| 100K concurrent | Insert throughput | 1,155 vectors/sec |
| Recall@10 | Average recall | 99.8% (min 90%) |
Running
# Run all HNSW stress tests
cargo test --release -p stress_tests --test hnsw_stress -- --ignored --nocapture
# Run specific test
cargo test --release -p stress_tests stress_hnsw_1m_vectors -- --ignored --nocapture
1M Vector Index Build
Tests building an HNSW index with 1 million 128-dimensional vectors.
What it validates:
- Memory efficiency at scale
- Index build time scalability
- Search accuracy after large insertions
Expected behavior:
- Linear memory growth with vector count
- Sub-linear search time (O(log n))
- Recall@10 > 95%
Concurrent Index Build
Tests building an HNSW index with 16 concurrent writer threads.
What it validates:
- Thread-safety of HNSW insert operations
- Performance under contention
- Correctness with concurrent modifications
Expected behavior:
- All inserted vectors are findable
- No panics or data races
- Throughput scales with thread count (with diminishing returns)
Search During Insert
Tests searching the index while new vectors are being inserted concurrently.
What it validates:
- Read/write concurrency safety
- Search accuracy with ongoing modifications
- Latency stability under load
Expected behavior:
- Searches return valid results
- No stale or corrupted results
- Latency remains bounded
Recall Under Load
Tests search recall accuracy under sustained concurrent load.
What it validates:
- HNSW recall guarantees under stress
- Accuracy with high query volume
- Configuration impact on recall
Expected behavior:
- Average recall@10 > 95%
- Minimum recall@10 > 90%
- High_recall config > default config recall
Performance Tuning
HNSW Configuration Impact
| Config | Insert Speed | Search Speed | Recall | Memory |
|---|---|---|---|---|
| high_speed | Fastest | Fastest | Lower | Lower |
| default | Medium | Medium | Good | Medium |
| high_recall | Slowest | Slowest | Highest | Higher |
Scaling Recommendations
| Scale | Recommendation |
|---|---|
| < 100K | Use default config |
| 100K - 1M | Consider high_speed if latency-critical |
| > 1M | Shard across multiple indexes |
TieredStore Stress Tests
Stress tests for the two-tier hot/cold storage system with automatic data migration.
Test Suite
| Test | Scale | Description |
|---|---|---|
stress_tiered_hot_only_scale | 1M entities | Hot-only tier at scale |
stress_tiered_migration_under_load | 100K entities | Hot/cold migration with concurrent load |
stress_tiered_hot_read_latency | 100K entities | Random access read latency |
Results
| Test | Key Metric | Result |
|---|---|---|
| Hot-only 1M | Throughput | 689K entities/sec |
| Migration | Concurrent access | Works correctly |
| Read latency | p50 | < 3us |
| Read latency | p99 | < 500us |
Running
# Run all TieredStore stress tests
cargo test --release -p stress_tests --test tiered_store_stress -- --ignored --nocapture
# Run specific test
cargo test --release -p stress_tests stress_tiered_hot_only_scale -- --ignored --nocapture
Hot-Only Scale Test
Tests TieredStore performance with only hot tier active (no cold storage).
What it validates:
- In-memory performance at scale
- DashMap + instrumentation overhead
- Memory usage patterns
Expected behavior:
- Throughput > 500K entities/sec
- Linear memory growth
- Consistent latency distribution
Migration Under Load
Tests hot-to-cold data migration while concurrent reads/writes continue.
What it validates:
- Migration correctness during active use
- No data loss during tier transitions
- Read consistency during migration
Expected behavior:
- All data accessible before and after migration
- Reads don’t block on migration
- Writes to migrated keys work correctly
Hot Read Latency
Tests random access read latency for hot tier data.
What it validates:
- Read latency distribution
- Hot path optimization
- Cache efficiency
Expected behavior:
- p50 latency < 3us
- p99 latency < 500us
- No extreme outliers (p999 < 10ms)
Architecture
TieredStore
│
├── Hot Tier (DashMap)
│ ├── Fast in-memory access
│ ├── Access instrumentation
│ └── Automatic hot shard tracking
│
└── Cold Tier (mmap)
├── Disk-backed storage
├── Memory-efficient for large datasets
└── Transparent promotion on access
Migration Strategies
| Strategy | Trigger | Use Case |
|---|---|---|
| Time-based | Entries older than threshold | Aging data |
| Access-based | Cold shards (low access) | Infrequent data |
| Memory-based | Hot tier size limit | Memory pressure |
Configuration
#![allow(unused)] fn main() { let config = TieredConfig { cold_dir: PathBuf::from("/var/lib/neumann/cold"), cold_capacity: 1_000_000, // Max cold entries sample_rate: 0.01, // 1% access sampling }; }
Mixed Workload Stress Tests
Stress tests that exercise all Neumann engines simultaneously with realistic workload patterns.
Test Suite
| Test | Scale | Description |
|---|---|---|
stress_all_engines_concurrent | 25K ops/thread, 12 threads | All engines under concurrent load |
stress_realistic_workload | 30s duration | Mixed OLTP + search + traversal |
Results
| Test | Key Metric | Result |
|---|---|---|
| All engines | Combined throughput | 841 ops/sec |
| All engines | Relational p50 | 12ms |
| All engines | Graph p50 | 5us |
| All engines | Vector p50 | < 1us |
| Realistic workload | Mixed throughput | 232 ops/sec |
| Realistic workload | Read rate | 91 reads/sec |
| Realistic workload | Write rate | 68 writes/sec |
| Realistic workload | Search rate | 72 searches/sec |
Running
# Run all mixed workload stress tests
cargo test --release -p stress_tests --test mixed_workload_stress -- --ignored --nocapture
# Run specific test
cargo test --release -p stress_tests stress_all_engines_concurrent -- --ignored --nocapture
All Engines Concurrent
Tests all engines (relational, graph, vector) under simultaneous heavy load from 12 threads.
What it validates:
- Cross-engine concurrency safety
- Shared TensorStore contention handling
- No deadlocks or livelocks
- Correct results under maximum stress
Workload distribution per thread:
- Relational: INSERT, SELECT, UPDATE
- Graph: NODE, EDGE, NEIGHBORS
- Vector: EMBED, SIMILAR
Expected behavior:
- No panics or assertion failures
- All operations complete (no hangs)
- Data consistency verified post-test
Realistic Workload
Simulates a production-like mixed workload over 30 seconds.
What it validates:
- Sustained throughput over time
- Memory stability (no leaks)
- Latency consistency
Workload pattern:
- 40% Reads (SELECT, GET, NEIGHBORS)
- 30% Writes (INSERT, UPDATE, NODE)
- 30% Searches (SIMILAR, PATH)
Expected behavior:
- Throughput variance < 20%
- Memory usage stable
- No degradation over time
Engine Latency Breakdown
| Engine | Operation | Typical p50 | Notes |
|---|---|---|---|
| Relational | SELECT | 1-10ms | Schema lookup overhead |
| Relational | INSERT | 3-15ms | Index maintenance |
| Graph | NEIGHBORS | 5-50us | Adjacency list lookup |
| Graph | PATH | 100us-5ms | Scales with path length |
| Vector | EMBED | 1-5us | Hash insert |
| Vector | SIMILAR | 1-100ms | Scales with corpus size |
Bottleneck Identification
When mixed workload throughput is lower than expected:
- Vector search dominates: Use HNSW index for SIMILAR queries
- Relational scans: Add hash/B-tree indexes on filter columns
- Graph traversals: Add LIMIT to NEIGHBORS/PATH queries
- Contention: Check hot shards with instrumentation
Scaling Considerations
| Bottleneck | Solution |
|---|---|
| CPU-bound | Add more cores, enable rayon parallelism |
| Memory-bound | Enable tiered storage, use sparse vectors |
| I/O-bound | Use NVMe storage, increase buffer sizes |
| Network-bound | Batch operations, use local cache |
Integration Tests
The integration test suite validates cross-engine functionality, data flow, and
system behavior. All tests use a shared TensorStore to verify that relational,
graph, and vector engines work correctly together.
Test Count: 267+ tests across 22 files
Running Tests
# Run all integration tests
cargo test --package integration_tests
# Run specific test file
cargo test --package integration_tests --test persistence
# Run single test
cargo test --package integration_tests test_snapshot_preserves_all_data
# Run with output
cargo test --package integration_tests -- --nocapture
Test Categories
| Category | Tests | Description |
|---|---|---|
| Persistence | 9 | Snapshot/restore across all engines |
| Concurrency | 10 | Multi-threaded and async operations |
| Cross-Engine | 10 | Data flow between engines |
| Error Handling | 10 | Proper error messages |
| Delete Operations | 7 | Cleanup and consistency |
| Cache Invalidation | 7 | Cache behavior on writes |
| FIND Command | 7 | Unified query syntax |
| Blob Lifecycle | 7 | GC, repair, streaming |
| Cache Advanced | 6 | TTL, semantic, eviction |
| Vault Advanced | 8 | Grants, audit, namespacing |
| Edge Cases | 10 | Boundary conditions |
| Tensor Compress | 10 | Quantization, delta, RLE encoding |
| Join Operations | 10 | Hash-based relational JOINs |
| HNSW Index | 13 | Approximate nearest neighbor search |
| Vault Versioning | 17 | Secret history and rollback |
| Index Operations | 18 | Hash and B-tree indexes |
| Columnar Storage | 20 | Columnar scans, batch insert, projection |
| Entity Graph API | 18 | String-keyed entity edge operations |
| Sparse Vectors | 22 | Sparse vector creation and similarity |
| Store Instrumentation | 15 | Access pattern tracking |
| Tiered Storage | 16 | Hot/cold data migration |
| Distance Metrics | 17 | COSINE, EUCLIDEAN, DOT_PRODUCT similarity |
Test Helpers
Available in integration_tests/src/lib.rs:
| Helper Function | Purpose |
|---|---|
create_shared_router() | Creates QueryRouter with shared TensorStore |
create_router_with_vault(master_key) | Router with vault initialized |
create_router_with_cache() | Router with cache initialized |
create_router_with_blob() | Router with blob store initialized |
create_router_with_all_features(master_key) | Router with vault, cache, and blob |
sample_embeddings(count, dim) | Generates deterministic test embeddings using sin() |
get_store_from_router(router) | Extracts TensorStore from router |
create_shared_engines() | Creates (store, relational, graph, vector) tuple |
create_shared_engines_arc() | Same as above but wrapped in Arc for concurrency |
Key Test Suites
Persistence Tests
Tests snapshot/restore functionality across all engines.
| Test | What It Tests |
|---|---|
test_snapshot_preserves_all_data | All engine data survives snapshot/restore |
test_snapshot_during_writes | Concurrent writes don’t corrupt snapshot |
test_restore_to_fresh_store | Snapshot loads into new TensorStore |
test_compressed_snapshot_roundtrip | Compression works for vector data |
test_snapshot_includes_vault_secrets | Vault secrets persist in snapshot |
Lessons Learned
- Cache is intentionally ephemeral (internal DashMaps)
- Vault secrets ARE persisted (encrypted in TensorStore)
- Bloom filter must be re-initialized with same parameters on restore
Concurrency Tests
Tests multi-threaded and async access patterns.
| Test | What It Tests |
|---|---|
test_concurrent_writes_all_engines | 6 threads write to relational/graph/vector simultaneously |
test_shared_store_contention | 4 threads write same key 1000 times each |
test_reader_writer_isolation | Reads during heavy writes |
test_blob_parallel_uploads | 10 concurrent blob uploads with barrier sync |
Lessons Learned
- DashMap provides excellent concurrent write performance
- Node IDs are NOT guaranteed sequential - must capture actual IDs
- Blob operations require
tokio::sync::Mutexfor shared access - HNSW search is thread-safe during concurrent writes
Cross-Engine Tests
Tests data flow and operations across multiple engines.
| Test | What It Tests |
|---|---|
test_unified_entity_across_engines | Single entity with data in all 3 engines |
test_graph_nodes_with_embeddings | Graph nodes linked to vector embeddings |
test_insert_embed_search_cycle | INSERT -> EMBED -> SIMILAR workflow |
test_query_router_cross_engine_operations | Router executes across all engines |
Lessons Learned
execute()usescol:typesyntax;execute_parsed()uses SQL syntaxNEIGHBORScommand returnsQueryResult::Ids, notQueryResult::Nodes- Node IDs must be captured and reused, not assumed to be 0, 1, 2…
Sparse Vector Tests (22 tests)
Tests sparse vector creation, storage, and similarity operations.
Key APIs
TensorValue::from_embedding(dense, value_threshold, sparsity_threshold)TensorValue::from_embedding_auto(dense)- Auto thresholds (0.01 value, 0.7 sparsity)TensorValue::dot(other)- Dot product (sparse-sparse, sparse-dense, dense-dense)TensorValue::cosine_similarity(other)- Cosine similarityTensorValue::to_dense()- Convert back to denseTensorValue::dimension()- Get vector dimension
Distance Metrics Tests (17 tests)
Tests SIMILAR queries with different distance metrics.
Key Syntax
-- Metric goes AFTER LIMIT clause
SIMILAR 'key' LIMIT 10 EUCLIDEAN
SIMILAR [0.1, 0.2] LIMIT 5 DOT_PRODUCT
Known Issues
- Metric keyword must be AFTER LIMIT (not
METRIC EUCLIDEAN) - COSINE/DOT_PRODUCT return empty for zero-magnitude queries
- EUCLIDEAN correctly handles zero vectors
Coverage Summary
| Category | Files | Tests | Key Validations |
|---|---|---|---|
| Storage | 4 | 50+ | Persistence, tiering, instrumentation |
| Engines | 5 | 60+ | Relational, graph, vector operations |
| Security | 2 | 25+ | Vault, access control, versioning |
| Caching | 2 | 13 | Exact, semantic, invalidation |
| Advanced | 6 | 80+ | Compression, joins, indexes, sparse |
| Total | 17 | 267+ |
Code Style
This guide covers the coding standards for Neumann. All contributions must follow these guidelines.
Rust Idioms
- Prefer iterators over loops
- Use
?for error propagation - Keep functions small and focused
- Prefer composition over inheritance patterns
Formatting
All code must pass cargo fmt:
cargo fmt --check
Lints
All code must pass clippy with warnings as errors:
cargo clippy -- -D warnings
Comments Policy
Doc comments (///) are for rustdoc generation. Use them sparingly.
DO Document
- Types (structs, enums) - explain purpose and invariants
- Non-obvious behavior - when a method does something unexpected
- Complex algorithms - when the “why” isn’t clear from code
DO NOT Document
- Methods with self-explanatory names (
get,set,new,len,is_empty) - Trivial implementations
- Anything where the doc would just repeat the function name
Examples
#![allow(unused)] fn main() { // BAD - restates the obvious /// Get a field value pub fn get(&self, key: &str) -> Option<&TensorValue> // GOOD - no comment needed, name is clear pub fn get(&self, key: &str) -> Option<&TensorValue> // GOOD - explains non-obvious behavior /// Returns cloned data to ensure thread safety. /// For zero-copy access, use get_ref(). pub fn get(&self, key: &str) -> Result<TensorData> }
Inline comments (//) should explain “why”, never “what”.
Naming
- Types:
PascalCase - Functions and variables:
snake_case - Constants:
SCREAMING_SNAKE_CASE - Modules:
snake_case
Error Handling
- Use
Resultfor fallible operations - Define error types with
thiserror - Provide context with error messages
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum MyError { #[error("failed to parse config: {0}")] ConfigParse(String), #[error("connection failed: {source}")] Connection { #[from] source: std::io::Error, }, } }
Concurrency
- Use
DashMapfor concurrent hash maps - Avoid
Mutexwhere possible (useparking_lotif needed) - Document thread-safety in type docs
Testing
- Unit tests in the same file as code (
#[cfg(test)]module) - Test the public API, not implementation details
- Use descriptive names:
test_<function>_<scenario>_<expected>
Commits
- Write clear, imperative commit messages
- No emoji in commits
- Reference issue numbers when applicable
- Keep commits atomic - one logical change per commit
Testing
Test Philosophy
- Test the public API, not implementation details
- Include edge cases: empty inputs, boundaries, error conditions
- Performance tests for operations that must scale (10k+ entities)
- Concurrent tests for thread-safe code
Running Tests
# All tests
cargo test
# Specific crate
cargo test -p tensor_chain
# Specific test
cargo test test_raft_election
# With output
cargo test -- --nocapture
# Run ignored tests (slow/integration)
cargo test -- --ignored
Test Organization
Unit tests live in the same file:
#![allow(unused)] fn main() { pub fn process(data: &str) -> Result<Output> { // implementation } #[cfg(test)] mod tests { use super::*; #[test] fn test_process_valid_input() { let result = process("valid").unwrap(); assert_eq!(result.status, "ok"); } #[test] fn test_process_empty_input() { let result = process(""); assert!(result.is_err()); } } }
Test Naming
Use the pattern: test_<function>_<scenario>_<expected>
#![allow(unused)] fn main() { #[test] fn test_insert_duplicate_key_returns_error() { } #[test] fn test_search_empty_index_returns_empty() { } #[test] fn test_commit_after_abort_fails() { } }
Concurrent Tests
For thread-safe code:
#![allow(unused)] fn main() { #[test] fn test_store_concurrent_writes() { let store = Arc::new(TensorStore::new()); let handles: Vec<_> = (0..10) .map(|i| { let store = Arc::clone(&store); std::thread::spawn(move || { for j in 0..1000 { store.put(format!("key_{i}_{j}"), data.clone()); } }) }) .collect(); for h in handles { h.join().unwrap(); } assert_eq!(store.len(), 10000); } }
Performance Tests
Mark slow tests with #[ignore]:
#![allow(unused)] fn main() { #[test] #[ignore] fn test_hnsw_search_10k_vectors() { let mut index = HNSWIndex::new(config); for i in 0..10_000 { index.insert(format!("vec_{i}"), random_vector(128)); } let start = Instant::now(); for _ in 0..100 { index.search(&query, 10); } let elapsed = start.elapsed(); assert!(elapsed < Duration::from_secs(1)); } }
Run with: cargo test -- --ignored
Integration Tests
Located in integration_tests/:
cargo test -p integration_tests
These test cross-crate behavior and full workflows.
Coverage
Check coverage with cargo-llvm-cov:
cargo install cargo-llvm-cov
cargo llvm-cov --workspace --html
open target/llvm-cov/html/index.html
Target coverage thresholds:
- shell: 88%
- parser: 91%
- blob: 91%
- router: 92%
- chain: 95%
Model Checking (TLA+)
Distributed protocol changes must be verified against the TLA+
specifications in specs/tla/:
cd specs/tla
# Run TLC on all three specs
java -XX:+UseParallelGC -Xmx4g -jar tla2tools.jar \
-deadlock -workers auto -config Raft.cfg Raft.tla
java -XX:+UseParallelGC -Xmx4g -jar tla2tools.jar \
-deadlock -workers auto \
-config TwoPhaseCommit.cfg TwoPhaseCommit.tla
java -XX:+UseParallelGC -Xmx4g -jar tla2tools.jar \
-deadlock -workers auto \
-config Membership.cfg Membership.tla
When modifying Raft, 2PC, or gossip protocols:
- Update the corresponding
.tlaspec - Run TLC and verify zero errors
- Save output to
specs/tla/tlc-results/
See Formal Verification for background on what model checking covers.
Mocking
Use trait objects for dependency injection:
#![allow(unused)] fn main() { pub trait Transport: Send + Sync { fn send(&self, msg: Message) -> Result<()>; } // In tests struct MockTransport { sent: Mutex<Vec<Message>>, } impl Transport for MockTransport { fn send(&self, msg: Message) -> Result<()> { self.sent.lock().unwrap().push(msg); Ok(()) } } }
Documentation
Documentation Structure
Neumann documentation consists of:
- mdBook (
docs/book/) - Conceptual docs, tutorials, operations - rustdoc - API reference generated from source
- README.md per crate - Quick overview
Writing mdBook Pages
File Location
docs/book/src/
├── SUMMARY.md # Table of contents
├── introduction.md # Landing page
├── getting-started/ # Tutorials
├── architecture/ # Module deep dives
├── concepts/ # Cross-cutting concepts
├── operations/ # Deployment, monitoring
└── contributing/ # Contribution guides
Page Structure
# Page Title
Brief introduction (1-2 paragraphs).
## Section 1
Content with examples.
### Subsection
More detail.
## Section 2
Use tables for structured data:
| Column 1 | Column 2 |
|----------|----------|
| Value 1 | Value 2 |
Use mermaid for diagrams:
\`\`\`mermaid
flowchart LR
A --> B --> C
\`\`\`
Admonitions
Use mdbook-admonish syntax:
```admonish note
This is a note.
## Writing Rustdoc
### Module Documentation
```rust
//! # Module Name
//!
//! Brief description (one line).
//!
//! ## Overview
//!
//! Longer explanation of purpose and design decisions.
//!
//! ## Example
//!
//! ```rust
//! // Example code
//! ```
Type Documentation
#![allow(unused)] fn main() { /// Brief description of the type. /// /// Longer explanation if needed. /// /// # Example /// /// ```rust /// let value = MyType::new(); /// ``` pub struct MyType { ... } }
When to Document
Document:
- All public types
- Non-obvious behavior
- Complex algorithms
Don’t document:
- Self-explanatory methods (
get,set,new) - Trivial implementations
Building Documentation
mdBook
cd docs/book
mdbook build
mdbook serve # Local preview at localhost:3000
rustdoc
cargo doc --workspace --no-deps --open
Full Build
# mdBook
cd docs/book && mdbook build
# rustdoc
cargo doc --workspace --no-deps
# Combine
cp -r target/doc docs/book-output/api/
Link Checking
cd docs/book
mdbook-linkcheck --standalone
Adding Mermaid Diagrams
Supported diagram types:
flowchart- Flow diagramssequenceDiagram- Sequence diagramsstateDiagram-v2- State machinesclassDiagram- Class diagramsgantt- Gantt charts
Example:
\`\`\`mermaid
sequenceDiagram
participant C as Client
participant S as Server
C->>S: Request
S->>C: Response
\`\`\`
Fuzzing
Neumann uses cargo-fuzz (libFuzzer-based) for coverage-guided fuzzing.
Setup
# Install cargo-fuzz (requires nightly)
cargo install cargo-fuzz
# List available targets
cd fuzz && cargo +nightly fuzz list
Running Fuzz Targets
# Run a specific target for 60 seconds
cargo +nightly fuzz run parser_parse -- -max_total_time=60
# Run without sanitizer (2x faster for safe Rust)
cargo +nightly fuzz run parser_parse --sanitizer none
# Reproduce a crash
cargo +nightly fuzz run parser_parse artifacts/parser_parse/crash-xxx
Available Targets
| Target | Module | What it tests |
|---|---|---|
parser_parse | neumann_parser | Statement parsing |
parser_parse_all | neumann_parser | Multi-statement parsing |
parser_parse_expr | neumann_parser | Expression parsing |
parser_tokenize | neumann_parser | Lexer/tokenization |
compress_ids | tensor_compress | Varint ID compression |
compress_rle | tensor_compress | RLE encode/decode |
compress_snapshot | tensor_compress | Snapshot serialization |
vault_cipher | tensor_vault | AES-256-GCM roundtrip |
checkpoint_state | tensor_checkpoint | Checkpoint bincode |
storage_sparse_vector | tensor_store | Sparse vector roundtrip |
slab_entity_index | tensor_store | EntityIndex operations |
consistent_hash | tensor_store | Consistent hash partitioner |
tcp_framing | tensor_chain | TCP wire protocol codec |
membership | tensor_chain | Cluster config serialization |
Adding a New Fuzz Target
- Create target file in
fuzz/fuzz_targets/<name>.rs:
#![allow(unused)] #![no_main] fn main() { use libfuzzer_sys::fuzz_target; fuzz_target!(|data: &[u8]| { // Your fuzzing code here if let Ok(input) = std::str::from_utf8(data) { let _ = my_crate::parse(input); } }); }
- Add entry to
fuzz/Cargo.toml:
[[bin]]
name = "my_target"
path = "fuzz_targets/my_target.rs"
test = false
doc = false
bench = false
- Add seed corpus files to
fuzz/corpus/<name>/:
mkdir -p fuzz/corpus/my_target
echo "valid input 1" > fuzz/corpus/my_target/seed1
echo "valid input 2" > fuzz/corpus/my_target/seed2
- Update CI matrix in
.github/workflows/fuzz.yml
Structured Fuzzing
For complex input types, use arbitrary:
#![allow(unused)] fn main() { use arbitrary::Arbitrary; #[derive(Arbitrary, Debug)] struct MyInput { field1: u32, field2: String, } fuzz_target!(|input: MyInput| { let _ = my_crate::process(&input); }); }
Investigating Crashes
# View crash input
xxd artifacts/my_target/crash-xxx
# Minimize crash
cargo +nightly fuzz tmin my_target artifacts/my_target/crash-xxx
# Debug
cargo +nightly fuzz run my_target artifacts/my_target/crash-xxx -- -verbosity=2
CI Integration
Fuzz tests run in CI for 60 seconds per target. See
.github/workflows/fuzz.yml.
Best Practices
- Add corpus seeds: Real-world inputs help fuzzer find paths
- Use structured fuzzing: For complex inputs
- Run locally: Before pushing changes to fuzzed code
- Minimize crashes: Smaller inputs are easier to debug
- Keep targets focused: One functionality per target
API Reference
This document provides detailed public API documentation for all Neumann crates. For auto-generated rustdoc, see Building Locally.
Table of Contents
- tensor_store - Core storage layer
- relational_engine - SQL-like tables
- graph_engine - Graph operations
- vector_engine - Embeddings and similarity
- tensor_chain - Distributed consensus
- neumann_parser - Query parsing
- query_router - Query execution
- tensor_cache - LLM response caching
- tensor_vault - Encrypted storage
- tensor_blob - Blob storage
- tensor_checkpoint - Snapshots
tensor_store
Core key-value storage with HNSW indexing, sparse vectors, and tiered storage.
Core Types
| Type | Description |
|---|---|
TensorStore | Thread-safe key-value store with slab routing |
TensorData | HashMap-based entity with typed fields |
TensorValue | Field value: Scalar, Vector, Sparse, Pointer(s) |
ScalarValue | Null, Bool, Int, Float, String, Bytes |
TensorStore
#![allow(unused)] fn main() { use tensor_store::{TensorStore, TensorData, TensorValue, ScalarValue}; let store = TensorStore::new(); // Store entity let mut data = TensorData::new(); data.set("name", TensorValue::Scalar(ScalarValue::String("Alice".into()))); data.set("embedding", TensorValue::Vector(vec![0.1, 0.2, 0.3])); store.put("user:1", data)?; // Retrieve let entity = store.get("user:1")?; assert!(entity.has("name")); // Check existence store.exists("user:1"); // -> bool // Delete store.delete("user:1")?; // Scan by prefix let count = store.scan_count("user:"); }
TensorData
#![allow(unused)] fn main() { let mut data = TensorData::new(); // Set fields data.set("field", TensorValue::Scalar(ScalarValue::Int(42))); // Get fields let value = data.get("field"); // -> Option<&TensorValue> // Check field existence data.has("field"); // -> bool // Field names let fields: Vec<&str> = data.keys().collect(); }
HNSW Index
Hierarchical Navigable Small World graph for approximate nearest neighbor search.
#![allow(unused)] fn main() { use tensor_store::{HNSWIndex, HNSWConfig, DistanceMetric}; // Create with config let config = HNSWConfig { m: 16, // Connections per node ef_construction: 200, ef_search: 50, max_elements: 10000, distance_metric: DistanceMetric::Cosine, ..Default::default() }; let index = HNSWIndex::new(128, config); // 128 dimensions // Insert vector index.insert("doc:1", &embedding)?; // Search let results = index.search(&query_vector, 10)?; for (key, distance) in results { println!("{}: {}", key, distance); } }
Sparse Vectors
Memory-efficient sparse embeddings with 15+ distance metrics.
#![allow(unused)] fn main() { use tensor_store::SparseVector; // Create from dense (auto-detects sparsity) let sparse = SparseVector::from_dense(&[0.0, 0.5, 0.0, 0.3, 0.0]); // Create from indices and values let sparse = SparseVector::new(vec![1, 3], vec![0.5, 0.3], 5)?; // Operations let dense = sparse.to_dense(); let dot = sparse.dot(&other_sparse); let cosine = sparse.cosine_similarity(&other_sparse); }
Tiered Storage
Automatic hot/cold storage with mmap backing.
#![allow(unused)] fn main() { use tensor_store::{TieredStore, TieredConfig}; use std::path::Path; let config = TieredConfig { hot_capacity: 10000, cold_path: Path::new("/data/cold").to_path_buf(), migration_threshold: 0.8, ..Default::default() }; let store = TieredStore::new(config)?; // Automatic migration based on access patterns store.put("key", data)?; let value = store.get("key")?; }
Cache Ring
Fixed-size eviction cache with multiple strategies.
#![allow(unused)] fn main() { use tensor_store::{CacheRing, EvictionStrategy}; let cache = CacheRing::new(1000, EvictionStrategy::LRU); cache.put("key", value); let hit = cache.get("key"); // -> Option<V> // Statistics let stats = cache.stats(); println!("Hit rate: {:.2}%", stats.hit_rate() * 100.0); }
Consistent Hash Partitioner
Partition routing with virtual nodes.
#![allow(unused)] fn main() { use tensor_store::{ConsistentHashPartitioner, ConsistentHashConfig}; let config = ConsistentHashConfig { virtual_nodes: 150, replication_factor: 3, }; let partitioner = ConsistentHashPartitioner::new(config); partitioner.add_node("node1"); partitioner.add_node("node2"); let partition = partitioner.partition("user:123"); let replicas = partitioner.replicas("user:123"); }
relational_engine
SQL-like table operations with SIMD-accelerated filtering.
Core Types
| Type | Description |
|---|---|
RelationalEngine | Main engine with TensorStore backend |
Schema | Table schema with column definitions |
Column | Column name, type, nullability |
ColumnType | Int, Float, String, Bool, Bytes, Json |
Value | Typed query value |
Condition | Composable filter predicate |
Row | Row with ID and values |
Table Operations
#![allow(unused)] fn main() { use relational_engine::{RelationalEngine, Schema, Column, ColumnType}; let engine = RelationalEngine::new(); // Create table let schema = Schema::new(vec![ Column::new("name", ColumnType::String), Column::new("age", ColumnType::Int), Column::new("email", ColumnType::String).nullable(), ]); engine.create_table("users", schema)?; // Check existence engine.table_exists("users")?; // -> bool // List tables let tables = engine.list_tables(); // -> Vec<String> // Get schema let schema = engine.get_schema("users")?; // Row count engine.row_count("users")?; // -> usize // Drop table engine.drop_table("users")?; }
CRUD Operations
#![allow(unused)] fn main() { use relational_engine::{Condition, Value}; use std::collections::HashMap; // INSERT let mut values = HashMap::new(); values.insert("name".to_string(), Value::String("Alice".into())); values.insert("age".to_string(), Value::Int(30)); let row_id = engine.insert("users", values)?; // BATCH INSERT (59x faster) let rows: Vec<HashMap<String, Value>> = vec![/* ... */]; let row_ids = engine.batch_insert("users", rows)?; // SELECT with condition let rows = engine.select("users", Condition::Ge("age".into(), Value::Int(18)))?; // UPDATE let mut updates = HashMap::new(); updates.insert("age".to_string(), Value::Int(31)); let count = engine.update("users", Condition::Eq("name".into(), Value::String("Alice".into())), updates)?; // DELETE let count = engine.delete_rows("users", Condition::Lt("age".into(), Value::Int(18)))?; }
Conditions
#![allow(unused)] fn main() { use relational_engine::{Condition, Value}; // Simple conditions Condition::True // Match all Condition::Eq("col".into(), Value::Int(1)) // col = 1 Condition::Ne("col".into(), Value::Int(1)) // col != 1 Condition::Lt("col".into(), Value::Int(10)) // col < 10 Condition::Le("col".into(), Value::Int(10)) // col <= 10 Condition::Gt("col".into(), Value::Int(0)) // col > 0 Condition::Ge("col".into(), Value::Int(0)) // col >= 0 // Compound conditions let cond = Condition::Ge("age".into(), Value::Int(18)) .and(Condition::Lt("age".into(), Value::Int(65))); let cond = Condition::Eq("status".into(), Value::String("active".into())) .or(Condition::Gt("priority".into(), Value::Int(5))); }
Indexes
#![allow(unused)] fn main() { // Hash index (O(1) equality) engine.create_index("users", "email")?; engine.has_index("users", "email"); // -> bool engine.drop_index("users", "email")?; // B-tree index (O(log n) range) engine.create_btree_index("users", "age")?; engine.has_btree_index("users", "age"); // -> bool engine.drop_btree_index("users", "age")?; // List indexed columns engine.get_indexed_columns("users"); // -> Vec<String> engine.get_btree_indexed_columns("users"); // -> Vec<String> }
Joins
#![allow(unused)] fn main() { // INNER JOIN let joined = engine.join("users", "posts", "_id", "user_id")?; // -> Vec<(Row, Row)> // LEFT JOIN let joined = engine.left_join("users", "posts", "_id", "user_id")?; // -> Vec<(Row, Option<Row>)> // RIGHT JOIN let joined = engine.right_join("users", "posts", "_id", "user_id")?; // -> Vec<(Option<Row>, Row)> // FULL JOIN let joined = engine.full_join("users", "posts", "_id", "user_id")?; // -> Vec<(Option<Row>, Option<Row>)> // CROSS JOIN let joined = engine.cross_join("users", "posts")?; // -> Vec<(Row, Row)> // NATURAL JOIN let joined = engine.natural_join("users", "profiles")?; // -> Vec<(Row, Row)> }
Aggregates
#![allow(unused)] fn main() { // COUNT let count = engine.count("users", Condition::True)?; let count = engine.count_column("users", "email", Condition::True)?; // SUM let total = engine.sum("orders", "amount", Condition::True)?; // AVG let avg = engine.avg("orders", "amount", Condition::True)?; // Option<f64> // MIN/MAX let min = engine.min("products", "price", Condition::True)?; // Option<Value> let max = engine.max("products", "price", Condition::True)?; }
Transactions
#![allow(unused)] fn main() { use relational_engine::{TransactionManager, TxPhase}; let tx_manager = TransactionManager::new(); // Begin transaction let tx_id = tx_manager.begin(); // Check state tx_manager.is_active(tx_id); // -> bool tx_manager.get(tx_id); // -> Option<TxPhase> // Acquire row locks tx_manager.lock_manager().try_lock(tx_id, &[ ("users".to_string(), 1), ("users".to_string(), 2), ])?; // Commit or rollback tx_manager.set_phase(tx_id, TxPhase::Committed); tx_manager.release_locks(tx_id); tx_manager.remove(tx_id); }
graph_engine
Directed graph operations with BFS traversal and shortest path.
Core Types
| Type | Description |
|---|---|
GraphEngine | Main engine with TensorStore backend |
Node | Graph node with label and properties |
Edge | Directed edge with type and properties |
Direction | Outgoing, Incoming, Both |
PropertyValue | Null, Int, Float, String, Bool |
Path | Sequence of nodes and edges |
Node Operations
#![allow(unused)] fn main() { use graph_engine::{GraphEngine, PropertyValue}; use std::collections::HashMap; let engine = GraphEngine::new(); // Create node let mut props = HashMap::new(); props.insert("name".to_string(), PropertyValue::String("Alice".into())); let node_id = engine.create_node("person", props)?; // Get node let node = engine.get_node(node_id)?; println!("{}: {:?}", node.label, node.properties); // Update node let mut updates = HashMap::new(); updates.insert("age".to_string(), PropertyValue::Int(30)); engine.update_node(node_id, updates)?; // Delete node engine.delete_node(node_id)?; // Find nodes by label let people = engine.find_nodes_by_label("person")?; }
Edge Operations
#![allow(unused)] fn main() { use graph_engine::Direction; // Create edge let edge_id = engine.create_edge(from_id, to_id, "follows", HashMap::new())?; // Get edge let edge = engine.get_edge(edge_id)?; // Get neighbors let neighbors = engine.neighbors(node_id, Direction::Outgoing)?; let neighbors = engine.neighbors(node_id, Direction::Incoming)?; let neighbors = engine.neighbors(node_id, Direction::Both)?; // Get edges let edges = engine.edges(node_id, Direction::Outgoing)?; // Delete edge engine.delete_edge(edge_id)?; }
Traversal
#![allow(unused)] fn main() { // BFS traversal let visited = engine.bfs(start_id, |node| { // Return true to continue traversal true })?; // Shortest path (Dijkstra) let path = engine.shortest_path(from_id, to_id)?; if let Some(path) = path { for node_id in path.nodes { println!("-> {}", node_id); } } }
Property Indexes
#![allow(unused)] fn main() { use graph_engine::{IndexTarget, RangeOp}; // Create index on node property engine.create_property_index(IndexTarget::Node, "age")?; // Create index on edge property engine.create_property_index(IndexTarget::Edge, "weight")?; // Range query using index let results = engine.find_by_range( IndexTarget::Node, "age", &PropertyValue::Int(18), RangeOp::Ge, )?; }
vector_engine
Embedding storage and similarity search.
Core Types
| Type | Description |
|---|---|
VectorEngine | Main engine for embedding operations |
SearchResult | Key and similarity score |
DistanceMetric | Cosine, Euclidean, DotProduct |
FilterCondition | Metadata filter (Eq, Ne, Lt, Gt, And, Or, In, etc.) |
FilterValue | Filter value type (Int, Float, String, Bool, Null) |
FilterStrategy | Filter strategy (Auto, PreFilter, PostFilter) |
FilteredSearchConfig | Configuration for filtered search |
VectorCollectionConfig | Configuration for collections |
MetadataValue | Simplified metadata value type |
Basic Operations
#![allow(unused)] fn main() { use vector_engine::{VectorEngine, DistanceMetric}; let engine = VectorEngine::new(); // Store embedding (auto-detects sparse) engine.store_embedding("doc:1", vec![0.1, 0.2, 0.3])?; // Get embedding let vector = engine.get_embedding("doc:1")?; // Check existence engine.exists("doc:1"); // -> bool // Delete engine.delete_embedding("doc:1")?; // Count embeddings engine.count(); // -> usize }
Similarity Search
#![allow(unused)] fn main() { // Search similar embeddings let query = vec![0.1, 0.2, 0.3]; let results = engine.search_similar(&query, 10)?; for result in results { println!("{}: {:.4}", result.key, result.score); } // Search with metric let results = engine.search_similar_with_metric( &query, 10, DistanceMetric::Euclidean, )?; }
Filtered Search
#![allow(unused)] fn main() { use vector_engine::{FilterCondition, FilterValue, FilteredSearchConfig}; // Build filter let filter = FilterCondition::Eq("category".into(), FilterValue::String("science".into())) .and(FilterCondition::Gt("year".into(), FilterValue::Int(2020))); // Search with filter let results = engine.search_similar_filtered(&query, 10, &filter, None)?; // With explicit strategy let config = FilteredSearchConfig::pre_filter(); let results = engine.search_similar_filtered(&query, 10, &filter, Some(config))?; }
Metadata Storage
#![allow(unused)] fn main() { use tensor_store::TensorValue; use std::collections::HashMap; // Store embedding with metadata let mut metadata = HashMap::new(); metadata.insert("category".into(), TensorValue::from("science")); metadata.insert("year".into(), TensorValue::from(2024i64)); engine.store_embedding_with_metadata("doc:1", vec![0.1, 0.2], metadata)?; // Get metadata let meta = engine.get_metadata("doc:1")?; // Update metadata let mut updates = HashMap::new(); updates.insert("year".into(), TensorValue::from(2025i64)); engine.update_metadata("doc:1", updates)?; }
Collections
#![allow(unused)] fn main() { use vector_engine::VectorCollectionConfig; // Create collection let config = VectorCollectionConfig::default() .with_dimension(768) .with_metric(DistanceMetric::Cosine); engine.create_collection("documents", config)?; // Store in collection engine.store_in_collection("documents", "doc:1", vec![0.1; 768])?; // Search in collection let results = engine.search_in_collection("documents", &query, 10)?; // List/delete collections let collections = engine.list_collections(); engine.delete_collection("documents")?; }
tensor_chain
Distributed consensus with Raft and 2PC transactions.
Core Types
| Type | Description |
|---|---|
Chain | Block chain with graph-based linking |
Block | Block with header and transactions |
Transaction | Put, Delete, Update operations |
RaftNode | Raft consensus state machine |
DistributedTxCoordinator | 2PC transaction coordinator |
Chain Operations
#![allow(unused)] fn main() { use tensor_chain::{Chain, Transaction, Block}; use graph_engine::GraphEngine; use std::sync::Arc; let graph = Arc::new(GraphEngine::new()); let chain = Chain::new(graph, "node1".to_string()); chain.initialize()?; // Create block let builder = chain.new_block() .add_transaction(Transaction::Put { key: "user:1".into(), data: vec![1, 2, 3], }) .add_transaction(Transaction::Delete { key: "user:0".into(), }); let block = builder.build(); chain.append(block)?; // Query chain let height = chain.height(); let block = chain.get_block(1)?; }
Raft Consensus
#![allow(unused)] fn main() { use tensor_chain::{RaftNode, RaftConfig, RaftState}; let config = RaftConfig { election_timeout_min: 150, election_timeout_max: 300, heartbeat_interval: 50, ..Default::default() }; let raft = RaftNode::new("node1".into(), config); // State queries raft.is_leader(); // -> bool raft.current_term(); // -> u64 raft.state(); // -> RaftState // Statistics let stats = raft.stats(); }
Distributed Transactions
#![allow(unused)] fn main() { use tensor_chain::{DistributedTxCoordinator, DistributedTxConfig}; let config = DistributedTxConfig { prepare_timeout_ms: 5000, commit_timeout_ms: 5000, max_retries: 3, ..Default::default() }; let coordinator = DistributedTxCoordinator::new(config); // Begin distributed transaction let tx_id = coordinator.begin()?; // Prepare phase coordinator.prepare(tx_id, keys, participants).await?; // Commit phase coordinator.commit(tx_id).await?; // Or abort coordinator.abort(tx_id).await?; }
Membership Management
#![allow(unused)] fn main() { use tensor_chain::{MembershipManager, ClusterConfig, HealthConfig}; let config = ClusterConfig { local: LocalNodeConfig { id: "node1".into(), addr: "127.0.0.1:9000".parse()? }, peers: vec![], health: HealthConfig::default(), }; let membership = MembershipManager::new(config); // Add/remove nodes membership.add_node("node2", "127.0.0.1:9001".parse()?)?; membership.remove_node("node2")?; // Health status let health = membership.node_health("node2"); let status = membership.partition_status(); }
neumann_parser
Hand-written recursive descent parser for the Neumann query language.
Core Types
| Type | Description |
|---|---|
Statement | Parsed statement with span |
StatementKind | Select, Insert, Update, Delete, Node, Edge, etc. |
Expr | Expression AST node |
Token | Lexer token with span |
ParseError | Error with source location |
Parsing
#![allow(unused)] fn main() { use neumann_parser::{parse, parse_all, parse_expr, tokenize}; // Parse single statement let stmt = parse("SELECT * FROM users WHERE id = 1")?; // Parse multiple statements let stmts = parse_all("SELECT 1; SELECT 2")?; // Parse expression only let expr = parse_expr("1 + 2 * 3")?; // Tokenize let tokens = tokenize("SELECT id, name FROM users"); }
Error Handling
#![allow(unused)] fn main() { let result = parse("SELCT * FROM users"); if let Err(err) = result { // Format with source context let formatted = err.format_with_source("SELCT * FROM users"); eprintln!("{}", formatted); // Access error details println!("Line: {}", err.line()); println!("Column: {}", err.column()); } }
Span Utilities
#![allow(unused)] fn main() { use neumann_parser::{line_number, line_col, get_line, BytePos}; let source = "SELECT\nFROM\nWHERE"; // Get line number (1-indexed) let line = line_number(source, BytePos(7)); // -> 2 // Get line and column let (line, col) = line_col(source, BytePos(7)); // -> (2, 1) // Get line text let text = get_line(source, BytePos(7)); // -> "FROM" }
query_router
Unified query routing across all engines.
Core Types
| Type | Description |
|---|---|
QueryRouter | Main router handling all query types |
QueryResult | Result variants for different query types |
RouterError | Error types from all engines |
Query Execution
#![allow(unused)] fn main() { use query_router::{QueryRouter, QueryResult}; let router = QueryRouter::new(); // Execute query let result = router.execute("SELECT * FROM users")?; match result { QueryResult::Rows(rows) => { /* relational result */ } QueryResult::Nodes(nodes) => { /* graph result */ } QueryResult::Similar(results) => { /* vector result */ } QueryResult::Success(msg) => { /* command result */ } _ => {} } }
Distributed Queries
#![allow(unused)] fn main() { use query_router::{QueryPlanner, MergeStrategy, ResultMerger}; let planner = QueryPlanner::new(partitioner); // Plan distributed query let plan = planner.plan("SELECT * FROM users WHERE region = 'us'")?; // Execute on shards let shard_results = execute_on_shards(&plan).await?; // Merge results let merger = ResultMerger::new(MergeStrategy::Union); let final_result = merger.merge(shard_results)?; }
tensor_cache
LLM response cache with exact and semantic matching.
Core Types
| Type | Description |
|---|---|
Cache | Multi-layer LLM response cache |
CacheConfig | Configuration for cache behavior |
CacheHit | Successful lookup result |
CacheLayer | Exact, Semantic, Embedding |
EvictionStrategy | LRU, LFU, CostBased, Hybrid |
Operations
#![allow(unused)] fn main() { use tensor_cache::{Cache, CacheConfig, EvictionStrategy}; let mut config = CacheConfig::default(); config.embedding_dim = 384; config.eviction_strategy = EvictionStrategy::Hybrid; let cache = Cache::with_config(config)?; // Store response let embedding = vec![0.1, 0.2, /* ... */]; cache.put( "What is 2+2?", &embedding, "The answer is 4.", "gpt-4", None, // version )?; // Lookup (tries exact, then semantic) if let Some(hit) = cache.get("What is 2+2?", Some(&embedding)) { println!("Response: {}", hit.response); println!("Layer: {:?}", hit.layer); println!("Cost saved: ${:.4}", hit.cost_saved); } // Statistics let stats = cache.stats(); println!("Hit rate: {:.2}%", stats.hit_rate() * 100.0); }
tensor_vault
Encrypted secret storage with graph-based access control.
Core Types
| Type | Description |
|---|---|
Vault | Main vault API |
VaultConfig | Configuration for security settings |
Permission | Read, Write, Admin |
MasterKey | Derived encryption key |
Operations
#![allow(unused)] fn main() { use tensor_vault::{Vault, VaultConfig, Permission}; let config = VaultConfig::default(); let vault = Vault::new(config)?; // Store secret vault.set("requester", "db/password", b"secret123", Permission::Admin)?; // Get secret let secret = vault.get("requester", "db/password")?; // Grant access vault.grant("admin", "user", "db/password", Permission::Read)?; // Revoke access vault.revoke("admin", "user", "db/password")?; // List secrets let secrets = vault.list("requester", "db/")?; // Delete secret vault.delete("admin", "db/password")?; }
tensor_blob
S3-style object storage with content-addressable chunks.
Core Types
| Type | Description |
|---|---|
BlobStore | Main blob storage API |
BlobConfig | Configuration for chunk size, GC |
PutOptions | Options for storing artifacts |
ArtifactMetadata | Metadata for stored artifacts |
BlobWriter | Streaming upload |
BlobReader | Streaming download |
Operations
#![allow(unused)] fn main() { use tensor_blob::{BlobStore, BlobConfig, PutOptions}; let config = BlobConfig::default(); let store = BlobStore::new(tensor_store, config).await?; // Store artifact let artifact_id = store.put( "report.pdf", &file_bytes, PutOptions::new() .with_created_by("user:alice") .with_tag("quarterly"), ).await?; // Get artifact let data = store.get(&artifact_id).await?; // Streaming upload let mut writer = store.writer("large-file.bin", PutOptions::new()).await?; writer.write(&chunk1).await?; writer.write(&chunk2).await?; let artifact_id = writer.finish().await?; // Streaming download let mut reader = store.reader(&artifact_id).await?; let chunk = reader.read(1024).await?; // Delete store.delete(&artifact_id).await?; // Metadata let metadata = store.metadata(&artifact_id).await?; }
tensor_checkpoint
Snapshot and rollback system.
Core Types
| Type | Description |
|---|---|
CheckpointManager | Main checkpoint API |
CheckpointConfig | Configuration for checkpoints |
DestructiveOp | Delete, Update operations |
OperationPreview | Preview of affected data |
ConfirmationHandler | Custom confirmation logic |
Operations
#![allow(unused)] fn main() { use tensor_checkpoint::{CheckpointManager, CheckpointConfig, AutoConfirm}; use std::sync::Arc; let config = CheckpointConfig::new() .with_max_checkpoints(10) .with_auto_checkpoint(true); let manager = CheckpointManager::new(blob_store, config).await; manager.set_confirmation_handler(Arc::new(AutoConfirm)); // Create checkpoint let checkpoint_id = manager.create(Some("before-migration"), &store).await?; // List checkpoints let checkpoints = manager.list().await?; // Restore from checkpoint manager.restore(&checkpoint_id, &mut store).await?; // Delete checkpoint manager.delete(&checkpoint_id).await?; }
Common Patterns
Error Handling
All crates use the Result type with crate-specific error enums:
#![allow(unused)] fn main() { use relational_engine::{RelationalEngine, RelationalError}; let result = engine.create_table("users", schema); match result { Ok(()) => println!("Table created"), Err(RelationalError::TableAlreadyExists) => println!("Already exists"), Err(e) => eprintln!("Error: {}", e), } }
Thread Safety
All engines use parking_lot and DashMap for concurrent access:
#![allow(unused)] fn main() { use std::sync::Arc; use std::thread; let engine = Arc::new(RelationalEngine::new()); let handles: Vec<_> = (0..4).map(|i| { let engine = Arc::clone(&engine); thread::spawn(move || { engine.insert("users", values).unwrap(); }) }).collect(); for handle in handles { handle.join().unwrap(); } }
Async Operations
tensor_blob, tensor_cache, and tensor_checkpoint use async APIs:
#![allow(unused)] fn main() { use tokio::runtime::Runtime; let rt = Runtime::new()?; rt.block_on(async { let store = BlobStore::new(tensor_store, config).await?; store.put("file.txt", &data, options).await?; Ok(()) })?; }
Building Locally
Generate documentation from source:
# Basic documentation
cargo doc --workspace --no-deps --open
# With all features and private items
cargo doc --workspace --no-deps --all-features --document-private-items
# With scraped examples (nightly)
RUSTDOCFLAGS="--cfg docsrs" cargo +nightly doc \
-Zunstable-options \
-Zrustdoc-scrape-examples \
--all-features
Crate Documentation
After generating docs locally with cargo doc, you can browse documentation for:
tensor_store- Core storage layerrelational_engine- SQL-like tablesgraph_engine- Graph operationsvector_engine- Vector similarity searchtensor_chain- Distributed consensusneumann_parser- Query language parserquery_router- Unified query executiontensor_cache- Multi-layer cachingtensor_vault- Encrypted secretstensor_blob- Blob storagetensor_checkpoint- Snapshot/restore