TrinoX architecture

Standalone Rust engine with a Trino-compatible surface

TrinoX is a standalone Rust query engine exposing Trino-compatible HTTP, plus an in-tree distributed execution path. Clients, wire protocol, and coordinator semantics stay familiar; the performance-critical path is native columnar execution over open table formats.

Summary

Standalone Rust SQL engine + Trino-compatible HTTP protocol + native vectorized execution + local Parquet/Iceberg catalog + optional distributed coordinator/worker runtime.

High-level shape

Hand-drawn-style diagram inspired by Excalidraw. Colors follow the query path: ingress → coordinator → compile → plan → execution → output.

Crate split

trino-parser

SQL parsing

Wrapper around sqlparser-rs for the Trino dialect surface.

trino-catalog

Table discovery

Connectors for Parquet, Iceberg, Delta, and related metadata chains.

trino-planner

Planning

Analyzer, optimizer, physical planning, and distributed stage planning.

trino-server

Runtime hub

Executor, CLI, HTTP server, distributed scheduler, worker, exchange, auth, cache, metrics, spill, and FTE plumbing.

trino-node

Node entrypoint

Small node-local binary for worker/coordinator roles.

Layout

`trinox/crates/`

All production crates live under a single workspace tree; the demo site you are using mounts the HTTP surface from the integrated service.

Execution flow

SQL enters through CLI, REPL, or the Trino HTTP protocol (/v1/statement).
Parser and analyzer resolve names, types, subqueries, and expressions.
Planner builds logical and physical plans — joins, aggregates, scans, sorts, limits, and distributed stage fragments when needed.
Executor runs native columnar pipelines over table data.
Catalog layer discovers data from local Parquet directories and Iceberg metadata chains.
Output returns as CLI text or Trino wire-compatible JSON.

Native operator layer

The performance-critical path is native Rust/SIMD execution. Internally, execution is columnar and batch-oriented.

Scan & filter

Scan + predicate pushdown
SIMD numeric filters
Varchar equality / LIKE / IN-list filters

Compute & aggregate

Projection and expression evaluation
Hash aggregation (Swiss-table-style)
Sort / top-N

Join & spill

Hash join build/probe
Semi/anti/outer join variants
Spill-aware execution paths
Exchange serialization for distributed mode

Server and distributed mode

trino-server supports both single-node and clustered operation:

Single node

HTTP server compatible with Trino clients
Coordinator-owned query lifecycle
Local pipeline execution end-to-end

Distributed

Worker registration and heartbeats
Stage DAG scheduling and split placement
Cross-process exchange (repartition / broadcast / local)
Partial and final aggregation shapes
Broadcast/repartition join planning
Distributed top-N / final merge stages
Cancellation, cleanup, retry policy, exchange spool / FTE

Agent factory (how TrinoX is built)

DataFlowsAI ships systems through a continuous SPEC → IMPLEMENT → VERIFY → FUZZ → SHIP pipeline. The pipeline page simulates one run; production development follows the same artifact boundaries.

Continuous

AI agent factory

Upstream docs, RFCs, and tests become typed specs; planner/coder/reviewer agents implement; property, differential, and formal verification gate releases; SQLancer-style fuzzing and DST soak before ship.

Customer surface

No client change

trino-cli, JDBC, Python trino, dbt, and BI tools keep using the same wire protocol — point the coordinator URL at TrinoX.