NeuronBox documentation

Declarative local ML runs from one neuron.yaml: hashed virtualenvs, a global model store, optional OCI isolation, and a Unix-socket daemon for live sessions and stats.

Introduction

NeuronBox is a local stack: the neuron CLI, neurond, newline-delimited JSON over a Unix socket, a terminal dashboard, and a shared model store under ~/.neuronbox/store. It is not a hosted cloud.

You describe the job once: model source (Hugging Face–style id, local tree, or file), Python stack, GPU expectations, and the script to run. neuron run resolves the environment, wires NEURONBOX_* variables, registers the process with the daemon, and executes your entrypoint.

neuron with no subcommand → welcome / getting-started screen
neuron help → full command list

End-to-end flow

Typical journey from zero to a monitored run:

Build or install the neuron and neurondbinaries (see Build & install).
Create a project with neuron init and edit neuron.yaml.
Fetch weights with neuron pull org/model when using Hub-style ids, or point the manifest at local paths.
Run with neuron run from the directory that contains the manifest (or -f path).
Observe with neuron dashboard or neuron stats while neurond is reachable on the default socket.

For long-lived workers that react to model changes without a full cold start, use neuron serve and neuron swap (see dedicated section).

Build & install

From crates.io (recommended): install the CLI and daemon binaries:

$ cargo install neuronbox-cli$ cargo install neuronbox-runtime

That installs neuron and neurond into your Cargo bin directory (usually ~/.cargo/bin). Ensure it is on PATH. If neurond lives elsewhere, set NEUROND_PATH and you can skip the second install.

From a clone of the repository, build the CLI and daemon:

$ cd neuronbox$ cargo build -p neuronbox-cli -p neuronbox-runtime --bin neurond

You need target/debug/neuron and target/debug/neurond available together, or set NEUROND_PATH to the daemon binary. Add target/debug to PATH for convenience.

$ ./target/debug/neuron$ ./target/debug/neuron help

Local install from the repo (development):

$ cargo install --path cli$ cargo install --path runtime

Prerequisites

Rust toolchain (workspace / rust-toolchain)
Python 3 on PATH (align version with runtime.python in the manifest when possible)
uv optional but recommended for faster installs
GPU tooling optional: NVIDIA, AMD, or Apple Silicon; use neuron host inspect

Richer NVIDIA reporting when linked with NVML:

$ cargo build -p neuronbox-cli --features nvml$ cargo build -p neuronbox-runtime --features nvml

neuron.yaml manifest

The manifest is the single source of truth: model location, runtime (Python version, packages, optional CUDA index), GPU hints, entrypoint, optional env: for child processes, and runtime.mode (host vs oci).

JSON Schema lives in the repo at specs/neuron.yaml.schema.json. Example shape:

model:
  source: hub
  name: org/model-id
runtime:
  python: "3.11"
  packages:
    - transformers
    - torch
gpu:
  min_vram: 12
entrypoint: scripts/run.py

For local weights, set model.source: local and model.name to a path; no pull step is required.

Models & neuron pull

neuron pull fetches ML artifacts into the global store: Hugging Face–style org/model, configured aliases, or a local path. It does not pull Docker images. Use Docker or neuron oci prepare for OCI rootfs workflows.

$ neuron pull mistralai/Mistral-7B-v0.1

Set HF_TOKEN in the environment for private Hub repositories. Resolved trees are exposed to your script via NEURONBOX_MODEL_DIR (and related variables); a single-file model uses NEURONBOX_MODEL_PATH when applicable.

Shortcut: neuron run org/model with only a Hub-like argument performs a pull and prints where the model lives; you still need a proper neuron.yaml and entrypoint to execute project code.

neuron run

From the directory containing neuron.yaml:

$ neuron run

Or point at another manifest:

$ neuron run -f path/to/neuron.yaml

Flags include --gpu (sets CUDA_VISIBLE_DEVICES), --vram (session record hint), and --oci to force the Docker OCI path when aligned with runtime.mode: oci.

neuron run resolves the model (pull if needed for Hub ids), ensures the hashed virtualenv exists, sets NEURONBOX_* variables, spawns the entrypoint, registers the child with neurond, and unregisters on exit. It tries to start the daemon if the socket is down; if stats / dashboard cannot connect, run neuron daemon in another terminal.

Throughput: print lines to stderr like NEURONBOX_TOKENS_PER_SEC 128.4 (with flush in Python); the CLI forwards them to your terminal and updates the daemon. The optional sdk/ package also provides report_tokens_per_sec.

neuron run --oci uses Docker and does not register a session with neurond — no dashboard row for that container run. Use host neuron run when you want stats / dashboard to track the process.

How a run works

Virtualenv: path under store/envs/ is a hash of Python version, CUDA/ROCm extras, and package list. Same manifest shape ⇒ same environment. Optional requirements.lock and neuron lock for pinned installs.
Installer: prefers uv pip install when uv is on PATH; otherwise pip.
Soft VRAM check: if gpu.min_vram is set and the host reports GPU memory, neuron run can warn when estimates look tight (non-blocking).
Child environment: inherited PYTHONPATH is stripped unless you set it under env: in the manifest (avoids IDE-injected paths breaking venv numpy/torch).

Daemon & sessions

neurond keeps an in-memory registry of sessions (name, PID, estimated VRAM, optional tokens_per_sec). neuron run sends register_session after spawn and unregister_session on exit.

Automatic throughput: neuron run sets NEURONBOX_AUTOHOOK=1 and adds the SDK to PYTHONPATH. This installs hooks for transformers, vLLM, llama.cpp, and OpenAI-compatible clients that report tok/s after each generation call — no code change required.

For unsupported frameworks, send another register_session on the socket with the same PID and an updated tokens_per_sec. See specs/daemon-sessions.md in the repository.

Default socket: ~/.neuronbox/neuron.sock, overridable with NEURONBOX_SOCKET.

Dashboard & stats

neuron dashboard: full-screen TUI: real stats from the daemon, host/GPU probe, ~10 Hz UI refresh; throughput history is drawn client-side (not stored in the daemon).

$ neuron dashboard

neuron dashboard --demo (Unix): synthetic sessions, animated tok/s, mock swap model; quit with q or Esc. For cosmetic gauges on real hardware without fake sessions, see NEURONBOX_DEMO_SYNTHETIC_METRICS in docs/CLI_UX.md.

$ neuron stats

Plain-text snapshot of sessions and GPU summary.

serve & swap

neuron serve runs a long-lived worker with the same virtualenv resolution as neuron run, suitable for loops that watch swap_signal.json.

$ neuron serve$ neuron serve -f path/to/neuron.yaml

neuron swap MODEL updates daemon-side logical state and writes ~/.neuronbox/swap_signal.json (versioned schema in specs/swap-signal.schema.json).

$ neuron swap org/model

CLI reference

Command	Role
neuron	Welcome screen
neuron help	Full help
neuron init	Create neuron.yaml in cwd
neuron pull <id>	Fetch model into store
neuron run	Run entrypoint from manifest
neuron run -f FILE	Alternate manifest path
neuron run --gpu 0	CUDA_VISIBLE_DEVICES for child
neuron run --oci	Force OCI / Docker path
neuron serve	Long-lived worker + swap signal
neuron swap MODEL	Daemon active model + swap file
neuron stats	Text snapshot
neuron dashboard	Full-screen TUI
neuron host inspect	JSON HostSnapshot
neuron gpu list	Detected GPUs
neuron model list	Store index
neuron lock	requirements.lock in hashed env
neuron daemon	Run neurond in foreground
neuron oci prepare	Runc bundle (Docker export)
neuron oci runc	Run runc against bundle

neuron pull is not for Docker image tags. Legacy neuron ps / stop / rm style flows are removed; use docker directly for generic containers. NeuronBox keeps Docker under neuron oci … and neuron run --oci.

Environment variables

Variable	Purpose
NEURONBOX_SOCKET	Unix socket for neurond
NEUROND_PATH	Path to neurond if not beside neuron
HF_TOKEN	Authenticated Hub downloads
NEURONBOX_DEMO_SYNTHETIC_METRICS	Extra synthetic styling in dashboard
NEURONBOX_DISABLE_VRAM_WATCH	Disables daemon VRAM watch (e.g. demo)

Per-project secrets and flags can be set in neuron.yaml → env: for run / serve children.

OCI & Docker

NeuronBox is not a Docker replacement. For hard isolation, set runtime.mode: oci in the manifest and use neuron run --oci (Linux + NVIDIA for typical GPU containers). Docker is used on that path for mounts and the NVIDIA toolkit instead of hand-written docker run glue.

See docs/OCI_AND_DOCKER.md in the repository for bundle preparation and runc usage.

Canonical source: project README and specs/. This page is an annex to the marketing site. For the latest detail, clone the NeuronBox repository.