Loom is in pilot intake · these docs cover the full data-science lifecycle CLI Talk to us

Configuration & providers

Everything in Loom is configured by flags + environment variables (with an optional .env or --config YAML). Loom is built as four swappable ports — search, MLOps, model-builder, and model — and each is selected the same way: a flag, an env var, and a sensible default. Secrets are read only from the environment, never passed on the command line.

How configuration works

A single LoomConfig is the source of truth for provider selection, model routing, and budget. It is resolved with a clear precedence — later layers override earlier ones:

So a flag always wins over an env var, which wins over YAML, which wins over the built-in default. Every verb also self-documents its flags:

loom <verb> --help        # lists every flag for that verb

Secrets are never hardcoded, never stored on the config object, and never put on the command line. Model providers read the matching API key (ANTHROPIC_API_KEY, OPENAI_API_KEY, NVIDIA_API_KEY, OPENROUTER_API_KEY, …) from the environment at the point of use and only ever move or pass through what you already set.

Environment variables & flags

The knobs you will reach for most. Each pairs an env var with its flag (where one exists), the default, and what it controls:

Env varFlagDefaultMeaning
LOOM_SEARCH_PROVIDER--searchaidesearch ("brain") provider — the ML-iteration engine
LOOM_MLOPS_PROVIDER--mlopsmetaflowexecution ("muscle") provider; local is a Metaflow-free dev path
LOOM_MODEL_BUILDER_PROVIDERnemotraining-engine provider — nemo (AutoModel) or local today; deepspeed / fsdp / megatron planned
LOOM_GPU_TARGET(none)GPU target for train --launch; unset ⇒ a clean refusal, never a launch
LOOM_CODE_PROVIDER--code-provideranthropic-apimodel ("LLM backend") provider for the code role (writes solutions)
LOOM_FEEDBACK_PROVIDER--feedback-provideranthropic-apimodel provider for the feedback/judge role (scores solutions)
--model-providershorthand that sets both roles at once
LOOM_BUDGET_STEPS--steps10number of search steps (LLM calls + real executions)
LOOM_CODE_MODELclaude-sonnet-4-5model name the code role writes solutions with
LOOM_FEEDBACK_MODELclaude-sonnet-4-5model name the judge reviews/scores with
METAFLOW_PROFILE(Metaflow default)your Metaflow endpoint — point Loom at your own perimeter (BYO datastore)
LOOM_PYTHON(set by installer)path to the engine interpreter (<repo>/.venv/bin/python); must stay exported
i

LOOM_PYTHON must stay exported (point it at <repo>/.venv/bin/python) or the lifecycle verbs can't find the engine. It is set up for you by the installer; if you move the repo or open a fresh shell without your profile, re-export it. See Getting started.

Endpoints & the model knobs

Two providers point at OpenAI-compatible endpoints, configured by env var rather than a flag:

The budget is more than steps: it carries steps, draft count, debug probability, and max debug depth — but --steps / LOOM_BUDGET_STEPS is the one you set day to day. Start small and raise it once a run works.

Swappable ports

Loom is ports & adapters: every verb runs through an interface, never a concrete backend, so any layer is drop-in replaceable. There are four seams, each with a default adapter and a one-line way to swap it.

PortRoleSelected byDefault · adapters
Search ("brain")the ML-iteration / optimize engine — propose, run, score candidates--search · LOOM_SEARCH_PROVIDERaide (tree-search)
MLOps ("muscle")data objects · flows · runs · @card · deploy/ops--mlops · LOOM_MLOPS_PROVIDERmetaflow · local
Model-builder ("training engine")train — pretrain · finetune · embed · serveLOOM_MODEL_BUILDER_PROVIDERnemo (AutoModel) · local
Model ("LLM backend")which model the brain talks to, and how it's authenticated--model-provider / --code-provider / --feedback-provideranthropic-api · seven more

Search — the brain (AIDE)

The search provider runs the ML-iteration loop behind loom run / /loom-optimize: it proposes candidate solutions, runs each one for real through the MLOps port, scores the result against your metric, and returns the best. The default adapter is AIDE tree-search (pinned by commit for reproducibility). Swap it with --search.

MLOps — the muscle (Metaflow)

The execution provider runs the work and owns the datastore. There are two built-in adapters, and the choice changes what's reachable:

--mlops local--mlops metaflow (default)
Setupnone (just an LLM key)a Metaflow profile + datastore
Coversthe loom run search onlythe whole lifecycle (eda, features, train, validate, deploy, …)
Each candidateruns in-processruns as a versioned Metaflow run
Inputa local --data dira Metaflow data object (--dataset <pathspec>)

The local path is the on-ramp — no Metaflow, no GPU, no cloud. The lifecycle verbs need --mlops metaflow because they run flows, not in-process candidates. Where artifacts physically live (local or S3/minio) is an implementation detail Metaflow owns, configured through METAFLOW_PROFILE / METAFLOW_* — see the data model.

Model-builder — the training engine (NeMo)

The model-builder is the heavy training/serving backend behind loom train. You state intent in data-science vocabulary — --objective {next-event|masked-field|contrastive}, --budget {probe|small|full}, --backbone, --metric — and the adapter compiles that to backend config; no .nemo/Megatron nouns ever surface. The default is NeMo AutoModel; the seam is swappable, with DeepSpeed, PyTorch FSDP, Megatron, and HF Accelerate as planned peers (they compare with each other as distributed-training engines — not with Metaflow, which orchestrates the GPU step that runs whichever one you pick).

Select the builder with LOOM_MODEL_BUILDER_PROVIDER. The torch-free CPU local stand-in actually builds a backbone end-to-end (a real PPMI + SVD embedding model) — no GPU, sub-2s, deterministic — which makes it ideal for dev and CI:

# No GPU target set: refuses cleanly, shows the cost plan, never launches.
loom train --dataset IngestDataset/123 --objective next-event --budget full
#   STATUS : REFUSED_NO_GPU_TARGET  (set LOOM_GPU_TARGET, or use the CPU stand-in)

# The CPU stand-in builds a backbone for real — dev + CI.
LOOM_MODEL_BUILDER_PROVIDER=local \
  loom train --dataset IngestDataset/123 --objective next-event --budget probe
#   STATUS : BUILT  (-> a backbone pathspec + @card)

train is the expensive · always-gate tier. With no LOOM_GPU_TARGET it refuses rather than launching; with a target set, the real heavy GPU launch is still off by default behind --launch, and the @card shows the cost PLAN (GPU-count · hours · $) first. Loom hides Megatron parallelism, never this costs $X and takes Y hours.

Model — the LLM backend

The model provider decides which model the brain talks to and how it authenticates. You pick one per role--code-provider (writes solutions) and --feedback-provider (the judge that scores) — or set both at once with --model-provider. Each adapter configures AIDE's existing LLM backends via config/env; Loom never forks AIDE and never handles your secrets. The full table is below.

Model providers

The eight built-in model providers, with how each authenticates and whether it can serve the judge role. The judge always uses tool/function calling, so a feedback route must be tool-capable.

ProviderAuthJudge-capable
anthropic-apiANTHROPIC_API_KEY (honors ANTHROPIC_BASE_URL)yes — native Claude tool use
openai-apiOPENAI_API_KEYyes — Responses-API tools
openrouterOPENROUTER_API_KEYOPENAI_API_KEY; model is a provider/model slugper-slug — pick a tool-capable feedback slug
nimNVIDIA_API_KEY; endpoint via OPENAI_BASE_URL (default integrate.api.nvidia.com)configurable (default yes)
openai-compatOPENAI_API_KEY; endpoint via LOOM_MODEL_BASE_URL (LiteLLM / vLLM / Ollama)configurable (default yes)
claude-subscriptionnone — your local claude CLI loginyes — CLI text coerced to JSON
codex-subscriptionnone — your local codex CLI login (~/.codex/auth.json)yes — codex exec --output-schema
loom-proxyLOOM_API_KEY (the gateway holds the vendor key server-side)yes — Anthropic passthrough
i

The judge needs tools. AIDE's feedback step always uses tool/function calling. If your feedback route can't, loom run fails fast (exit 2) and tells you to pick a tool-capable feedback model — it never silently degrades.

Picking and mixing providers

Set the same provider for both roles, or split them — a cheap local model can write code while a stronger model judges:

# Native OpenAI for both roles
export OPENAI_API_KEY="sk-..."; export LOOM_CODE_MODEL="gpt-4o"; export LOOM_FEEDBACK_MODEL="gpt-4o"
loom run --data ./task --goal "..." --metric "..." --model-provider openai-api

# NVIDIA NIM — a real NeMo touchpoint, zero GPU infra
export NVIDIA_API_KEY="nvapi-..."
loom run --data ./task --goal "..." --metric "..." --model-provider nim

# Mix roles: a local model writes code, Claude judges
export OPENAI_BASE_URL="http://localhost:11434/v1"   # Ollama, say
loom run --data ./task --goal "..." --metric "..." \
  --code-provider openai-compat --feedback-provider anthropic-api

OpenRouter, NIM & OpenAI-compatible routing

AIDE routes purely by model name, which creates two sharp edges the providers route around for you:

Subscription bridges

The two *-subscription providers are CLI bridges: Loom drives the claude / codex CLI you already installed and logged in, never seeing your credentials. Metering comes from a separate subscription pool — a heavy AIDE run is many calls and can exhaust it, so for sustained runs an API key is the better fit. loom run pre-flights both (a missing binary or login yields an actionable hint, not a traceback).

The Loom gateway (loom-proxy)

The loom-proxy provider routes through the Loom gateway — a thin Anthropic-passthrough that holds the real vendor key server-side, so callers present only a LOOM_API_KEY. Run it as a server and point clients at it:

# Server side: the gateway holds the REAL vendor key; callers present only a LOOM_API_KEY.
export ANTHROPIC_API_KEY="sk-ant-..."     # stays on the server
export LOOM_API_KEY="loom-..."            # the key callers must present (allowlist)
loom proxy serve --host 127.0.0.1 --port 8088

# Client side (another shell, only the LOOM key):
export LOOM_API_KEY="loom-..."
loom run --data ./task --goal "..." --metric "..." --model-provider loom-proxy

Two env vars configure the client side: LOOM_API_BASE (default http://127.0.0.1:8088) and LOOM_PROXY_LOG_PATH (default learnings/proxy_calls.jsonl). When a hosted gateway is detected — LOOM_API_BASE set to a non-loopback URL, or LOOM_PROXY_DEFAULT truthy — and you haven't explicitly chosen a provider, Loom defaults both roles to loom-proxy. Precedence: an explicit --code/feedback/model-provider (or LOOM_CODE/FEEDBACK_PROVIDER) always wins; else a hosted gateway ⇒ loom-proxy; else anthropic-api. A loopback base is local dev, so it never flips the default.

i

Your bulk data never reaches the LLM in any mode — datasets live as Metaflow data objects in your perimeter; the model only ever sees small derived context (schema, preview, metrics). With a BYO key, prompts go straight to your vendor and the gateway sees nothing; loom-proxy sits in that same path and additionally logs the call to the moat corpus, tagged by tenant/owner. It is opt-in today and becomes the default once the gateway is hosted.

Keyless by default

Most of Loom needs no model key at all. The read-only and lifecycle verbs work without one:

Only the LLM verbs — run / pipeline, which drive the AIDE search brain — and the natural-language planning need a model key. A missing key yields an actionable line (set ANTHROPIC_API_KEY … or pick a --model-provider), never a crash. The default model is Claude, so the quickest path is an ANTHROPIC_API_KEY; inside the agent you can also run /login.

Adding a provider

Because every verb speaks an interface and the controller resolves providers by name from a registry, adding a backend is one adapter class plus a one-line registration — never a core change. Each new provider is constructed uniformly as Provider(config), reads endpoints/profiles from the config, reads secrets from the environment at the point of use, and lazy-imports any heavy dependency so the core import stays light. Once registered, you select it exactly like a built-in:

loom run --search my-brain              # LOOM_SEARCH_PROVIDER=my-brain
loom run --mlops my-runtime            # LOOM_MLOPS_PROVIDER=my-runtime
loom run --model-provider my-llm        # LOOM_CODE_PROVIDER / LOOM_FEEDBACK_PROVIDER
LOOM_MODEL_BUILDER_PROVIDER=my-trainer loom train ...

A new model-builder must additionally pass the golden conformance suite (the tokenize → pretrain → embed → finetune → evaluate round-trip, valid artifact pathspecs, manifest honesty). The full provider model and step-by-step is in docs/architecture.md in the repository.