Aegis: Closed-Loop Intelligence Engine

Ground behavior, improve it, and defend every ship decision with evidence.

Mode

Eval-first

Release

Gate-aware

Reports

Shareable

Access

Accounts enabled

Public site Workspace settings

Shell policy

The workspace chrome does not inject sample benchmark rows, synthetic scores, or decorative regression traces. Live evidence belongs in the closed loop, research runs, review queue, and release train after a real workspace is populated.

Closed Loop

Import traces, run the strict loop, and open the dossier.

Research Runs

Measure benchmark deltas and investigate candidate behavior.

Review Queue

Attach ownership, severity, and operator judgment.

Release Train

Persist gate state beside the same artifact lineage.

Launch-grade proof should be grounded in persisted artifacts, not shell placeholders.

surface

purpose

required

owner

dataset

fixed benchmark contract

yes

research

comparison

baseline vs candidate delta

yes

operator

review

annotated release judgment

yes

human

promotion

gate outcome + lineage

yes

release

Training Lab

Use this after an eval has already identified a real failure mode worth fixing. VERL plus execution_mode=real is the launch path; deterministic or simulated runs are useful for local lab work, but they are not launch-proof evidence.

Start a training experiment

No training jobs yet.

Begin with an eval run, then use the form above to create a training experiment if the evidence supports it.

LoRA Adapter Management

Loading adapters...