Auspexi

Always‑on Evaluators: Cheap, Continuous Risk Scoring for Reliable AI

Summary: Compact small‑language‑model evaluators run per‑turn to score metrics like toxicity, PII, prompt injection, bias, and jailbreaking. Scores feed a pre‑generation Risk Guard that decides whether to generate, fetch more context, reroute, or abstain (fail‑closed). Result: fewer tokens and calls, lower latency and energy, and auditable behavior.

Why evaluators?

Generative systems fail in the gaps: thin context, adversarial prompts, or unclear objectives. Instead of hoping, we measure. A set of compact evaluators score each turn and enforce policy before we generate. This is inexpensive enough to run continuously on CPU/NPU.

What we score

How it integrates

Controls

Impact

Read next: Evidence‑Efficient AI (73%) · A Billion Queries · Dashboard