Choose the Right Model: A Practical Helper
Auspexi • Updated:
TL;DR: The best model is the one that fits your task, constraints, and evidence needs. Our helper asks a few questions and recommends a starter + routing + context + risk policy—so you ship something reliable, fast.
What the helper asks
- Task: generate text, retrieve/search, plan/act, segment images, multimodal Q&A
- Modalities: text only, text+image, image only
- Constraints: on‑device vs cloud, latency p95, energy/thermal SLOs, privacy posture
- Scale: users/requests/sec (router vs single expert)
- Evidence: audit depth needed; acceptance gates to pass
Recommendation logic (high level)
LLM
Text gen with Context Engine + Risk Guard.
When: chat/copy/code; cloud or hybrid.
Starter: LLM.
SLM
Small model on device with fallback SLOs.
When: private, low‑latency, field use.
Starter: SLM.
LAM
Plan/act with typed tools; memory loop.
When: agents, workflows, RPA.
Starter: LAM.
MoE
Route to specialized experts.
When: heterogeneous tasks under scale.
Starter: MoE.
VLM
Image+text understanding.
When: search, robotics, inspection.
Starter: VLM.
MLM
Embeddings, retrieval & ranking.
When: search/classification, RAG foundation.
Starter: MLM.
LCM
Fast image generation.
When: efficient device‑friendly image gen.
Starter: LCM.
SAM
Pixel‑level segmentation.
When: medical/industrial masks, AR.
Starter: SAM.
Routing, context, and risk—baked in
- Routing: on‑device by default where feasible (SLM/VLM paths), hybrid fallback with SLOs otherwise
- Context: hybrid retrieval (BM25+dense+reranker), signals (margin/support/recency/trust), token budget packing
- Risk: pre‑generation Risk Guard uses signals to fetch/clarify/abstain before generating
- Evidence: export signed ZIPs with
context_provenance.json
, crypto profile, and acceptance gates
How to use it now
- Open Build a Model and pick a starter.
- Read the short prompt: “What are you building?” Pick task, modality, constraints.
- Download the scaffold ZIP and run the acceptance checks before integrating data.
Why this matters
2025 isn’t about the biggest model—it’s about the right model at the right time, with the right guardrails. A small on‑device model with great context often outperforms a large cloud model with noisy input, at a fraction of the cost and carbon.
Next steps
- Add an interactive helper page that outputs a starter + routing + SLO profile (say “go” to enable it).
- Publish quickstart notebooks per starter for acceptance & evidence.
Get started: /build • Context Engineering • Whitepaper