⚠️
Model Identity
Model version
—
Deployment accuracy
—
Recall (harmful)
—
Deployment date
—
Inference device
—
Uptime this session
—
Live Feedback Metrics
Total feedback collected
—
predictions reviewed by humans
Correction rate
—
times model was wrong
False positives
—
model said harmful → clean
False negatives
—
model said clean → harmful
Requests this session
—
resets on server restart
Avg response time
—
end-to-end ms
Accuracy: Deployment vs Live (Playground)
Deployment accuracy is the benchmark score (honest, frozen, measured at training time).
Live accuracy is estimated from playground corrections — directional only until ≥100 corrections collected.
Where the model is wrong (by category)
No corrections yet — start testing in the playground.
Correction type split
No corrections yet.
Live Feed — All Submissions (last 20)
Recent Corrections Only (last 10)
| Time |
Text preview |
Error type |
Category |
Risk score |
Comment |
| Loading… |