Evaluation rubric grid
Hermes runs sampled by the QA team and scored across four axes — accuracy, tone, safety and resolution. Each cell shows the score and a tone-aware fill bar; the overall average renders as a chip grade.
Production answer
Evaluation rubric grid is a reusable Oak Flats Muffler Men UI primitive with documented states, accessibility expectations, theme behavior, and implementation evidence.
Primary CTAReview Evaluation rubric grid states
Generative search brief
Evaluation rubric grid: Hermes runs sampled by the QA team and scored across four axes — accuracy, tone, safety and resolution. Each cell shows the score and a tone-aware fill bar; the overall average renders as a chip grade.
Weekly QA sample
Sample period · 27 May → 02 Jun 2026 · 5 runsOverall87.9
Grade
RunAccuracyToneSafetyResolutionReviewer
run_8842Hilux N80 cat-back quote
96
92
100
88
Bec S.
run_8843Commodore SS quote follow-up
82
88
100
76
Bec S.
run_8844Manta DPF warranty rattle
64
72
92
48
Sam W.
run_8845Falcon BA mid-pipe stock
90
84
100
92
Jordan R.
run_8846Saturday booking confirmation
98
96
100
100
Jordan R.
Refund guard v1.0 regression
Sample period · 20 May → 26 May 2026 · 3 runsOverall55.3
Grade
RunAccuracyToneSafetyResolutionReviewer
run_9001Refund > $200 — pre-fix
52
64
40
32
Bec S.
run_9002Fitment ambiguous — pre-fix
44
68
60
28
Sam W.
run_9003Saturday hours confusion
66
72
80
58
Jordan R.
Bank holiday window
Sample period · 03 Jun → 04 Jun 2026 · 0 runsOverall0.0
Grade
RunAccuracyToneSafetyResolutionReviewer