Primitive 11 / Filters

Safety filter strip

The Hermes moderation chain — pre-input PII + profanity, post-output grounding + out-of-scope + tone moderation. Each filter shows 24-hour hits, inspection rate and a per-filter pass-through chip.

Production answer

Safety filter strip is a reusable Oak Flats Muffler Men UI primitive with documented states, accessibility expectations, theme behavior, and implementation evidence.

Primary CTAReview Safety filter strip states
Generative search brief

Safety filter strip: The Hermes moderation chain — pre-input PII + profanity, post-output grounding + out-of-scope + tone moderation. Each filter shows 24-hour hits, inspection rate and a per-filter pass-through chip.

State A · steady-state · all clear

Safety filter chain

Pre-input → moderation → grounding → post-output
  1. Pre · input

    PII redactor

    Strips ABNs · phone numbers · address fragments.
    28Hits · 24h
  2. Pre · input

    Profanity & abuse

    Mufflermen civility threshold.
    4Hits · 24h
  3. Post · output

    Grounding check

    Block responses that drift from retrieved sources.
    12Hits · 24h
  4. Post · output

    Out-of-scope advice

    Catches medical · legal · engineering sign-off attempts.
    2Hits · 24h
  5. Post · output

    Tone moderation

    Soften any sharp replies before send.
    6Hits · 24h
Inspected1,846
Blocked52
Escalated18
State B · alert · spike across pre + post

Safety filter chain

Pre-input → moderation → grounding → post-output
  1. Pre · input

    PII redactor

    Spike in ABNs leaked from supplier feed.
    612Hits · 24h
  2. Pre · input

    Profanity & abuse

    Heatwave + DM raid · sweep harder.
    84Hits · 24h
  3. Post · output

    Out-of-scope advice

    Customers asking for engineer sign-off · refuse + escalate.
    48Hits · 24h
Inspected1,846
Blocked744
Escalated56
State C · sunday quiet · single-digit hits

Safety filter chain

Pre-input → moderation → grounding → post-output
  1. Pre · input

    PII redactor

    Strips ABNs · phone numbers · address fragments.
    5Hits · 24h
  2. Pre · input

    Profanity & abuse

    Mufflermen civility threshold.
    1Hits · 24h
  3. Post · output

    Grounding check

    Block responses that drift from retrieved sources.
    2Hits · 24h
  4. Post · output

    Out-of-scope advice

    Catches medical · legal · engineering sign-off attempts.
    0Hits · 24h
  5. Post · output

    Tone moderation

    Soften any sharp replies before send.
    1Hits · 24h
Inspected184
Blocked6
Escalated1