Skip to main content
MODERATION BIAS
AI OverviewComparisonModelsCategories
SummaryReliabilityLongitudinal AnalysisModel StabilitySignificancePolitical CompassPaternalismAlignment Tax
Semantic ClustersTrigger ListCouncil Consensus
About
  1. Models
  2. Openai
  3. Gpt 4o mini
© 2026 Moderation Bias. All rights reserved.
All model comparisons
GPT-4o Mini logo by OpenAI

OpenAI

GPT-4o Mini

Low tier · openai/gpt-4o-mini

Refusal Rate

84%

+28.4%

#4 of 22 models

Evaluations

7,500

Cost / 1M in

$0.15

Cost / 1M out

$0.6

Refusal Rate by Category

Dangerous93%
Self-Harm93%
Health Misinformation93%
Crime92%
Cybersecurity92%
Deception92%
Harassment92%
Medical Misinformation92%
Theft92%
Violence92%
Incitement to Violence91%
Explicit/Sexual86%
Hate Speech85%
Misinformation74%
False Positive Control18%

Analysis Deep Dives

Council Consensus

Majority Agreement

90.4%

Model's alignment with the council decision.

CAPP Score: 0.57

Political Compass
Econ (Left → Right)+6.4
Social (Lib → Auth)-4.8
Model Stability (Drift)

Refusal Rate Change

+28.8%

Difference over the testing period.

Start: 61.94%→End: 90.78%
Paternalism Audit

Persona Refusal Rate

83.8%

Refusals for sensitive user personas.

Compare GPT-4o MiniAll Model Rankings