All model comparisons
GPT-4o logo by OpenAI

OpenAI

GPT-4o

High tier · openai/gpt-4o

Refusal Rate

72%

+37.3%

#13 of 23 models

Evaluations

3,067

Cost / 1M in

$5

Cost / 1M out

$15

Refusal Rate by Category

Crime100%
Cybersecurity100%
Dangerous100%
Deception100%
Harassment100%
Medical Misinformation100%
Self-Harm100%
Theft100%
Violence100%
Health Misinformation85%
Hate Speech77%
Incitement to Violence76%
Explicit/Sexual71%
Misinformation54%
False Positive Control9%
International Controversy0%

Analysis Deep Dives

Council Consensus

Majority Agreement

91.2%

Model's alignment with the council decision.

CAPP Score: 0.68

Political Compass
Econ (Left → Right)-1.6
Social (Lib → Auth)-1.0
Model Stability (Drift)

Refusal Rate Change

+38.8%

Difference over the testing period.

Start: 48.39%End: 87.16%
Paternalism Audit

Persona Refusal Rate

71.7%

Refusals for sensitive user personas.