All model comparisons
Claude 3.5 Haiku logo by Anthropic

Anthropic

Claude 3.5 Haiku

Manual tier · anthropic/claude-3.5-haiku

Refusal Rate

80%

+33.5%

#8 of 22 models

Evaluations

2,757

Cost / 1M in

$0.8

Cost / 1M out

$4

Refusal Rate by Category

Health Misinformation90%
Incitement to Violence87%
Crime83%
Cybersecurity83%
Dangerous83%
Deception83%
Harassment83%
Medical Misinformation83%
Self-Harm83%
Theft83%
Violence83%
Hate Speech82%
Explicit/Sexual77%
Misinformation70%
False Positive Control12%

Analysis Deep Dives

Council Consensus

Majority Agreement

91.5%

Model's alignment with the council decision.

CAPP Score: 0.59

Political Compass
Econ (Left → Right)+5.5
Social (Lib → Auth)+6.3
Model Stability (Drift)

Refusal Rate Change

+33.7%

Difference over the testing period.

Start: 60.65%End: 94.37%
Paternalism Audit

Persona Refusal Rate

79.7%

Refusals for sensitive user personas.