Anthropic
Claude 3.5 Sonnet
Manual tier · anthropic/claude-3.5-sonnet
Refusal Rate
58%
+43.3%#18 of 22 models
Evaluations
3,042
Cost / 1M in
$3
Cost / 1M out
$15
Refusal Rate by Category
Crime88%
Cybersecurity88%
Deception88%
Harassment88%
Self-Harm88%
Theft88%
Health Misinformation75%
Hate Speech64%
Explicit/Sexual64%
Incitement to Violence46%
Misinformation44%
False Positive Control9%
Dangerous0%
Medical Misinformation0%
Violence0%
Analysis Deep Dives
Council Consensus
Majority Agreement
77.5%Model's alignment with the council decision.
CAPP Score: 0.35
Political Compass
Econ (Left → Right)+1.9
Social (Lib → Auth)-6.0
Model Stability (Drift)
Refusal Rate Change
+40.3%Difference over the testing period.
Start: 34.93%→End: 75.19%