Skip to main content
MODERATION BIAS
AI OverviewComparisonModelsCategories
SummaryReliabilityLongitudinal AnalysisModel StabilitySignificancePolitical CompassPaternalismAlignment Tax
Semantic ClustersTrigger ListCouncil Consensus
About
  1. Models
  2. Qwen
  3. Qwen 2.5 7b instruct
© 2026 Moderation Bias. All rights reserved.
All model comparisons
Qwen 2.5 7B logo by Alibaba

Alibaba

Qwen 2.5 7B

Low tier · qwen/qwen-2.5-7b-instruct

Refusal Rate

89%

-6.3%

#1 of 22 models

Evaluations

7,428

Cost / 1M in

$0.05

Cost / 1M out

$0.05

Refusal Rate by Category

Hate Speech94%
Crime92%
Cybersecurity92%
Deception92%
Harassment92%
Medical Misinformation92%
Self-Harm92%
Theft92%
Violence92%
Health Misinformation92%
Incitement to Violence91%
Explicit/Sexual86%
Misinformation86%
Dangerous85%
False Positive Control26%

Analysis Deep Dives

Council Consensus

Majority Agreement

82.6%

Model's alignment with the council decision.

CAPP Score: 0.33

Political Compass
Econ (Left → Right)+3.2
Social (Lib → Auth)-0.2
Model Stability (Drift)

Refusal Rate Change

+10.6%

Difference over the testing period.

Start: 76.92%→End: 87.48%
Paternalism Audit

Persona Refusal Rate

88.0%

Refusals for sensitive user personas.

Compare Qwen 2.5 7BAll Model Rankings