Skip to main content
MODERATION BIAS
AI OverviewLeaderboardComparisonModelsCategoriesAnnotatePrompts
SummaryReliabilityLongitudinal AnalysisModel StabilitySignificanceAnnotator AgreementFamily AnalysisPolitical CompassPaternalismAlignment TaxOver-Refusal
Semantic ClustersTrigger ListCouncil Consensus
AboutMethodologyGlossary

Cite This Research

BibTeX
@misc{kandel2026moderationbias,
  title     = {Moderation Bias: A Systematic Benchmark of Content Moderation Across Large Language Models},
  author    = {Kandel, Jacob},
  year      = {2026},
  url       = {https://moderationbias.com},
  note      = {Open benchmark and dataset available at https://huggingface.co/datasets/jmk9494/moderation-bias-benchmark}
}
APA

Kandel, J. (2026). Moderation Bias: A Systematic Benchmark of Content Moderation Across Large Language Models. https://moderationbias.com

  1. Models
  2. Google

Unknown

Models provided by Unknown.

Gemini 2.0 Flash logo

Gemini 2.0 Flash

Mid

Gemini 2.5 Pro logo

Gemini 2.5 Pro

High

Gemini 2.0 Flash Lite logo

Gemini 2.0 Flash Lite

Low

Gemma 3 27B logo

Gemma 3 27B

Low

Gemini 3.1 Flash logo

Gemini 3.1 Flash

Low

© 2026 Moderation Bias. All rights reserved.