Behavioral Research
Gauntlet Leaderboard
Community-aggregated behavioral scores across hardware, quantization, and providers. Filter by your setup.
Filter by Hardware
Filters
Loading community data...
Degradation Curves
How scores change across quantization levels for a given model family and size
No degradation data available for gemma 8b. Try a different combination.
Performance Prediction
Predict how a model will perform on your hardware based on community data
Enter a model name from the rankings above, pick a hardware tier, and see how it's predicted to perform.
Every test from every user builds this dataset. Contribute by testing models on your hardware.
pip install gauntlet-cli && gauntlet run --model ollama/gemma4:e2b