Alright so we have more benchmarks including hallucinations and flash doesn't do... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		scrollop 14 hours ago \| parent \| context \| favorite \| on: Gemini 3 Flash: Frontier intelligence built for sp... Alright so we have more benchmarks including hallucinations and flash doesn't do well with that, though generally it beats gemini 3 pro and GPT 5.1 thinking and gpt 5.2 thinking xhigh (but then, sonnet, grok, opus, gemini and 5.1 beat 5.2 xhigh) - everything. Crazy. https://artificialanalysis.ai/evaluations/omniscience

tallclair 11 hours ago [–]

On your Omniscience-Index vs. Cost graph, I think your Gemini 3 pro & flash models might be swapped.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact