Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This outperforms Gemini 3 pro image (nano banana pro) on Text-to-Image Arena and Image Edit Arena. I'm surprised they didn't mention this leaderboard in the blog post.

I like this benchmark because its based upon user votes, so overfitting is not as easy (after all, if users prefer your result, you've won).

https://lmarena.ai/leaderboard/text-to-image

https://lmarena.ai/leaderboard/image-edit





The score are really, really close, it might be why

The arena concept doesn’t work for image models due to watermarks.

There are no watermarks in the arena.

There are no visible watermarks, but model makers can use steganographic codes to identify outputs from their own models.

Text-to-Image Models Leave Identifiable Signatures: Implications for Leaderboard Security

https://arxiv.org/pdf/2510.06525


This is true, however LMArena does employ some methods to mitigate attempts to manipulate the leaderboard, see https://openreview.net/forum?id=zf9zwCRKyP

They also control for style https://news.lmarena.ai/sentiment-control/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: