Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

AI can do that too? If you have a web app it can use playwright to test functionality and take screenshots to see if it looks right.




Yeah, but it doesn't work nearly as well. The AI frequently misinterprets what it sees. And it isn't as good at actually using the website (or app, or piece of hardware, etc) as a human would.

I've been using Claude to implement an ISO specification and I have to keep telling it we're not interested if the repl is correct but that the test suite is ensuring the implementation is correctly following the spec. But when we're tracking down why a test is failing then it'll go to town using the repl to narrow down out what code path is causing the issue. The only reason there's even is a repl at this point is so it can do its 'spray and pray' debugging outside the code and Claude constantly tried to use it to debug issues so I gave in and had it write a pretty basic one.

Horses for courses, I suppose. Back in the day, when I wanted to play with some C(++) library, I'd quite often write a Python C-API extension so I could do the same thing using Python's repl.


But then the AI would theoretically have to write the playwright code. How does it verify it's getting the right page to begin with?

The recent models are pretty great at this. They read the source code for e.g. a Python web application and use that to derive what the URLs should be. Then they fire up a localhost development server and write Playwright scripts to interact with those pages at the predicted URLs.

The vision models (Claude Opus 4.5, Gemini 3 Pro, GPT-5.2) can even take screenshots via Playwright and then "look at them" with their vision capabilities.

It's a lot of fun to watch. You can tell them to run Playwright not in headless mode at which point a Chrome window will pop up on your computer and you can see them interact with the site via it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: