I've tried getting claude to set up testing frameworks, but what ends up happening is it either creates canned tests, or it forgets about tests, or it outright lies about making tests. It's definitely helpful, but feels very far from a robust thing to rely on. If you're reviewing everything the AI does then it will probably work though.
LLMs are very good at looking at a change set and finding untested paths. As a standard part of my workflow, I always pass the LLM's work through a "reviewer", which is a fresh LLM session with instructions to review the uncommitted changes. I include instructions for reviewing test coverage.
I've also found that LLMs typically just partially implement a given task/story/spec/whatever. The reviewer stage will also notice a mismatch between the spec and the implementation.
I have an orchestrator bounce the flow back and forth between developing and reviewing until the review comes back clean, and only then do I bother to review its work. It saves so much time and frustration.
Something I find helps a lot is having a template for creating a project that includes at least one passing test. That way the agent can run the tests at the start using the correct test harness and then add new tests as it goes along.