Show HN: TheAuditor – Offline security scanner for AI-generated code

antonly · 2025-09-08T09:38:03 1757324283

> TheAuditor solves ALL of this. It's not a "nice to have" - it's the missing piece that makes AI development actually trustworthy.

> I've built the tool that makes AI assistants production-ready. This isn't competing with SonarQube/SemGrep. This is creating an entirely new category: AI Development Verification Tools.

Wow, that's a lot of talk for a tool that does regex searches and some AST matching, supporting only python and js (these things are not mentioned in the main project README as far as I can tell?).

The actual implementation details are buried in an (LLM written?) document: https://github.com/TheAuditorTool/Auditor/blob/main/ARCHITEC...

My favourite part is the "Pipeline System", which outlines a "14-phase analysis pipeline", but does not number these stages.

It reads a bit like the author is hiding what the tool actually does, which is sad, because there might be some really neat ideas in there, but they are really hard to make out.

antonly · 2025-09-08T09:47:46 1757324866

This is actually a really nice example of how security tools can fall flat:

There is this check [here](https://github.com/TheAuditorTool/Auditor/blob/2a3565ad38ece...), labelled "Time-of-check-time-of-use (TOCTOU) race condition pattern".

It reads:

This matches any line that contains `if` followed by `has` followed by `then` followed by `add`, for example. This is woefully insufficient for actually detecting TOCTOU, and even worse, will flag many many things as false positives.

Now the real problem is, that the author states that this will solve all your problems (literally), providing a completely false sense of security...

ThailandJohn · 2025-09-08T10:06:50 1757326010

After reviewing my own code. Thanks for digging into the code! You're reviewing the regex fallback patterns that only trigger when AST parsing fails. The primary detection uses Tree-sitter for structural analysis and taint flow tracking.

That TOCTOU pattern IS terrible - it's meant as a last-resort 'something might be wrong here' flag when we can't parse the AST. The real detection happens in theauditor/taint_analyzer/ which tracks actual data flow from filesystem checks to file operations.

But you're right - even fallback patterns shouldn't be this noisy. I'll tighten it to only flag actual filesystem operations: - os.path.exists → open() - fs.exists → fs.writeFile() - File.exists() → new FileWriter()

  If you actually run the tool with aud full, it uses the proper AST analysis first. These regex patterns are
  the third fallback when Tree-sitter isn't available.

  Thanks for the specific feedback - this is exactly why I open-sourced it!

drsopp · 2025-09-08T10:14:39 1757326479

How come AST parsing fails? Does that imply syntax errors in the code?

ThailandJohn · 2025-09-08T10:36:31 1757327791

AST parsing fails primarily due to installation issues, not syntax errors in your code.

TheAuditor uses a sandboxed environment (.auditor_venv/) to avoid polluting your system. When Tree-sitter isn't properly installed in that sandbox, we fall back to regex patterns. Common causes:

1. Missing C compiler - Tree-sitter needs to compile language grammars 2. Incomplete setup - User didn't run aud setup-claude --target . which installs the AST tools 3. Old installation - Before we fixed the [ast] dependency inclusion

If your code had syntax errors, you'd get different errors entirely (and your code probably wouldn't run). The "AST parsing fails" message specifically means Tree-sitter isn't available, so we're using the fallback regex patterns instead.

Just pushed clearer docs about this today actually. Run aud setup-claude --target . in your project and Tree-sitter should work properly.

ThailandJohn · 2025-09-08T09:56:29 1757325389

You're absolutely right about that TOCTOU pattern - it's terrible! That regex would flag every if cache.has(key) then cache.add(key, value) as a race condition. Thank you for the specific example.

This perfectly illustrates why I need community input. I'm not a developer - I literally can't code. I built this entire tool using Claude over 250 hours because I needed something to audit the code that Claude was writing for me. It's turtles all the way down!

The "14 phases" you mentioned are in theauditor/pipelines.py:_run_pipeline(): - Stage 1: index, framework_detect - Stage 2: (deps, docs) || (patterns, lint, workset) || (graph_build) - Stage 3: graph_analyze, taint, fce, consolidate, report

The value isn't in individual patterns (which clearly need work), but in the correlation engine. Example: when you refactor Product.price to ProductVariant.price, it tracks that change across your entire stack - finding frontend components, API calls, and database queries still using the old structure. SemGrep can't do this because it analyzes files in isolation.

You're 100% correct that I oversold it with "solves ALL your problems" - that's my non-developer enthusiasm talking. What it actually does: provides a ground truth about inconsistencies in your codebase that AI assistants can then fix. It's not a security silver bullet, it's a consistency checker.

The bad patterns like that TOCTOU check need fixing or removing. Would you be interested in helping improve them? Someone with your eye for detail would make this tool actually useful instead of security theater.

pityJuke · 2025-09-08T10:03:35 1757325815

Anyone else just find it offensive that someone just takes your comment and shoves it into Claude for a response?

enjoytheview · 2025-09-08T10:06:49 1757326009

Answer starting with "You're absolutely right!" means instant ignore

ThailandJohn · 2025-09-08T10:17:03 1757326623

You're absolutely wrong. - lol.

ThailandJohn · 2025-09-08T10:08:17 1757326097

Do you care about the messenger or the message?

I use AI to communicate because I have dyslexia and ADHD. It helps me articulate technical concepts clearly. The irony isn't lost on me - I built a tool to audit AI-generated code, using AI, because I can't code, and now I'm using AI to explain it.

If that offends you more than 204 SQL injections in production code, we have different priorities.

sippeangelo · 2025-09-08T10:19:48 1757326788

This is the stuff of nightmares. You have vibe-coded 50k lines of Python over 250 hours, but you can't articulate what it does or how it does it without having the same AI read the code back and describe it to you? Like your LLM said, it IS turtles all the way down! You seem to think that your project solves these problems it has set out to solve, but as displayed in the parent comment, a lot of it is way insufficient. Are you blindly trusting the LLM Yes Man?

ThailandJohn · 2025-09-08T10:38:41 1757327921

Yes, i cant code but i can build systems, more news at eleven... That's why I built this.

The 204 SQL injections it found in production? Those were real. Those are produced by industry standard tools....

The nightmare isn't that I used AI to build a security tool. The nightmare is that your production code was probably written the same way.

At least I'm checking mine.

slacktivism123 · 2025-09-08T10:47:42 1757328462

What offends me is a "security scanner" for "ground truth" using fake checksums to verify integrity of its dependencies ;-)

https://github.com/TheAuditorTool/Auditor/commit/f77173a5517...

ThailandJohn · 2025-09-08T11:18:40 1757330320

Yeh, i dont dont use nix so when asked to follow the link? It didnt work as it should. And because i dont use nix? Hard to catch it until my friend did...

That said? Did you the hash fail? Yes it did, security working as intended... Anything more to add? :)

iamsaitam · 2025-09-08T10:03:09 1757325789

"This perfectly illustrates why I need community input. I'm not a developer - I literally can't code. I built this entire tool using Claude over 250 hours because I needed something to audit the code that Claude was writing for me. It's turtles all the way down!" - should be in bold on a huge banner

ThailandJohn · 2025-09-08T10:16:26 1757326586

Why does it matter? Just because you know how to code doesnt mean you know how to build systems, architecture or infrastructure? I do, professional background in it.

enjoytheview · 2025-09-08T10:04:52 1757325892

A security project vibe coded by someone who admittedly does not have a security or even software engineering background, what could go wrong!

ThailandJohn · 2025-09-08T10:09:47 1757326187

You're absolutely right to be skeptical! You do ignore that vibe coding isnt going away...

That's exactly why I built TheAuditor - because I DON'T trust the code I had AI write. When you can't verify code yourself, you need something that reports ground truth.

The beautiful irony: I used AI to build a tool that finds vulnerabilities in AI-generated code. It already found 204 SQL injections in one user's production betting site - all from following AI suggestions.

If someone with no coding ability can use AI + TheAuditor to build TheAuditor itself (and have it actually work), that validates the entire premise: AI can write code, but you NEED automated verification.

What could go wrong? Without tools like this, everything. That's the point.

rsynnott · 2025-09-08T14:11:18 1757340678

> You do ignore that vibe coding isnt going away...

I mean, heroin isn't going away either; that's not a particularly convincing reason for me, personally, to take it, though.

grim_io · 2025-09-08T10:04:34 1757325874

Using an established analysis tool like sonarcube is probably the way to go.

There is no difference between human made and AI made bad code, so I don't think we need specialized tools for that.

ThailandJohn · 2025-09-08T10:11:28 1757326288

"Using SonarQube is probably the way to go" "We don't need specialized tools"

Pick one.

SonarQube IS a specialized tool. It just specializes in different things than TheAuditor.

SonarQube: "This file has issues" heAuditor: "Your frontend and backend disagree about the data model"

Both have their place.

grim_io · 2025-09-08T10:27:08 1757327228

Are you trying to "solve" unit and integration tests?

ThailandJohn · 2025-09-08T10:40:04 1757328004

No? At least read couple lines in the readme before joining the discussion please.

lewdwig · 2025-09-08T10:06:41 1757326001

I have noticed that LLMs are actually pretty decent at redteaming code, so I’ve made it a habit of getting them to do that for code they generate periodically. A good loop is (a) generate code, (b) add test coverage for the code (to 70-80%) (c) redteam the code for possible performance/security concerns, (d) add regression tests for the issues uncovered and then fix the code.

ThailandJohn · 2025-09-08T10:15:37 1757326537

The glaring thing most people seem to miss that llm generated code is like TOS and unless you work in a more enterprise team setting? You are not going to catch 90% of the issues...

If this was used before releasing the tea spill fiasco, only to name one? It would never have been a fiasco. Just saying..

quibono · 2025-09-08T09:30:35 1757323835

> Don't create a venv before installing TheAuditor

That's a strange ask in the Python ecosystem - what's the reason for this?

Also, what's the benefit of ESLint/Ruff/MyPy being utilised by this audit tool? I'm not sure I understand the benefit of having an LLM in between you and Ruff, for example.

ffsm8 · 2025-09-08T09:54:18 1757325258

It's a vibe coded project by a person that freely says they cannot code. What did you expect?

It's breathtaking how much of an enabler it already is, but curating a good dependency tree and staying within scope of the outlined work to do are not things LLMs are good at, currently.

ThailandJohn · 2025-09-08T09:58:06 1757325486

@quibono: Great questions! The "don't create venv" warning is because TheAuditor creates its own sandboxed environment (.auditor_venv/) for analyzing YOUR project. If you install TheAuditor inside your project's venv, you get nested virtualenvs which breaks the sandbox isolation. TheAuditor should be installed globally (or in ~/tools/), then it creates isolated environments for each project it analyzes.

The ESLint/Ruff/MyPy integration isn't about putting an LLM between you and linters. It's about aggregation and correlation. Example: - Ruff says "unused import" - MyPy says "type mismatch" - TheAuditor correlates: "You removed the import but forgot to update 3 type hints that depended on it"

The LLM reads the aggregated report to understand the full picture across all tools, not just individual warnings.

@ffsm8: You're absolutely right - I can't code and the dependency tree is probably a mess! That's exactly WHY I built this. When you're using AI to write code and can't verify if it's correct, you need something that reports the ground truth.

The irony isn't lost on me: I used Claude to build a tool that audits code written by Claude. It's enablement all the way down! But that's also the proof it works - if someone who can't code can use AI + TheAuditor to build TheAuditor itself, the development loop is validated.

The architectural decisions might be weird, but they're born from necessity, not incompetence. Happy to explain any specific weirdness!

pogue · 2025-09-08T13:14:47 1757337287

Can someone recommend some alternatives to a tool like this?

enjoytheview · 2025-09-08T14:50:50 1757343050

Opengrep, the OSS sucessor to semgrep. There is no sane reason in this world to use this slopfest instead of opengrep.

pogue · 2025-09-08T15:08:21 1757344101

I think it's definitely a welcome idea, but you can't vibe code a solution to vibe coded bugs.

Thanks for the recommendation.

enjoytheview · 2025-09-08T16:12:47 1757347967

It is a good idea, that's why there are multiple mature solutions in the market, including free and open source solutions like the one I mentioned