OpenHack makes AI bug hunting auditable instead of magical
May 25, 2026
Hadrian has released OpenHack as an MIT-licensed tool for source-guided vulnerability research. The important part: AI agents do not roam freely, but work through files, checkpoints and human approvals.
What this is about
Hadrian made OpenHack public on May 25, 2026: an MIT-licensed open-source workspace system for source-guided vulnerability research. It is not another chatbot that reads code and then announces a verdict. Instead, OpenHack records every step of a security review as files: recon results, routing units, scenarios, scenario results, finding candidates, triage decisions, final findings and logs.
That matters because AI in security often fails on trust. When a model claims it has found a critical flaw, teams need to know: where did that claim come from? Was it checked? Who approved the next step? OpenHack tries to make that chain visible.
What OpenHack actually does
OpenHack runs inside a coding harness such as Claude Code, Codex, Cursor or a custom runner. The harness provides model access, a terminal, repository access and the execution environment. OpenHack provides the durable workflow: it creates a run structure, collects review surfaces, generates scenarios, has expert agents test those scenarios and then routes the results through separate triage.
The core is a narrow state chain: recon item → routing unit → scenario → scenario result → finding candidate → triage decision → finding. A human approves the phase transitions. That turns a loose agent loop into an auditable process. Optional Semgrep rules can enrich the recon phase; according to the project description, those hits are treated as hints, not automatically as proven vulnerabilities.
Why it matters
Security teams are caught between two extremes. On one side, coding agents can inspect large codebases faster than individual analysts. On the other side, they create false positives, miss context or hallucinate exploit paths. OpenHack targets the place where real security teams live: not the flashy demo, but the reproducible proof.
The published workflow names twelve expert families aligned with categories from the OWASP Top 10, MITRE terminology and CWE classes. That does not prove OpenHack will automatically produce better findings. But it is an important signal: agent work is mapped into known security categories instead of being sold as a proprietary black box. For open-source projects and smaller teams, the MIT license also matters because it allows experimentation without a large budget.
In plain language
Think of an apartment inspection. A weak inspector walks through the rooms and says, “There are problems here.” A good inspector takes photos, notes the room, describes the damage, confirms open questions and separates suspicion from evidence. OpenHack tries to bring AI bug hunting closer to the second version: every hint gets a place, a reason and a decision.
A practical example
A team runs an internal web app with 140,000 lines of code, 38 API routes and three upload features. An OpenHack run could first collect routes, authentication boundaries and parser entry points. The router might then produce 18 scenarios: access without the right role, unsafe file extensions, missing size limits or injection paths.
An expert agent checks one upload scenario and finds a possible path traversal issue. That is not immediately published as a vulnerability. First, it becomes a finding candidate with evidence. Then a separate triage step decides whether the issue is accepted, downgraded, marked as a duplicate or rejected. Later, a security lead can reconstruct why 18 scenarios produced perhaps only two real findings.
Scope and limits
- OpenHack does not replace experienced security reviewers. It structures the work, but a human still has to judge scope, risk and evidence.
- Quality depends heavily on the harness, model, repository access and prompts. A bad run remains a bad run even if the files look tidy.
- The approach fits source-code review better than pure black-box pentesting. Runtime behavior, product logic and production data can still be missing.
The sober takeaway is this: OpenHack does not make AI security work automatically true. It makes it easier to audit. For security-critical agents, that can be more valuable than another loud model claim.
SEO & GEO keywords
OpenHack, Hadrian, AI vulnerability research, source-guided security review, AI security, OWASP Top 10, MITRE ATT&CK, Semgrep, coding agents, Claude Code, Codex, Cursor
💡 In plain English
OpenHack is an open system that binds AI agents in code security reviews to a traceable sequence of steps. Instead of only claiming a result, it stores hints, scenarios, checks and triage decisions as files.
Key Takeaways
- →OpenHack was released on May 25, 2026 as an MIT-licensed open-source project.
- →The system structures AI-assisted vulnerability research through files, checkpoints and human approvals.
- →The workflow separates recon, scenarios, finding candidates and independent triage.
- →Its expert families are aligned with known security categories such as OWASP, MITRE and CWE.
- →The value is less about magic and more about traceability and auditability.
FAQ
Is OpenHack a replacement for pentesters?
No. It can structure and speed up reviews, but scope, risk and evidence still require human judgment.
What tools does OpenHack need?
It runs inside a coding harness such as Claude Code, Codex, Cursor or a custom runner. The harness provides model access, a terminal and repository access.
Why do files matter here?
Files make the run traceable. A team can later see which hints, scenarios and triage decisions led to a finding.
Is a Semgrep hit automatically a vulnerability?
No. According to the project description, Semgrep hits are treated as hints. A vulnerability has to be supported through scenario work and triage.