BLOG POST

How We Got a 90% Fix Rate on Open Source Security Reports

Most automated security reports get ignored. We got 90.24% of ours accepted by doing what most tools skip: actually reading the code, understanding the architecture, and submitting fixes specific enough to merge without back-and-forth
February 202610 min read
Jost
vulnerability researchsemantic code analysisapplication securityopen source securityautomated remediation

Most automated security reports get ignored. I don't blame the maintainers - if I got one more Dependabot PR that broke my build, I'd stop reading them too.

So when we started submitting vulnerability findings to open-source projects, we assumed the rejection rate would be brutal. These are busy people maintaining critical infrastructure on nights and weekends. They don't have time for another "your code might be insecure" email from a tool they've never heard of.

But that's not what happened.

Out of 41 findings reviewed by maintainers so far, 37 were accepted. That's 90.24%. Not acknowledged. Merged. Deployed to production.

I've been thinking about why, and I think it comes down to something pretty simple: we did the work that most automated tools skip.


The bar is on the floor

Here's the thing about automated security findings - the bar for quality is shockingly low. Most tools generate a list of potential issues, sorted by severity, with generic descriptions and maybe a link to a CWE page. The maintainer gets an email that says something like "possible SQL injection on line 342" and they're supposed to figure out the rest.

When we ran Semgrep against NocoDB, it produced 222 findings. We went through every single one. 208 were false positives. That's 94%. If you're a maintainer and 94 out of every 100 alerts are wrong, of course you stop reading them.

But worse than the noise: Semgrep missed the actual critical vulnerability. There was a SQL injection in the Oracle client - OracleClient.ts, 17 different locations where user-controlled parameters were concatenated directly into SQL queries. The kind of bug that leads to complete database compromise.

Semgrep didn't flag it because the vulnerability required understanding how data moved between files, through a query builder abstraction. Pattern matching saw the query builder and assumed things were fine.

That's the baseline we were working against. Maintainers are trained to expect noise and miss the real stuff. If you want them to pay attention, you have to be different.


We actually read the code

I know that sounds obvious. It shouldn't need to be said. But the reason most automated findings get rejected is that the tool clearly didn't understand the codebase.

Phase - 9 authorization bypasses

Take what we found in Phase. Phase is a secrets management tool - the irony of finding auth bypasses in a secrets manager is not lost on me - and we identified nine separate authorization vulnerabilities.

The one that sticks with me is the double-negative bug:

1if not user_id is not None:
2    # Check for permission here

Whoever wrote that line meant "if user_id exists, check permissions." Totally reasonable intent. But Python's operator precedence turns that into (not user_id) is (not None), which evaluates to False is True when user_id exists. The permission check never executes.

You can't find that with pattern matching. There's no regex for "developer meant X but Python does Y." You need to understand both what the code says and what it's supposed to do. We explained exactly that in our report - what the developer intended, how Python actually evaluates it, and why the permission check gets skipped.

Phase fixed all nine issues across PRs #722 through #731. Took them about a week.

Weaviate - AWS credential injection

We found that their S3 backup module accepted AWS credentials directly from HTTP request headers - X-AWS-ACCESS-KEY, X-AWS-SECRET-KEY, X-AWS-SESSION-TOKEN - without validating that the credentials belonged to the current account. This is in modules/backup-s3/client.go, lines 82-94.

The attack path: inject your own AWS credentials via headers, redirect backups to your own S3 bucket, steal data.

We didn't just say "credential injection bad." We walked through the specific attack scenario and recommended they remove header-based credential acceptance entirely in favor of environment variables and IAM roles - which is how their architecture should have been handling it anyway.

Weaviate confirmed both items (this one plus an SSRF in their Anthropic module) within a day of our HackerOne report and created GitHub issue #10146 to track the fix.

When you show maintainers that you actually understand their code, their architecture, and the real-world attack implications, they respond fast. They're not ignoring security - they're ignoring noise.


Showing the work matters more than you'd think

There's a trust gap in security reporting. The maintainer has no idea if your finding is real until they verify it themselves. If your report is vague, that verification takes hours they don't have. If your report is specific enough, they can confirm it in minutes.

Every finding we submit includes the exact file and line numbers, the CWE classification, a clear explanation of the vulnerability and how to exploit it, and - when we submit fixes - a PR they can review directly.

vLLM - 3 days to merge

We found torch.load() calls without the weights_only=True flag in tensorizer.py (lines 763-765) and adapters.py (line 93). Without that flag, torch.load() will execute arbitrary Python code embedded in a checkpoint file.

It's a well-known PyTorch issue, but it's the kind of thing that slips through because the code works fine with legitimate checkpoints. You only notice the problem when someone feeds you a malicious one.

We reported through GitHub Security on January 7th. Three days later, the fix was merged in PR #32045. That's a fast turnaround for any open-source project, and I think it's because the report was specific enough that the maintainers could verify and fix it without going back and forth.

Qdrant - same-day fix

We found unsafe memory access in their CSR loader - lib/sparse/src/index/loaders.rs, lines 79-100. The get_unchecked() call was accessing memory-mapped data without validating that offsets actually pointed within buffer bounds. The indptr validation checked monotonicity but never checked against buffer sizes, so you could craft a malicious CSR file to read adjacent memory or crash the process.

Reported on January 7th. Merged in PR #7884 the same day.

Same-day turnaround. That doesn't happen when the maintainer has to spend hours figuring out if your report is real.


The honest numbers

Here's where everything stands as of January 2026:

Status

Vulnerabilities

Projects

Reviewed & Accepted

37

16

Reviewed & Rejected

4

16

Awaiting Response

135

20

In Progress

49

7

Total

225

45


90.24% acceptance rate on the reviewed findings.

And yeah, we get pushback too - sometimes for reasons we didn't expect.

We submitted a fix to Casdoor for a TLS certificate verification issue in their SUBMAIL email provider (InsecureSkipVerify: true hardcoded for all connections, making them vulnerable to man-in-the-middle attacks). The PR had the vulnerability details, the fix, the tests. A maintainer closed it with one line:

"closed, PR author should be human"

Fair enough. That's their call. But the vulnerability is still there - every SUBMAIL user on that platform has TLS verification disabled whether they know it or not.

Other rejections were less dramatic. Some findings turned out to be intentional design decisions or had mitigating factors we didn't fully account for. Honestly, if we had a 100% acceptance rate I'd be worried - that would mean we were only submitting the obvious stuff and not pushing the boundaries of what we can detect.

Eight (and growing) projects with full public disclosures:

  • Weaviate - 2 vulnerabilities, GitHub issue #10146

  • vLLM - 1 vulnerability, PR #32045

  • Qdrant - 1 vulnerability, PR #7884

  • Langfuse - 4 vulnerabilities, PRs #11311 and #11395

  • Agenta - 8 vulnerabilities, release v0.77.1

  • Cloudreve - 8 vulnerabilities, release 4.11.0

  • Phase - 9 vulnerabilities, PRs #722-731

  • NocoDB - 5 vulnerabilities, PRs #12748-12752


What the responses looked like

I could talk about methodology all day, but the maintainer responses tell the story better.

Langfuse shipped a fix for the SSRF vulnerability in their PostHog integration within 24 hours. Merged in PR #11311. They added us to their Hall of Fame, which was a nice surprise.

Phase remediated all nine authorization bypasses within a week. Nine findings, ten PRs (#722-731), one week. That's a team that took the report seriously.

Qdrant merged their fix the same day. NocoDB implemented fixes on their own internal private branches and submitted verified fixes through public PRs #12748 through #12752.

Agenta fixed all eight findings - including the critical RestrictedPython sandbox escapes where __import__ was explicitly added to safe_builtins, effectively disabling the entire sandbox - in release v0.77.1.

When maintainers move fast, it means two things: the finding was clearly real, and the report was clear enough to act on without a bunch of back-and-forth.


The takeaway

A 90% acceptance rate isn't really the point. The point is what had to be true for that number to happen: the findings had to be real, the reports had to be clear, and the analysis had to go deeper than pattern matching.


Every assessment is public with full technical details at kolega.dev/security-wins. Code locations, CWEs, PR numbers, disclosure timelines. If you think we got something wrong, tell us. We've been wrong before and we'll say so when it happens.

Simple 3 click setup.

Deploy Kolega.dev.

Find and fix your technical debt.