Enterprise Software Leader Proves 96% of SCA Findings Are False Positives
One security engineer, 2,000 CVEs a year, an hour of analysis per ticket. Konvu proved 96% of the backlog was noise and found exploitable CVEs the old severity rule would never have checked.
Industry
Enterprise Software
Company size
100,000+ employees
Integrated tools
Black Duck, GitHub
Use case
SCA triage at enterprise scale
“The speed of decision making, that's where you shine.”
Senior Product Security Engineer, Global Enterprise Software Leader
Key Results
- 96%
- False positives, proven
- Dismissed with written evidence, not a score
- 4%
- Exploitable
- Including low and medium CVEs the old process would never have checked
- 10 min
- Triage target per CVE
- Down from about an hour of manual analysis
Fifty-one weeks of pressure
The product security team at one of the world's largest enterprise software companies works on a monthly release cycle. Every cycle ends the same way: a queue of open CVEs, a release date that will not move, and development teams waiting on a verdict.
For the senior product security engineer who owns a large part of that queue, the math was unforgiving. Black Duck scans across the product portfolio produced a constant stream of findings. He analyzed about 2,000 of them a year himself. A high-severity CVE took roughly an hour to work through, from opening the ticket to making a defensible decision.
“From the moment I start working on the ticket until I'm able to make a decision, it's about an hour when it's a high vulnerability.”
Some mornings disappeared entirely into ticket processing, four hours before any other work started. He called the past twelve months a brutal year for CVE volume. And the pressure never let up.
“There is one week a year when we don't have pressure. That's Christmas week. For fifty-one weeks a year, we're dealing with high CVEs.”
Panic above 7.0, forgotten below
With no way to investigate everything, the team relied on the same shortcut most enterprises use: the CVSS score. Findings above 7.0 got analyzed. Findings below 7.0 were deprioritized automatically. Nobody believed the line was accurate. It was just the only line available.
“When the CVSS score is above 7, everybody panics. Below 7, people forget about it because it's not on their plate.”
The scores did not even agree with each other. EPSS and CVSS regularly pointed in different directions on the same CVE. And neither answered the question the team actually needed answered: can an attacker exploit this in our product, as it is actually built and deployed? That is an exploitability question, and no score answers it.
The back and forth that leaves scars
The hours were only half the cost. Every verdict had to be defended to development teams under release pressure, usually with nothing stronger than a severity score as evidence. The debates were long, repetitive, and personal.
“They cursed my mom on every continent of the planet. We go into this back and forth about whether we're vulnerable or not.”
A security ticket landing weeks before a release felt, in his words, like a hammer. It causes friction, and it causes scars. The team did not need another scanner or another score. They needed evidence strong enough to end the argument.
Three tests
The team did not evaluate Konvu on a slide deck. They ran it through three increasingly hard tests.
- 1
Test 1
A working session
Konvu walked through how its agents investigate a finding: whether an attacker can actually exploit it in the product, with the full analysis laid out in a graph.
- 2
Test 2
A POC on their own code
The team pointed Konvu at one of their own projects and let the agents work through its backlog of Black Duck findings.
- 3
Test 3
A live CVE from his queue
He ran a real Apache Tomcat ticket through Konvu and compared the analysis, step by step, against his own manual process.
The first reaction came in the working session, looking at how Konvu presents its analysis. “I have to say, that's impressive,” he told the team.
“Of all the things you showed me, what I liked the most was the triage evidence. It saves a lot of back and forth.”
The POC ran on one of the company's own projects. Konvu's agents worked through its Black Duck findings and returned a verdict for each one, backed by a written analysis. Only 75 findings were exploitable. The rest were false positives, including an Angular CVE that a score-based process would have escalated: the vulnerable code was present, but the HTTP client calls it depended on were not attacker-controllable. Present in the code, not exploitable in the product. That distinction is exactly what a severity score cannot see.
The last test was his own workflow. He took a live Apache Tomcat CVE from his queue and ran Konvu against it, comparing the step-by-step analysis and dependency graph to what he would produce by hand, and to what a general-purpose AI assistant produced for the same ticket. Konvu's analysis held up. He had recently spent a full Sunday manually working through CVEs. This was the workflow he saw Konvu replacing.
What the evidence showed
Across the findings Konvu assessed, the severity mix looked like any enterprise backlog: 8% critical, 41% high, 36% medium. Read by score alone, half the queue demanded panic.
The verdicts told a different story. 96% of findings were false positives, each dismissed with a written explanation of why no attack path exists in the product. 4% were exploitable and worth engineering time.
The detail that mattered most sat at the bottom of the severity range. Several of the exploitable findings were low and medium severity CVEs. Under the 7.0 rule, nobody would ever have opened those tickets. The score-based process burned hours on noise above the line and hid real risk below it.
“Engineers love the evidence. They see that, and it ends many conversations.”
By the end of the assessment, 58% of the backlog was dismissed with the evidence attached to each ticket, 23% was fixed, and 19% remained open with an owner. A queue that had only ever grown started to shrink.
The target they set for the rollout was direct: cut triage from an hour per CVE to under ten minutes, with every verdict carrying evidence a developer can check instead of a score they can argue with.
After the POC, the team rolled Konvu out across more of the portfolio. The rollout was the easy part: connect a repository, and the agents work through its backlog the same way they worked through the first one. Coverage no longer costs analyst hours.
Why it worked
Konvu did not replace Black Duck, and it did not add another dashboard. The scanner keeps finding. Konvu's agents do the investigation step that used to cost an hour per ticket: they determine whether each finding is exploitable in the product as it is actually built, then write the verdict and the evidence back into the tools the team already uses.
Exploitability is a different question from severity, and a different question from reachability. A reachable function is not necessarily exploitable. A CVE below 7.0 is not necessarily safe. Evidence, not scores, is what ends the back and forth.