Data sources

Explore this section to discover the trusted sources of our data.

Everything in this modern world begins with data! And so does Debricked.

Debricked's algorithms constantly scan various sources for information about vulnerabilities, licenses and health data. These include (but are not limited to: the NVD Database, NPM, C# Announcement, FriendsOfPHP's security advisories, Go Vulnerability Database, PyPA Python Advisory Database, GitHub Issues, GitHub Security Advisory, mailing lists, and more. We check our sources every 15 minutes, giving fast and accurate data.

Data refinement

When the data is collected, we "clean it up", since it is often quite messy. As our sources are a combination of structured and unstructured data, there are a lot of errors in it by default.

CVE-parsing

The largest source of vulnerability data is the NVD database. The problem with this source is that the CPEs (or products connected to vulnerabilities) are often mislabelled, and it's common to see a time lag of up to four weeks in assigning CPEs to CVEs. Here, we use our state-of-the-art natural language processing to re-classify the vulnerabilities and increase the amount of correctly classified vulnerabilities and reduce that time lag to 0 days. This is one of many data-refinement activities that we carry out 24/7 for our customers.

Fully automated

What makes Debricked special is that we do not use any form of manual analysis of vulnerabilities. That’s a risky bet that took almost 5 years of research and development to pull off! But as a result, as soon a vulnerability is discovered in a data source, we index it, refine it and try to find a fix. All of this can happen within 15-30 minutes. Moreover, we constantly monitor for changes in data regarding this particular vulnerability. In contrast, it takes an average of 30 days for a NVD database to complement their data with more details. Sometimes it is never done if the vulnerability has low priority. The same is true for finding fixes and other details. But with Debricked, you can rest assured that our systems are working around the clock and are not introducing any noticeable lag between the vulnerability sources and your developers. Debricked is here to assist you in building the bricks of security!

Scanning the code for dependencies & matching

In the next step, Debricked scans your projects for dependency files. This can be done in a variety of ways, e.g. by CI/CD integrations (recommended), manual uploads, and our API.

What do we look for?

We essentially scan for any declared dependencies in files, such as the well-known "package-lock.json" file, "composer.json", etc. Next, this dependency file is transformed into our own internal format and is sent to our matching & rule engines. Any indirect dependencies are also built/traversed in this process.

Matching & rule engines

These are two pieces of software that:

  • match your vendor and name of the dependency to our internal database

  • determine the likelihood of this match being correct

It is often the case that open source projects have similar names, share parts of names, or even have the same names but different vendors. Because of this, simple regular expressions and white/black lists are not enough. We make use of modern tech, such as machine learning, to determine the likelihood of the match being a true positive or not based on our algorithms. The accuracy of these algorithms varies depending on the language and package manager being used.

A solution to your problems

In most cases, the solution to vulnerabilities in open source dependencies is to simply update the dependency to a later version that is not vulnerable. Often the update is easy to make, but if the gap in between the versions is large enough, an update could cause breaking changes to your code. We help you figure out which version to update to by finding the smallest possible update you can do, which still fixes the vulnerability, helping you fix the problem while keeping the risk of breaking changes as low as possible.

Last updated