Artificial Intelligence for Open Source Risk Management

Artificial Intelligence for Good Governance of Open Source

Artificial Intelligence (AI) is revolutionizing the way we live, work and think. In recent times, computing machines have become intelligent enough to recognize real world objects, recognize speech, learn programs, paint like an artist, or even dream like humans.

Security and reliability of software systems, which is enormously important to our modern economy, is also benefiting from advances in AI research. Although open source is no more or less secure than other software, given the availability of source codes, detection and exploitation of security vulnerabilities in open source presents an easier target. Figure 1 below reveals the number of vulnerabilities reported in National Vulnerability Database (NVD). Note that there are many more vulnerabilities that never make it to NVD, a topic that I’ll address in my FLIGHT 2017 presentation with Nathan Zhang, a data scientist on my team. 

 Distribution of vulnerabilities reported in NVD by year

Figure 1: Distribution of vulnerabilities reported in NVD by year

The recent exploitation of vulnerability (CVE-2017-5638) in Apache Struts reminds us of severe consequences that enterprises (as well individuals) face when they don't secure and manage the open source in their applications. As various open source solutions expand to different industries and markets, the timely discovery and mitigation of publicly known vulnerabilities has become increasingly important. Unfortunately, the security experts who often discover these vulnerabilities (with the intention of mitigating the risks) are finding it extremely difficult to analyze the vulnerabilities. For instance, to determine various threat levels and exploitability factors, security experts are often required to determine: (1) access/authentication complexity, (2) confidentiality, integrity and availability impacts of vulnerabilities, and (3) numerical scores to quantify the items mentioned in (1) and (2). NVD is one of the several good sources for vulnerability assessment methodologies. 

Artificial Intelligence in Vulnerability Analysis?

Overall, vulnerability analysis is a time-consuming task, which unfortunately must be done in a time-sensitive manner without compromising with the essential steps of analysis needed to mitigate the risks in an effective way. Unfortunately, this situation is becoming worse due to the increased number of vulnerabilities that are being discovered (recall Figure 1). On a given day, our security experts at Black Duck could end up analyzing tens of vulnerabilities to make the consumers of affected open source solutions more secure. In this context, we are using AI solutions to help our security experts conduct vulnerability analysis at a large scale quickly and accurately. If computing machines (powered by AI solutions) can do this analysis independently and automatically it will be be incredibly time-effective and cost-effective. While a worthy goal, we first need to understand where the challenges lie.

Security, Compliance Risks in Web Services in Open Source

Training Computing Machines

An important part of AI driven security solutions is training computing machines with real world datasets. At Black Duck Research, we are fortunate to have the world’s largest database of open source software, supplemented by important pieces of meta data such as publicly known vulnerabilities, licenses, vendor information, and so on. Our data scientists and security experts are utilizing these data to build the next generation of open source security solutions. In this context, training a computing machine is very important. To train a machine, you essentially need to provide a relevant and sufficient amount of data to your algorithms so that they can continue to learn from the evolving data as new open source solutions become available and new vulnerabilities are discovered.

The Ever Evolving Data in Open Source

These constantly evolving data pose several challenges that need to be overcome before we can realize effective AI driven security solutions. Many of these challenges stem from the fact that open source projects entail large volumes of structured and unstructured data that are difficult to find, manage and analyze. We are applying various Data Mining, Machine Learning and Natural Language Processing solutions to solve some of the most challenging problems related to open source security. Following are some examples of our AI driven solutions.

  1. Automatically map publicly known vulnerabilities to open source projects (which could be known differently within various open source and security communities).
  2. Automatically conduct a preliminary analysis of vulnerabilities to determine their severity and importance so that vulnerability analysis can be prioritized. Our AI driven solution evaluates these risks in the context of applications and their business impact.
  3. Automatically find relationships between various Open Source projects that are detected within your code. Our AI driven solution helps in a better understanding of your code dependencies to mitigate security and compliance risks at the file and directory level.  
  4. Automatically analyze hundreds of legal documents (licenses, terms of services, privacy statements, privacy laws such as HIPAA, DMCA, and others) to determine the compliance risks.

Essentially, AI cannot fully automate the process of open source security or open source risk management. Nonetheless, we've seen success in experimenting and implementing various AI driven solutions that are stepping stones toward a fully automated open source risk management solution. 

Do you want to know more about our AI based approaches or get involved in our research projects? Contact us for more details.

Sorry we missed you! We close comments for older posts, but we still want to hear from you. Tweet @black_duck_sw to continue the discussion.


Web Services Security: Providers and Consumers of APIs

| Dec 20, 2017

Web services security has become an important part of cybersecurity preparedness. As the value of web services (APIs) for data driven innovations have grown, the challenges around the security and privacy of data provided through underlying APIs have also increased. In my previous posts, I've

| MORE >

A Methodology for Quantifying Risks from Web Services

| Jun 27, 2017

In my previous blogs, I explored the challenges of managing Web Services in applications, including the ones that use Open Source. In this blog, I have described a methodology that our research team has developed to quantify the risks that come with using Web Services that make calls to various

| MORE >