skip to main content
survey
Open access

A Systematic Literature Review on Automated Software Vulnerability Detection Using Machine Learning

Published: 11 November 2024 Publication History

Abstract  摘要

In recent years, numerous Machine Learning (ML) models, including Deep Learning (DL) and classic ML models, have been developed to detect software vulnerabilities. However, there is a notable lack of comprehensive and systematic surveys that summarize, classify, and analyze the applications of these ML models in software vulnerability detection. This absence may lead to critical research areas being overlooked or under-represented, resulting in a skewed understanding of the current state of the art in software vulnerability detection. To close this gap, we propose a comprehensive and systematic literature review that characterizes the different properties of ML-based software vulnerability detection systems using six major Research Questions (RQs).
近年来,包括深度学习(DL)和经典机器学习(ML)模型在内的众多机器学习(ML)模型被开发出来以检测软件漏洞。然而,缺乏全面和系统的调查综述,这些综述总结了、分类并分析了这些 ML 模型在软件漏洞检测中的应用。这种缺失可能导致关键研究领域的忽视或代表性不足,从而导致对软件漏洞检测当前技术水平的理解出现偏差。为了填补这一空白,我们提出了一项全面和系统的文献综述,使用六个主要研究问题(RQs)来描述基于 ML 的软件漏洞检测系统的不同特性。
Using a custom web scraper, our systematic approach involves extracting a set of studies from four widely used online digital libraries: ACM Digital Library, IEEE Xplore, ScienceDirect, and Google Scholar. We manually analyzed the extracted studies to filter out irrelevant work unrelated to software vulnerability detection, followed by creating taxonomies and addressing RQs. Our analysis indicates a significant upward trend in applying ML techniques for software vulnerability detection over the past few years, with many studies published in recent years. Prominent conference venues include the International Conference on Software Engineering (ICSE), the International Symposium on Software Reliability Engineering (ISSRE), the Mining Software Repositories (MSR) conference, and the ACM International Conference on the Foundations of Software Engineering (FSE), whereas Information and Software Technology (IST), Computers & Security (C&S), and Journal of Systems and Software (JSS) are the leading journal venues.
使用自定义网络爬虫,我们的系统方法包括从四个广泛使用的在线数字图书馆中提取一系列研究:ACM 数字图书馆、IEEE Xplore、ScienceDirect 和 Google Scholar。我们手动分析了提取的研究,以过滤掉与软件漏洞检测无关的不相关工作,随后创建分类和解决 RQs。我们的分析表明,在过去的几年中,应用机器学习技术进行软件漏洞检测的趋势显著上升,近年来发表了大量研究。突出的会议场所包括国际软件工程会议(ICSE)、国际软件可靠性工程研讨会(ISSRE)、软件库挖掘(MSR)会议以及 ACM 国际软件工程基础会议(FSE),而《信息和软件技术》(IST)、《计算机与安全》(C&S)和《系统与软件杂志》(JSS)是领先的期刊场所。
Our results reveal that 39.1% of the subject studies use hybrid sources, whereas 37.6% of the subject studies utilize benchmark data for software vulnerability detection. Code-based data are the most commonly used data type among subject studies, with source code being the predominant subtype. Graph-based and token-based input representations are the most popular techniques, accounting for 57.2% and 24.6% of the subject studies, respectively. Among the input embedding techniques, graph embedding and token vector embedding are the most frequently used techniques, accounting for 32.6% and 29.7% of the subject studies. Additionally, 88.4% of the subject studies use DL models, with recurrent neural networks and graph neural networks being the most popular subcategories, whereas only 7.2% use classic ML models. Among the vulnerability types covered by the subject studies, CWE-119, CWE-20, and CWE-190 are the most frequent ones. In terms of tools used for software vulnerability detection, Keras with TensorFlow backend and PyTorch libraries are the most frequently used model-building tools, accounting for 42 studies for each. In addition, Joern is the most popular tool used for code representation, accounting for 24 studies.
我们的结果表明,39.1%的研究对象使用混合来源,而 37.6%的研究对象利用基准数据用于软件漏洞检测。基于代码的数据是研究对象中最常用的数据类型,其中源代码是主要子类型。基于图和基于标记的输入表示是最受欢迎的技术,分别占研究对象总数的 57.2%和 24.6%。在输入嵌入技术中,图嵌入和标记向量嵌入是最常用的技术,分别占研究对象总数的 32.6%和 29.7%。此外,88.4%的研究对象使用深度学习模型,其中循环神经网络和图神经网络是最受欢迎的子类别,而只有 7.2%使用经典机器学习模型。在研究对象覆盖的漏洞类型中,CWE-119、CWE-20 和 CWE-190 是最频繁的。在软件漏洞检测使用的工具方面,Keras 与 TensorFlow 后端和 PyTorch 库是最常用的模型构建工具,各有 42 项研究使用。 此外,Joern 是用于代码表示最流行的工具,占 24 项研究。
Finally, we summarize the challenges and future directions in the context of software vulnerability detection, providing valuable insights for researchers and practitioners in the field.
最后,我们在软件漏洞检测的背景下总结挑战和未来方向,为该领域的学者和实践者提供有价值的见解。

1 Introduction  1 引言

Automatic vulnerability identification is essential for ensuring software security [99]. Successes in the field of Machine Learning (ML) have inspired a lot of interest in using these models to find software vulnerabilities in general/traditional software systems [145]. ML models excel at detecting subtle patterns and correlations in large datasets [6]. They can automatically extract important features from raw data, such as source code, and detect hidden patterns that could reveal software defects. This capacity is critical in vulnerability detection, as vulnerabilities frequently entail subtle code characteristics and dependencies. In addition, ML models can handle a wide range of data types and formats, including source code [26], textual information [56], and numerical features such as commit characteristics [114]. They can use these data representations to effectively discover vulnerabilities. This versatility enables researchers to use a variety of data sources and include numerous features for comprehensive vulnerability detection.
自动漏洞识别对于确保软件安全至关重要[99]。机器学习(ML)领域的成功激发了对利用这些模型在一般/传统软件系统中寻找软件漏洞的浓厚兴趣[145]。ML 模型擅长在大型数据集中检测微妙的模式和相关性[6]。它们可以从原始数据中自动提取重要特征,如源代码,并检测可能揭示软件缺陷的隐藏模式。这种能力在漏洞检测中至关重要,因为漏洞通常涉及微妙的代码特征和依赖关系。此外,ML 模型可以处理各种数据类型和格式,包括源代码[26]、文本信息[56]以及提交特征等数值特征[114]。它们可以使用这些数据表示来有效地发现漏洞。这种多功能性使研究人员能够使用各种数据源,并包括众多特征以实现全面的漏洞检测。
Although many studies have used ML models to detect software vulnerabilities, there has not been a comprehensive and systematic review to consolidate the various approaches and characteristics of these techniques. Conducting such a systematic survey would be beneficial for practitioners and researchers in gaining a better understanding of the current state-of-the-art tools for vulnerability detection and could serve as an inspiration for future studies. This study conducts a comprehensive and detailed survey to review, analyze, describe, and classify software vulnerability detection studies from different perspectives. We analyzed 138 studies published in many software engineering flagship journals and conferences from January 2011 to June 2024. In this study, we investigated the following Research Questions (RQs):
尽管许多研究已经使用机器学习模型来检测软件漏洞,但尚未有全面和系统的综述来整合这些技术的各种方法和特点。进行此类系统调查将有助于实践者和研究人员更好地了解当前最先进的漏洞检测工具,并为未来的研究提供灵感。本研究从不同角度对软件漏洞检测研究进行了全面和详细的调查、分析、描述和分类。我们分析了 2011 年 1 月至 2024 年 6 月发表在许多软件工程旗舰期刊和会议上的 138 项研究。在本研究中,我们探讨了以下研究问题(RQs):
RQ1: What is the trend of studies?
RQ1:研究趋势是什么?
RQ1.1: What is the trend of studies over time?
RQ1.1:研究随时间推移的趋势是什么?
RQ1.2: What is the distribution of publication venues?
RQ1.2:出版场所的分布情况如何?
RQ2: What are the characteristics of software vulnerability detection datasets?
RQ2:软件漏洞检测数据集的特点是什么?
RQ2.1: What is the source of datasets?
RQ2.1:数据集的来源是什么?
RQ2.2: What are the most commonly used data types?
RQ2.2:最常用的数据类型有哪些?
RQ2.3: What are the most commonly used input representations?
RQ2.3:最常用的输入表示是什么?
RQ2.4: What are the most commonly used embedding approaches?
RQ2.4:最常用的嵌入方法有哪些?
RQ3: What is the distribution of ML and Deep Learning (DL) models used for software vulnerability detection?
RQ3:用于软件漏洞检测的机器学习(ML)和深度学习(DL)模型的分布情况如何?
RQ4: What are the most frequent types of vulnerabilities covered in the subject studies?
RQ4:在主题研究中,最常见的漏洞类型有哪些?
RQ5: What are the most frequently used tools for software vulnerability detection?
RQ5:软件漏洞检测中最常用的工具有哪些?
RQ6: What are possible challenges and open directions in software vulnerability detection?
RQ6:软件漏洞检测中可能面临的挑战和开放方向有哪些?
This article makes the following contributions:
本文做出了以下贡献:
We thoroughly analyze 138 studies that used ML models to detect security vulnerabilities regarding publication trends, distribution of publication venues, and types of contributions.
我们全面分析了 138 项研究,这些研究使用了机器学习模型来检测关于出版趋势、出版物分布和贡献类型的网络安全漏洞。
We conduct a comprehensive analysis to understand the dataset, the processing of data, data representation, model architecture, tools, and types of covered vulnerabilities in the subject studies.
我们对数据集、数据处理、数据表示、模型架构、工具以及研究对象中涵盖的漏洞类型进行了全面分析。
We provide a classification of ML models used in vulnerability detection based on their architectures.
我们根据其架构对用于漏洞检测的机器学习模型进行了分类。
We discuss distinct technical challenges of using ML techniques in vulnerability detection and outline key future directions.
我们讨论了在漏洞检测中使用机器学习技术的独特技术挑战,并概述了关键的未来发展方向。
We share our results and analysis data as a replication package1 to allow other researchers to easily follow this work and extend it.
我们将我们的结果和分析数据作为复现包 1 分享,以便其他研究人员能够轻松地跟随这项工作并扩展它。
We believe that this work is valuable for researchers and practitioners in software engineering and cybersecurity, especially those focused on software vulnerability detection and mitigation. It also benefits policymakers, software providers, and stakeholders interested in improving software security and reducing cyberattack risks, forming their software development, procurement, and risk management decisions.
我们认为这项工作对软件工程和网络安全领域的科研人员和从业者具有重要价值,尤其是那些专注于软件漏洞检测和缓解的。它也有利于政策制定者、软件提供商以及关注提高软件安全性和降低网络攻击风险的利益相关者,影响他们的软件开发、采购和风险管理决策。
The rest of the article is organized as follows. Section 2 provides background information and reviews related work. Section 3 outlines the research methodology proposed in this article. Section 4 addresses the RQs and presents the corresponding results. Section 5 discusses potential threats to the validity of this study. Finally, Section 6 presents the conclusion and suggests future directions.
文章其余部分组织如下。第 2 节提供背景信息和相关工作的综述。第 3 节概述了本文提出的研究方法。第 4 节讨论了研究问题(RQs)并展示了相应的结果。第 5 节讨论了本研究的有效性可能受到的潜在威胁。最后,第 6 节提出了结论并建议未来的研究方向。

2 Background and Related Work
2 背景及相关工作

In this section, we begin by defining vulnerability and outlining the key steps in detecting software vulnerabilities. We then review related surveys, emphasizing how they differ from our own.
在这一节中,我们首先定义漏洞并概述检测软件漏洞的关键步骤。然后,我们回顾相关调查,强调它们与我们自己的不同之处。

2.1 Background  2.1 背景

Software vulnerability management is crucial for ensuring software security and integrity [119]. With the increasing reliance on software for critical operations like financial transactions [39], vulnerabilities pose serious risks, including unauthorized access and service disruption. Effective management is essential for protecting user privacy, maintaining system availability, and ensuring trustworthiness. There are multiple steps in software vulnerability management, including vulnerability detection, vulnerability analysis, and vulnerability remediation. In the following subsections, we elaborate on each step in detail.
软件漏洞管理对于确保软件安全和完整性至关重要[119]。随着对软件在关键操作(如金融交易[39])中依赖性的增加,漏洞带来了严重风险,包括未经授权的访问和服务中断。有效的管理对于保护用户隐私、维护系统可用性和确保可靠性至关重要。软件漏洞管理包括多个步骤,包括漏洞检测、漏洞分析和漏洞修复。在以下小节中,我们将详细阐述每个步骤。

2.1.1 Vulnerability Detection.
2.1.1 漏洞检测。

Vulnerability detection is critical in the overall process of managing software vulnerabilities [11]. It comprises detecting possible security weaknesses in software systems that attackers may exploit. There are several traditional techniques commonly used for vulnerability detection. In the manual code auditing method, human experts examine the source thoroughly to manually detect coding flaws, unsafe procedures, and possible vulnerabilities. Static analysis [35] involves using automated tools to analyze the source code or compiled binaries without executing the software under test. The goal of dynamic analysis [67, 102] is to evaluate the behavior of software while it is running. Running the software in a controlled environment or through automated tests while monitoring its execution and interactions with system resources is what it entails. However, dynamic analysis may have constraints in terms of significant system overhead [167]. One approach that falls under this category is the usage of fuzz testing for software vulnerability detection [42]. In fuzz testing, the input space for the program under test is identified, then the inputs are modified/mutated randomly or based on a set of already-defined rules to generate malformed inputs as well as boundary input values (i.e., edge cases). These tainted values are expected to hit parts of the program under test that are not properly validated, which results in serious security vulnerabilities like denial of service or remote code execution. Hybrid code analysis [25] is a strong approach that combines the benefits of static and dynamic analysis to increase the effectiveness of software vulnerability detection. Static analysis examines code without executing it. Its key strength is its ability to quickly scan the entire codebase and identify any flaws before the code executes. Yet, it often generates high false positives and has limited context on runtime behavior [52]. Dynamic analysis, however, involves running the code and monitoring its behavior in a real-time fashion. This method excels at finding runtime issues such as memory leaks.2 Yet, the main drawback is that it is resource intensive, as you need to run the entire program under test to explore different code patches.
漏洞检测在软件漏洞管理整体过程中至关重要[11]。它包括检测软件系统中攻击者可能利用的安全弱点。用于漏洞检测的传统技术有几种。在手动代码审计方法中,专家会彻底检查源代码,以手动检测编码缺陷、不安全程序和可能存在的漏洞。静态分析[35]涉及使用自动化工具分析源代码或编译的二进制文件,而无需执行测试软件。动态分析[67, 102]的目标是评估软件在运行时的行为。这包括在受控环境中运行软件或通过自动化测试,同时监控其执行和与系统资源的交互。然而,动态分析可能在系统开销方面存在限制[167]。这一类别下的一个方法是使用模糊测试进行软件漏洞检测[42]。 模糊测试中,确定了待测程序的输入空间,然后随机修改/变异输入或基于一组已定义的规则生成格式不正确的输入以及边界输入值(即边缘情况)。这些受污染的值预计会击中程序中未正确验证的部分,从而导致严重的安全漏洞,如拒绝服务或远程代码执行。混合代码分析[25]是一种强大的方法,它结合了静态分析和动态分析的优势,以提高软件漏洞检测的有效性。静态分析在不执行代码的情况下检查代码。其关键优势是能够快速扫描整个代码库并在代码执行之前识别任何缺陷。然而,它通常会产生大量的误报,并且对运行时行为的上下文有限[52]。然而,动态分析涉及运行代码并以实时方式监控其行为。这种方法擅长发现运行时问题,如内存泄漏。 然而,主要缺点是它资源密集,因为你需要运行整个待测试程序来探索不同的代码补丁。
The hybrid model leverages the strengths of both approaches to ensure comprehensive coverage. Despite its benefits, implementing hybrid code analysis has technical complexities, such as integrating and synchronizing static and dynamic tools. Additionally, it demands significant computational resources and time, potentially slowing down development time.
混合模型利用两种方法的优势以确保全面覆盖。尽管具有这些优点,实施混合代码分析存在技术复杂性,例如集成和同步静态和动态工具。此外,它需要大量的计算资源和时间,可能会减缓开发时间。

2.1.2 Vulnerability Analysis.
2.1.2 漏洞分析。

After the detection of vulnerabilities, the subsequent step in software vulnerability management is vulnerability analysis and assessment [130]. This step involves a further examination of identified vulnerabilities to assess their severity, impact, and potential exploitability. First, with regard to severity, accurately assessing software vulnerabilities is vital for several reasons. One reason is that it allows organizations to prioritize their response based on the severity of the vulnerabilities. Severity refers to the potential impact a vulnerability could have if exploited [15]. By accurately assessing the severity, organizations can focus their attention on high-severity vulnerabilities that pose significant threats to the security and functionality of the software system. Second, with regard to impact, accurately assessing vulnerabilities helps determine the potential impact they may have on the organization [43]. The term impact refers to the manifestations of exploiting a vulnerability, such as denial of service [53] or data breaches. By understanding the potential impact, organizations can make informed decisions regarding the urgency and priority of remediation efforts. Third, with regard to exploitability, accurate vulnerability assessment aids in understanding their potential exploitability [14]. This entails determining the possibility that an attacker will be successful in exploiting the vulnerability to infiltrate the software system.
在漏洞检测之后,软件漏洞管理的下一步是漏洞分析和评估[130]。这一步骤包括对已识别的漏洞进行进一步审查,以评估其严重性、影响和潜在的利用性。首先,关于严重性,准确评估软件漏洞至关重要,原因有几个。其中一个原因是它允许组织根据漏洞的严重性来优先处理响应。严重性指的是漏洞被利用时可能产生的潜在影响[15]。通过准确评估严重性,组织可以集中精力关注那些对软件系统安全和功能构成重大威胁的高严重性漏洞。其次,关于影响,准确评估漏洞有助于确定它们可能对组织产生的潜在影响[43]。影响一词指的是利用漏洞的表现,如拒绝服务[53]或数据泄露。 通过理解潜在影响,组织可以就修复努力的紧迫性和优先级做出明智的决策。第三,关于可利用性,准确的安全漏洞评估有助于理解其潜在的利用性[14]。这包括确定攻击者成功利用漏洞渗透软件系统的可能性。

2.1.3 Vulnerability Remediation.
2.1.3 漏洞修复。

The process of resolving detected software vulnerabilities by different techniques such as patching, code modification, and repairing is referred to as software vulnerability remediation [59]. The fundamental goal of remediation is to eliminate or mitigate vulnerabilities to improve the security and dependability of the software system. One common approach to vulnerability remediation is applying patches provided by software vendors or open source communities [156]. Patches are updates or fixes that address specific vulnerabilities or weaknesses identified in a software system.
软件漏洞的解决过程,通过修补、代码修改和修复等不同技术手段,被称为软件漏洞修复[59]。修复的基本目标是消除或减轻漏洞,以提高软件系统的安全性和可靠性。一种常见的漏洞修复方法是应用软件供应商或开源社区提供的补丁[156]。补丁是针对软件系统中识别出的特定漏洞或弱点的更新或修复。

2.1.4 ML for Software Vulnerability Detection.
2.1.4 软件漏洞检测中的机器学习

By utilizing data analysis, pattern recognition, and ML to find software security vulnerabilities, ML approaches have revolutionized software vulnerability detection [145]. These techniques improve the accuracy and efficiency of vulnerability detection, potentially allowing automated detection, faster analysis, and the identification of previously undisclosed vulnerabilities. One common application of ML in vulnerability detection is the classification of code snippets [27], software binaries, or code changes extracted from open source repositories such as GitHub or Common Vulnerability and Exposure (CVE). ML models can be trained on labeled datasets, where each sample represents a known vulnerability or non-vulnerability. These models then learn to generalize from the provided examples and classify new instances based on the patterns they have learned. This method allows for automatic vulnerability discovery without the need for manual examination, considerably lowering the time and effort necessary for analysis.
通过利用数据分析、模式识别和机器学习来发现软件安全漏洞,机器学习方法已经彻底改变了软件漏洞检测[145]。这些技术提高了漏洞检测的准确性和效率,可能实现自动化检测、快速分析和识别之前未公开的漏洞。机器学习在漏洞检测中的一个常见应用是对代码片段[27]、软件二进制文件或从 GitHub 或通用漏洞和暴露(CVE)等开源存储库中提取的代码更改进行分类。这些模型可以在标记数据集上进行训练,其中每个样本代表一个已知的漏洞或非漏洞。然后,这些模型从提供的示例中学习,并根据它们学到的模式对新实例进行分类。这种方法允许自动发现漏洞,无需手动检查,大大降低了分析所需的时间和精力。
ML models for detecting software vulnerabilities have promising advantages over traditional methodologies. Each benefit is discussed in detail in the next paragraph. Automation is a significant advantage. ML models can automatically scan and analyze large codebases, or system configurations, detecting potential vulnerabilities without requiring human intervention for each case [12]. This automation speeds up the detection process, allowing security teams to focus on verifying and mitigating vulnerabilities rather than manual analysis. With regard to efficiency and scalability, ML approaches offer faster analysis. Traditional vulnerability detection techniques rely on manual inspection or the application of pre-defined rules [128]. In contrast, ML approaches can evaluate enormous volumes of data in parallel and generate predictions quickly, dramatically shortening the time necessary to find vulnerabilities. With regard to detection effectiveness, ML models can uncover previously unknown vulnerabilities, commonly known as zero-day vulnerabilities [5]. These models may uncover signs of vulnerabilities even when they have not been specifically trained on them by learning patterns and generalizing from labeled data. This capability improves the overall security by helping to identify and address unknown weaknesses in software before they are exploited by attackers [1].
机器学习模型在检测软件漏洞方面相较于传统方法具有显著优势。每个优势将在下一段详细讨论。自动化是一个重要优势。机器学习模型可以自动扫描和分析大型代码库或系统配置,检测潜在漏洞,无需对每个案例进行人工干预[12]。这种自动化加快了检测过程,使安全团队能够专注于验证和缓解漏洞,而不是手动分析。在效率和可扩展性方面,机器学习方法提供了更快的分析。传统的漏洞检测技术依赖于人工检查或应用预定义的规则[128]。相比之下,机器学习方法可以并行评估大量数据并快速生成预测,显著缩短发现漏洞所需的时间。在检测有效性方面,机器学习模型可以揭示之前未知的漏洞,通常称为零日漏洞[5]。 这些模型甚至在没有针对它们进行特定训练的情况下,通过学习模式和从标记数据中泛化,也可能发现漏洞的迹象。这种能力通过帮助在攻击者利用之前识别和解决软件中的未知弱点,从而提高了整体安全性[1]。
Figure 1 shows the overall pipeline of software vulnerability detection. The pipeline for software vulnerability detection using ML models involves several key stages.
图 1 展示了软件漏洞检测的整体流程。使用机器学习模型进行软件漏洞检测的流程包括几个关键阶段。
Fig. 1.
Fig. 1. Overall pipeline of software vulnerability detection.
图 1. 软件漏洞检测的整体流程。
The first stage is data collection, where data is gathered from various sources such as benchmark datasets including but not limited to the National Vulnerability Database (NVD) and the National Institute of Standards and Technology (NIST) Software Assurance Reference Dataset (SARD), code repositories (GitHub), and specific open source projects (LibTIFF, FFMPEG).
第一阶段是数据收集,数据来自各种来源,包括但不限于国家漏洞数据库(NVD)和国家标准与技术研究院(NIST)软件保证参考数据集(SARD),代码仓库(GitHub),以及特定的开源项目(LibTIFF,FFMPEG)。
The data preprocessing stage involves tokenization, parsing (using tools like Joern,3) normalization, and feature extraction to convert raw code into analyzable formats.
数据预处理阶段包括分词、解析(使用如 Joern、 3 等工具)、归一化和特征提取,以将原始代码转换为可分析格式。
The data representation stage is where the preprocessed data is converted into appropriate representations, including graph-based representations such as control flow or dataflow graphs, token representations, or numerical attributes.
数据表示阶段是将预处理后的数据转换为适当的表示,包括基于图的表示,如控制流图或数据流图,标记表示或数值属性。
In the feature extraction stage, once the data is represented in an appropriate form, these representations are converted into suitable features using different embedding techniques such as graph embedding or token vector embedding.
在特征提取阶段,一旦数据以适当的形式表示,这些表示就通过不同的嵌入技术(如图嵌入或标记向量嵌入)转换为合适的特征。
In the model inference stage, appropriate DL models (e.g., Recurrent Neural Networks (RNNs), Graph Neural Networks (GNNs), Transformers, Autoencoders, and Deep Belief Networks (DBNs)), as well as traditional ML models (e.g., Support Vector Machines (SVMs), Decision Trees, and Random Forests), are chosen based on the characteristics of the data. The training process includes splitting the data into training and test sets, feature engineering, hyperparameter tuning, and applying suitable training algorithms.
在模型推理阶段,根据数据特征选择合适的深度学习模型(例如,循环神经网络(RNNs)、图神经网络(GNNs)、Transformer、自编码器和深度信念网络(DBNs)),以及传统的机器学习模型(例如,支持向量机(SVMs)、决策树和随机森林)。训练过程包括将数据分为训练集和测试集、特征工程、超参数调整以及应用合适的训练算法。
In addition, model evaluation is often conducted using cross validation, performance metrics (i.e., accuracy, precision, and recall), confusion matrices, and ablation studies to ensure robust performance. This step ensures the models are accurate and reliable for detecting software vulnerabilities.
此外,模型评估通常采用交叉验证、性能指标(即准确率、精确率和召回率)、混淆矩阵和消融研究来确保稳健的性能。这一步骤确保模型在检测软件漏洞方面准确可靠。

2.2 Related Work  2.2 相关工作

There have been several existing survey papers on software vulnerabilities in the literature. In this section, we analyze the existing papers based on different aspects as shown in Table 1.
文献中已有关于软件漏洞的几篇综述论文。在本节中,我们根据表 1 所示的不同方面分析现有论文。
Table 1.
No.  第号Study  研究Data Source  数据源Representation  表示Embedding  嵌入Models  模型Vulnerability Types  漏洞类型Tools  工具
1Le et al. [72]
Le 等人[72]
×××
2Ghaffarian & Shahriari [40]
加法里安 & 沙赫里亚里 [40]
××
3Lin et al. [86]
林等人[86]
××
4Zeng et al. [173]
曾等[173]
××
5Semasaba et al. [124]
Semasaba 等[124]
××
6Sun et al. [133]
Sun 等人[133]
××××
7Kritikos et al. [69]
Kritikos 等人[69]
×××
8Khan & Parkinson [66]
卡恩与帕金森[66]
××××××
9Nong et al. [112]
Nong 等人 [112]
××××××
10Chakraborty et al. [12]
查克拉巴蒂等[12]
××××××
11Liu et al. [90]
刘等[90]
××××××
12Our survey  我们的调查
Table 1. Comparison of Contributions between Our Survey and the Existing Related Surveys/Reviews
表 1. 我们调查与现有相关调查/综述贡献比较
The table’s columns represent different aspects of the surveys, such as the data source used, representation, feature embedding, ML models, vulnerability types, and tools employed for model building or dataset processing. Data Source indicates whether the survey reviewed vulnerability detection data sources. Representation discusses whether the survey considered source code representation in its analysis. Embedding checks whether the survey considered feature embedding. The table also considers the ML models in the sixth column. The table also checks whether the survey considers vulnerability types based on the Common Weakness Enumeration (CWE) number. The last column indicates whether the studies covered tools used for software vulnerability detection.
表格的列表示调查的不同方面,例如所使用的数据源、表示、特征嵌入、机器学习模型、漏洞类型以及用于模型构建或数据集处理的工具。数据源表示调查是否审查了漏洞检测数据源。表示讨论调查是否在其分析中考虑了源代码表示。嵌入检查调查是否考虑了特征嵌入。表格还考虑了第六列的机器学习模型。表格还检查调查是否根据通用弱点枚举(CWE)编号考虑漏洞类型。最后一列表示研究是否涵盖了用于软件漏洞检测的工具。
The works of Ghaffarian and Shahriari [40] and Kritikos et al. [69] are the closest surveys to ours when it comes to the detection of data-driven security vulnerabilities. In their surveys, they analyzed ML-based software vulnerability detection from various aspects as shown in Table 1. However, there are a couple of differences compared to our work. Specifically, our work surveys vulnerability detection from the following aspects: better understanding of attack patterns and tools used for software vulnerability detection. Understanding different types of vulnerabilities gives researchers insights into various attack patterns, enabling them to design detection techniques that can identify both known and unknown attack patterns. Understanding tools for software vulnerability detection reveals technological trends, helping researchers in this field leverage tools for reproducibility. It highlights the strengths and weaknesses of existing tools, guiding new developments. Popular tools offer community support, documentation, and shared knowledge, accelerating innovation and practical application of research.
Ghaffarian 和 Shahriari[40]以及 Kritikos 等人[69]的作品在数据驱动安全漏洞检测方面与我们的研究最为接近。在他们调查中,他们从多个方面分析了基于机器学习的软件漏洞检测,如表 1 所示。然而,与我们的工作相比,存在一些差异。具体来说,我们的工作从以下方面调查了漏洞检测:更好地理解攻击模式和用于软件漏洞检测的工具。了解不同类型的漏洞使研究人员能够深入了解各种攻击模式,从而设计出能够识别已知和未知攻击模式的检测技术。了解软件漏洞检测工具揭示了技术趋势,帮助该领域的研究人员利用工具实现可重复性。它突出了现有工具的优缺点,指导了新发展的方向。流行的工具提供社区支持、文档和共享知识,加速了研究的创新和实际应用。
Le et al. [72] reviewed data-driven vulnerability assessment and prioritization studies. They conducted a review of prior research on software assessment and prioritization that leverages ML and data mining methods. The major difference from ours is that we review software vulnerability detection techniques, which refers to the process of identifying potential vulnerabilities in software systems, whereas they survey assessment and prioritization techniques.
Le 等人[72]回顾了基于数据驱动的漏洞评估和优先级研究。他们回顾了利用机器学习和数据挖掘方法进行软件评估和优先级研究的先前研究。与我们不同的是,我们回顾的是软件漏洞检测技术,这指的是识别软件系统中潜在漏洞的过程,而他们调查的是评估和优先级技术。
Lin et al. [86] examined the literature on using DL and neural network based techniques to detect software vulnerabilities. The major difference compared to our work is that we examine the trend analysis of papers published in software vulnerability detection in journal and conference papers because it provides a comprehensive understanding of the publishing patterns in a particular field or area of research. Trend analysis can shed light on the distribution of research output across various publication venues and the shifting preferences of researchers and authors.
Lin 等人[86]研究了使用深度学习和基于神经网络的技巧来检测软件漏洞的文献。与我们的工作相比,主要区别在于我们考察了在期刊和会议论文中发表的软件漏洞检测论文的趋势分析,因为这提供了对特定领域或研究领域的出版模式的整体理解。趋势分析可以揭示研究产出在各种出版场所的分布以及研究人员和作者偏好的变化。
Zeng et al. [173] discussed the growing focus on exploitable software vulnerabilities and the development of detection methods, especially using ML techniques. It reviews 22 recent studies employing DL for vulnerability detection and identifies four significant game-changers in the field. The survey compares these game-changers based on data sources, feature representation, DL models, and detection tools. Our survey differs in two key ways. First, we analyze publication trends in software vulnerability detection in journals and conferences, providing a comprehensive understanding of research trends. Second, we cover additional aspects beyond data sources, feature representation, and ML models including vulnerability types and detection tools.
曾等人[173]讨论了可利用软件漏洞的关注度日益增长以及检测方法的发展,特别是使用机器学习技术。它回顾了 22 项使用深度学习进行漏洞检测的近期研究,并确定了该领域四个重要的颠覆性创新。该调查基于数据来源、特征表示、深度学习模型和检测工具对这些颠覆性创新进行了比较。我们的调查在两个方面有所不同。首先,我们分析了期刊和会议中软件漏洞检测的出版趋势,提供了对研究趋势的全面理解。其次,我们涵盖了除数据来源、特征表示和机器学习模型之外的其他方面,包括漏洞类型和检测工具。
Kritikos et al. [69] and Sun et al. [133] focused on cybersecurity and aimed to improve cyber resilience. Sun et al. [133] discussed the paradigm shift in understanding and protecting against cyber threats from reactive detection to proactive prediction, with an emphasis on new research on cybersecurity incident prediction systems that use many types of data sources. Kritikos et al. [69] discusses the challenges of migrating applications to the cloud and ensuring their security, with a focus on vulnerability management during the application lifecycle and the use of open source tools and databases to better secure applications. While both approaches aim to improve the security of applications, they differ in their focus and techniques used. They mainly focus on providing guidance and tools to support vulnerability management during the application lifecycle, whereas in our survey, we focus on software vulnerability detection using ML techniques on source code which aim at automating the identification of vulnerabilities in the source code or repository data (i.e., commit characteristics).
Kritikos 等人[69]和 Sun 等人[133]专注于网络安全,旨在提高网络韧性。Sun 等人[133]讨论了从被动检测到主动预测的网络安全威胁理解和防御的范式转变,重点介绍了使用多种数据源进行网络安全事件预测系统的新研究。Kritikos 等人[69]讨论了将应用程序迁移到云中并确保其安全的挑战,重点关注应用程序生命周期中的漏洞管理以及使用开源工具和数据库来更好地保护应用程序。虽然这两种方法都旨在提高应用程序的安全性,但它们在重点和所使用的技术上有所不同。它们主要侧重于提供指导和支持工具,以支持应用程序生命周期中的漏洞管理,而我们的调查则侧重于使用机器学习技术在源代码上检测软件漏洞,旨在自动化源代码或存储库数据(即提交特征)中漏洞的识别。
Khan and Parkinson [66] focused on vulnerability assessment, which is the process of finding and fixing vulnerabilities in a computer system before they can be exploited by hackers. This highlights the necessity for more studies into automated vulnerability mitigation strategies that can effectively secure software systems. However, vulnerability identification with ML approaches on source code entails analyzing a software’s source code to spot security flaws. Instead of evaluating the safety of the entire system, this method concentrates on finding vulnerabilities in the code itself.
Khan 和 Parkinson[66]专注于脆弱性评估,即在黑客利用之前,寻找和修复计算机系统中的漏洞的过程。这突出了对自动化脆弱性缓解策略进行更多研究的必要性,这些策略可以有效地保障软件系统安全。然而,使用机器学习方法对源代码进行脆弱性识别,需要分析软件的源代码以发现安全漏洞。这种方法不是评估整个系统的安全性,而是专注于寻找代码本身的漏洞。
Nong et al. [112] explored the open science aspects of studies on software vulnerability detection and argued that there is a dearth of research on problems of open science in software engineering, particularly about software vulnerability detection. The authors conducted an exhaustive literature study and identified 55 relevant studies that propose DL-based vulnerability detection approaches. They investigated open science aspects including availability, executability, reproducibility, and replicability. The study revealed that 25.5% of the examined approaches provide open source tools.
Nong 等人[112]探讨了软件漏洞检测研究的开放科学方面,并认为在软件工程中,关于开放科学问题的研究不足,尤其是关于软件漏洞检测的问题。作者进行了详尽的文献研究,并确定了 55 项相关研究,这些研究提出了基于深度学习的漏洞检测方法。他们研究了开放科学的各个方面,包括可用性、可执行性、可重复性和可复制性。该研究揭示了 25.5%的检查方法提供了开源工具。
Chakraborty et al. [12] investigated the performance of cutting-edge DL-based vulnerability prediction approaches in real-world vulnerability prediction scenarios. They find that the performance of the state-of-the-art DL-based techniques drops by more than 50% in real-world scenarios. The significant difference compared to our survey study is that in our work, we focus on the usage of ML models for software vulnerability detection and characterize the different stages in the pipeline of vulnerability detection. However, they focus on issues related to the use of state-of-the-art DL models for software vulnerability detection.
Chakraborty 等人[12]研究了基于深度学习的漏洞预测方法在现实世界漏洞预测场景中的性能。他们发现,最先进的基于深度学习的技术在现实世界场景中的性能下降了 50%以上。与我们的调查研究相比,一个显著的不同之处在于,在我们的工作中,我们专注于使用机器学习模型进行软件漏洞检测,并描述了漏洞检测流程中的不同阶段。然而,他们关注的是与使用最先进的深度学习模型进行软件漏洞检测相关的问题。
Liu et al. [90] discussed the increasing popularity of DL techniques in software engineering research due to their ability to address software engineering challenges without extensive manual feature engineering. The major difference compared to our study is that we focus on the usage of ML techniques in software vulnerability detection pipelines, whereas they emphasize replicability and reproducibility of the results reported in software engineering research studies.
刘等人[90]讨论了深度学习(DL)技术在软件工程研究中的日益普及,这得益于它们能够解决软件工程挑战而无需进行广泛的手动特征工程。与我们的研究相比,主要区别在于我们关注机器学习(ML)技术在软件漏洞检测管道中的应用,而他们强调软件工程研究报告中结果的复现性和可重复性。

3 Methodology  3 方法论

3.1 Sources of Information
3.1 信息来源

In this article, we conducta systematic survey following other works [65, 116] to collect and examine studies from January 2011 to June 2024 focusing on software vulnerability detection using ML techniques. The overall workflow of our systematic approach is depicted in Figure 2. We target a set of popular and widely used digital libraries as the source of our data, including ACM Digital Library, ScienceDirect, IEEE Xplore, and Google Scholar. We developed a web crawler4 based on Selenium5 and Beautiful Soup6 libraries. The reason we developed a web crawler is that it offers a reliable, scalable, and effective method for collecting relevant information from the web, which is very useful for academic research, specifically systematic literature review.
本文对其他作品[65, 116]进行了系统调查,收集并审查了从 2011 年 1 月到 2024 年 6 月关于使用机器学习技术进行软件漏洞检测的研究。我们系统方法的整体工作流程如图 2 所示。我们将一系列流行且广泛使用的数字图书馆作为数据来源,包括 ACM 数字图书馆、ScienceDirect、IEEE Xplore 和 Google Scholar。我们基于 Selenium[1]和 Beautiful Soup[2]库开发了一个网络爬虫 4 。我们开发网络爬虫的原因是它提供了一种可靠、可扩展且有效的方法来从网络中收集相关信息,这对于学术研究,特别是系统文献综述非常有用。
Fig. 2.
Fig. 2. Overall workflow of our systematic survey.
图 2. 我们系统调查的整体工作流程。
The period between January 2011 to June 2024 is an appropriate time interval for extracting software vulnerability detection studies for several reasons. One reason is the increase in the volume and diversity of software vulnerabilities. During the past decade, there has been a significant increase in the number and diversity of software vulnerabilities that have been discovered and reported.7 As of 2021, there were 150,000 CVE records in NVD.8 This increase has created a need for more sophisticated and effective methods for vulnerability detection, which has led to the development of new data-driven techniques. A second reason is advancements in ML and data analytics. The past decade has seen significant advancements in ML, including the development of DL algorithms [44, 55], natural language processing techniques [89], and other data-driven approaches that are highly effective in detecting software vulnerabilities.
2011 年 1 月至 2024 年 6 月是提取软件漏洞检测研究的一个合适的时间段,原因有以下几点。一方面,软件漏洞的数量和多样性有所增加。在过去十年中,发现和报告的软件漏洞数量和多样性显著增加。截至 2021 年,NVD 中有 15 万个 CVE 记录。这种增加需要更复杂和有效的漏洞检测方法,这导致了新的数据驱动技术的开发。另一方面,机器学习和数据分析的进步。过去十年中,机器学习取得了显著进展,包括深度学习算法[44, 55]、自然语言处理技术[89]和其他在检测软件漏洞方面非常有效的数据驱动方法。

3.2 Search Terms  3.2 搜索词

Following existing surveys [72, 86, 124, 173], we devised the following search terms:
根据现有调查[72, 86, 124, 173],我们设计了以下搜索词:
vulnerability detection” OR “Deep Transfer Learning Vulnerability Detection” OR “Transfer Learning Software Vulnerability Detection” OR “Transfer Learning Software Bug Detection” OR “Software Vulnerability Detection” OR “Vulnerability Detection Using Deep Learning” OR “Source Code Security Bug Prediction” OR “Source Code Vulnerability Detection” OR “Source Code Bug Detection” OR “Vulnerability Detection on Source Code Using Deep Learning
漏洞检测 或 “深度迁移学习漏洞检测” 或 “迁移学习软件漏洞检测” 或 “迁移学习软件错误检测” 或 “软件漏洞检测” 或 “使用深度学习的漏洞检测” 或 “源代码安全错误预测” 或 “源代码漏洞检测” 或 “源代码错误检测” 或 “使用深度学习在源代码上的漏洞检测”
Using the keywords and our web scraper, we collected more than 15K initial records9 from the subject digital libraries shown in Figure 2. After extracting initial records, we started the manual analysis and filtering of initial records in three stages including verification based on paper titles, abstracts, and contents. These three stages are explained in detail in the following subsections.
使用关键词和我们的网络爬虫,我们从图 2 所示的数字图书馆中收集了超过 15K 条初始记录。在提取初始记录后,我们开始对初始记录进行三个阶段的手动分析和筛选,包括基于论文标题、摘要和内容的验证。这三个阶段将在下文小节中详细解释。

3.3 Study Selection and Quality Assessment
3.3 研究选择与质量评估

The process of selecting studies to be included in our survey involves the following stages: (1) initially choosing studies based on their title, (2) selecting studies after reviewing their abstracts, and (3) making further selections after reading the full articles. Note that the initial search results contain entries that are not related to software vulnerability detection. This might be caused by accidental keyword matching. We manually checked each paper and removed these irrelevant papers to ensure the quality of our survey dataset. We also observe that there exist duplicate papers among search results since the same study could be indexed by multiple databases. We then discarded duplicate studies manually.
选择纳入我们调查的研究过程包括以下阶段:(1)最初根据标题选择研究,(2)在审阅摘要后选择研究,以及(3)阅读全文后进行进一步选择。请注意,初始搜索结果包含与软件漏洞检测无关的条目。这可能是由于关键词匹配错误造成的。我们手动检查了每一篇论文,并删除了这些无关论文,以确保我们调查数据集的质量。我们还观察到,搜索结果中存在重复的论文,因为同一研究可能被多个数据库索引。然后我们手动丢弃了重复的研究。
The inclusion criteria are as follows: (1) the studies should have been peer reviewed (i.e., we do not include arXiv papers), (2) the studies should have experimental results, (3) the studies should propose a novel ML technique, (4) the studies should improve existing data-drive vulnerability detection techniques, and (5) the input to ML models should be either source code, text, commit, byte-code, or a combination of them. In addition, we have the following exclusion criteria to filter out irrelevant papers: (1) studies focusing on other engineering domains (electrical engineering, mechanical engineering, aerospace engineering, etc.), (2) studies addressing static analysis, dynamic analysis, hybrid analysis, and mutation testing, (3) review or survey studies, (4) studies focusing on vulnerability detection of web and Android applications, (5) studies belonging to one of the following categories: books, chapters, tutorials, or technical reports, and (6) studies focusing on malware detection on mobile devices, intrusion detection, and bug detection using static code attributes (i.e., Cyclomatic Complexities).
纳入标准如下:(1)研究应为同行评审(即我们不包括 arXiv 论文),(2)研究应包含实验结果,(3)研究应提出新的机器学习技术,(4)研究应改进现有的数据驱动漏洞检测技术,(5)机器学习模型的输入应为源代码、文本、提交、字节码或它们的组合。此外,我们还有以下排除标准以筛选出无关论文:(1)关注其他工程领域(如电气工程、机械工程、航空航天工程等)的研究,(2)涉及静态分析、动态分析、混合分析和突变测试的研究,(3)综述或调查性研究,(4)关注 Web 和 Android 应用程序漏洞检测的研究,(5)属于以下类别之一:书籍、章节、教程或技术报告的研究,(6)关注移动设备恶意软件检测、入侵检测和利用静态代码属性(即圈复杂度)进行漏洞检测的研究。

3.3.1 Title Filtering Stage.
3.3.1 标题过滤阶段。

In this stage, we filter studies based on their titles. Since titles do not convey much information about the subject study, we only focused on relevance to the initial keywords. In this stage, we answer the following question: Do the titles contain specific keywords or phrases that are central to software vulnerability detection? For example, in the study titled “Toward Hardware-Based IP Vulnerability Detection and Post-Deployment Patching in Systems-on-Chip,” although the title includes our devised keyword vulnerability detection, the context indicates that the focus is on hardware and systems-on-chip rather than software engineering.
在这个阶段,我们根据研究标题进行筛选。由于标题并不能传达太多关于研究主题的信息,我们只关注与初始关键词的相关性。在这个阶段,我们回答以下问题:标题中是否包含与软件漏洞检测相关的特定关键词或短语?例如,在标题为“面向基于硬件的 IP 漏洞检测和片上系统部署后补丁的解决方案”的研究中,尽管标题中包含了我们设计的“漏洞检测”关键词,但上下文表明,重点是硬件和片上系统,而不是软件工程。
After the manual analysis on approximately 15K records, we collected 398 unique studies for further evaluation.
在手动分析约 15K 条记录之后,我们收集了 398 个独特的研究进行进一步评估。
Abstract Filtering Stage. Given the list of studies filtered from the previous stage, we thoroughly analyzed the abstract of the studies. We decomposed the abstract of each paper into four major sections, including Context, Objective, Approach, and Results/Findings, as abstracts of research papers often follow such structure.
摘要筛选阶段。给定从上一阶段筛选出的研究列表,我们对研究的摘要进行了全面分析。我们将每篇论文的摘要分解为四个主要部分,包括背景、目标、方法和结果/发现,因为研究论文的摘要通常遵循这种结构。
In this stage of filtering, we get 202 unique papers for further verification.
在这个筛选阶段,我们获得了 202 篇独特的论文以供进一步验证。
Content Filtering Stage. In this section, we analyze the content of each study in detail to perform the filtering process. Since there is more detail in the actual content of each study, we devise a set of criteria questions. We rely on the answers to these questions to assess the quality of the papers. If the answers to these questions are positive, the study is relevant; otherwise, we remove the paper from further examination. The questions are as follows: (1) Is there a clearly stated research goal related to software vulnerability detection in the introduction of the paper?; (2) Does the proposed vulnerability detection approach use ML or DL techniques?; (3) Is there a defined and repeatable technique?; (4) Is there any explicit contribution to software vulnerability detection?; (5) Is there a clear methodology for validating the technique?; (6) Are the subject projects selected for validation suitable for the research goals?; (7) Are the employed datasets relevant to software vulnerability detection?; (8) Are the type of input data to DL and ML models relevant to software vulnerability detection? (valid data types include source code, binary code, text, and commit metrics); (9) Are there control techniques or baselines to demonstrate the effectiveness of the software vulnerability detection technique?; (10) Are the evaluation metrics relevant (e.g., evaluate the effectiveness of the proposed technique) to the research objectives?; and (11) Do the results presented in the study align with the research objectives, and are they presented in a clear and relevant manner?
内容过滤阶段。在本节中,我们详细分析每项研究的具体内容以执行过滤过程。由于每项研究的实际内容中包含更多细节,我们制定了一套标准问题。我们依靠对这些问题的回答来评估论文的质量。如果这些问题的回答是肯定的,则该研究相关;否则,我们将该论文从进一步审查中移除。问题如下:(1)论文引言中是否明确提出了与软件漏洞检测相关的明确研究目标?(2)所提出的漏洞检测方法是否使用了机器学习或深度学习技术?(3)是否存在定义明确且可重复的技术?(4)是否对软件漏洞检测有明确的贡献?(5)是否有验证技术的明确方法?(6)所选用于验证的项目主题是否适合研究目标?(7)所使用的数据集是否与软件漏洞检测相关?(8)深度学习和机器学习模型的输入数据类型是否与软件漏洞检测相关? (有效数据类型包括源代码、二进制代码、文本和提交指标);(9)是否存在控制技术或基线来证明软件漏洞检测技术的有效性?;(10)评估指标是否与研究目标相关(例如,评估所提出技术的有效性)?;(11)研究中呈现的结果是否与研究目标一致,并且是否以清晰和相关的形式呈现?
The filtering process in this stage resulted in 138 subject studies to address the RQs. We used these 138 studies to create taxonomies which are explained in detail in the next section.
本阶段的筛选过程产生了 138 项主题研究以解决 RQs。我们利用这 138 项研究创建了分类法,将在下一节中详细解释。

3.4 Taxonomy Development and Classification Methodology
3.4 分类发展及分类方法

In this section, we present the methodology used to develop our taxonomy and classify the selected papers based on our RQs. The process is done in an incremental approach following existing studies [53]. The foundation of our taxonomy is anchored in a systematic analysis of the literature, guided by the specific RQs designed to explore various dimensions of software vulnerability detection. Each RQ serves as a focal point for our classification, ensuring a structured and coherent approach.
本节中,我们介绍了用于开发我们的分类法和根据我们的研究问题(RQs)对所选论文进行分类的方法。该过程采用增量方法,遵循现有研究[53]。我们的分类法基础建立在文献的系统分析上,由旨在探索软件漏洞检测各个维度的特定 RQs 指导。每个 RQ 都作为我们分类的焦点,确保了一种结构化和连贯的方法。
Extraction of Relevant Information. We meticulously examined each selected paper to extract relevant text segments related to the RQs. For RQ2, which pertains to the sources of datasets, we examine the experiential setup sections of each study. This section is the most commonly used section where authors discuss the source of datasets.10 This allows us to understand the types of datasets the authors used to evaluate their proposed software vulnerability detection techniques. For RQ3, we analyzed the section detailing the proposed approach for software vulnerability detection. This involved identifying descriptions of the employed ML and DL models. One of the main sources of information that clearly explains the proposed approach is the overall architecture, which depicts the entire process of the proposed technique. For RQ4, we examine the vulnerability types covered in the subject study. These types often use the CWE system, which is easy to locate in the paper. We search for any keywords that start with CWE in the paper. If we find any CWE IDs mentioned, we record them. Otherwise, we note that the paper does not specify which vulnerability types their study aims to detect. Please note that some papers do not mention the CWE ID. For instance, for Integer Overflow (CWE-190), they only use the original title instead of the CWE ID. Therefore, we search for both CWE IDs and other related vulnerability keywords. For RQ5, we thoroughly analyzed the experimental sections, particularly the implementation sections of the subject studies to extract information about the tools used for building the ML models. Our empirical evaluation revealed that the authors usually use the keywords implementation or built to describe the tools they used. For RQ6, we examine the introduction section of the subject study, as authors often explicitly mention the specific problem they address in software vulnerability detection.
相关信息提取。我们仔细审查了每篇选定的论文,以提取与 RQs 相关的文本片段。对于 RQ2,它涉及数据集的来源,我们检查了每项研究的经验设置部分。这部分是作者讨论数据集来源最常用的部分。 10 这使我们能够了解作者用于评估其提出的软件漏洞检测技术的数据集类型。对于 RQ3,我们分析了详细说明软件漏洞检测方法的部分。这包括识别所采用的机器学习(ML)和深度学习(DL)模型的描述。清楚地解释所提出方法的主要信息来源之一是整体架构,它描绘了所提出技术的整个过程。对于 RQ4,我们检查了主题研究中涵盖的漏洞类型。这些类型通常使用 CWE 系统,在论文中容易找到。我们在论文中搜索以 CWE 开头的任何关键词。如果我们找到任何提到的 CWE ID,我们就记录下来。否则,我们注明该论文未指定其研究旨在检测哪些漏洞类型。 请注意,有些论文没有提及 CWE ID。例如,对于整数溢出(CWE-190),它们只使用原始标题而不是 CWE ID。因此,我们同时搜索 CWE ID 和其他相关漏洞关键词。对于 RQ5,我们详细分析了实验部分,特别是主题研究的实现部分,以提取有关用于构建 ML 模型的工具的信息。我们的实证评估显示,作者通常使用“实现”或“构建”等关键词来描述他们使用的工具。对于 RQ6,我们检查了主题研究的引言部分,因为作者通常明确提到他们在软件漏洞检测中解决的具体问题。
Create Preliminary Taxonomies. Initially, we establish a preliminary taxonomy that groups the studies based on defined RQs, which provides a basic framework for organizing the studies in a meaningful and systematic manner. For example, for the first study, we create preliminary taxonomies for RQ1 through RQ6. After thoroughly addressing all RQs for a given study, we move on to the next study.
创建初步分类法。最初,我们根据定义的研究问题(RQs)建立初步分类法,这为以有意义和系统的方式组织研究提供了一个基本框架。例如,对于第一项研究,我们为 RQ1 至 RQ6 创建初步分类法。在彻底解决给定研究的所有 RQs 之后,我们继续下一项研究。
Iterative Refinement. Once the initial taxonomy is created, we proceed to expand and refine it as we delve deeper into the analysis of each RQ across all subject studies. The authors then expand the taxonomy by assigning new papers to the preliminary taxonomy. If a new paper cannot fit into any of the existing categories within the taxonomy, a new category is created that reflects the unique characteristics of that paper. To ensure the accuracy of the taxonomy, the second and third authors (who are not involved in the taxonomy creation process) randomly select 20 papers from the workflow and check the created taxonomies for any discrepancies. After identifying any disagreements, they proceed to mark them. Subsequently, all authors engage in discussions to address and resolve these disagreements. Initially, the disagreement rate was 30%, but after a second round of review and cross checking of the papers, we were able to eliminate all disagreements.
迭代细化。一旦创建了初始分类法,我们就随着对每个研究问题(RQ)在所有主题研究中的分析深入,对其进行扩展和细化。然后,作者通过将新论文分配到初步分类法中来扩展分类法。如果一篇新论文无法纳入分类法中现有的任何类别,就创建一个反映该论文独特特征的新类别。为确保分类法的准确性,第二和第三位作者(未参与分类法创建过程)从工作流程中随机选取 20 篇论文,检查创建的分类法是否存在任何差异。在确定任何分歧后,他们进行标记。随后,所有作者参与讨论,以解决和解决这些分歧。最初,分歧率为 30%,但在第二次审查和交叉检查论文后,我们能够消除所有分歧。
Resolving Disagreements. During the extraction process, if we encountered conflicting information or interpretations, we collaboratively discussed these discrepancies to reach a consensus. This collaborative effort ensured that our classification remained consistent and accurate. By following this rigorous methodology, we ensured that our taxonomy is grounded in detailed and systematic analyses of the literature. This approach provides a clear and coherent framework for classifying the selected papers and addressing each RQ comprehensively.
解决分歧。在提取过程中,如果我们遇到冲突的信息或解释,我们会共同讨论这些差异,以达成共识。这种协作努力确保了我们的分类保持一致和准确。通过遵循这种严格的方法,我们确保了我们的分类体系基于对文献的详细和系统分析。这种方法为对所选论文进行分类和全面解决每个研究问题提供了一个清晰且连贯的框架。

4 Results  4 结果

In this section, we present our analyses and findings to address the RQs.
在这一节中,我们展示了我们的分析和发现,以解决研究问题。

4.1 RQ1: What Is the Trend of Studies?
4.1 RQ1:研究趋势是什么?

To understand the trend of publications, we examined the publication dates and the venues in which they were presented.
为了了解出版物趋势,我们考察了出版日期以及它们所呈现的场所。

4.1.1 RQ1.1: What Is the Trend of Studies over Time?.
4.1.1 RQ1.1:研究随时间推移的趋势是什么?

Figure 3 demonstrates the publication trend of software vulnerability detection studies published over 13 years (i.e., between January 2011 and June 2024). It is observable that the number of publications has gradually increased over the years.
图 3 展示了过去 13 年(即 2011 年 1 月至 2024 年 6 月)发表的软件漏洞检测研究出版趋势。观察可知,出版物数量逐年逐渐增加。
Fig. 3.
Fig. 3. Publication trend of vulnerability detection studies.
图 3. 漏洞检测研究发表趋势
We also analyze the cumulative number of publications shown in Figure 3. It is noticeable that the curve fitting the distribution shows a significant increase in slope between 2020 and 2024, suggesting that the usage of ML techniques for software vulnerability detection has become a prevalent trend since 2020.
我们同时分析了图 3 中显示的出版物累积数量。值得注意的是,拟合分布的曲线在 2020 年至 2024 年之间斜率显著增加,表明自 2020 年以来,使用机器学习技术进行软件漏洞检测已成为一种普遍趋势。

4.1.2 RQ1.2: What Is the Distribution of Publication Venues?.
4.1.2 RQ1.2:出版物场所的分布情况如何?

In this study, in general, we studied and reviewed 138 studies from various publication venues, including 61 studies from conferences and symposiums and 77 studies from journals. Table 2 shows the distribution of studies for each publication venue. A total of 44.2% of the publications are published in conferences and symposiums, whereas 55.7% of the studies have been published as articles in journals. It is observable that ICSE, ISSRE, MSR, and FSE are the most popular venues that have the highest number of studies. Meanwhile, among the journal venues, IST, C&S, and JSS have the highest number of studies—that is, 13, 12, and 12 studies, respectively.
在本研究中,总体而言,我们研究了并回顾了来自各种出版物场所的 138 项研究,包括来自会议和研讨会的 61 项研究以及来自期刊的 77 项研究。表 2 显示了每个出版物场所的研究分布。总共 44.2%的出版物是在会议和研讨会上发表的,而 55.7%的研究已作为文章发表在期刊上。可以观察到,ICSE、ISSRE、MSR 和 FSE 是最受欢迎的场所,拥有最多的研究数量。同时,在期刊场所中,IST、C&S 和 JSS 的研究数量最多,分别是 13 篇、12 篇和 12 篇。
Table 2.
Conference Venue  会议地点# Studies  # 研究References  参考文献Journal Venue  期刊出版地# Studies  # 研究References  参考文献
ICSE9[11, 113, 129, 135, 146, 147, 150, 155, 170]
[11, 113, 129, 135, 146, 147, 150, 155, 170]
IST13[9, 10, 17, 30, 31, 108, 126, 127, 139, 149, 158, 175, 181]
[9, 10, 17, 30, 31, 108, 126, 127, 139, 149, 158, 175, 181]
ISSRE6[153, 169, 172, 179, 180, 185]
[153, 169, 172, 179, 180, 185]
C&S  C&S:C&S(此处保持原文,可能为专有名词或缩写)12[36, 45, 47, 63, 68, 77, 131, 132, 138, 148, 152, 164]
[36, 45, 47, 63, 68, 77, 131, 132, 138, 148, 152, 164]
MSR5[19, 38, 54, 56, 105]
[19, 38, 54, 56, 105]
JSS12[7, 8, 13, 16, 32, 91, 98, 106, 114, 136, 143, 171]
[7, 8, 13, 16, 32, 91, 98, 106, 114, 136, 143, 171]
FSE5[78, 81, 103, 111, 184]
[78, 81, 103, 111, 184]
TDSC6[83, 84, 87, 95, 188, 189]
[83, 84, 87, 95, 188, 189]
IJCAI4[23, 34, 96, 187]
[23, 34, 96, 187]
TSE5[26, 82, 122, 151, 174]
[26, 82, 122, 151, 174]
ASE3[73, 110, 177]  [73, 110, 177]TIFS4[58, 142, 154, 157]
[58, 142, 154, 157]
NDSS2[85, 125]  [ 85, 125 ]ISA4[134, 159, 176, 182]
[134, 159, 176, 182]
NeurIPS  神经信息处理系统大会2[3, 183]  [3, 183]TOSEM3[20, 109, 190]  [ 20, 109, 190 ]
TrustCom2[93, 165]  [93, 165]TKDE2[76, 97]  [76, 97]
OOPSLA2[79, 118]  [79, 118]IS2[41, 64]  [41, 64]
CCS2[115, 163]  [ 115, 163 ]ESA2[92, 140]  [92, 140]
ICLR2[28, 71]  [28, 71]CN1[178]  [178]
QRS2[74, 141]  [74, 141]TFS1[94]  [ 94 ]
USENIX1[162]  [ 162 ]SQJ1[33]  [33]
MASCOTS1[37]  [37]PL1[80]  [80]
KDDM1[107]  [ 107 ]P&S  P&S:P&S(可能指产品与服务的缩写,具体含义需根据上下文确定)1[75]  [ 75 ]
ISSTA1[22]  [ 22 ]Nature  自然1[60]  [60]
IJCNN1[48]  [48]KBS1[186]  [186]
ICTAI1[117]  [ 117 ]FGCS1[49]  [49]
ICECCS1[21]  [21]EAAI1[144]  [144]
ICBD1[168]  [168]CEE1[120]  [120]
GLOBCOM1[166]  [166]BRA1[4]  [4]
DSAA1[104]  [ 104 ]ASC1[57]  [57]
CDSN1[137]  [ 137 ]   
CARS1[70]  [70]   
SANER1[29]  [29]   
ENTCC1[62]  [62]   
MCSoC  MCSoC:多核片上系统1[46]  [ 46 ]   
Overall  总体61  77 
Table 2. Distribution of Publications Based on Conference and Journal Venues
表 2. 基于会议和期刊场所的出版物分布

4.2 RQ2: What Are the Characteristics of Software Vulnerability Detection Datasets?
4.2 研究问题 2:软件漏洞检测数据集的特征是什么?

In this section, we examine data used in vulnerability detection studies and conduct a comprehensive analysis of the steps of data source, data type, and data representation.
在这一节中,我们考察了用于漏洞检测研究的数据,并对数据来源、数据类型和数据表示的步骤进行了全面分析。

4.2.1 RQ2.1: What Is the Source of Datasets?.
4.2.1 RQ2.1:数据集的来源是什么?

One of the main challenges in ML-based software vulnerability detection is the insufficient amount of data available for model training [19, 88]. Consequently, there exists a gap in research on how to obtain sufficient datasets to facilitate the training of ML models for software vulnerability detection. To this end, we analyze the sources of datasets in the subject studies. Our analysis reveals that datasets for this purpose can be broadly classified into four categories: Benchmark, Hybrid, Open Source Software, and Repository sources. Among the subject studies, 39.1% of them use Hybrid as the data source for the detection of software vulnerability. They use a combination of various sources of data, such as benchmarks, repositories, and open source projects, to provide a comprehensive and multi-faceted resource for software vulnerability detection [36, 141]. These datasets combine the benefits of each data source to provide richer and more diversified information, which is critical for building and verifying robust vulnerability detection systems. Benchmark datasets used by 37.6% of the subject studies play a crucial role in the field of software vulnerability detection by providing standardized, high-quality data that researchers can use to evaluate and compare the effectiveness of their detection technique [127, 159]. Using benchmark datasets facilitates the construction of ML models for software vulnerability detection. However, they may not include zero-day vulnerabilities, which have a significant impact. Among the subject studies, 13.7% of them collect datasets from online repositories which we classify as the Repository category. These datasets are gathered from publicly available projects hosted on repository websites such as GitHub or Stack Overflow [28, 118, 184]. These repositories hold a plethora of data, including source code, commit history, issue trackers, and documentation. Repositories keep detailed records of any changes made to a codebase, such as commit messages, diffs, and timestamps [101]. This comprehensive history enables researchers to trace the lifecycle of vulnerabilities from introduction to resolution (please refer to the work of Iannone et al. [61]). The fourth source is open source software, accounting for 9.4% of the subject studies, which provides a rich and diverse source of data for software vulnerability detection [126, 163]. These projects are publicly accessible and typically have a large community of contributors who continuously update and maintain the code. Some example open source projects include but are not limited to FFmpeg, QEMU, OpenSSH, and LibTIFF. The open nature of these projects means that they are often inspected carefully by numerous developers, which can lead to the discovery and documentation of various vulnerabilities.
机器学习软件漏洞检测的主要挑战之一是模型训练所需数据的不足[19, 88]。因此,在如何获取足够的数据集以促进软件漏洞检测的机器学习模型训练方面的研究存在差距。为此,我们分析了该主题研究中数据集的来源。我们的分析表明,用于此目的的数据集可以大致分为四类:基准、混合、开源软件和存储库来源。在主题研究中,39.1%的研究使用混合作为软件漏洞检测的数据源。他们结合了各种数据来源,如基准、存储库和开源项目,为软件漏洞检测提供了一个全面和多角度的资源[36, 141]。这些数据集结合了每个数据来源的优点,提供了更丰富和更多样化的信息,这对于构建和验证稳健的漏洞检测系统至关重要。由 37.6% 的研究对象在软件漏洞检测领域发挥着关键作用,通过提供标准化的、高质量的数据,研究人员可以使用这些数据来评估和比较其检测技术的有效性[127, 159]。使用基准数据集有助于构建软件漏洞检测的机器学习模型。然而,它们可能不包括零日漏洞,这对影响很大。在研究对象中,13.7% 的研究收集来自在线存储库的数据集,我们将这类数据集归类为存储库类别。这些数据集来自 GitHub 或 Stack Overflow 等存储库网站上公开的项目[28, 118, 184]。这些存储库包含大量数据,包括源代码、提交历史、问题跟踪器和文档。存储库详细记录了代码库的任何更改,如提交信息、差异和时间戳[101]。这一全面的历史记录使研究人员能够追踪漏洞从引入到解决的生命周期(请参阅 Iannone 等人[61]的工作)。第四个来源是开源软件,占 9%。4% 的受试研究,为软件漏洞检测提供了丰富多样的数据来源[126, 163]。这些项目是公开可访问的,通常拥有庞大的贡献者社区,他们持续更新和维护代码。一些示例开源项目包括但不限于 FFmpeg、QEMU、OpenSSH 和 LibTIFF。这些项目的开放性意味着它们经常被众多开发者仔细检查,这可能导致各种漏洞的发现和记录。
Table 3 shows the detailed distribution of benchmark data used in the subject studies. As it is observable, SARD and NVD are the most widely used sources of data in the Benchmark category. SARD is a comprehensive set of test cases created exclusively for testing software systems. It was developed by NIST11 as part of their efforts to improve the quality and safety of software systems. SARD offers a wide range of synthetic and real-world test scenarios intended to reflect many sorts of software vulnerabilities. Another major source of benchmark data is NVD, which is a comprehensive repository of publicly disclosed software vulnerabilities. NVD entries are based on the CVE system, which provides standardized identifiers and descriptions for each vulnerability. CVEs are assigned by CVE Numbering Authorities12 and are a cornerstone of NVD. Each entry in NVD includes detailed information about the vulnerability, such as its description, severity (using the Common Vulnerability Scoring System), impacted software versions, references to related advisories, and mitigation advice. Smartbugs Wild13 is also the third most commonly used (accounting for 12 studies) dataset for software vulnerability detection within the field of smart contracts. Smartbugs Wild contains more than 47K smart contracts mined from the main network of Ethereum, which includes a wide variety of real-world smart contracts, providing a useful dataset for testing and assessing vulnerability detection techniques. Please note that the key factor confirming the validity of a benchmark dataset is its continuous updating. As the nature of vulnerabilities evolves and more zero-day vulnerabilities emerge, these datasets need to be updated to reflect the latest software vulnerability patterns. This is why researchers do not rely solely on benchmark data for building ML models.
表 3 显示了主题研究中使用的基准数据的详细分布。观察可知,SARD 和 NVD 是基准类别中最广泛使用的数据来源。SARD 是一套专为测试软件系统而创建的测试用例集合。它是作为 NIST 11 提高软件系统质量和安全性的努力之一而开发的。SARD 提供了一系列合成和现实世界的测试场景,旨在反映许多类型的软件漏洞。基准数据的另一个主要来源是 NVD,它是一个公开披露的软件漏洞的综合性存储库。NVD 条目基于 CVE 系统,该系统为每个漏洞提供标准化的标识符和描述。CVE 由 CVE 编号机构 12 分配,是 NVD 的基石。NVD 中的每个条目都包含有关漏洞的详细信息,例如其描述、严重性(使用通用漏洞评分系统)、受影响的软件版本、相关警告的引用以及缓解建议。 Smartbugs Wild 13 也是智能合约领域软件漏洞检测中第三常用的(占 12 项研究)数据集。Smartbugs Wild 包含从以太坊主网络挖掘的超过 47K 个智能合约,其中包括各种现实世界的智能合约,为测试和评估漏洞检测技术提供了有用的数据集。请注意,确认基准数据集有效性的关键因素是其持续更新。随着漏洞性质的变化和更多零日漏洞的出现,这些数据集需要更新以反映最新的软件漏洞模式。这就是为什么研究人员在构建机器学习模型时不仅仅依赖于基准数据。
Table 3.
No.  第号Source  # Studies  # 研究References  参考文献
1SARD33[9, 11, 20, 21, 31, 34, 36, 37, 47, 49, 63, 68, 83, 84, 85, 87, 95, 137, 138, 139, 140, 141, 142, 143, 148, 152, 155, 157, 158, 165, 179, 180, 189]
[9, 11, 20, 21, 31, 34, 36, 37, 47, 49, 63, 68, 83, 84, 85, 87, 95, 137, 138, 139, 140, 141, 142, 143, 148, 152, 155, 157, 158, 165, 179, 180, 189]
2NVD32[10, 11, 19, 30, 31, 32, 36, 37, 49, 54, 63, 68, 70, 73, 83, 84, 85, 94, 95, 113, 132, 137, 142, 143, 155, 157, 158, 159, 174, 179, 180, 189]
[10, 11, 19, 30, 31, 32, 36, 37, 49, 54, 63, 68, 70, 73, 83, 84, 85, 94, 95, 113, 132, 137, 142, 143, 155, 157, 158, 159, 174, 179, 180, 189]
3Smartbugs Wild  智能虫12[7, 13, 57, 91, 103, 104, 105, 134, 153, 169, 177, 185]
[7, 13, 57, 91, 103, 104, 105, 134, 153, 169, 177, 185]
4Big-Vul  大漏洞8[32, 38, 81, 98, 108, 110, 129, 188]
[32, 38, 81, 98, 108, 110, 129, 188]
5Reveal  揭示6[78, 98, 140, 150, 151, 174]
[78, 98, 140, 150, 151, 174]
6Juliet Test Suit  朱丽叶测试套件5[23, 29, 75, 148, 164]
[23, 29, 75, 148, 164]
7ESC5[76, 96, 97, 169, 187]
[76, 96, 97, 169, 187]
8D2A5[22, 29, 127, 140, 174]
[22, 29, 127, 140, 174]
9SolidiFi-benchmark5[7, 103, 104, 105, 134]
[7, 103, 104, 105, 134]
10Fan et al.  范等4[22, 78, 150, 151]
[ 22, 78, 150, 151 ]
11Vuldeepecker4[16, 17, 140, 190]
[16, 17, 140, 190]
12VSC4[76, 96, 97, 187]
[76, 96, 97, 187]
13NDSS3[71, 75, 107]  [71, 75, 107]
14PROMISE3[74, 146, 172]  [74, 146, 172]
15FUNDED2[60, 174]  [60, 174]
16F-Droid2[26, 122]  [26, 122]
17Android/iOS2[26, 122]  [26, 122]
18SySeVr2[17, 190]  [17, 190]
20Others  其他人25[7, 32, 33, 41, 46, 48, 57, 60, 64, 70, 73, 81, 82, 95, 129, 134, 135, 136, 140, 142, 164, 166, 174, 181, 182]
[7, 32, 33, 41, 46, 48, 57, 60, 64, 70, 73, 81, 82, 95, 129, 134, 135, 136, 140, 142, 164, 166, 174, 181, 182]
Unique Total  独特总计99
Table 3. Detailed Distribution of Benchmark Sources
表 3. 基准数据源详细分布
Table 4 shows the detailed distribution of the Repository source of data. As shown, GitHub is the most popular source of data for software vulnerability detection, accounting for 27 subject studies. One benefit of utilizing GitHub as a data source is that it gives you access to real-world code written by developers, which can be used to train and test vulnerability detection models. The second commonly used source of repository data is the CVE system, which is a widely recognized and utilized framework for identifying, cataloging, and referencing publicly disclosed vulnerabilities. Each vulnerability in the CVE system is given a unique identification known as a CVE ID (e.g., CVE-2023-33976). This standardized identifier facilitates easy reference and communication across various platforms and tools. CVE entries provide detailed descriptions of vulnerabilities, outlining the nature of the issue, the affected software, and the potential impacts. The third commonly used source of repository data is Etherscan,14 a popular blockchain explorer for the Ethereum blockchain. Etherscan provides users with extensive information about Ethereum transactions, addresses, tokens, and smart contracts. It offers detailed insights into deployed smart contracts, including the contract’s source code (if verified), transactions, and execution history. Users can access the complete history of transactions involving a smart contract, with details about function calls, input parameters, and transaction results.
表 4 显示了数据存储库源的数据详细分布。如图所示,GitHub 是软件漏洞检测数据的最受欢迎来源,占 27 项研究。利用 GitHub 作为数据源的一个好处是,它为您提供了开发者编写的真实世界代码的访问权限,这些代码可用于训练和测试漏洞检测模型。第二个常用的存储库数据来源是 CVE 系统,这是一个广泛认可和使用的框架,用于识别、编目和引用公开披露的漏洞。CVE 系统中的每个漏洞都被赋予一个唯一的标识符,称为 CVE ID(例如,CVE-2023-33976)。这个标准化标识符促进了在各种平台和工具之间的轻松引用和沟通。CVE 条目提供了漏洞的详细描述,概述了问题的性质、受影响的软件和潜在影响。第三个常用的存储库数据来源是 Etherscan, 14 ,一个流行的以太坊区块链浏览器。 Etherscan 为用户提供关于以太坊交易、地址、代币和智能合约的详细信息。它提供了对已部署智能合约的详细洞察,包括合约的源代码(如果已验证)、交易和执行历史。用户可以访问涉及智能合约的所有交易的完整历史,包括函数调用、输入参数和交易结果。
Table 4.
No.  第号Source  # Studies  # 研究References  参考文献
1GitHub27[10, 11, 19, 20, 28, 48, 54, 73, 79, 80, 93, 94, 106, 111, 113, 114, 115, 118, 120, 132, 142, 147, 149, 159, 175, 176, 184]
[10, 11, 19, 20, 28, 48, 54, 73, 79, 80, 93, 94, 106, 111, 113, 114, 115, 118, 120, 132, 142, 147, 149, 159, 175, 176, 184]
2CVE20[9, 11, 19, 38, 47, 58, 60, 75, 87, 94, 113, 115, 131, 132, 141, 147, 152, 154, 174, 176]
[9, 11, 19, 38, 47, 58, 60, 75, 87, 94, 113, 115, 131, 132, 141, 147, 152, 154, 174, 176]
3Etherscan  以太坊浏览器13[4, 7, 8, 58, 62, 82, 120, 125, 135, 168, 171, 176, 178]
[4, 7, 8, 58, 62, 82, 120, 125, 135, 168, 171, 176, 178]
4Bugzilla4[19, 114, 166, 184]
[19, 114, 166, 184]
5Jira3[19, 80, 184]  [19, 80, 184]
6PyPI1[3]  [ 3 ]
Unique Total  独特总计51
Table 4. Detailed Distribution of Repositories Used for Collecting Data
表 4. 用于收集数据所使用的存储库的详细分布

4.2.2 RQ2.2: What Are the Most Commonly Used Data Types?.
4.2.2 RQ2.2:最常见的使用数据类型是什么?

When it comes to detecting software vulnerabilities, datasets can have varying data types. Existing software vulnerability detection models, for example, can find vulnerabilities in source code or commits. It is crucial to carefully examine these data types, as they require different preprocessing techniques and must be represented differently when using ML models. Additionally, distinct data types necessitate different architectural approaches for ML models. This section provides an overview of the various data types and their distributions. We classified the data types of the employed datasets into four broad categories: Code, Text, Numerical, and Hybrid.
在检测软件漏洞方面,数据集可以包含不同的数据类型。例如,现有的软件漏洞检测模型可以在源代码或提交中找到漏洞。仔细检查这些数据类型至关重要,因为它们需要不同的预处理技术,并且在使用机器学习模型时必须以不同的方式表示。此外,不同的数据类型需要为机器学习模型采用不同的架构方法。本节概述了各种数据类型及其分布。我们将所使用数据集的数据类型分为四大类:代码、文本、数值和混合。
The majority of the subject studies (92.7%) primarily focus on analyzing source code for software vulnerability detection, underscoring the importance of code-level analysis in identifying vulnerabilities. Repository-level data, such as textual reports and logs, account for 1.4%, whereas commit characteristics (numerical data) account for 2.8%. Additionally, 2.8% of the studies adopt a hybrid approach, combining both code-level analysis and repository-level data.
大多数研究对象(92.7%)主要关注分析源代码以检测软件漏洞,强调了在识别漏洞中代码级分析的重要性。仓库级数据(如文本报告和日志)占 1.4%,而提交特征(数值数据)占 2.8%。此外,2.8%的研究采用混合方法,结合了代码级分析和仓库级数据。
Table 5 elaborates on the detailed data type categories used in the subject studies. The table shows that 128 subject studies used a code-based category and the major data type of this category is Source code [34, 179]. Binary code is the second major data type in the code-based category [58, 117], accounting for 18 subject studies.
表 5 详细说明了在主题研究中使用的详细数据类型类别。表格显示,128 项主题研究使用了基于代码的类别,该类别的主要数据类型是源代码[34, 179]。二进制代码是基于代码类别的第二大主要数据类型[58, 117],占 18 项主题研究。
Table 5.
Category  类别Data Type  数据类型# Studies  # 研究Total  总计References  参考文献
Code based  基于代码Source code  源代码108128[3, 7, 8, 9, 10, 11, 13, 16, 20, 21, 22, 23, 26, 28, 29, 31, 32, 34, 36, 37, 38, 41, 47, 48, 49, 54, 60, 63, 64, 68, 70, 73, 74, 77, 78, 79, 80, 81, 82, 83, 84, 85, 87, 91, 92, 93, 94, 95, 96, 97, 98, 103, 104, 106, 108, 109, 110, 113, 118, 120, 122, 126, 127, 129, 131, 132, 134, 135, 136, 137, 138, 140, 141, 142, 143, 144, 146, 147, 149, 150, 151, 152, 153, 154, 157, 158, 159, 162, 163, 165, 169, 170, 171, 172, 174, 175, 176, 177, 179, 180, 181, 183, 185, 186, 187, 188, 189, 190]
[3, 7, 8, 9, 10, 11, 13, 16, 20, 21, 22, 23, 26, 28, 29, 31, 32, 34, 36, 37, 38, 41, 47, 48, 49, 54, 60, 63, 64, 68, 70, 73, 74, 77, 78, 79, 80, 81, 82, 83, 84, 85, 87, 91, 92, 93, 94, 95, 96, 97, 98, 103, 104, 106, 108, 109, 110, 113, 118, 120, 122, 126, 127, 129, 131, 132, 134, 135, 136, 137, 138, 140, 141, 142, 143, 144, 146, 147, 149, 150, 151, 152, 153, 154, 157, 158, 159, 162, 163, 165, 169, 170, 171, 172, 174, 175, 176, 177, 179, 180, 181, 183, 185, 186, 187, 188, 189, 190]
 Binary code  二进制代码18 [4, 45, 46, 57, 58, 62, 71, 75, 105, 107, 117, 125, 139, 148, 164, 168, 178, 182]
[4, 45, 46, 57, 58, 62, 71, 75, 105, 107, 117, 125, 139, 148, 164, 168, 178, 182]
 Image  图像2 [76, 155]  [76, 155]
Hybrid  混合44[17, 19, 30, 56]
[17, 19, 30, 56]
Commit Metrics  提交指标44[111, 114, 115, 166]
[111, 114, 115, 166]
Text  文本22[33, 184]  [33, 184]
Unique Total  独特总计138
Table 5. Detailed Data Types Used in the Subject Studies
表 5. 主题研究中使用的详细数据类型

4.2.3 RQ2.3: What Are the Most Commonly Used Input Representations?.
4.2.3 RQ2.3:最常见的输入表示是什么?

As noted in earlier sections, research studies focusing on software vulnerability detection rely on diverse sources of data and data types. This variability urges the adoption of various representation strategies, architectural approaches, and design assumptions for ML models.
如前文所述,专注于软件漏洞检测的研究依赖于多种数据来源和数据类型。这种多样性促使采用各种表示策略、架构方法和设计假设来构建机器学习模型。
We classified the input representation of employed datasets into five broad categories: Graph, Token, Tree, Commit Metrics, and Hybrid. The most popular input representation is the use of Graph, accounting for 57.2% of the subject studies. Token follows closely, representing a substantial portion (24.6%) of the subject studies. Tree representation is the third most common approach, accounting for 11.5% of the subject studies. The Commit Metrics and Hybrid categories have the smallest portion, accounting for 2.8% and 2.1% of the subject studies, respectively. In the following paragraphs, we elaborate on each category in detail.
我们将所使用数据集的输入表示分为五大类:图、标记、树、提交指标和混合。最流行的输入表示是图的使用,占主题研究的 57.2%。标记紧随其后,代表主题研究的大比例(24.6%)。树表示是第三种最常见的方法,占主题研究的 11.5%。提交指标和混合类别所占比例最小,分别占主题研究的 2.8%和 2.1%。在接下来的段落中,我们将详细阐述每一类。
Graph/Tree-Based Representation [63, 126]. This type allows for the detection of complex patterns and relationships between different code elements. By representing source code as a graph or tree, we can capture not only the syntax and structure of the code but also its semantics, control flow, and dataflow. There are many graph/tree-based representation techniques, such as AST (Abstract Syntax Trees) [100, 161] and CPG (Code Property Graph) [34, 41, 183] used to transform source code into AST and CPG representations.
图/树状表示[63, 126]。这种类型允许检测不同代码元素之间的复杂模式和关系。通过将源代码表示为图或树,我们不仅可以捕获代码的语法和结构,还可以捕获其语义、控制流和数据流。有许多基于图/树的表示技术,例如 AST(抽象语法树)[100, 161]和 CPG(代码属性图)[34, 41, 183],用于将源代码转换为 AST 和 CPG 表示。
Token-Based Representation [45, 140]. This typetreats the source code as string token sequences and then transforms source code into token vectors. The input data is first broken down into tokens, which are then turned into numerical vectors that can be processed by ML algorithms. Tokenization involves breaking down a string of text or source code into smaller units, or tokens, which can then be used as the basis for further analysis. In the case of source code, tokens might include keywords, operators, variables, and other elements of the programming language syntax.
基于标记的表示[45, 140]。这种类型将源代码视为字符串标记序列,然后将源代码转换为标记向量。首先将输入数据分解为标记,然后将这些标记转换为可以由机器学习算法处理的数值向量。标记化涉及将文本或源代码字符串分解为更小的单元,即标记,然后可以用作进一步分析的基础。在源代码的情况下,标记可能包括关键字、运算符、变量和编程语言语法的其他元素。
Commit Metrics [114, 115]. This type leverages the metrics extracted from commits to represent code commits. Commit-level features, such as the number of code changes, the number of modified lines, and the programming language used, can be used to train ML models. These models may then learn patterns and connections between commit attributes and the presence of vulnerabilities, allowing for automatic detection of new commits.
提交指标[114, 115]。此类方法利用从提交中提取的指标来表示代码提交。提交级别的特征,如代码更改数量、修改行数以及使用的编程语言,可以用于训练机器学习模型。这些模型可以学习提交属性与漏洞存在之间的模式和联系,从而实现新提交的自动检测。
Hybrid Representation [19, 30]. This type employs a variety of representations to discover software security vulnerabilities. Combining diverse representations of input data can result in a more comprehensive and richer input representation of source code, which can help vulnerability detection techniques perform better in tasks like prediction and detection.
混合表示法[19, 30]。此类方法采用多种表示法来发现软件安全漏洞。结合输入数据的多样化表示可以产生更全面、更丰富的源代码输入表示,这有助于漏洞检测技术在预测和检测等任务中表现更佳。
Table 6 shows the representation techniques distributed by different artifacts used by ML models. It is evident that Graph/Tree-based representation is the most prevalent technique, with a total of 96 studies employing this method. These studies represent the input to ML models using various forms: Source code as a graph, Source code as a tree, Binary code as a graph, and Binary code as a tree. Notably, Source code as a graph is the predominant representation technique, used by 71 studies. Furthermore, 33 subject studies employed Token-based representation. Among them, 23 studies represented source code as a token sequence, 9 studies modeled binary code as tokens, and 2 studies represented text as token sequences.
表 6 展示了机器学习模型使用的不同工具所分布的表示技术。显然,基于图/树表示是最普遍的技术,共有 96 项研究采用这种方法。这些研究使用各种形式将输入表示为机器学习模型:源代码作为图、源代码作为树、二进制代码作为图和二进制代码作为树。值得注意的是,源代码作为图是主要的表示技术,被 71 项研究使用。此外,33 项主题研究采用了基于标记的表示。其中,23 项研究将源代码表示为标记序列,9 项研究将二进制代码建模为标记,2 项研究将文本表示为标记序列。
Table 6.
Category  类别Artifact  文物# Studies  # 研究Total  总计References  参考文献
Graph/Tree  图/树Source code as a graph
源代码作为图
7196[3, 7, 8, 9, 10, 11, 13, 20, 21, 29, 31, 34, 36, 38, 41, 47, 49, 60, 63, 64, 68, 70, 77, 78, 79, 81, 82, 84, 85, 91, 92, 93, 95, 96, 97, 103, 104, 106, 110, 120, 126, 127, 129, 132, 134, 136, 138, 141, 143, 144, 147, 150, 151, 152, 153, 154, 157, 158, 165, 170, 172, 174, 175, 179, 180, 181, 183, 185, 187, 188, 189]
[3, 7, 8, 9, 10, 11, 13, 20, 21, 29, 31, 34, 36, 38, 41, 47, 49, 60, 63, 64, 68, 70, 77, 78, 79, 81, 82, 84, 85, 91, 92, 93, 95, 96, 97, 103, 104, 106, 110, 120, 126, 127, 129, 132, 134, 136, 138, 141, 143, 144, 147, 150, 151, 152, 153, 154, 157, 158, 165, 170, 172, 174, 175, 179, 180, 181, 183, 185, 187, 188, 189]
 Source code as a tree
源代码作为树
15 [22, 28, 32, 74, 80, 83, 87, 94, 98, 142, 146, 159, 176, 177, 186]
[22, 28, 32, 74, 80, 83, 87, 94, 98, 142, 146, 159, 176, 177, 186]
 Binary code as graph  二进制代码作为图形8 [46, 58, 75, 105, 117, 139, 148, 178]
[46, 58, 75, 105, 117, 139, 148, 178]
 Binary code as tree  二进制代码作为树1 [4]  [4]
Token  标记Source code as a token
源代码作为标记
2333[16, 23, 26, 37, 48, 54, 56, 73, 108, 109, 113, 118, 122, 131, 135, 137, 140, 149, 162, 163, 169, 171, 190]
[16, 23, 26, 37, 48, 54, 56, 73, 108, 109, 113, 118, 122, 131, 135, 137, 140, 149, 162, 163, 169, 171, 190]
 Binary code as a token
二进制代码作为标记
9 [45, 57, 62, 71, 107, 125, 164, 168, 182]
[45, 57, 62, 71, 107, 125, 164, 168, 182]
 Text as a token  文本作为标记2 [33, 184]  [33, 184]
Commit Metrics  提交指标44[111, 114, 115, 166]
[111, 114, 115, 166]
Hybrid  混合33[17, 19, 30]  [17, 19, 30]
Image  图像22[76, 155]  [76, 155]
Unique Total  独特总计138 
Table 6. Distribution of Input Representations in the Subject Studies
表 6. 主题研究中的输入表示分布
Figure 4 shows the distribution of data type representation in software vulnerability detection studies over time. As shown in the figure, Graph-based representation shows a substantial presence compared to other input representation techniques. There are a couple of reasons for this trend. First, graphs provide a natural and intuitive way to represent the structural relationships within the source code. By modeling the code as a graph, the relationships between functions, classes, methods, and variables can be captured effectively. Token-based representation has also gained popularity, with a peak occurrence in 2023. This is because it provides a fine-grained representation of the code. It simplifies the code analysis process by reducing the complexity of the code to a sequence of tokens, making it easier to apply ML models.
图 4 显示了软件漏洞检测研究中数据类型表示的分布情况。如图所示,基于图的表达方式与其他输入表示技术相比,具有显著的存在感。这种趋势有几个原因。首先,图提供了一种自然直观的方式来表示源代码中的结构关系。通过将代码建模为图,可以有效地捕捉函数、类、方法和变量之间的关系。基于标记的表达方式也获得了流行,2023 年达到峰值。这是因为它提供了代码的细粒度表示。它通过将代码的复杂性简化为标记序列,简化了代码分析过程,使得应用机器学习模型变得更加容易。
Fig. 4.
Fig. 4. Distribution of data type representations in software vulnerability detection studies over time.
图 4. 软件漏洞检测研究中数据类型表示随时间分布

4.2.4 RQ2.4: What Are the Most Commonly Used Embedding Approaches?.
4.2.4 RQ2.4:最常见的嵌入方法有哪些?

In this section, we look at embedding methods that can convert these representations explored in the previous section into inputs that ML models can understand. The representation approaches are in a human-readable format and cannot be directly understood by computers. As a result, researchers applied various embedding approaches to translate these representations into numerical format. We discuss the embedding techniques in the following paragraphs.
在这一节中,我们探讨可以将上一节中探索的这些表示转换为机器学习模型可以理解的输入的嵌入方法。表示方法以人类可读的格式存在,不能被计算机直接理解。因此,研究人员应用了各种嵌入方法将这些表示转换为数值格式。我们将在下一段中讨论嵌入技术。
Graph Embedding (32.6%) [97, 117]. This is the most commonly used embedding technique among the subject studies, accounting for 32.6%, which is mostly used by graph neural networks for its capability to capture the structural relationships between different code components.
图嵌入(32.6%)[ 97, 117]。这是主题研究中最常用的嵌入技术,占 32.6%,主要被图神经网络使用,因为它能够捕捉不同代码组件之间的结构关系。
Token Vector Embedding (29.7%) [79, 190]. This is the second most popular technique used by subject studies, accounting for 29.7% of examined papers. In this technique, input is converted into a sequence of tokens and each token is transformed into a numeric value. Then, these values are fed into ML models for training operations.
标记向量嵌入(29.7%)[ 79, 190]。这是学科研究中使用最广泛的第二种技术,占所有审查论文的 29.7%。在这种技术中,输入被转换为一系列标记,每个标记被转换为一个数值。然后,这些数值被输入到机器学习模型中进行训练操作。
Hybrid (16.6%) [19, 41]. We find that 16.6% of the subject studies use multiple embedding techniques to convert inputs to ML models. Different embedding techniques capture different aspects of the data. By combining multiple techniques, researchers can leverage the complementary information provided by each technique. For example, some embedding techniques may focus on syntax, whereas others may capture semantic or contextual information.
混合(16.6%)[19, 41]。我们发现 16.6%的研究对象使用多种嵌入技术将输入转换为机器学习模型。不同的嵌入技术捕捉数据的不同方面。通过结合多种技术,研究人员可以利用每种技术提供的互补信息。例如,一些嵌入技术可能专注于句法,而其他技术可能捕捉语义或上下文信息。
Transformer Embedding (7.2%) [48, 153]. Transformer embedding is used in 7.2% of the subject studies. Despite its lower prevalence, the use of Transformers is notable because of their powerful capabilities in natural language processing, which can be adapted to analyze code.
Transformer 嵌入(7.2%)[ 48, 153]。Transformer 嵌入在 7.2%的主题研究中被使用。尽管其使用频率较低,但 Transformer 的使用值得关注,因为它们在自然语言处理方面的强大能力可以适应分析代码。
Others (13.7%) [126, 146, 163]. The remaining 13.7% that seldom emerge and do not belong to any group are classified as Others.
其他人(13.7%)[126, 146, 163]。其余 13.7%,很少出现且不属于任何群体,被归类为“其他人”。

4.3 RQ3: What Is the Distribution of ML and DL Models Used for Software Vulnerability Detection?
4.3 研究问题 3:用于软件漏洞检测的机器学习(ML)和深度学习(DL)模型的分布情况如何?

In this section, we provide detailed information about the various ML models utilized for software vulnerability detection. Initially, we present an analysis of the usage distribution of models based on the subject studies. Subsequently, we investigate the distribution of the usage of specific DL models used in the subject studies over time. However, we have not extensively analyzed the distribution of classic ML models since their prevalence is relatively small compared to DL models. However, we provide a comprehensive list of classic ML models that have been commonly used in subject studies.
本节中,我们提供了关于用于软件漏洞检测的各种机器学习模型的详细信息。最初,我们分析了基于主题研究的模型使用分布情况。随后,我们研究了在主题研究中使用的特定深度学习模型的使用分布随时间的变化。然而,由于与深度学习模型相比,经典机器学习模型的普及率相对较小,我们没有对经典机器学习模型的使用分布进行深入分析。但是,我们提供了一份在主题研究中常用到的经典机器学习模型的完整列表。
The majority of studies (88.4%) use DL models for software vulnerability detection [82, 127, 159], whereas only 7.2% of the studies use classic ML models [19, 107, 184]. Some of the subject studies also use a combination of DL and ML models, accounting for 1.4% of studies. The remaining (2.8%) are classified as Others.
大多数研究(88.4%)使用深度学习模型进行软件漏洞检测[82, 127, 159],而只有 7.2%的研究使用经典机器学习模型[19, 107, 184]。部分研究对象研究还使用深度学习和机器学习模型的组合,占 1.4%的研究。其余(2.8%)被归类为其他。
The graph in Figure 5 illustrates the usage trend of DL models in detecting software vulnerabilities from 2016 to 2024. According to the trend, DL models were first introduced in 2016 for vulnerability detection, and since then, the use of RNNs for vulnerability detection has shown an upward trend. The graph also demonstrates a rising trend in using GNNs for vulnerability detection from 2021 to 2024. This can be because GNNs are more powerful than RNNs in detecting vulnerabilities, as they can capture more meaningful and semantic representations of input source code.
图 5 展示了从 2016 年到 2024 年深度学习模型在检测软件漏洞中的应用趋势。根据这一趋势,深度学习模型首次于 2016 年被引入用于漏洞检测,此后,用于漏洞检测的循环神经网络(RNNs)的使用呈现上升趋势。图表还显示了从 2021 年到 2024 年使用图神经网络(GNNs)进行漏洞检测的趋势上升。这可能是因为 GNNs 在检测漏洞方面比 RNNs 更强大,因为它们可以捕捉到更有意义和语义的输入源代码表示。
Fig. 5.
Fig. 5. Trend of DL models over time.
图 5. 随时间推移的深度学习模型趋势。
Table 7 shows the distribution of DL models used in the subject studies. As shown in the table, Recurrent Models are the most commonly used DL models for software vulnerability detection. In this category, BiLSTM is the most frequently used recurrent model, appearing in 20 studies. GRU and LSTM are also popular models with 14 and 13 studies, respectively. Graph Models are the second most widely used class of DL models for software vulnerability detection. It is observable that GCN is the most prevalent model, appearing in 22 studies. GNN, GGNN, and GAT are also commonly used, accounting for 13, 9, and 8 subject studies, respectively. The presence of these models highlights the importance of capturing graph structures and relationships between code elements in vulnerability detection. Convolutional Models are used in 19 studies. While not as prevalent as recurrent or graph models, CNNs are still considered effective for capturing local patterns and features in vulnerability detection tasks.
表 7 显示了在主题研究中使用的深度学习模型的分布。如表所示,循环模型是用于软件漏洞检测最常用的深度学习模型。在这个类别中,BiLSTM 是最常用的循环模型,出现在 20 项研究中。GRU 和 LSTM 也是流行的模型,分别有 14 和 13 项研究。图模型是用于软件漏洞检测的第二大广泛使用的深度学习模型类别。可以观察到,GCN 是最普遍的模型,出现在 22 项研究中。GNN、GGNN 和 GAT 也常被使用,分别占 13、9 和 8 项主题研究。这些模型的存在突出了在漏洞检测中捕获代码元素之间图结构和关系的重要性。卷积模型在 19 项研究中被使用。虽然不如循环或图模型普遍,但 CNNs 仍被认为在漏洞检测任务中能够有效地捕获局部模式和特征。
Table 7.
Category  类别Model Name  型号名称# Studies  # 研究Total  总计References  参考文献
Recurrent Models  循环模型BiLSTM  双向长短期记忆网络2065[47, 57, 63, 64, 83, 85, 87, 113, 120, 131, 136, 139, 148, 159, 168, 176, 182, 185, 189, 190]
[47, 57, 63, 64, 83, 85, 87, 113, 120, 131, 136, 139, 148, 159, 168, 176, 182, 185, 189, 190]
 GRU14 [47, 54, 63, 73, 76, 77, 78, 79, 125, 139, 142, 144, 147, 181]
[47, 54, 63, 73, 76, 77, 78, 79, 125, 139, 142, 144, 147, 181]
 LSTM13 [26, 28, 63, 82, 94, 95, 125, 139, 149, 152, 158, 181, 186]
[26, 28, 63, 82, 94, 95, 125, 139, 149, 152, 158, 181, 186]
 BGRU10 [32, 63, 68, 75, 83, 84, 138, 139, 164, 190]
[32, 63, 68, 75, 83, 84, 138, 139, 164, 190]
 TreeLSTM  Tree LSTM3 [8, 78, 147]  [8, 78, 147]
 RNN3 [37, 139, 154]  [37, 139, 154]
 BRNN2 [109, 139]  [109, 139]
Graph Models  图模型GCN2263[7, 8, 20, 21, 31, 41, 46, 75, 78, 81, 91, 97, 104, 110, 127, 136, 147, 150, 172, 179, 181, 187]
[7, 8, 20, 21, 31, 41, 46, 75, 78, 81, 91, 97, 104, 110, 127, 136, 147, 150, 172, 179, 181, 187]
 GNN13 [3, 8, 10, 11, 20, 28, 105, 106, 136, 143, 151, 152, 183]
[3, 8, 10, 11, 20, 28, 105, 106, 136, 143, 151, 152, 183]
 GGNN9 [29, 31, 36, 77, 92, 129, 142, 154, 188]
[29, 31, 36, 77, 92, 129, 142, 154, 188]
 GAT8 [20, 38, 41, 46, 110, 174, 175, 178]
[20, 38, 41, 46, 110, 174, 175, 178]
 RGCN4 [13, 31, 158, 180]
[ 13, 31, 158, 180 ]
 HGNN1 [103]  [ 103 ]
 RGAT1 [30]  [30]
 DGCNN1 [117]  [ 117 ]
 HGCN1 [68]  [ 68 ]
 GCL1 [144]  [144]
 BGNN1 [10]  [ 10 ]
 GGRU1 [157]  [ 157 ]
Convolutional Models  卷积模型CNN1119[17, 37, 48, 56, 73, 74, 79, 137, 138, 155, 190]
[17, 37, 48, 56, 73, 74, 79, 137, 138, 155, 190]
 TextCNN  文本 CNN6 [9, 64, 132, 141, 164, 176]
[9, 64, 132, 141, 164, 176]
 TextRCNN1 [30]  [30]
 QCNN1 [60]  [60]
General Models  通用模型FCN213[81, 170]  [ 81, 170 ]
 TCN2 [16, 17]  [16, 17]
 Auto Encoders  自动编码器1 [71]  [ 71 ]
 Memory Neural Network  记忆神经网络1 [23]  [ 23 ]
 GAN1 [109]  [109]
 Feed Forward  前馈1 [118]  [ 118 ]
 Representation Learning  表示学习1 [108]  [ 108 ]
 DRSN1 [16]  [16]
 DCN1 [76]  [ 76 ]
 Others  其他人1 [4]  [4]
 DBN1 [146]  [ 146 ]
Transformers  变压器BERT29[93, 134]  [93, 134]
 GraphCodeBERT1 [153]  [153]
 CodeBERT  代码 BERT1 [111]  [ 111 ]
 HGT1 [165]  [165]
 GPT-41 [98]  [98]
 GPT-3.5_turbo1 [135]  [ 135 ]
 Code-T5  代码-T51 [169]  [ 169 ]
 Transformer Encoder  Transformer 编码器1 [171]  [ 171 ]
Attention Models  注意模型88[22, 34, 49, 62, 80, 96, 140, 177]
[22, 34, 49, 62, 80, 96, 140, 177]
Unique Total  独特总计124
Table 7. Distribution of DL Models in the Subject Studies
表 7. 主题研究中的深度学习模型分布
Table 8 shows the distribution of classic ML models used in subject studies. As shown in the table, Random Forest is the most frequently used ML model, appearing in seven studies. Naive Bayes, SVM, and KNN are popular choices, with 5, 4, and 4 occurrences, respectively. Random Forest is an ensemble learning method that builds multiple Decision Trees and merges their outputs to make a final prediction. This ensemble approach helps improve the robustness and accuracy of detection, making it effective for detecting software vulnerabilities. Naive Bayes is popular because it is computationally efficient and easy to implement. It requires less training data compared to more complex algorithms, making it faster in both training and prediction phases [2, 50, 51].
表 8 显示了在主题研究中使用的经典机器学习模型的分布。如表所示,随机森林是最常用的机器学习模型,在七项研究中出现。朴素贝叶斯、SVM 和 KNN 是流行的选择,分别出现 5 次、4 次和 4 次。随机森林是一种集成学习方法,它构建多个决策树并将它们的输出合并以做出最终预测。这种集成方法有助于提高检测的鲁棒性和准确性,使其在检测软件漏洞方面非常有效。朴素贝叶斯之所以受欢迎,是因为它计算效率高且易于实现。与更复杂的算法相比,它需要的训练数据更少,因此在训练和预测阶段都更快[2, 50, 51]。
Table 8.
Category  类别Model Name  型号名称# Studies  # 研究Total  总计References  参考文献
Classic ML Models  经典机器学习模型Random Forest  随机森林738[19, 70, 94, 114, 122, 166, 184]
[19, 70, 94, 114, 122, 166, 184]
 Naive Bayes  朴素贝叶斯5 [19, 70, 114, 122, 184]
[19, 70, 114, 122, 184]
 SVM4 [19, 115, 122, 184]
[19, 115, 122, 184]
 K-NN4 [19, 95, 122, 184]
[19, 95, 122, 184]
 Logistic Regression  逻辑回归3 [70, 114, 184]  [70, 114, 184]
 AdaBoost3 [19, 33, 184]  [19, 33, 184]
 Decision Tree  决策树2 [70, 122]  [70, 122]
 Gradient Boosting  梯度提升2 [19, 184]  [ 19, 184 ]
 PCA1 [162]  [ 162 ]
 Kernel Machine  内核机器1 [107]  [ 107 ]
 ADTree  AD 树1 [114]  [ 114 ]
 TAN1 [70]  [70]
 Gradient Boosting Classifier
梯度提升分类器
1 [33]  [33]
 SGDClassifier  SGD 分类器1 [33]  [33]
 AdaBoostClassifier  AdaBoost 分类器1 [33]  [33]
 TrAdaBoost1 [33]  [33]
Distance/Similarity Measures
距离/相似度度量
33[45, 58, 163]  [45, 58, 163]
Language Models  语言模型N-gram11[126]  [ 126 ]
Unique Total  独特总计14
Table 8. Distribution of Classic ML and Other Models in the Subject Studies
表 8. 主题研究中经典机器学习和其他模型的分布
Table 7 also shows one study that uses n-gram models for software vulnerability detection. N-gram models serve an important role in capturing local context using word sequence probabilities. An n-gram model predicts the likelihood of a word based on the preceding n-1 words, successfully describing the local structure of the language [18, 123]. N-gram models are effective at identifying patterns within sequences of tokens (e.g., words, characters, or code elements). In the context of code, an n-gram model can be trained on large codebases to understand the typical sequences of code elements.
表 7 还显示了一项使用 n-gram 模型进行软件漏洞检测的研究。n-gram 模型在利用词序列概率捕捉局部上下文中发挥着重要作用。n-gram 模型根据前 n-1 个词预测一个词的可能性,成功地描述了语言的局部结构[18, 123]。n-gram 模型在识别标记序列(例如,单词、字符或代码元素)中的模式方面非常有效。在代码的上下文中,n-gram 模型可以在大型代码库上训练,以理解代码元素的典型序列。

4.3.1 Comparison of ML Models with Manual Code Analysis.
4.3.1 机器学习模型与人工代码分析的对比

When it comes to software vulnerability detection, ML models are far superior to conventional manual code analysis techniques. ML-based software vulnerability detection facilitates efficiency and scalability by automating the analysis of massive amounts of code. This ability is essential in the current software development environment, where quick and comprehensive security evaluations are required due to complex systems and frequent changes. This efficiency lowers the possibility of human error that comes with manual inspections while simultaneously speeding up the detection process. Additionally, preemptive threat detection and ongoing monitoring are made easier by ML models. But even with these benefits, human code analysis is still essential for handling some crucial situations. The best people to handle special circumstances like zero-day vulnerabilities [5]—vulnerabilities when exploits are found and used before software developers have a chance to mitigate them—are human analysts.
在软件漏洞检测方面,机器学习模型远优于传统的手动代码分析技术。基于机器学习的软件漏洞检测通过自动化大量代码的分析,提高了效率和可扩展性。这种能力在当前软件开发环境中至关重要,因为复杂的系统和频繁的变更要求快速而全面的网络安全评估。这种效率降低了手动检查中可能出现的人为错误,同时加快了检测过程。此外,机器学习模型还简化了先发制人的威胁检测和持续监控。但即便有这些好处,人类代码分析在处理一些关键情况时仍然是必不可少的。处理像零日漏洞[5]这样的特殊情况的最好人选是人类分析师——即在软件开发者有机会缓解之前,攻击者就已经发现并使用了漏洞的情况。

4.3.2 Transfer Learning for Software Vulnerability Detection.
4.3.2 软件漏洞检测的迁移学习

Transfer learning is crucial for software vulnerability detection. First, high-quality labeled datasets for software vulnerability detection are often scarce and expensive to produce because labeling requires expert knowledge [19, 87, 95]. Second, software vulnerability detection often requires understanding domain-specific languages and contexts, which can vary widely between different applications and systems [33, 95].
迁移学习对于软件漏洞检测至关重要。首先,用于软件漏洞检测的高质量标注数据集通常稀缺且生产成本高昂,因为标注需要专业知识[19, 87, 95]。其次,软件漏洞检测通常需要理解特定领域的语言和上下文,这些在不同应用程序和系统之间可能差异很大[33, 95]。
Among the studies we reviewed, six studies utilized transfer learning for software vulnerability detection. Liu et al. [95] minimized distribution disparities between domains by improving cross-domain representations using a metric transfer learning framework). With this method, the model can still generalize well even in cases when the projects or vulnerability types in the test and training data are different. Du et al. [33] presented a system for detecting software vulnerabilities that makes use of the transfer learning algorithm TrAdaBoost. By using labeled bug reports from one project to predict issue categories in another where labeled data is insufficient, their method identifies bug types across several projects. Sendner et al. [125] customized transfer learning for smart contract software vulnerability detection. Their method, called ESCORT, uses a common feature extractor to understand the semantics of the bytecode, with different branches responding to different kinds of vulnerabilities. The transfer learning capability of ESCORT increases system flexibility by making it easier to include new vulnerability types with less data. Zhou et al. [182] presented a framework for adversarial multi-task learning that integrates common and task-specific components to maximize feature extraction while using adversarial transfer learning to reduce noise and interference between private and general features. Li et al. [77] explored the identification of cross-domain vulnerabilities using VulGDA, a system that combines graph embedding and deep-domain adaptation methods. To capture syntactic and semantic links and improve feature extraction through domain-invariant feature generation, VulGDA transforms samples of source code into graph representations. Zhang et al. [174] proposed CPVD, a cross-domain vulnerability detection method that utilizes labeled data from one source to accurately predict vulnerability labels. CPVD encodes code as property graphs and uses a graph attention network and convolution pooling network for feature extraction.
在所审查的研究中,有六项研究使用了迁移学习进行软件漏洞检测。刘等人[95]通过改进跨域表示来最小化域之间的分布差异(使用度量迁移学习框架)。这种方法使得模型即使在测试和训练数据中的项目或漏洞类型不同的情况下,仍能很好地泛化。杜等人[33]提出了一种利用迁移学习算法 TrAdaBoost 检测软件漏洞的系统。通过使用一个项目的标记错误报告来预测另一个项目中标记数据不足的问题类别,他们的方法能够识别出多个项目中的错误类型。森纳等人[125]为智能合约软件漏洞检测定制了迁移学习。他们的方法称为 ESCORT,使用一个通用的特征提取器来理解字节码的语义,不同的分支响应不同类型的漏洞。ESCORT 的迁移学习能力通过使包含新的漏洞类型更容易,从而提高了系统的灵活性。周等人 [182] 提出了一种对抗多任务学习框架,该框架整合了通用和任务特定组件,以最大化特征提取,同时使用对抗迁移学习来减少私有特征和通用特征之间的噪声和干扰。Li 等人[77]探讨了使用 VulGDA 系统识别跨域漏洞,该系统结合了图嵌入和深度域自适应方法。为了捕获句法和语义链接并通过域不变特征生成来提高特征提取,VulGDA 将源代码样本转换为图表示。Zhang 等人[174]提出了 CPVD,一种跨域漏洞检测方法,它利用一个来源的标记数据来准确预测漏洞标签。CPVD 将代码编码为属性图,并使用图注意力网络和卷积池化网络进行特征提取。

4.4 RQ4: What Is the Most Frequent Type of Vulnerability Covered in the Subject Studies?
4.4 研究问题 4:在研究对象中,最常见的漏洞类型是什么?

Software vulnerability detection datasets support different vulnerability types. For example, NVD and SARD benchmarks together support 96 types of vulnerabilities. This RQ intends to summarize the most popular vulnerability types covered by subject studies and their frequency. Table 9 shows the statistics regarding the vulnerability types. The column CWE-Type indicates the type of CWE.15 There are many categories on the CWE website for vulnerability categorization including categorization by software development, categorization by hardware design, and categorization by research concepts. The categorization shown in Table 9 is based on categorization by research concepts, as this categorization is a perfect match for vulnerability types reported in the subject studies.
软件漏洞检测数据集支持不同类型的漏洞。例如,NVD 和 SARD 基准共同支持 96 种漏洞类型。本研究问题旨在总结主题研究中涵盖的最流行漏洞类型及其频率。表 9 显示了关于漏洞类型的统计数据。CWE-Type 列表示 CWE 的类型。 15 CWE 网站上有很多用于漏洞分类的类别,包括按软件开发分类、按硬件设计分类和按研究概念分类。表 9 中所示分类是基于研究概念分类的,因为这种分类与主题研究中报告的漏洞类型完美匹配。
Table 9.
Category  类别CWE-Type  CWE 类型Severity Score  严重度评分# Studies  # 研究Total  总计References  参考文献
Resource  资源CWE-11929121[9, 11, 16, 20, 29, 30, 34, 36, 37, 38, 47, 49, 71, 75, 85, 87, 95, 98, 107, 110, 121, 131, 132, 140, 141, 142, 151, 159, 164]
[9, 11, 16, 20, 29, 30, 34, 36, 37, 38, 47, 49, 71, 75, 85, 87, 95, 98, 107, 110, 121, 131, 132, 140, 141, 142, 151, 159, 164]
 CWE-47613 [11, 29, 30, 36, 47, 98, 110, 131, 132, 142, 151, 152, 159]
[11, 29, 30, 36, 47, 98, 110, 131, 132, 142, 151, 152, 159]
 CWE-39913 [9, 16, 34, 37, 49, 75, 85, 98, 110, 131, 132, 141, 159]
[9, 16, 34, 37, 49, 75, 85, 98, 110, 131, 132, 141, 159]
 CWE-40010 [9, 20, 30, 47, 132, 141, 142, 143, 159, 165]
[9, 20, 30, 47, 132, 141, 142, 143, 159, 165]
 CWE-2210 [9, 20, 30, 38, 41, 140, 141, 142, 151, 159]
[9, 20, 30, 38, 41, 140, 141, 142, 151, 159]
 CWE-7879 [9, 11, 20, 38, 98, 132, 141, 151, 165]
[9, 11, 20, 38, 98, 132, 141, 151, 165]
 CWE-1259 [9, 11, 20, 98, 110, 132, 141, 151, 152]
[9, 11, 20, 98, 110, 132, 141, 151, 152]
 CWE-416 9 [11, 29, 30, 98, 110, 131, 132, 151, 159]
[11, 29, 30, 98, 110, 131, 132, 151, 159]
 CWE-1227 [9, 11, 23, 121, 138, 141, 152]
[9, 11, 23, 121, 138, 141, 152]
 CWE-1216 [11, 121, 138, 141, 152, 164]
[11, 121, 138, 141, 152, 164]
 CWE-3626 [98, 110, 131, 140, 142, 151]
[98, 110, 131, 140, 142, 151]
Validation  验证CWE-201337[9, 20, 30, 38, 98, 110, 131, 132, 141, 142, 151, 159, 165]
[9, 20, 30, 38, 98, 110, 131, 132, 141, 142, 151, 159, 165]
 CWE-78 9 [9, 20, 41, 75, 83, 141, 142, 151, 165]
[9, 20, 41, 75, 83, 141, 142, 151, 165]
 CWE-8418 [4, 8, 62, 125, 134, 168, 169, 171]
[4, 8, 62, 125, 134, 168, 169, 171]
 CWE-2007 [30, 38, 98, 131, 132, 140, 142]
[30, 38, 98, 131, 132, 140, 142]
Numeric  数值CWE-190 2336[4, 8, 9, 20, 29, 38, 58, 62, 98, 110, 120, 125, 131, 132, 140, 141, 142, 143, 151, 152, 165, 168, 182]
[4, 8, 9, 20, 29, 38, 58, 62, 98, 110, 120, 125, 131, 132, 140, 141, 142, 143, 151, 152, 165, 168, 182]
 CWE-189 7 [30, 87, 98, 110, 131, 132, 190]
[30, 87, 98, 110, 131, 132, 190]
 CWE-191 6 [4, 125, 140, 142, 168, 182]
[4, 125, 140, 142, 168, 182]
Unique Total  独特总计48
Table 9. Top Vulnerability Types Covered in the Subject Studies
表 9. 主题研究中涵盖的主要漏洞类型
Table 9 indicates that the vulnerability category that receives the highest attendance is related to Resource vulnerabilities, mentioned in 121 studies. This category primarily involves managing a system’s resources, which are created, utilized, and disposed of according to a pre-defined set of instructions. It is observable that CWE-119 [95, 107, 121] is the most frequent vulnerability type addressed by the subject studies. This vulnerability occurs when a software system attempts to access or write to a memory location outside the permitted boundary of the system’s buffer. The second most frequent vulnerability type is Null Pointer Dereference (CWE-476), accounting for 13 subject studies. This vulnerability occurs when a program attempts to read or write to a memory location through a pointer that has not been properly initialized and points to NULL (no valid memory address).
表 9 表明,获得最高关注度的漏洞类别与资源漏洞相关,在 121 项研究中被提及。该类别主要涉及管理系统的资源,这些资源根据预定义的指令被创建、使用和废弃。观察发现,CWE-119 [95, 107, 121] 是被研究对象研究中最频繁的漏洞类型。这种漏洞发生在软件系统试图访问或写入系统缓冲区允许边界之外的内存位置时。第二常见的漏洞类型是空指针解引用(CWE-476),占 13 项研究对象。这种漏洞发生在程序试图通过未正确初始化且指向 NULL(无有效内存地址)的指针读取或写入内存位置时。
Validation-related vulnerabilities is the second major family of vulnerability types, covered by 37 subject studies. In this type, the attackers exploit input and output data when they are malformed or not validated properly. As can be seen, CWE-20 [20, 142] is the most frequent type of vulnerability, accounting for 13 subject studies. CWE-20 refers to a situation where input validation is not done properly in software systems, making them vulnerable to attacks by malicious individuals who can exploit input data. This occurs when the input data is not verified to be safe or in line with the pre-defined specifications. CWE-78 is the second major vulnerability type, covered by 9 subject studies [11, 20, 38]. This category of security vulnerability pertains to OS command injection, in which an external attacker can construct an OS command by using input data from components that have not been adequately verified.
验证相关漏洞是漏洞类型的第二大类,共有 37 项主题研究涉及。在这类漏洞中,攻击者在输入和输出数据格式错误或未正确验证时利用这些数据。如所见,CWE-20 [20, 142] 是最常见的漏洞类型,占 13 项主题研究。CWE-20 指的是软件系统中输入验证未正确执行的情况,这使得它们容易受到恶意个人利用输入数据进行的攻击。这种情况发生在输入数据未经过验证以确保其安全性或符合预定义规范时。CWE-78 是第二大漏洞类型,由 9 项主题研究涉及[11, 20, 38]。这类安全漏洞与操作系统命令注入有关,其中外部攻击者可以使用未经充分验证的组件的输入数据构建操作系统命令。
Vulnerabilities related to Numeric are the third most frequent type of vulnerabilities covered in the subject studies, accounting for 36 studies in total. Within this class, Integer Overflow (CWE-190) is the most frequently covered vulnerability type [20, 58, 142]. Integer overflow is a condition that occurs when an arithmetic operation attempts to create a numeric value that is outside the range that can be represented with a given number of bits. For example, an 8-bit unsigned integer can represent values from 0 to 255, whereas a 32-bit signed integer typically ranges from –2,147,483,648 to 2,147,483,647. When an arithmetic operation produces a value that exceeds these limits, an overflow occurs.
与数值相关的漏洞是主题研究中覆盖的第三种最常见的漏洞类型,共涉及 36 项研究。在此类别中,整数溢出(CWE-190)是最常被讨论的漏洞类型[20, 58, 142]。整数溢出是指当算术运算尝试创建一个超出用给定位数表示范围的数值时发生的情况。例如,一个 8 位无符号整数可以表示从 0 到 255 的值,而一个 32 位有符号整数通常范围从-2,147,483,648 到 2,147,483,647。当算术运算产生的值超过这些限制时,就会发生溢出。

4.5 RQ5: What Are the Most Frequently Used Tools for Software Vulnerability Detection?
4.5 研究问题 5:软件漏洞检测中最常用的工具是什么?

In this section, we summarize the most commonly used tools for software vulnerability detection. Table 10 shows the distribution of the tools. We summarized the tools into three categories, including Model Building Tools, Code Analysis/Compilation, and Data Tools.
本节中,我们总结了最常用的软件漏洞检测工具。表 10 显示了工具的分布。我们将工具分为三类,包括模型构建工具、代码分析/编译和数据工具。
Table 10.
Category  类别Tool Name  工具名称# Studies  # 研究Total  总计References  参考文献
Model Building Tools  模型构建工具Keras/TensorFlow42116[10, 11, 16, 17, 23, 26, 29, 32, 37, 41, 60, 62, 63, 64, 68, 71, 74, 76, 83, 84, 85, 87, 95, 96, 97, 106, 107, 109, 114, 118, 125, 131, 139, 142, 144, 148, 149, 154, 164, 186, 189, 190]
[10, 11, 16, 17, 23, 26, 29, 32, 37, 41, 60, 62, 63, 64, 68, 71, 74, 76, 83, 84, 85, 87, 95, 96, 97, 106, 107, 109, 114, 118, 125, 131, 139, 142, 144, 148, 149, 154, 164, 186, 189, 190]
 PyTorch42 [3, 8, 9, 13, 20, 21, 22, 28, 30, 31, 38, 48, 49, 54, 57, 64, 68, 77, 82, 91, 120, 127, 129, 132, 138, 140, 141, 143, 143, 151, 152, 155, 170, 174, 175, 178, 179, 180, 181, 183, 185, 188]
[3, 8, 9, 13, 20, 21, 22, 28, 30, 31, 38, 48, 49, 54, 57, 64, 68, 77, 82, 91, 120, 127, 129, 132, 138, 140, 141, 143, 143, 151, 152, 155, 170, 174, 175, 178, 179, 180, 181, 183, 185, 188]
Scikit-learn11 [19, 33, 41, 54, 60, 62, 70, 87, 142, 174, 175]
[19, 33, 41, 54, 60, 62, 70, 87, 142, 174, 175]
GenSim9 [41, 54, 64, 87, 95, 140, 154, 174, 183]
[41, 54, 64, 87, 95, 140, 154, 174, 183]
DGL6 [7, 8, 129, 174, 175, 180]
[7, 8, 129, 174, 175, 180]
Theano2 [26, 85]  [26, 85]
sent2vec2 [155, 188]  [155, 188]
Transformers  变压器2 [38, 48]  [38, 48]
Code Analysis/Compilation
代码分析/编译
Joern  乔恩2435[9, 11, 22, 29, 30, 68, 71, 110, 129, 132, 137, 141, 142, 143, 147, 150, 152, 155, 170, 174, 175, 183, 188, 189]
[9, 11, 22, 29, 30, 68, 71, 110, 129, 132, 137, 141, 142, 143, 147, 150, 152, 155, 170, 174, 175, 183, 188, 189]
 Soot  烟灰3 [79, 80, 142]  [79, 80, 142]
Clang2 [22, 48]  [22, 48]
tree-sitter2 [152, 153]  [152, 153]
CodeSensor  代码传感器2 [94, 95]  [94, 95]
ANTLR2 [135, 142]  [135, 142]
Data Tools  数据工具NetworkX  网络 X59[7, 9, 41, 141, 155]
[7, 9, 41, 141, 155]
 NLTK4 [54, 56, 73, 114]
[54, 56, 73, 114]
Unique Total  独特总计96
Table 10. Most Commonly Used Tools for Software Vulnerability Detection
表 10. 软件漏洞检测中最常用的工具
As can be seen in the table, Keras with TensorFlow backend16 is the most commonly used library for building ML-based software vulnerability detection techniques, accounting for 42 studies, and PyTorch17 comes as the second most commonly used library, with 42 studies in total. Scikit-learn18 is the third most popular library for model building, accounting for 11 studies in total. Scikit-learn provides a user-friendly and consistent API, making it easy to implement and experiment with various ML algorithms. Scikit-learn includes a diverse set of classification algorithms such as Logistic Regression, SVM, Decision Trees, Random Forests, KN, and Naive Bayes. GenSim19 is the fourth commonly used tool for building software vulnerability detection models. GenSim’s ability to efficiently handle large datasets, combined with its powerful topic modeling and word embedding functionalities, makes it an indispensable tool for model building in natural language processing and text mining. DGL20 is the fifth most commonly used model building tool, accounting for 6 studies. DGL is specifically designed for constructing and training GNNs, making it a go-to library for researchers and practitioners working on graph-related problems. It abstracts the complexity of implementing GNNs, providing easy-to-use APIs for building and applying various GNN models.
如表所示,使用 TensorFlow 后端 16 的 Keras 是最常用的基于机器学习的软件漏洞检测技术库,占 42 项研究,其次是使用 PyTorch 17 的库,总共有 42 项研究。Scikit-learn 18 是用于模型构建的第三大流行库,总共有 11 项研究。Scikit-learn 提供了一个用户友好且一致的 API,使得实现和实验各种机器学习算法变得容易。Scikit-learn 包括一系列分类算法,如逻辑回归、SVM、决策树、随机森林、KN 和朴素贝叶斯。GenSim 19 是构建软件漏洞检测模型的第四大常用工具。GenSim 高效处理大数据集的能力,结合其强大的主题建模和词嵌入功能,使其成为自然语言处理和文本挖掘中模型构建不可或缺的工具。DGL 20 是第五大常用的模型构建工具,占 6 项研究。 DGL 专门设计用于构建和训练 GNN,使其成为研究者和从事图相关问题的实践者的首选库。它抽象了实现 GNN 的复杂性,提供了易于使用的 API,用于构建和应用各种 GNN 模型。
In the category of Code Analysis/Compilation, the most commonly used tool is Joern, accounting for 24 studies in total. Joern was first proposed by Yamaguchi et al. [160], and it converts source code into a graph representation, specifically AST, CFG, and PDG. The second most commonly used tool for code processing is Soot,21 which provides various intermediate representations of Java bytecode.
在代码分析/编译类别中,最常用的工具是 Joern,总共有 24 项研究使用。Joern 最初由山口等人[160]提出,它将源代码转换为图表示,具体为 AST、CFG 和 PDG。用于代码处理的第二常用工具是 Soot, 21 ,它提供了 Java 字节码的各种中间表示。
In the category of Data Tools, NetworkX is the most commonly used data tool, accounting for five studies in total. NetworkX22 uses native Python data structures (like dictionaries and lists) to represent graphs. This allows seamless integration with other Python libraries and makes it easy to manipulate and explore graph data. NLTK23 provides robust tools for breaking down source code and text into tokens, which is essential for analyzing software vulnerability data.
在数据工具类别中,NetworkX 是最常用的数据工具,总共有五项研究使用。NetworkX 使用原生 Python 数据结构(如字典和列表)来表示图。这使得它可以与其他 Python 库无缝集成,并便于操作和探索图数据。NLTK 提供了强大的工具,用于将源代码和文本分解成标记,这对于分析软件漏洞数据至关重要。

4.6 RQ6: What Are Possible Challenges and Open Directions in Software Vulnerability Detection?
4.6 RQ6:软件漏洞检测中可能面临的挑战和开放方向是什么?

4.6.1 Challenges.  4.6.1 挑战。

Challenge 1: Heterogeneous Data Sources. The biggest challenge in vulnerability detection through learning is the inadequate modeling of the comprehensive semantics of complex vulnerabilities by current models [26, 27, 126]. Existing ML models often fail to capture the complex patterns of software vulnerabilities because they treat source code like natural language. Unlike natural language, source code contains structural and logical information requiring AST, dataflow, and control flow analysis. To address this, the detection pipeline must use rich representation techniques like control flow and dataflow graphs and proper embeddings to convert these representations into a numerical format for graph-based neural networks.
挑战 1:异构数据源。通过学习进行漏洞检测的最大挑战是当前模型无法充分建模复杂漏洞的综合语义 [26, 27, 126]。现有的机器学习模型往往无法捕捉软件漏洞的复杂模式,因为它们将源代码视为自然语言。与自然语言不同,源代码包含结构和逻辑信息,需要抽象语法树(AST)、数据流和控制流分析。为了解决这个问题,检测管道必须使用丰富的表示技术,如控制流和数据流图以及适当的嵌入,将这些表示转换为基于图神经网络的数值格式。
Challenge 2: Detection Granularity. The effectiveness of DL models in identifying vulnerabilities depends on input granularity. Current models use coarse inputs like methods and files. To achieve finer granularity, program slicing can select crucial statements for detection, but it must be done effectively to reduce noise. Existing tools focus on library/API calls and operations, but these alone are insufficient. A promising approach is using code changes from GitHub, focusing on added and deleted lines, which often have the highest impact on vulnerability detection.
挑战 2:检测粒度。深度学习模型在识别漏洞方面的有效性取决于输入粒度。当前模型使用粗粒度输入,如方法和文件。为了实现更细的粒度,程序切片可以选择关键语句进行检测,但必须有效地进行以减少噪声。现有工具主要关注库/API 调用和操作,但仅此不足以。一种有前景的方法是使用 GitHub 上的代码更改,重点关注新增和删除的行,这些行通常对漏洞检测影响最大。
Challenge 3: Lack of Training Data. A significant weakness of DL models, particularly in software vulnerability detection, is their insatiable need for data [24, 111]. In domains like image classification, abundant labeled data, and pre-trained models enable effective DL training. However, in software vulnerability detection, data scarcity is a major issue due to the difficulty of labeling ground truth information. Platforms like Stack Overflow, GitHub, and issue-tracking systems provide extensive records, but labeling is often manual and challenging. Automatic labeling is a potential solution but tends to generate many false positives. Some researchers use unsupervised classification, but this method also has limited precision.
挑战 3:训练数据不足。深度学习模型的一个显著弱点,尤其是在软件漏洞检测方面,就是它们对数据的无尽需求[24, 111]。在图像分类等领域,丰富的标记数据和预训练模型能够有效地进行深度学习训练。然而,在软件漏洞检测中,由于标记真实信息困难,数据稀缺成为一个主要问题。像 Stack Overflow、GitHub 和问题跟踪系统这样的平台提供了大量的记录,但标记通常是手动且具有挑战性的。自动标记是一个潜在的解决方案,但往往会产生许多误报。一些研究人员使用无监督分类,但这种方法也具有有限的精度。

4.6.2 Open Directions.  4.6.2 开放方向。

Multi-Modal Learning. Performing a simple vulnerability detection with source code snippets is not sufficient to have accurate and effective models. Various artifacts are needed to feed into ML models to increase vulnerability detection performance. For example, feeding code comments will increase classification performance remarkably. Some subject studies that use commits [19] argue that feeding source code is not enough and commit characteristics as metadata are required for software vulnerability detection.
多模态学习。仅使用源代码片段进行简单的漏洞检测不足以构建准确有效的模型。需要各种工件来输入机器学习模型,以提高漏洞检测性能。例如,输入代码注释将显著提高分类性能。一些使用提交[19]的专题研究认为,仅输入源代码是不够的,软件漏洞检测还需要提交特征作为元数据。
Just-in-Time Vulnerability Detection. One possible direction for software vulnerability detection is using the just-in-time approaches. This approach focuses on detecting vulnerabilities as they occur or are introduced, hence offering real-time protection [56, 114]. This method allows for faster reaction and mitigation of vulnerabilities before they are exploited.
即时漏洞检测。软件漏洞检测的一个可能方向是采用即时检测方法。这种方法侧重于在漏洞出现或引入时进行检测,因此提供实时保护[56, 114]。这种方法允许在漏洞被利用之前更快地做出反应和缓解。
Leveraging Foundation Models (LLMs) for Vulnerability Detection. Recently, LLMs have been used in a wide variety of software engineering tasks including automatic program repair [59], test case generation, and root cause analysis of incidents in cloud environments. However, the application of LLMs for software vulnerability detection has not been yet discovered comprehensively as it should be. In our survey, we identified some subject studies that utilize LLMs for software vulnerability detection [32, 98, 135, 169]. However, their frequency is still negligible compared to the widespread usage of typical DL models.
利用基础模型(LLMs)进行漏洞检测。最近,LLMs 已被广泛应用于各种软件工程任务中,包括自动程序修复[ 59]、测试用例生成以及云环境中事件的根本原因分析。然而,LLMs 在软件漏洞检测中的应用尚未得到全面发现,正如它应该做到的那样。在我们的调查中,我们确定了某些利用 LLMs 进行软件漏洞检测的研究[ 32, 98, 135, 169]。然而,与典型深度学习模型的广泛应用相比,它们的频率仍然微不足道。

5 Threats to Validity  5 有效性威胁

In this section, we discuss threats to the validity of each RQ. We discuss various threats to the RQs that we address in this study.
在这一节中,我们讨论了每个 RQ 有效性的威胁。我们讨论了在本研究中我们解决的 RQs 的各种威胁。
RQ1: Trend of Studies . The selection of studies might be biased if certain types of studies are more likely to be indexed or retrieved by our web crawler. To address selection bias, we defined diverse key terms to extract the most relevant research papers related to software vulnerability detection. The target papers should use ML-based software vulnerability detection techniques. To increase the accuracy of data selection, we refined the initial search results in three steps to ensure that the most relevant studies were selected for taxonomy creation and refinement. These steps have been performed by multiple authors simultaneously. The choice of digital libraries could impact construct validity if they do not equally represent all relevant studies. To mitigate this threat, we selected the most widely used digital libraries: ACM Digital Library, ScienceDirect, IEEE Xplore, and Google Scholar. These libraries are representative of the software vulnerability detection field because they contain a sufficient number of records that match our key terms for data extraction. One of the major threats to the external validity of the first RQ is that the trends we observed from January 2011 to June 2024 may not apply to future research beyond this period. As technologies evolve rapidly, new techniques and tools for software vulnerability detection may emerge. However, we believe that our findings accurately represent the current state-of-the-art technology for software vulnerability detection at the time of this study.
RQ1:研究趋势。如果某些类型的研究更有可能被我们的网络爬虫索引或检索,那么研究的选择可能会存在偏差。为了解决选择偏差,我们定义了多种关键词来提取与软件漏洞检测相关的最相关的研究论文。目标论文应使用基于机器学习的软件漏洞检测技术。为了提高数据选择的准确性,我们通过三个步骤细化了初始搜索结果,以确保为分类创建和细化选择了最相关的研究。这些步骤是由多位作者同时执行的。如果数字图书馆不能平等地代表所有相关研究,那么它们的选择可能会影响构建效度。为了减轻这种威胁,我们选择了最广泛使用的数字图书馆:ACM 数字图书馆、ScienceDirect、IEEE Xplore 和 Google Scholar。这些图书馆代表了软件漏洞检测领域,因为它们包含足够多的与我们的关键词匹配的记录,用于数据提取。 第一个 RQ 的外部效度的主要威胁之一是,我们从 2011 年 1 月到 2024 年 6 月观察到的趋势可能不适用于此时期之后的研究。随着技术的快速发展,可能会出现新的软件漏洞检测技术和工具。然而,我们相信我们的发现准确地代表了本研究时软件漏洞检测的最新技术水平。
RQ2: Characteristics of Software Vulnerability Detection Datasets . Datasets might focus on specific types of software or languages that threaten the generalizability of our findings. To overcome this limitation, we focused on software vulnerability detection in three major language domains, including software vulnerability in Java, C/C++, and smart contracts. Java is prevalent in enterprise and web applications, C/C++ is fundamental in system and performance-critical programming, and smart contracts are crucial in blockchain technology. This diverse selection reduces selection bias, provides a holistic view of vulnerabilities, and ensures that the findings are more broadly applicable and relevant to real-world software development contexts. Although our findings are based on datasets from studies published between January 2011 and June 2024, the identified characteristics are expected to apply to future datasets due to ongoing advancements in software vulnerability detection techniques. We provide detailed criteria and procedures for selecting and analyzing datasets, enabling other researchers to replicate and validate our findings, thus enhancing the generalizability and reliability of our conclusions.
RQ2:软件漏洞检测数据集的特征。数据集可能专注于特定类型的软件或语言,这可能会威胁到我们研究结果的普适性。为了克服这一局限性,我们专注于三个主要语言领域的软件漏洞检测,包括 Java、C/C++和智能合约中的软件漏洞。Java 在企业和 Web 应用中普遍存在,C/C++在系统和性能关键编程中是基础,而智能合约在区块链技术中至关重要。这种多样化的选择减少了选择偏差,提供了对漏洞的整体视角,并确保研究结果更广泛地适用于现实世界的软件开发环境。尽管我们的研究结果基于 2011 年 1 月至 2024 年 6 月间发表的研究数据集,但由于软件漏洞检测技术的持续进步,预计所识别的特征也适用于未来的数据集。 我们提供了详细的标准和程序,用于选择和分析数据集,使其他研究人员能够复制和验证我们的发现,从而增强我们结论的普遍性和可靠性。
RQ3: Distribution of ML and DL Models in Software Vulnerability Detection . There are multiple threats to this RQ. First, ML models evolve quickly, and models that are effective today might become obsolete or be replaced by more advanced ones soon. To overcome this threat, we expanded our study selection bias to cover the last 2 years—that is, 2023 and 2024 to cover the most state-of-the-art ML technology for software vulnerability detection. This results in identifying three promising studies that use foundation models for software vulnerability detection.
RQ3:机器学习(ML)和深度学习(DL)模型在软件漏洞检测中的应用分布。这一研究问题存在多个威胁。首先,ML 模型发展迅速,今天有效的模型可能很快就会过时或被更先进的模型所取代。为了克服这一威胁,我们将我们的研究选择偏差扩展到覆盖过去两年——即 2023 年和 2024 年,以涵盖最先进的软件漏洞检测 ML 技术。这导致我们发现了三个使用基础模型进行软件漏洞检测的有前景的研究。
RQ4: Frequent Software Vulnerability . To ensure construct validity in this RQ, it is crucial to provide clear and precise definitions of each type of vulnerability. We first identify reputable sources like OWASP and MITRE’s CWE. OWASP provides a widely recognized list of common security vulnerabilities, particularly in web applications. CWE offers a comprehensive list of software weaknesses, providing detailed descriptions and classifications. We then reviewed the subject studies and identified the types of vulnerabilities that are mentioned frequently. Often these vulnerabilities can be identified by CWE IDs that are explicitly mentioned in the research papers.
RQ4:频繁的软件漏洞。为确保本 RQ 的构建效度,提供每种漏洞的清晰和精确定义至关重要。我们首先确定了像 OWASP 和 MITRE 的 CWE 这样的可靠来源。OWASP 提供了一份广为人知的常见安全漏洞列表,尤其是在 Web 应用程序中。CWE 提供了一份全面的软件弱点列表,包括详细的描述和分类。然后我们审查了主题研究,并确定了经常提到的漏洞类型。通常,这些漏洞可以通过在研究论文中明确提到的 CWE ID 来识别。
RQ5: Tools for Software Vulnerability Detection . The threat to this question is that there may be biases in the selection of tools for study, influenced by an important factor such as popularity (like TensorFlow and PyTorch). This can skew the findings toward more well-known tools, neglecting equally effective but less publicized options. To overcome this threat, we classified the tools into three broad categories. For each category, we extracted the most popular and the least popular tools including a balanced mix of tools to avoid over-representation of any particular subset.
RQ5:软件漏洞检测工具。这个问题受到的威胁是,在工具选择上可能存在偏差,受到诸如流行度(如 TensorFlow 和 PyTorch)等重要因素的影响。这可能导致研究结果偏向于更知名的工具,而忽视了同样有效但知名度较低的选择。为了克服这一威胁,我们将工具分为三大类。对于每一类,我们提取了最受欢迎和最不受欢迎的工具,包括平衡的工具组合,以避免任何特定子集的过度代表。
RQ6: Challenges and Open Directions . To ensure the construct validity of this RQ, we thoroughly analyzed two key sections of each study. First, we examined the context section of the abstract to gain a general understanding of the problem being addressed. Next, we analyzed the introduction section to extract relevant text that further elaborates on the problem. By combining this information, we generalized the problem and created a concise taxonomy for classification.
RQ6:挑战与开放方向。为确保该 RQ 的构建效度,我们对每项研究的两个关键部分进行了彻底分析。首先,我们检查了摘要中的背景部分,以获得对所解决问题的总体理解。接下来,我们分析了引言部分,以提取进一步阐述问题的相关文本。通过结合这些信息,我们对问题进行了概括,并创建了一个简洁的分类法。

6 Conclusion  6 结论

In this study, we conducted a systematic survey to investigate various characteristics of ML-based software vulnerability detection studies using six RQs. We extracted initial studies from four widely-used online digital libraries—ACM Digital Library, IEEE Xplore, ScienceDirect, and Google Scholar—using a custom web scraper. After manually filtering out irrelevant studies unrelated to software vulnerability detection, we created taxonomies and addressed the RQs.
本研究对基于机器学习的软件漏洞检测研究进行了系统调查,使用六个研究问题(RQs)来探究其各种特征。我们使用自定义网络爬虫从四个广泛使用的在线数字图书馆——ACM 数字图书馆、IEEE Xplore、ScienceDirect 和 Google Scholar——中提取了初步研究。在手动过滤掉与软件漏洞检测无关的不相关研究后,我们创建了分类法并解决了研究问题。
Our findings indicated a notable increase in the use of ML techniques to detect software vulnerabilities in recent years. We found that prominent conference venues include ICSE, ISSRE, MSR, and FSE, whereas the leading journal venues are IST, C&S, and JSS. Additionally, we found that 39.1% of the subject studies use hybrid as the sources of data, whereas 37.6% of the subject studies use benchmark data for software vulnerability detection. Among the data types analyzed, code-based data is the most prevalent, with source code being the most common sub-type. Graph-based and token-based input representations are the most popular techniques, utilized in 57.2% and 24.6% of the studies, respectively. For input embedding, graph embeddings and token vector embeddings are the most frequently employed methods, appearing in 32.6% and 29.7% of studies. Furthermore, 88.4% of the examined studies use DL models, with RNNs and GNNs being the most popular, whereas only 7.2% use traditional ML models. The most frequently addressed vulnerability types are CWE-119, CWE-20, and CWE-190. In terms of tools for software vulnerability detection, Keras and PyTorch are the most widely used tools. Joern is the leading tool for code analysis and representation. Finally, we summarized the challenges and future directions in the context of software vulnerability detection, providing valuable insight for researchers and practitioners in the field. This comprehensive survey aimed to bridge the existing gap and provide a clearer understanding of the current landscape and future opportunities in the detection of software vulnerabilities using ML techniques.
近年来,我们发现使用机器学习技术检测软件漏洞的应用显著增加。我们发现,重要的会议包括 ICSE、ISSRE、MSR 和 FSE,而领先的期刊包括 IST、C&S 和 JSS。此外,我们发现 39.1%的研究使用混合数据作为数据来源,而 37.6%的研究使用基准数据用于软件漏洞检测。在分析的数据类型中,基于代码的数据最为普遍,其中源代码是最常见的子类型。基于图和基于标记的输入表示是最受欢迎的技术,分别被 57.2%和 24.6%的研究使用。对于输入嵌入,图嵌入和标记向量嵌入是最常用的方法,分别出现在 32.6%和 29.7%的研究中。此外,88.4%的研究使用了深度学习模型,其中 RNN 和 GNN 最受欢迎,而只有 7.2%的研究使用传统的机器学习模型。最常讨论的漏洞类型是 CWE-119、CWE-20 和 CWE-190。在软件漏洞检测工具方面,Keras 和 PyTorch 是最广泛使用的工具。 Joern 是代码分析和表示的领先工具。最后,我们在软件漏洞检测的背景下总结了挑战和未来方向,为该领域的学者和实践者提供了有价值的见解。这项全面调查旨在弥合现有差距,并更清晰地理解当前软件漏洞检测领域使用机器学习技术的现状和未来机遇。

Footnotes  脚注

2
Please note that it is possible to detect memory leak vulnerabilities using static analysis techniques; however, application of dynamic analysis is more effective compared to static analysis.
请注意,可以使用静态分析技术检测内存泄漏漏洞;然而,与静态分析相比,动态分析的应用更为有效。
10
Please note that if we could not find the dataset name and source in the experimental setup section, we looked for other sections.
请注意,如果在实验设置部分找不到数据集名称和来源,我们就在其他部分寻找。

References  参考文献

[1]  [1] [1]
Faranak Abri, Sima Siami-Namini, Mahdi Adl Khanghah, Fahimeh Mirza Soltani, and Akbar Siami Namin. 2019. Can machine/deep learning classifiers detect zero-day malware with high accuracy? In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data’19). IEEE, 3252–3259.
Faranak Abri,Sima Siami-Namini,Mahdi Adl Khanghah,Fahimeh Mirza Soltani,以及 Akbar Siami Namin. 2019. 机器/深度学习分类器能否以高精度检测零日恶意软件?载于 2019 年 IEEE 国际大数据会议(Big Data’19)论文集。IEEE,3252–3259。
[2]  [2] [2]
Sasan H. Alizadeh, Alireza Hediehloo, and Nima Shiri Harzevili. 2021. Multi independent latent component extension of naive Bayes classifier. Knowledge-Based Systems 213 (2021), 106646.
萨桑·阿利扎德,阿里雷扎·赫迪洛,尼玛·希里·哈泽维利。2021。朴素贝叶斯分类器的多独立潜在成分扩展。知识工程系统 213(2021),106646。
[3]
Miltiadis Allamanis, Henry Jackson-Flux, and Marc Brockschmidt. 2021. Self-supervised bug detection and repair. In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS’21). 27865–27876.
Miltiadis Allamanis、Henry Jackson-Flux 和 Marc Brockschmidt. 2021. 自监督错误检测与修复。在《第 35 届神经信息处理系统会议(NeurIPS’21)》论文集中。27865–27876。
[4]
Nami Ashizawa, Naoto Yanai, Jason Paul Cruz, and Shingo Okamura. 2021. Eth2Vec: Learning contract-wide code representations for vulnerability detection on Ethereum smart contracts. In Proceedings of the 3rd ACM International Symposium on Blockchain and Secure Critical Infrastructure (BSCI’21). 47–59.
Nami Ashizawa, Naoto Yanai, Jason Paul Cruz 和 Shingo Okamura. 2021. Eth2Vec:学习以太坊智能合约的合约级代码表示以进行漏洞检测。在第三届 ACM 国际区块链与安全关键基础设施研讨会(BSCI’21)论文集中。第 47-59 页。
[5]
Leyla Bilge and Tudor Dumitraş. 2012. Before we knew it: An empirical study of zero-day attacks in the real world. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCCS’12). 833–844.
Leyla Bilge 和 Tudor Dumitraş. 2012. 我们意识到之前:对现实世界零日攻击的实证研究。在 2012 年 ACM 计算机与通信安全会议(CCCS’12)论文集中。833–844。
[6]
Christopher M. Bishop and Nasser M. Nasrabadi. 2006. Pattern Recognition and Machine Learning. New York: Springer, 4, 4 (2006).
Christopher M. Bishop 和 Nasser M. Nasrabadi. 2006. 模式识别与机器学习. 纽约:Springer, 4, 4 (2006).
[7]
Jie Cai, Bin Li, Jiale Zhang, Xiaobing Sun, and Bing Chen. 2023. Combine sliced joint graph with graph neural networks for smart contract vulnerability detection. Journal of Systems and Software 195 (2023), 111550.
蔡杰,李斌,张佳乐,孙晓兵,陈冰。2023. 结合切片联合图与图神经网络进行智能合约漏洞检测。系统与软件杂志 195(2023),111550。
[8]
Jie Cai, Bin Li, Tao Zhang, Jiale Zhang, and Xiaobing Sun. 2024. Fine-grained smart contract vulnerability detection by heterogeneous code feature learning and automated dataset construction. Journal of Systems and Software 209 (2024), 111919.
蔡杰,李斌,张涛,张佳乐,孙晓兵。2024. 基于异构代码特征学习和自动化数据集构建的细粒度智能合约漏洞检测。系统与软件杂志 209(2024),111919。
[9]
Wenjing Cai, Junlin Chen, Jiaping Yu, and Lipeng Gao. 2023. A software vulnerability detection method based on deep learning with complex network analysis and subgraph partition. Information and Software Technology 164 (2023), 107328.
蔡文静,陈俊霖,余嘉平,高立鹏。2023. 基于复杂网络分析和子图划分的深度学习软件漏洞检测方法。信息与软件技术 164(2023),107328。
[10]
Sicong Cao, Xiaobing Sun, Lili Bo, Ying Wei, and Bin Li. 2021. BGNN4VD: Constructing bidirectional graph neural-network for vulnerability detection. Information and Software Technology 136 (2021), 106576.
曹思聪,孙晓兵,薄丽丽,魏莹,李斌。2021. BGNN4VD:构建用于漏洞检测的双向图神经网络。信息与软件技术 136(2021),106576。
[11]
Sicong Cao, Xiaobing Sun, Lili Bo, Rongxin Wu, Bin Li, and Chuanqi Tao. 2022. MVD: Memory-related vulnerability detection based on flow-sensitive graph neural networks. In Proceedings of the 44th International Conference on Software Engineering (ICSE’22). 1456–1468.
曹思聪,孙晓兵,薄丽丽,吴荣欣,李斌,陶传祺。2022. 基于流敏感图神经网络的内存相关漏洞检测方法 MVD。在《第 44 届国际软件工程会议(ICSE’22)》论文集中。1456–1468。
[12]
Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, and Baishakhi Ray. 2022. Deep learning based vulnerability detection: Are we there yet? IEEE Transactions on Software Engineering 48 (2022), 3280–3296.
Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, 和 Baishakhi Ray. 2022. 基于深度学习的漏洞检测:我们是否已经到达那里?IEEE 软件工程汇刊 48 (2022),3280–3296。
[13]
Da Chen, Lin Feng, Yuqi Fan, Siyuan Shang, and Zhenchun Wei. 2023. Smart contract vulnerability detection based on semantic graph and residual graph convolutional networks with edge attention. Journal of Systems and Software 202 (2023), 111705.
陈大,林峰,范雨琪,尚思源,魏振春. 2023. 基于语义图和边缘注意力残差图卷积网络的智能合约漏洞检测. 系统与软件杂志 202 (2023),111705.
[14]
Haipeng Chen, Jing Liu, Rui Liu, Noseong Park, and V. S. Subrahmanian. 2019. VEST: A system for vulnerability exploit scoring & timing. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). 6503–6505.
陈海鹏,刘静,刘瑞,朴诺桑,V. S. 苏布拉哈曼尼亚。2019。VEST:一个漏洞利用评分与时间系统。在《第 28 届国际人工智能联合会议(IJCAI’19)》论文集中。6503–6505。
[15]
Jinfu Chen, Patrick Kwaku Kudjo, Solomon Mensah, Selasie Aformaley Brown, and George Akorfu. 2020. An automatic software vulnerability classification framework using term frequency-inverse gravity moment and feature selection. Journal of Systems and Software 167 (2020), 110616.
陈金富,库杜·帕特里克,门萨·所罗门,布朗·塞拉斯·阿福莱,阿科富·乔治。2020。《基于词频-逆重力矩和特征选择的自动软件漏洞分类框架》。系统与软件杂志,第 167 卷(2020 年),110616 号。
[16]
Jinfu Chen, Wei Lin, Saihua Cai, Yemin Yin, Haibo Chen, and Dave Towey. 2023. BiTCN_DRSN: An effective software vulnerability detection model based on an improved temporal convolutional network. Journal of Systems and Software 204 (2023), 111772.
陈金富,林伟,蔡赛华,尹叶民,陈海波,托伊·戴夫。2023. 基于改进时间卷积网络的 BiTCN_DRSN:一种有效的软件漏洞检测模型。系统与软件杂志 204(2023),111772。
[17]
Jinfu Chen, Weijia Wang, Bo Liu, Saihua Cai, Dave Towey, and Shengran Wang. 2024. Hybrid semantics-based vulnerability detection incorporating a temporal convolutional network and self-attention mechanism. Information and Software Technology 171 (2024), 107453.
陈金富,王伟佳,刘波,蔡赛华,托伊·戴夫,王胜然。2024. 基于混合语义的漏洞检测:结合时间卷积网络和自注意力机制。信息与软件技术 171(2024),107453。
[18]
Stanley F. Chen and Joshua Goodman. 1999. An empirical study of smoothing techniques for language modeling. Computer Speech & Language 13, 4 (1999), 359–394.
斯坦利·F·陈和约书亚·古德曼。1999。《语言模型平滑技术实证研究》。计算机语音与语言 13,4(1999),359–394。
[19]
Yang Chen, Andrew E. Santosa, Ang Ming Yi, Abhishek Sharma, Asankhaya Sharma, and David Lo. 2020. A machine learning approach for vulnerability curation. In Proceedings of the 17th International Conference on Mining Software Repositories (MSR’20). 32–42.
杨晨,安德鲁·E·桑托萨,安明义,阿布希克·沙玛,阿桑卡亚·沙玛,和大卫·洛。2020。一种用于漏洞管理的机器学习方法。在《第 17 届国际软件仓库挖掘会议(MSR’20)论文集》中。32–42。
[20]
Xiao Cheng, Haoyu Wang, Jiayi Hua, Guoai Xu, and Yulei Sui. 2021. DeepWukong: Statically detecting software vulnerabilities using deep graph neural network. ACM Transactions on Software Engineering and Methodology 30, 3 (2021), 1–33.
肖成,王浩宇,华佳怡,徐国奥,隋宇雷。2021. DeepWukong:利用深度图神经网络进行静态软件漏洞检测。ACM 软件工程与方法学交易 30,3(2021),1–33。
[21]
Xiao Cheng, Haoyu Wang, Jiayi Hua, Miao Zhang, Guoai Xu, Li Yi, and Yulei Sui. 2019. Static detection of control-flow-related vulnerabilities using graph embedding. In Proceedings of the 2019 24th International Conference on Engineering and Complex Computer Systems (ICECCS’19). IEEE, 41–50.
肖成,王浩宇,华佳怡,张苗,徐国奥,易丽,隋宇雷。2019。利用图嵌入进行控制流相关漏洞的静态检测。载于 2019 年第 24 届工程与复杂计算机系统国际会议(ICECCS’19)论文集。IEEE,41–50。
[22]
Xiao Cheng, Guanqin Zhang, Haoyu Wang, and Yulei Sui. 2022. Path-sensitive code embedding via contrastive learning for software vulnerability detection. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’22). 519–531.
肖城,张冠钦,王浩宇,隋玉磊。2022. 基于对比学习的路径敏感代码嵌入用于软件漏洞检测。在 31 届 ACM SIGSOFT 国际软件测试与分析研讨会(ISSTA’22)论文集中。519–531。
[23]
Min-Je Choi, Sehun Jeong, Hakjoo Oh, and Jaegul Choo. 2017. End-to-end prediction of buffer overruns from raw source code via neural memory networks. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). 1546–1553.
崔敏杰,郑世勋,吴河九,崔在旭。2017。通过神经记忆网络从原始源代码端到端预测缓冲区溢出。在《第 26 届国际人工智能联合会议(IJCAI’17)》论文集中。1546–1553。
[24]
Roland Croft, M. Ali Babar, and M. Mehdi Kholoosi. 2023. Data quality for software vulnerability datasets. In Proceedings of the 45th International Conference on Software Engineering (ICSE’23). IEEE, 121–133.
罗兰·克罗夫特,M.阿里·巴巴尔,M.梅赫迪·霍洛西。2023。软件漏洞数据集的数据质量。在《第 45 届国际软件工程会议(ICSE’23)》论文集中。IEEE,121–133。
[25]
Christoph Csallner, Yannis Smaragdakis, and Tao Xie. 2008. DSD-Crasher: A hybrid analysis tool for bug finding. ACM Transactions on Software Engineering and Methodology 17, 2 (2008), Article 8, 37 pages.
克里斯托夫·察尔纳,亚尼斯·斯玛拉加基斯,和谢涛。2008。DSD-Crasher:一种用于发现错误的混合分析工具。ACM 软件工程与方法论交易第 17 卷第 2 期(2008 年),文章 8 号,37 页。
[26]
Hoa Khanh Dam, Truyen Tran, Trang Pham, Shien Wee Ng, John Grundy, and Aditya Ghose. 2018. Automatic feature learning for predicting vulnerable software components. IEEE Transactions on Software Engineering 47, 1 (2018), 67–85.
Hoa Khanh Dam, Truyen Tran, Trang Pham, Shien Wee Ng, John Grundy, 和 Aditya Ghose. 2018. 预测易受攻击软件组件的自动特征学习。IEEE 软件工程汇刊 47, 1 (2018), 67–85。
[27]
Hoa Khanh Dam, Truyen Tran, Trang Pham, Shien Wee Ng, John Grundy, and Aditya Ghose. 2018. Automatic feature learning for predicting vulnerable software components. IEEE Transactions on Software Engineering 47, 1 (2018), 67–85.
Hoa Khanh Dam, Truyen Tran, Trang Pham, Shien Wee Ng, John Grundy, 和 Aditya Ghose. 2018. 预测易受攻击软件组件的自动特征学习。IEEE 软件工程汇刊 47, 1 (2018), 67–85。
[28]
Elizabeth Dinella, Hanjun Dai, Ziyang Li, Mayur Naik, Le Song, and Ke Wang. 2020. Hoppity: Learning graph transformations to detect and fix bugs in programs. In Proceedings of the 2020 International Conference on Learning Representations (ICLR’20).
Elizabeth Dinella, Hanjun Dai, Ziyang Li, Mayur Naik, Le Song, 和 Ke Wang. 2020. Hoppity:学习图变换以检测和修复程序中的错误。在 2020 年国际学习表示会议(ICLR’20)论文集中。
[29]
Yangruibo Ding, Sahil Suneja, Yunhui Zheng, Jim Laredo, Alessandro Morari, Gail Kaiser, and Baishakhi Ray. 2022. VELVET: A noVel Ensemble Learning approach to automatically locate VulnErable sTatements. In Proceedings of the 2022 IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER’22). 959–970.
丁阳瑞博,萨希尔·苏内贾,郑云辉,吉姆·拉雷多,亚历山德罗·莫拉里,盖尔·凯泽,和巴什卡希·雷。2022. VELVET:一种自动定位易受攻击语句的新型集成学习方法。在 2022 年 IEEE 国际软件分析、演化与重构会议(SANER’22)论文集中。959–970。
[30]
Yukun Dong, Yeer Tang, Xiaotong Cheng, and Yufei Yang. 2023. DeKeDVer: A deep learning-based multi-type software vulnerability classification framework using vulnerability description and source code. Information and Software Technology 163 (2023), 107290.
董宇坤,唐业,程晓桐,杨宇飞。2023. 基于深度学习的多类型软件漏洞分类框架 DeKeDVer:利用漏洞描述和源代码。信息与软件技术 163(2023),107290。
[31]
Yukun Dong, Yeer Tang, Xiaotong Cheng, Yufei Yang, and Shuqi Wang. 2023. SedSVD: Statement-level software vulnerability detection based on relational graph convolutional network with subgraph embedding. Information and Software Technology 158 (2023), 107168.
董宇坤,唐叶尔,程晓桐,杨宇飞,王淑琪。2023. 基于子图嵌入的关系图卷积网络语句级软件漏洞检测方法 SedSVD。信息与软件技术 158(2023),107168。
[32]
Xiaozhi Du, Shiming Zhang, Yanrong Zhou, and Hongyuan Du. 2024. A vulnerability severity prediction method based on bimodal data and multi-task learning. Journal of Systems and Software 213 (2024), 112039.
杜晓志,张世明,周艳荣,杜红元. 2024. 基于双模态数据和多任务学习的漏洞严重性预测方法. 系统软件学报 213 (2024), 112039.
[33]
Xiaoting Du, Zenghui Zhou, Beibei Yin, and Guanping Xiao. 2020. Cross-project bug type prediction based on transfer learning. Software Quality Journal 28, 1 (2020), 39–57.
杜晓婷,周曾辉,尹贝贝,肖冠平。2020. 基于迁移学习的跨项目缺陷类型预测。软件工程学报 28,1(2020),39–57。
[34]
Xu Duan, Jingzheng Wu, Shouling Ji, Zhiqing Rui, Tianyue Luo, Mutian Yang, and Yanjun Wu. 2019. VulSniper: Focus your attention to shoot fine-grained vulnerabilities. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). 4665–4671.
徐端,吴竞政,季守岭,芮志清,罗天越,杨暮天,武艳君。2019。VulSniper:聚焦射击细粒度漏洞。在《第 28 届国际人工智能联合会议(IJCAI’19)》论文集中。4665–4671。
[35]
Facebook. 2013. Infer. Retrieved October 12, 2024 from https://fbinfer.com/
Facebook. 2013. Infer. 2024 年 10 月 12 日检索自 https://fbinfer.com/
[36]
Yuanhai Fan, Chuanhao Wan, Cai Fu, Lansheng Han, and Hao Xu. 2023. VDoTR: Vulnerability detection based on tensor representation of comprehensive code graphs. Computers & Security 130 (2023), 103247.
袁海帆,万传浩,付才,韩兰生,徐浩。2023. 基于综合代码图张量表示的漏洞检测方法:VDoTR。计算机与安全 130(2023),103247。
[37]
Katarzyna Filus, Miltiadis Siavvas, Joanna Domańska, and Erol Gelenbe. 2020. The random neural network as a bonding model for software vulnerability prediction. In Modelling, Analysis, and Simulation of Computer and Telecommunication Systems. Lecture Notes in Computer Science, Vol. 12527. Springer, 102–116.
卡塔琳娜·菲卢斯,米蒂亚迪斯·西瓦斯,乔安娜·多马恩斯卡,以及埃罗尔·热尔南贝。2020 年。随机神经网络作为软件漏洞预测的粘合模型。在计算机和电信系统建模、分析和模拟。计算机科学讲义,第 12527 卷。斯普林格,102–116。
[38]
Michael Fu and Chakkrit Tantithamthavorn. 2022. LineVul: A transformer-based line-level vulnerability prediction. In Proceedings of the 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR’22).
Michael Fu 和 Chakkrit Tantithamthavorn. 2022. LineVul:基于 transformer 的行级漏洞预测。在 2022 年 IEEE/ACM 第 19 届国际软件仓库挖掘会议(MSR’22)论文集中。
[39]
Cuifeng Gao, Wenzhang Yang, Jiaming Ye, Yinxing Xue, and Jun Sun. 2024. sGuard+: Machine learning guided rule-based automated vulnerability repair on smart contracts. ACM Transactions on Software Engineering and Methodology 33, 5 (2024), 1–55.
高翠凤,杨文章,叶家铭,薛寅星,孙军. 2024. sGuard+:基于机器学习的基于规则的智能合约自动漏洞修复. 软件工程与方法学学报 33, 5 (2024), 1–55.
[40]
Seyed Mohammad Ghaffarian and Hamid Reza Shahriari. 2017. Software vulnerability analysis and discovery using machine-learning and data-mining techniques: A survey. ACM Computing Surveys 50, 4 (2017), 1–36.
赛义德·穆罕默德·加法里安和哈米德·雷扎·沙赫里亚里。2017。利用机器学习和数据挖掘技术进行软件漏洞分析和发现:综述。ACM 计算机调查 50,4(2017),1–36。
[41]
Seyed Mohammad Ghaffarian and Hamid Reza Shahriari. 2021. Neural software vulnerability analysis using rich intermediate graph representations of programs. Information Sciences 553 (2021), 189–207.
赛义德·穆罕默德·加法里安和哈米德·雷扎·沙赫里亚里。2021。利用程序丰富的中间图表示进行神经网络软件漏洞分析。信息科学 553(2021),189–207。
[42]
Patrice Godefroid. 2007. Random testing for security: Blackbox vs. whitebox fuzzing. In Proceedings of the 2nd International Conference on Random Testing, Co-Located with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE’07). 1.
Patrice Godefroid. 2007. 安全性随机测试:黑盒与白盒模糊测试。在第二届国际随机测试会议论文集,与第 22 届 IEEE/ACM 国际自动软件工程会议(ASE’07)联合举办。1。
[43]
Xi Gong, Zhenchang Xing, Xiaohong Li, Zhiyong Feng, and Zhuobing Han. 2019. Joint prediction of multiple vulnerability characteristics through multi-task learning. In Proceedings of the 2019 24th International Conference on Engineering and Complex Computer Systems (ICECCS’19). IEEE, 31–40.
席工,郑长兴,李晓红,冯志勇,韩卓斌。2019。通过多任务学习联合预测多个漏洞特征。载于 2019 年第 24 届工程与复杂计算机系统国际会议(ICECCS’19)论文集。IEEE,31–40。
[44]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Communications of the ACM 63, 11 (2020), 139–144.
伊恩·古德费洛,让·普吉特-阿巴迪,梅赫迪·米尔扎,徐冰,大卫·沃德-法雷利,谢吉尔·奥扎伊尔,阿隆·库维尔,约舒亚·本吉奥。2020。《生成对抗网络》。ACM 通讯 63,11(2020),139–144。
[45]
Yeming Gu, Hui Shu, and Fei Kang. 2023. BinAIV: Semantic-enhanced vulnerability detection for Linux x86 binaries. Computers & Security 135 (2023), 103508.
郭燕明,舒辉,康飞。2023. BinAIV:针对 Linux x86 二进制的语义增强漏洞检测。计算机与安全 135(2023),103508。
[46]
Longtao Guo, Huakun Huang, Sihun Xue, Peiliang Wang, and Lingjun Zhao. 2023. Reentrancy vulnerability detection based on graph convolutional networks and expert patterns. In Proceedings of the 2023 IEEE 16th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip (MCSoC’23). 312–316.
郭龙涛,黄华坤,薛世勋,王培亮,赵凌君。2023. 基于图卷积网络和专家模式的可重入漏洞检测。在 2023 年 IEEE 第 16 届嵌入式多核/众核系统芯片国际研讨会(MCSoC’23)论文集中。312–316。
[47]
Wenbo Guo, Yong Fang, Cheng Huang, Haoran Ou, Chun Lin, and Yongyan Guo. 2022. HyVulDect: A hybrid semantic vulnerability mining system based on graph neural network. Computers & Security 121 (2022), 102823.
郭文博,方勇,黄成,欧浩然,林春,郭永岩。2022. 基于图神经网络的混合语义漏洞挖掘系统 HyVulDect。计算机与安全 121(2022),102823。
[48]
Hazim Hanif and Sergio Maffeis. 2022. Vulberta: Simplified source code pre-training for vulnerability detection. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN’22). IEEE, 1–8.
汉兹·哈尼夫和塞尔吉奥·马费斯。2022 年。Vulberta:用于漏洞检测的简化源代码预训练。载于 2022 年国际神经网络联合会议(IJCNN’22)论文集。IEEE,1–8。
[49]
M. Hariharan, C. Sathish Kumar, Anshul Tanwar, Krishna Sundaresan, Prasanna Ganesan, Sriram Ravi, and R. Karthik. 2022. Proximal instance aggregator networks for explainable security vulnerability detection. Future Generation Computer Systems 134 (2022), 303–318.
M. 哈里哈兰,C. Sathish Kumar,Anshul Tanwar,Krishna Sundaresan,Prasanna Ganesan,Sriram Ravi,和 R. Karthik。2022. 近端实例聚合网络用于可解释的安全漏洞检测。未来计算机系统 134(2022),303–318。
[50]
Nima Shiri Harzevili and Sasan H. Alizadeh. 2018. Mixture of latent multinomial naive Bayes classifier. Applied Soft Computing 69 (2018), 516–527.
尼玛·希里·哈泽维利和萨桑·H·阿利扎德。2018。《混合潜在多项式朴素贝叶斯分类器》。应用软计算第 69 卷(2018 年),516–527。
[51]
Nima Shiri Harzevili and Sasan H. Alizadeh. 2021. Analysis and modeling conditional mutual dependency of metrics in software defect prediction using latent variables. Neurocomputing 460 (2021), 309–330.
尼玛·希里·哈泽维利和萨桑·H·阿利扎德。2021。使用潜在变量分析软件缺陷预测中指标的条件互依性建模。神经计算 460(2021),309–330。
[52]  [52] [52]
Nima Shiri Harzevili, Jiho Shin, Junjie Wang, Song Wang, and Nachiappan Nagappan. 2023. Automatic static vulnerability detection for machine learning libraries: Are we there yet? In Proceedings of the 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE’23). IEEE, 795–806.
尼玛·希里·哈泽维利,吉霍·申,王军杰,王松,纳奇亚潘·纳加潘。2023. 针对机器学习库的自动静态漏洞检测:我们做到了吗?在 2023 年 IEEE 第 34 届国际软件可靠性工程研讨会(ISSRE’23)论文集中。IEEE,795–806。
[53]
Nima Shiri Harzevili, Jiho Shin, Junjie Wang, Song Wang, and Nachiappan Nagappan. 2023. Characterizing and understanding software security vulnerabilities in machine learning libraries. In Proceedings of the 20th International Conference on Mining Software Repositories (MSR’23). IEEE, 27–38.
尼玛·希里·哈泽维利,吉霍·申,王军杰,王松,纳奇亚帕南·纳加潘。2023。描述和了解机器学习库中的软件安全漏洞。在《第 20 届国际软件仓库挖掘会议(MSR’23)》论文集中。IEEE,27–38。
[54]
David Hin, Andrey Kan, Huaming Chen, and M. Ali Babar. 2022. LineVD: Statement-level vulnerability detection using graph neural networks. In Proceedings of the 19th International Conference on Mining Software Repositories (MSR’22). 596–607.
David Hin, Andrey Kan, Huaming Chen 和 M. Ali Babar. 2022. LineVD:基于图神经网络的语句级漏洞检测。在《第 19 届国际软件仓库挖掘会议(MSR’22)》论文集中。596–607。
[55]
Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Computation 18, 7 (2006), 1527–1554.
杰弗里·E·辛顿,西蒙·奥斯因德罗,以及叶伟威。2006。《深度信念网的快速学习算法》。神经网络计算 18,7(2006),1527–1554。
[56]
Thong Hoang, Hoa Khanh Dam, Yasutaka Kamei, David Lo, and Naoyasu Ubayashi. 2019. DeepJIT: An end-to-end deep learning framework for just-in-time defect prediction. In Proceedings of the 16th International Conference on Mining Software Repositories (MSR’19). IEEE, 34–45.
黄通,阮华 Khanh Dam,上村康隆 Yasutaka Kamei,罗大卫 David Lo,宇佐美直隆 Naoyasu Ubayashi. 2019. DeepJIT:即时缺陷预测的端到端深度学习框架。在 2019 年第 16 届国际软件仓库挖掘会议(MSR’19)论文集中。IEEE,34–45。
[57]
Huakun Huang, Longtao Guo, Lingjun Zhao, Haoda Wang, Chenkai Xu, and Shan Jiang. 2024. Effective combining source code and opcode for accurate vulnerability detection of smart contracts in edge AI systems. Applied Soft Computing 158 (2024), 111556.
黄华坤,郭龙涛,赵凌君,王浩达,徐晨凯,姜山。2024。《边缘人工智能系统中智能合约漏洞检测的有效源代码与操作码结合》。应用软件计算 158(2024),111556。
[58]
Jianjun Huang, Songming Han, Wei You, Wenchang Shi, Bin Liang, Jingzheng Wu, and Yanjun Wu. 2021. Hunting vulnerable smart contracts via graph embedding based bytecode matching. IEEE Transactions on Information Forensics and Security 16 (2021), 2144–2156.
黄建军,韩松明,尤伟,石文畅,梁斌,吴竞政,吴艳君。2021. 基于字节码匹配的图嵌入智能合约漏洞搜索。IEEE 信息取证与安全杂志 16(2021),2144–2156。
[59]
Kai Huang, Xiangxin Meng, Jian Zhang, Yang Liu, Wenjie Wang, Shuhao Li, and Yuqing Zhang. 2023. An empirical study on fine-tuning large language models of code for automated program repair. In Proceedings of the 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE’23). IEEE, 1162–1174.
黄凯,孟祥欣,张健,刘洋,王文杰,李淑豪,张雨晴。2023.关于代码大型语言模型微调的自动化程序修复实证研究。在 2023 年 38 届 IEEE/ACM 国际自动软件工程会议(ASE’23)论文集中。IEEE,1162–1174。
[60]
Shumaila Hussain, Muhammad Nadeem, Junaid Baber, Mohammed Hamdi, Adel Rajab, Mana Saleh Al Reshan, and Asadullah Shaikh. 2024. Vulnerability detection in Java source code using a quantum convolutional neural network with self-attentive pooling, deep sequence, and graph-based hybrid feature extraction. Scientific Reports 14, 1 (2024), 7406.
Shumaila Hussain,Muhammad Nadeem,Junaid Baber,Mohammed Hamdi,Adel Rajab,Mana Saleh Al Reshan,和 Asadullah Shaikh. 2024. 基于自注意力池化、深度序列和基于图混合特征提取的量子卷积神经网络在 Java 源代码中的漏洞检测。科学报告 14,1(2024),7406。
[61]
Emanuele Iannone, Roberta Guadagni, Filomena Ferrucci, Andrea De Lucia, and Fabio Palomba. 2022. The secret life of software vulnerabilities: A large-scale empirical study. IEEE Transactions on Software Engineering 49, 1 (2022), 44–63.
伊曼纽尔·亚诺内,罗伯塔·瓜达尼,菲洛梅娜·费鲁奇,安德烈亚·德卢西亚,以及法比奥·帕洛马。2022。《软件漏洞的隐秘生活:一项大规模实证研究》。IEEE 软件工程杂志 49 卷 1 期(2022 年),44–63。
[62]
Vikas Kumar Jain and Meenakshi Tripathi. 2023. Multi-objective approach for detecting vulnerabilities in Ethereum smart contracts. In Proceedings of the 2023 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC’23). IEEE, 1–6.
贾卡斯·库马尔·贾因和米娜什基·特里帕蒂。2023。检测以太坊智能合约漏洞的多目标方法。载于 2023 年国际网络与计算机通信新兴趋势会议(ETNCC’23)论文集。IEEE,1–6。
[63]
Sanghoon Jeon and Huy Kang Kim. 2021. AutoVAS: An automated vulnerability analysis system with a deep learning approach. Computers & Security 106 (2021), 102308.
Sanghoon Jeon 和 Huy Kang Kim. 2021. AutoVAS:采用深度学习方法的自动化漏洞分析系统。计算机与安全 106(2021),102308。
[64]
Wanqing Jie, Qi Chen, Jiaqi Wang, Arthur Sandor Voundi Koe, Jin Li, Pengfei Huang, Yaqi Wu, and Yin Wang. 2023. A novel extended multimodal AI framework towards vulnerability detection in smart contracts. Information Sciences 636 (2023), 118907.
万庆街,陈琪,王佳琪,阿瑟·桑多·旺迪·科,李金,黄鹏飞,吴雅琪,王吟。2023。《面向智能合约漏洞检测的一种新型扩展多模态人工智能框架》。信息科学 636(2023),118907。
[65]
Barbara Kitchenham and Stuart Charters. 2007. Guidelines for Performing Systematic Literature Reviews in Software Engineering. EBSE Technical Report, Version 2.3. EBSE.
Barbara Kitchenham 和 Stuart Charters. 2007. 软件工程中进行系统文献综述的指南。EBSE 技术报告,版本 2.3。EBSE。
[66]
Saad Khan and Simon Parkinson. 2018. Review into state of the art of vulnerability assessment using artificial intelligence. In Guide to Vulnerability Analysis for Computer Networks and Systems. Springer, 3–32.
Saad Khan 和 Simon Parkinson. 2018. 利用人工智能进行漏洞评估技术综述。在《计算机网络和系统漏洞分析指南》。Springer,3–32。
[67]
Taegyu Kim, Chung Hwan Kim, Junghwan Rhee, Fan Fei, Zhan Tu, Gregory Walkup, Xiangyu Zhang, Xinyan Deng, and Dongyan Xu. 2019. RVFuzzer: Finding input validation bugs in robotic vehicles through control-guided testing. In Proceedings of the 28th USENIX Conference on Security Symposium (SEC’19). 425–442.
金泰宇,金钟焕,李正焕,费凡,谭图,格雷戈里·沃克普,张翔宇,邓欣燕,徐东岩。2019。RVFuzzer:通过控制引导测试在机器人车辆中寻找输入验证漏洞。在第 28 届 USENIX 安全研讨会(SEC’19)论文集中。425–442。
[68]
Lingdi Kong, Senlin Luo, Limin Pan, Zhouting Wu, and Xinshuai Li. 2024. A multi-type vulnerability detection framework with parallel perspective fusion and hierarchical feature enhancement. Computers & Security 140 (2024), 103787.
孔令蒂,罗森林,潘丽敏,吴志通,李新帅。2024. 基于并行视角融合和层次特征增强的多类型漏洞检测框架。计算机与安全 140(2024),103787。
[69]
Kyriakos Kritikos, Kostas Magoutis, Manos Papoutsakis, and Sotiris Ioannidis. 2019. A survey on vulnerability assessment tools and databases for cloud-based web applications. Array 3 (2019), 100011. 
[70]
Jorrit Kronjee, Arjen Hommersom, and Harald Vranken. 2018. Discovering software vulnerabilities using data-flow analysis and machine learning. In Proceedings of the 13th International Conference on Availability, Reliability, and Security (ARES’18). 1–10.
Jorrit Kronjee, Arjen Hommersom 和 Harald Vranken. 2018. 利用数据流分析和机器学习发现软件漏洞。在第十三届国际可用性、可靠性和安全性会议(ARES’18)论文集中。1–10。
[71]
Tue Le, Tuan Nguyen, Trung Le, Dinh Phung, Paul Montague, Olivier De Vel, and Lizhen Qu. 2018. Maximal divergence sequential autoencoder for binary software vulnerability detection. In Proceedings of the 2018 International Conference on Learning Representations (ICLR’18).
Tue Le,Tuan Nguyen,Trung Le,Dinh Phung,Paul Montague,Olivier De Vel,和 Lizhen Qu。2018。《最大差异序列自动编码器在二进制软件漏洞检测中的应用》。载于 2018 年国际学习表示会议(ICLR’18)论文集。
[72]
Triet H. M. Le, Huaming Chen, and M. Ali Babar. 2022. A survey on data-driven software vulnerability assessment and prioritization. ACM Computing Surveys 55, 5 (2022), 1–39.
Triet H. M. Le,陈华明,M. Ali Babar. 2022. 基于数据驱动的软件漏洞评估与优先级排序综述。ACM 计算评论 55,5(2022),1–39。
[73]
Triet Huynh Minh Le, David Hin, Roland Croft, and M. Ali Babar. 2021. DeepCVA: Automated commit-level vulnerability assessment with deep multi-task learning. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). 717–729.
胡伊明·莱,大卫·欣,罗兰·克罗夫特,M.阿里·巴巴。2021. DeepCVA:基于深度多任务学习的自动提交级别漏洞评估。在 2021 年第 36 届 IEEE/ACM 国际自动软件工程会议(ASE’21)论文集中。717–729。
[74]
Jian Li, Pinjia He, Jieming Zhu, and Michael R. Lyu. 2017. Software defect prediction via convolutional neural network. In Proceedings of the 2017 IEEE International Conference on Software Quality, Reliability, and Security (QRS’17). IEEE, 318–328.
李健,何品佳,朱杰明,李瑞。2017. 基于卷积神经网络的软件缺陷预测。在 2017 年 IEEE 国际软件质量、可靠性和安全性会议(QRS’17)论文集中。IEEE,318–328。
[75]
Litao Li, Steven H. H. Ding, Yuan Tian, Benjamin C. M. Fung, Philippe Charland, Weihan Ou, Leo Song, and Congwei Chen. 2023. VulANalyzeR: Explainable binary vulnerability detection with multi-task learning and attentional graph convolution. ACM Transactions on Privacy and Security 26, 3 (2023), 1–25.
李丽涛,丁世宏,田媛,冯柏承,夏兰,欧伟汉,宋磊,陈聪伟。2023. VulANalyzeR:基于多任务学习和注意力图卷积的可解释二进制漏洞检测。ACM 隐私和安全交易 26,3(2023),1–25。
[76]
Lina Li, Yang Liu, Guodong Sun, and Nianfeng Li. 2024. Smart contract vulnerability detection based on automated feature extraction and feature interaction. IEEE Transactions on Knowledge and Data Engineering 36, 9 (2024), 4916–4929.
李琳,刘洋,孙国栋,李念锋。2024. 基于自动化特征提取和特征交互的智能合约漏洞检测。IEEE 知识数据工程杂志 36,9(2024),4916–4929。
[77]
Xin Li, Yang Xin, Hongliang Zhu, Yixian Yang, and Yuling Chen. 2023. Cross-domain vulnerability detection using graph embedding and domain adaptation. Computers & Security 125 (2023), 103017.
辛力,杨欣,朱红亮,杨一娴,陈玉玲。2023. 基于图嵌入和领域自适应的跨域漏洞检测。计算机与安全 125(2023),103017。
[78]
Yi Li, Shaohua Wang, and Tien N. Nguyen. 2021. Vulnerability detection with fine-grained interpretations. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’21). 292–303.
李毅,王少华,阮天年。2021。基于细粒度解释的漏洞检测。在 29 届 ACM 欧洲软件工程联合会议和软件工程基础研讨会(ESEC/FSE’21)论文集中,第 292-303 页。
[79]
Yi Li, Shaohua Wang, Tien N. Nguyen, and Son Van Nguyen. 2019. Improving bug detection via context-based code representation learning and attention-based neural networks. Proceedings of the ACM on Programming Languages 3, OOPSLA (Oct. 2019), Article 162, 30 pages. DOI:
李毅,王少华,阮天年,阮顺范。2019。通过基于上下文的代码表示学习和基于注意力的神经网络改进错误检测。ACM 程序语言会议论文集 3,OOPSLA(2019 年 10 月),文章编号 162,30 页。DOI:
[80]
Yi Li, Shaohua Wang, Tien N. Nguyen, and Son Van Nguyen. 2019. Improving bug detection via context-based code representation learning and attention-based neural networks. Proceedings of the ACM on Programming Languages 3, OOPSLA (Oct. 2019), Article 162, 30 pages.
李毅,王少华,阮天年,阮顺范。2019。通过基于上下文的代码表示学习和基于注意力的神经网络改进错误检测。ACM 程序语言会议论文集 3,OOPSLA(2019 年 10 月),文章编号 162,30 页。
[81]
Yi Li, Aashish Yadavally, Jiaxing Zhang, Shaohua Wang, and Tien N. Nguyen. 2023. Commit-level, neural vulnerability detection and assessment. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’23). 1024–1036.
李毅,雅什·亚达瓦利,张嘉兴,王少华,阮天恩。2023. 基于提交级别的神经漏洞检测与评估。在 31 届 ACM 欧洲软件工程联合会议和软件工程基础研讨会(ESEC/FSE’23)论文集中。1024–1036。
[82]
Zhaoxuan Li, Siqi Lu, Rui Zhang, Ziming Zhao, Rujin Liang, Rui Xue, Wenhao Li, Fan Zhang, and Sheng Gao. 2023. VulHunter: Hunting vulnerable smart contracts at EVM bytecode-level via multiple instance learning. IEEE Transactions on Software Engineering 49, 11 (2023), 4886–4916.
赵璇 李,陆思琪,张瑞,赵子明,梁瑞金,薛瑞,李文豪,张帆,高胜。2023. VulHunter:通过多实例学习在 EVM 字节码级别猎捕易受攻击的智能合约。IEEE 软件工程杂志 49,11(2023),4886–4916。
[83]
Zhen Li, Deqing Zou, Shouhuai Xu, Zhaoxuan Chen, Yawei Zhu, and Hai Jin. 2022. VulDeeLocator: A deep learning-based fine-grained vulnerability detector. IEEE Transactions on Dependable and Secure Computing 19 (2022), 2821–2837.
Zhen Li, Deqing Zou, Shouhuai Xu, Zhaoxuan Chen, Yawei Zhu, 和 Hai Jin. 2022. VulDeeLocator:一种基于深度学习的细粒度漏洞检测器。IEEE 可靠性和安全计算杂志 19 (2022),2821–2837。
[84]
Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, and Zhaoxuan Chen. 2022. SySeVR: A framework for using deep learning to detect software vulnerabilities. IEEE Transactions on Dependable and Secure Computing 19 (2022), 2244–2258.
Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, 和 Zhaoxuan Chen. 2022. SySeVR:一种利用深度学习检测软件漏洞的框架。IEEE 可靠性和安全计算杂志 19 (2022),2244–2258。
[85]
Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. 2018. VulDeePecker: A deep learning-based system for vulnerability detection. In Proceedings of the 2018 Network and Distributed Systems Security Symposium (NDSS’18). 1–15.
郑力,邹德清,徐守怀,欧新宇,金海,王素娟,邓志军,钟雨怡。2018. VulDeePecker:一种基于深度学习的漏洞检测系统。在 2018 年网络与分布式系统安全研讨会(NDSS’18)论文集中,第 1-15 页。
[86]
Guanjun Lin, Sheng Wen, Qing-Long Han, Jun Zhang, and Yang Xiang. 2020. Software vulnerability detection using deep neural networks: A survey. Proceedings of the IEEE 108, 10 (2020), 1825–1848.
林冠军,文胜,韩庆龙,张军,向阳。2020. 基于深度神经网络的软件漏洞检测:综述。IEEE 第 108 卷第 10 期(2020 年),1825–1848。
[87]
Guanjun Lin, Jun Zhang, Wei Luo, Lei Pan, Olivier De Vel, Paul Montague, and Yang Xiang. 2019. Software vulnerability discovery via learning multi-domain knowledge bases. IEEE Transactions on Dependable and Secure Computing 18, 5 (2019), 2469–2485.
林冠军,张军,罗伟,潘磊,德·维勒,蒙塔古,向杨。2019. 通过学习多领域知识库进行软件漏洞发现。IEEE 可信赖和安全的计算杂志,第 18 卷,第 5 期(2019 年),2469–2485。
[88]
Guanjun Lin, Jun Zhang, Wei Luo, Lei Pan, Yang Xiang, Olivier De Vel, and Paul Montague. 2018. Cross-project transfer representation learning for vulnerable function discovery. IEEE Transactions on Industrial Informatics 14, 7 (2018), 3289–3297. 
[89]
Jinfeng Lin, Yalin Liu, Qingkai Zeng, Meng Jiang, and Jane Cleland-Huang. 2021. Traceability transformed: Generating more accurate links with pre-trained BERT models. In Proceedings of the 43rd International Conference on Software Engineering (ICSE’21). IEEE, 324–335.
林金凤,刘亚林,曾庆凯,姜萌,黄洁莲。2021。可追溯性转型:使用预训练的 BERT 模型生成更精确的链接。在《第 43 届国际软件工程会议(ICSE’21)》论文集中。IEEE,324–335。
[90]
Chao Liu, Cuiyun Gao, Xin Xia, David Lo, John Grundy, and Xiaohu Yang. 2021. On the reproducibility and replicability of deep learning in software engineering. ACM Transactions on Software Engineering and Methodology 31, 1 (2021), 1–46.
赵刘,高翠云,夏欣,罗大卫,约翰·格鲁迪,杨晓虎。2021。关于软件工程中深度学习的可重复性和可复制性。ACM 软件工程与方法论交易 31,1(2021),1-46。
[91]
Haiyang Liu, Yuqi Fan, Lin Feng, and Zhenchun Wei. 2023. Vulnerable smart contract function locating based on multi-relational nested graph convolutional network. Journal of Systems and Software 204 (2023), 111775.
刘海阳,范雨琪,冯琳,魏振春。2023. 基于多关系嵌套图卷积网络的可疑智能合约函数定位。系统与软件杂志 204(2023),111775。
[92]
Huijiang Liu, Shuirou Jiang, Xuexin Qi, Yang Qu, Hui Li, Tingting Li, Cheng Guo, and Shikai Guo. 2024. Detect software vulnerabilities with weight biases via graph neural networks. Expert Systems with Applications 238 (2024), 121764.
刘辉江,姜水柔,齐学新,曲杨,李辉,李婷婷,郭成,郭世凯. 2024. 通过图神经网络检测具有权重偏差的软件漏洞. 专家系统与应用 238(2024),121764.
[93]
Jingqiang Liu, Xiaoxi Zhu, Chaoge Liu, Xiang Cui, and Qixu Liu. 2022. CPGBERT: An effective model for defect detection by learning program semantics via code property graph. In Proceedings of the 2022 IEEE International Conference on Trust, Security, and Privacy in Computing and Communications (TrustCom’22). IEEE, 274–282.
刘静强,朱晓曦,刘超戈,崔翔,刘奇旭。2022. CPGBERT:通过代码属性图学习程序语义的缺陷检测有效模型。在 2022 年 IEEE 国际信任、安全和隐私计算与通信会议(TrustCom’22)论文集中。IEEE,274–282。
[94]
Shigang Liu, Guanjun Lin, Qing-Long Han, Sheng Wen, Jun Zhang, and Yang Xiang. 2019. DeepBalance: Deep-learning and fuzzy oversampling for vulnerability detection. IEEE Transactions on Fuzzy Systems 28, 7 (2019), 1329–1343.
刘世刚,林冠军,韩庆龙,文胜,张军,向杨。2019. DeepBalance:用于漏洞检测的深度学习和模糊过采样。IEEE 模糊系统汇刊 28,7(2019),1329–1343。
[95]
Shigang Liu, Guanjun Lin, Lizhen Qu, Jun Zhang, Olivier De Vel, Paul Montague, and Yang Xiang. 2020. CD-VulD: Cross-domain vulnerability discovery based on deep domain adaptation. IEEE Transactions on Dependable and Secure Computing 19, 1 (2020), 438–451.
刘世刚,林冠军,曲丽珍,张军,奥利弗·德·韦尔,保罗·蒙塔古,杨翔。2020。CD-VulD:基于深度领域自适应的跨域漏洞发现。IEEE 可信赖和安全的计算杂志 19,1(2020),438–451。
[96]
Zhenguang Liu, Peng Qian, Xiang Wang, Lei Zhu, Qinming He, and Shouling Ji. 2021. Smart contract vulnerability detection: From pure neural network to interpretable graph feature and expert pattern fusion. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI’21).
刘振光,钱鹏,王翔,朱磊,何勤明,季守岭。2021。智能合约漏洞检测:从纯神经网络到可解释图特征与专家模式融合。在《第 30 届国际人工智能联合会议(IJCAI’21)》论文集中。
[97]
Zhenguang Liu, Peng Qian, Xiaoyang Wang, Yuan Zhuang, Lin Qiu, and Xun Wang. 2023. Combining graph neural networks with expert knowledge for smart contract vulnerability detection. IEEE Transactions on Knowledge and Data Engineering 35, 2 (2023), 1296–1310.
刘振光,钱鹏,王晓阳,庄媛,邱琳,王勋。2023. 结合图神经网络与专家知识进行智能合约漏洞检测。IEEE 知识数据工程杂志 35,2(2023),1296–1310。
[98]
Guilong Lu, Xiaolin Ju, Xiang Chen, Wenlong Pei, and Zhilong Cai. 2024. GRACE: Empowering LLM-based software vulnerability detection with graph structure and in-context learning. Journal of Systems and Software 212 (2024), 112031.
鲁桂林,俱晓琳,陈翔,裴文龙,蔡志龙。2024. GRACE:利用图结构和上下文学习增强LLM-基于的软件漏洞检测。系统与软件杂志 212(2024),112031。
[99]
Abu Sayed Mahfuz. 2016. Software Quality Assurance: Integrating Testing, Security, and Audit. CRC Press.
阿布·赛义德·马赫富兹. 2016. 软件质量保证:整合测试、安全和审计. CRC 出版社.
[100]
Yi Mao, Yun Li, Jiatai Sun, and Yixin Chen. 2020. Explainable software vulnerability detection based on attention-based bidirectional recurrent neural networks. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data’20). IEEE, 4651–4656.
易毛,李云,孙嘉泰,陈益欣。2020. 基于注意力机制的循环神经网络的可解释软件漏洞检测。在 2020 年 IEEE 国际大数据会议(Big Data’20)论文集中。IEEE,4651–4656。
[101]
Andrew Meneely, Harshavardhan Srinivasan, Ayemi Musa, Alberto Rodriguez Tejeda, Matthew Mokary, and Brian Spates. 2013. When a patch goes bad: Exploring the properties of vulnerability-contributing commits. In Proceedings of the 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’13). IEEE, 65–74.
安德鲁·梅尼利,哈什瓦德汉·斯里尼瓦桑,阿耶米·穆萨,阿尔贝托·罗德里格斯·特耶达,马修·莫卡里,布莱恩·斯佩茨斯。2013。《当补丁变坏:探讨导致漏洞的提交属性》。载于 2013 年 ACM/IEEE 国际实证软件工程与度量研讨会(ESEM'13)论文集。IEEE,第 65-74 页。
[102]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: A framework for heavyweight dynamic binary instrumentation. ACM SIGPLAN Notices 42, 6 (2007), 89–100.
尼古拉斯·内瑟科特和朱利安·西沃德。2007。Valgrind:一种重型动态二进制插桩框架。ACM SIGPLAN 通告 42,6(2007),89–100。
[103]
Hoang H. Nguyen, Nhat-Minh Nguyen, Hong-Phuc Doan, Zahra Ahmadi, Thanh-Nam Doan, and Lingxiao Jiang. 2022. MANDO-GURU: Vulnerability detection for smart contract source code by heterogeneous graph embeddings. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’22). 1736–1740.
黄鸿辉·吴,吴念明,段洪福,阿赫玛迪·扎赫拉,段清南,姜凌霄。2022。MANDO-GURU:通过异构图嵌入进行智能合约源代码漏洞检测。在 30 届 ACM 欧洲软件工程联合会议和软件工程基础研讨会(ESEC/FSE’22)论文集中。1736–1740。
[104]
Hoang H. Nguyen, Nhat-Minh Nguyen, Chunyao Xie, Zahra Ahmadi, Daniel Kudendo, Thanh-Nam Doan, and Ling-xiao Jiang. 2022. MANDO: Multi-level heterogeneous graph embeddings for fine-grained detection of smart contract vulnerabilities. In Proceedings of the 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA’22). IEEE, 1–10.
黄辉·吴,吴念明,谢春瑶,阿赫玛迪,丹尼尔·库登多,阮清南,蒋凌晓。2022。MANDO:用于智能合约漏洞细粒度检测的多层次异构图嵌入。载于 2022 年第 9 届 IEEE 国际数据科学与高级分析会议(DSAA’22)论文集。IEEE,1–10。
[105]
Hoang H. Nguyen, Nhat-Minh Nguyen, Chunyao Xie, Zahra Ahmadi, Daniel Kudendo, Thanh-Nam Doan, and Ling-xiao Jiang. 2023. MANDO-HGT: Heterogeneous graph transformers for smart contract vulnerability detection. In Proceedings of the 20th International Conference on Mining Software Repositories (MSR’23). IEEE, 334–346.
黄辉·吴,吴念明,谢春瑶,阿赫玛迪·扎赫拉,库登多·丹尼尔,段清南,姜凌晓。2023。MANDO-HGT:用于智能合约漏洞检测的异构图变换器。在《第 20 届国际软件仓库挖掘会议(MSR’23)》论文集中。IEEE,334–346。
[106]
Son Nguyen, Thu-Trang Nguyen, Thanh Trong Vu, Thanh-Dat Do, Kien-Tuan Ngo, and Hieu Dinh Vo. 2024. Code-centric learning-based just-in-time vulnerability detection. Journal of Systems and Software 214 (2024), 112014.
[107]
Tuan Nguyen, Trung Le, Khanh Nguyen, Olivier de Vel, Paul Montague, John Grundy, and Dinh Phung. 2020. Deep cost-sensitive kernel machine for binary software vulnerability detection. In Advances in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, Vol. 12085. Springer, 164–177.
[108]
Thu-Trang Nguyen and Hieu Dinh Vo. 2024. Context-based statement-level vulnerability localization. Information and Software Technology 169 (2024), 107406.
Nguyen Thu-Trang 和 Vo Hieu Dinh. 2024. 基于上下文的语句级漏洞定位。信息与软件技术 169 (2024),107406。
[109]
Van Nguyen, Trung Le, Chakkrit Tantithamthavorn, John Grundy, and Dinh Phung. 2024. Deep domain adaptation with max-margin principle for cross-project imbalanced software vulnerability detection. ACM Transactions on Software Engineering and Methodology 33, 6 (2024), Article 162, 34 pages.
范辉,陈中,查克里·坦提萨蒙,约翰·格鲁迪,丁峰。2024. 基于最大边缘原理的跨项目不平衡软件漏洞检测的深度领域自适应。ACM 软件工程与方法学交易 33,6(2024),文章 162,34 页。
[110]
Chao Ni, Xinrong Guo, Yan Zhu, Xiaodan Xu, and Xiaohu Yang. 2023. Function-level vulnerability detection through fusing multi-modal knowledge. In Proceedings of the 2023 IEEE/ACM International Conference on Automated Software Engineering (ASE’23). IEEE, 1911–1918.
赵妮,郭新荣,朱岩,徐晓丹,杨晓虎。2023. 通过融合多模态知识进行功能级漏洞检测。在 2023 年 IEEE/ACM 国际自动软件工程会议(ASE’23)论文集中。IEEE,1911–1918。
[111]
Chao Ni, Wei Wang, Kaiwen Yang, Xin Xia, Kui Liu, and David Lo. 2022. The best of both worlds: Integrating semantic features with expert features for defect prediction and localization. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’22). 672–683.
赵妮,王伟,杨凯文,夏欣,刘奎,和 David Lo. 2022. 两者之最佳:将语义特征与专家特征整合进行缺陷预测和定位。在 30 届 ACM 欧洲软件工程联合会议和软件工程基础研讨会(ESEC/FSE’22)论文集中。672–683。
[112]
Yu Nong, Rainy Sharma, Abdelwahab Hamou-Lhadj, Xiapu Luo, and Haipeng Cai. 2022. Open science in software engineering: A study on deep learning-based vulnerability detection. IEEE Transactions on Software Engineering 49, 4 (2022), 1983–2005.
余农,拉尼·夏尔马,阿卜德瓦哈布·哈莫-拉德吉,罗霞普,蔡海鹏。2022. 软件工程中的开放科学:基于深度学习的漏洞检测研究。IEEE 软件工程杂志 49,4(2022),1983–2005。
[113]
Shengyi Pan, Lingfeng Bao, Xin Xia, David Lo, and Shanping Li. 2023. Fine-grained commit-level vulnerability type prediction by CWE tree structure. In Proceedings of the 45th International Conference on Software Engineering (ICSE’23). IEEE, 957–969.
潘胜义,包凌峰,夏欣,罗大卫,李山平。2023. 基于 CWE 树结构的细粒度提交级漏洞类型预测。在《第 45 届国际软件工程会议(ICSE’23)》论文集中。IEEE,957–969。
[114]
Luca Pascarella, Fabio Palomba, and Alberto Bacchelli. 2019. Fine-grained just-in-time defect prediction. Journal of Systems and Software 150 (2019), 22–36.
Luca Pascarella, Fabio Palomba, 和 Alberto Bacchelli. 2019. 精细粒度即时缺陷预测。系统与软件杂志 150 (2019),22–36。
[115]
Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. VCCFinder: Finding potential vulnerabilities in open-source projects to assist code audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 426–437.
Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, 和 Yasemin Acar. 2015. VCCFinder:在开源项目中寻找潜在漏洞以协助代码审计。在 2015 年 ACM SIGSAC 第 22 届计算机与通信安全会议论文集。426–437。
[116]
Kai Petersen, Sairam Vakkalanka, and Ludwik Kuzniarz. 2015. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology 64 (2015), 1–18.
Kai Petersen,Sairam Vakkalanka,Ludwik Kuzniarz. 2015. 软件工程系统映射研究指南:更新。信息与软件技术 64(2015),1–18。
[117]
Anh Viet Phan, Minh Le Nguyen, and Lam Thu Bui. 2017. Convolutional neural networks over control flow graphs for software defect prediction. In Proceedings of the 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI’17). IEEE, 45–52.
安越藩,黎明,武秋梅。2017。基于控制流图的卷积神经网络在软件缺陷预测中的应用。在 2017 年 IEEE 第 29 届国际人工智能工具会议(ICTAI’17)论文集中。IEEE,45–52。
[118]
Michael Pradel and Koushik Sen. 2018. DeepBugs: A learning approach to name-based bug detection. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), Article 147, 25 pages.
Michael Pradel 和 Koushik Sen. 2018. DeepBugs:基于名称的漏洞检测的学习方法。ACM 程序语言会议论文集 2,OOPSLA(2018),文章 147 号,25 页。
[119]
Ali Raza and Waseem Ahmed. 2022. Threat and vulnerability management life cycle in operating systems: A systematic review. Journal of Multidisciplinary Engineering Science and Technology 9, 1 (2022), 15010–15013.
Ali Raza 和 Waseem Ahmed. 2022. 操作系统中的威胁和漏洞管理生命周期:系统综述。多学科工程科学和技术杂志 9,1(2022),15010–15013。
[120]
Xiaojun Ren, Yongtang Wu, Jiaqing Li, Dongmin Hao, and Muhammad Alam. 2023. Smart contract vulnerability detection based on a semantic code structure and a self-designed neural network. Computers and Electrical Engineering 109 (2023), 108766.
任晓军,吴永堂,李家庆,郝东民,阿兰·穆罕默德。2023.基于语义代码结构和自设计神经网络的智能合约漏洞检测。计算机与电气工程 109(2023),108766。
[121]
Rebecca Russell, Louis Kim, Lei Hamilton, Tomo Lazovich, Jacob Harer, Onur Ozdemir, Paul Ellingwood, and Marc McConley. 2018. Automated vulnerability detection in source code using deep representation learning. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA’18). IEEE, 757–762.
Rebecca Russell, Louis Kim, Lei Hamilton, Tomo Lazovich, Jacob Harer, Onur Ozdemir, Paul Ellingwood, 和 Marc McConley. 2018. 基于深度表示学习的源代码自动漏洞检测。在 2018 年第 17 届 IEEE 国际机器学习与应用会议(ICMLA’18)论文集中。IEEE,757–762。
[122]
Riccardo Scandariato, James Walden, Aram Hovsepyan, and Wouter Joosen. 2014. Predicting vulnerable software components via text mining. IEEE Transactions on Software Engineering 40, 10 (2014), 993–1006.
Riccardo Scandariato,James Walden,Aram Hovsepyan,和 Wouter Joosen. 2014. 通过文本挖掘预测易受攻击的软件组件。IEEE 软件工程汇刊 40,10(2014),993–1006。
[123]
Hinrich Schütze, Christopher D. Manning, and Prabhakar Raghavan. 2008. Introduction to Information Retrieval. Vol. 39. Cambridge University Press, Cambridge.
Hinrich Schütze、Christopher D. Manning 和 Prabhakar Raghavan. 2008. 信息检索导论。第 39 卷。剑桥大学出版社,剑桥。
[124]
Abubakar Omari Abdallah Semasaba, Wei Zheng, Xiaoxue Wu, and Samuel Akwasi Agyemang. 2020. Literature survey of deep learning-based vulnerability analysis on source code. IET Software 14, 6 (2020), 654–664.
阿布巴卡尔·奥马里·阿卜杜拉希·塞马萨巴,魏正,吴晓雪,塞缪尔·阿库阿西·阿吉曼。2020。基于深度学习的源代码漏洞分析文献综述。IET 软件 14,6(2020),654–664。
[125]
Christoph Sendner, Huili Chen, Hossein Fereidooni, Lukas Petzi, Jan König, Jasper Stang, Alexandra Dmitrienko, Ahmad-Reza Sadeghi, and Farinaz Koushanfar. 2023. Smarter contracts: Detecting vulnerabilities in smart contracts with deep transfer learning. In Proceedings of the 2023 Network and Distributed Security Symposium (NDSS’23).
Christoph Sendner, Huili Chen, Hossein Fereidooni, Lukas Petzi, Jan König, Jasper Stang, Alexandra Dmitrienko, Ahmad-Reza Sadeghi, 和 Farinaz Koushanfar. 2023. 智能合约:利用深度迁移学习检测智能合约中的漏洞。在 2023 网络与分布式安全研讨会(NDSS’23)论文集中。
[126]
Thomas Shippey, David Bowes, and Tracy Hall. 2019. Automatically identifying code features for software defect prediction: Using AST n-grams. Information and Software Technology 106 (2019), 142–160.
托马斯·希佩伊,大卫·鲍斯,特雷西·霍尔。2019。自动识别代码特征以预测软件缺陷:使用 AST n-gram。信息与软件技术 106(2019),142–160。
[127]
Zihua Song, Junfeng Wang, Kaiyuan Yang, and Jigang Wang. 2023. HGIVul: Detecting inter-procedural vulnerabilities based on hypergraph convolution. Information and Software Technology 160 (2023), 107219.
宋子华,王军锋,杨凯元,王继刚。2023. 基于超图卷积的跨程序漏洞检测:HGIVul。信息与软件技术 160(2023),107219。
[128]
Miroslaw Staron, Mirosław Ochodek, Wilhelm Meding, and Ola Söder. 2020. Using machine learning to identify code fragments for manual review. In Proceedings of the 2020 46th Euromicro Conference on Software Engineering and Advanced Applications (SEAA’20). IEEE, 513–516.
Miroslaw Staron, Mirosław Ochodek, Wilhelm Meding 和 Ola Söder. 2020. 利用机器学习识别需要人工审查的代码片段。在 2020 年第 46 届欧姆 icro 软件工程与高级应用会议(SEAA’20)论文集中。IEEE,513–516。
[129]
Benjamin Steenhoek, Hongyang Gao, and Wei Le. 2024. Dataflow analysis-inspired deep learning for efficient vulnerability detection. In Proceedings of the 46th International Conference on Software Engineering (ICSE’24). 1–13.
本杰明·斯蒂恩霍克,高红阳,和李伟。2024.受数据流分析启发的深度学习在高效漏洞检测中的应用。在第 46 届国际软件工程会议(ICSE’24)论文集中,第 1-13 页。
[130]
Octavian Suciu, Connor Nelson, Zhuoer Lyu, Tiffany Bao, and Tudor Dumitraş. 2022. Expected exploitability: Predicting the development of functional vulnerability exploits. In Proceedings of the 31st USENIX Security Symposium (Security’22). 377–394.
奥克塔维安·苏奇乌,康纳·尼尔森,卓尔·刘,蒂芙尼·宝,以及图多尔·杜米特拉什。2022 年。预期可利用性:预测功能性漏洞利用的发展。在 31 届 USENIX 安全研讨会(Security’22)论文集中。第 377-394 页。
[131]
Hao Sun, Lei Cui, Lun Li, Zhenquan Ding, Zhiyu Hao, Jiancong Cui, and Peng Liu. 2021. VDSimilar: Vulnerability detection based on code similarity of vulnerabilities and patches. Computers & Security 110 (2021), 102417.
郝孙,崔磊,李伦,丁振权,郝志宇,崔建聪,刘鹏。2021. 基于漏洞和补丁代码相似性的漏洞检测方法:VDSimilar。计算机与安全 110(2021),102417。
[132]
Hao Sun, Lei Cui, Lun Li, Zhenquan Ding, Siyuan Li, Zhiyu Hao, and Hongsong Zhu. 2024. VDTriplet: Vulnerability detection with graph semantics using triplet model. Computers & Security 139 (2024), 103732.
郝孙,崔磊,李伦,丁振权,李思源,郝志宇,朱宏松。2024. VDTriplet:基于三元组模型的图语义漏洞检测。计算机与安全 139(2024),103732。
[133]
Nan Sun, Jun Zhang, Paul Rimba, Shang Gao, Leo Yu Zhang, and Yang Xiang. 2018. Data-driven cybersecurity incident prediction: A survey. IEEE Communications Surveys & Tutorials 21, 2 (2018), 1744–1772.
南孙,张军,保罗·里姆巴,高翔,张宇,杨翔。2018。基于数据驱动的网络安全事件预测:综述。IEEE 通信调查与教程 21,2(2018),1744–1772。
[134]
Xiaobing Sun, Liangqiong Tu, Jiale Zhang, Jie Cai, Bin Li, and Yu Wang. 2023. ASSBert: Active and semi-supervised bert for smart contract vulnerability detection. Journal of Information Security and Applications 73 (2023), 103423.
孙晓兵,涂亮琼,张嘉乐,蔡杰,李斌,王宇. 2023. ASSBert:用于智能合约漏洞检测的主动和半监督 Bert. 信息安全与应用杂志 73 (2023),103423.
[135]
Yuqiang Sun, Daoyuan Wu, Yue Xue, Han Liu, Haijun Wang, Zhengzi Xu, Xiaofei Xie, and Yang Liu. 2024. GPTScan: Detecting logic vulnerabilities in smart contracts by combining GPT with program analysis. In Proceedings of the 46th International Conference on Software Engineering (ICSE’24). 1–13.
孙宇强,吴道远,薛越,刘汉,王海军,徐正子,谢晓飞,刘杨。2024. GPTScan:结合 GPT 与程序分析检测智能合约中的逻辑漏洞。在《第 46 届国际软件工程会议(ICSE’24)》论文集中,第 1-13 页。
[136]
Wei Tang, Mingwei Tang, Minchao Ban, Ziguo Zhao, and Mingjun Feng. 2023. CSGVD: A deep learning approach combining sequence and graph embedding for source code vulnerability detection. Journal of Systems and Software 199 (2023), 111623.
魏唐,唐明伟,班敏超,赵子国,冯明军。2023. CSGVD:一种结合序列和图嵌入的源代码漏洞检测的深度学习方法。系统与软件杂志 199(2023),111623。
[137]
Zhiquan Tang, Qiao Hu, Yupeng Hu, Wenxin Kuang, and Jiongyi Chen. 2022. SeVulDet: A semantics-enhanced learnable vulnerability detector. In Proceedings of the 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’22). IEEE, 150–162.
唐志全,胡乔,胡宇鹏,旷文欣,陈炯毅。2022. SeVulDet:一种语义增强的可学习漏洞检测器。在 2022 年第 52 届 IEEE/IFIP 国际可靠系统与网络会议(DSN’22)论文集中。IEEE,150–162。
[138]
Wenxin Tao, Xiaohong Su, Jiayuan Wan, Hongwei Wei, and Weining Zheng. 2023. Vulnerability detection through cross-modal feature enhancement and fusion. Computers & Security 132 (2023), 103341.
文心涛,苏晓红,万嘉源,魏宏伟,郑伟宁。2023. 通过跨模态特征增强与融合进行漏洞检测。计算机与安全 132(2023),103341。
[139]
Junfeng Tian, Wenjing Xing, and Zhen Li. 2020. BVDetector: A program slice-based binary code vulnerability intelligent detection system. Information and Software Technology 123 (2020), 106289.
田军锋,邢文静,李振. 2020. BVDetector:一种基于程序切片的二进制代码漏洞智能检测系统. 信息与软件技术 123 (2020), 106289.
[140]
Zhenzhou Tian, Binhui Tian, Jiajun Lv, Yanping Chen, and Lingwei Chen. 2024. Enhancing vulnerability detection via AST decomposition and neural sub-tree encoding. Expert Systems with Applications 238 (2024), 121865.
郑州天,彬辉天,吕家骏,陈艳平,陈凌伟。2024. 通过 AST 分解和神经子树编码增强漏洞检测。专家系统与应用 238(2024),121865。
[141]
Tong Wan, Lu Lu, Hao Xu, and Quanyi Zou. 2023. Software vulnerability detection via doc2vec with path representations. In Proceedings of the 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security Companion (QRS-C’23). IEEE, 131–139.
童万,陆陆,许浩,邹全一. 2023. 基于 doc2vec 和路径表示的软件漏洞检测. 在 2023 年 IEEE 第 23 届国际软件质量、可靠性和安全性会议(QRS-C’23)配套会议论文集中. IEEE,131–139.
[142]
Huanting Wang, Guixin Ye, Zhanyong Tang, Shin Hwei Tan, Songfang Huang, Dingyi Fang, Yansong Feng, Lizhong Bian, and Zheng Wang. 2020. Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Transactions on Information Forensics and Security 16 (2020), 1943–1958.
王欢廷,叶国新,唐占勇,谭欣慧,黄松方,方定一,冯彦松,边立中,王正。2020。结合基于图的学习与自动化数据收集进行代码漏洞检测。IEEE 信息取证与安全杂志 16(2020),1943–1958。
[143]
Mingke Wang, Chuanqi Tao, and Hongjing Guo. 2023. LCVD: Loop-oriented code vulnerability detection via graph neural network. Journal of Systems and Software 202 (2023), 111706.
王明科,陶川奇,郭红晶。2023. LCVD:基于图神经网络的循环导向代码漏洞检测。系统与软件杂志 202 (2023),111706。
[144]
Qian Wang, Zhengdao Li, Hetong Liang, Xiaowei Pan, Hui Li, Tingting Li, Xiaochen Li, Chenchen Li, and Shikai Guo. 2024. Graph confident learning for software vulnerability detection. Engineering Applications of Artificial Intelligence 133 (2024), 108296.
钱王,李正道,梁和同,潘晓伟,李辉,李婷婷,李晓晨,李晨晨,郭石凯。2024. 软件漏洞检测的图置信学习。人工智能在工程中的应用 133(2024),108296。
[145]
Song Wang, Taiyue Liu, Jaechang Nam, and Lin Tan. 2018. Deep semantic feature learning for software defect prediction. IEEE Transactions on Software Engineering 46, 12 (2018), 1267–1293.
王松,刘泰岳,南载昌,谭琳。2018. 软件缺陷预测的深度语义特征学习。IEEE 软件工程 Transactions 46, 12 (2018), 1267–1293。
[146]
Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically learning semantic features for defect prediction. In Proceedings of the 38th International Conference on Software Engineering (ICSE’16). IEEE, 297–308.
王松,刘泰岳,谭琳。2016。自动学习语义特征进行缺陷预测。载于第 38 届国际软件工程会议(ICSE’16)论文集。IEEE,297–308。
[147]
Wenbo Wang, Tien N. Nguyen, Shaohua Wang, Yi Li, Jiyuan Zhang, and Aashish Yadavally. 2023. DeepVD: Toward class-separation features for neural network vulnerability detection. In Proceedings of the 45th International Conference on Software Engineering (ICSE’23). IEEE, 2249–2261.
王文博,阮天年,王少华,李毅,张继源,雅什·亚达瓦利。2023. DeepVD:神经网络漏洞检测中的类别分离特征。在《第 45 届国际软件工程会议(ICSE’23)》论文集中。IEEE,2249–2261。
[148]
Yan Wang, Peng Jia, Xi Peng, Cheng Huang, and Jiayong Liu. 2023. BinVulDet: Detecting vulnerability in binary program via decompiled pseudo code and BiLSTM-attention. Computers & Security 125 (2023), 103023.
王岩,贾鹏,彭曦,黄成,刘佳勇。2023. BinVulDet:通过反编译伪代码和 BiLSTM-attention 检测二进制程序中的漏洞。计算机与安全 125(2023),103023。
[149]
Laura Wartschinski, Yannic Noller, Thomas Vogel, Timo Kehrer, and Lars Grunske. 2022. VUDENC: Vulnerability detection with deep learning on a natural codebase for Python. Information and Software Technology 144 (2022), 106809.
Laura Wartschinski, Yannic Noller, Thomas Vogel, Timo Kehrer 和 Lars Grunske. 2022. 基于 Python 自然代码库的深度学习漏洞检测:VUDENC. 信息与软件技术 144 (2022),106809.
[150]
Xin-Cheng Wen, Yupan Chen, Cuiyun Gao, Hongyu Zhang, Jie M. Zhang, and Qing Liao. 2023. Vulnerability detection with graph simplification and enhanced graph representation learning. In Proceedings of the 45th International Conference on Software Engineering (ICSE’23). IEEE, 2275–2286. 
[151]
Xin-Cheng Wen, Cuiyun Gao, Jiaxin Ye, Yichen Li, Zhihong Tian, Yan Jia, and Xuan Wang. 2024. Meta-path based attentional graph learning model for vulnerability detection. IEEE Transactions on Software Engineering 50 (2024), 360–375.
文新成,高翠云,叶佳欣,李一晨,田志宏,贾岩,王璇。2024. 基于元路径的注意力图学习模型用于漏洞检测。IEEE 软件工程杂志 50(2024),360–375。
[152]
Bolun Wu, Futai Zou, Ping Yi, Yue Wu, and Liang Zhang. 2023. SlicedLocator: Code vulnerability locator based on sliced dependence graph. Computers & Security 134 (2023), 103469.
吴博伦,邹复台,易平,吴越,张亮。2023. SlicedLocator:基于切片依赖图的代码漏洞定位器。计算机与安全 134(2023),103469。
[153]
Hongjun Wu, Zhuo Zhang, Shangwen Wang, Yan Lei, Bo Lin, Yihao Qin, Haoyu Zhang, and Xiaoguang Mao. 2021. Peculiar: Smart contract vulnerability detection based on crucial data flow graph and pre-training techniques. In Proceedings of the 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE’21). 378–389. 
[154]
Tongshuai Wu, Liwei Chen, Gewangzi Du, Dan Meng, and Gang Shi. 2024. UltraVCS: Ultra-fine-grained variable-based code slicing for automated vulnerability detection. IEEE Transactions on Information Forensics and Security 19 (2024), 3986–4000. 
[155] 
Yueming Wu, Deqing Zou, Shihan Dou, Wei Yang, Duo Xu, and Hai Jin. 2022. VulCNN: An image-inspired scalable vulnerability detection system. In Proceedings of the 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE’22). 
[156] 
Chunqiu Steven Xia, Yuxiang Wei, and Lingming Zhang. 2023. Automated program repair in the era of large pre-trained language models. In Proceedings of the 45th International Conference on Software Engineering (ICSE’23). 
[157] 
Peng Xiao, Qibin Xiao, Xusheng Zhang, Yumei Wu, and Fengyu Yang. 2024. Vulnerability detection based on enhanced graph representation learning. IEEE Transactions on Information Forensics and Security 19 (2024), 5120–5135. 
[158] 
Wei Xiao, Zhengzhang Hou, Tao Wang, Chengxian Zhou, and Chao Pan. 2024. MSGVUL: Multi-semantic integration vulnerability detection based on relational graph convolutional neural networks. Information and Software Technology 170 (2024), 107442. 
[159] 
Rongze Xu, Zhanyong Tang, Guixin Ye, Huanting Wang, Xin Ke, Dingyi Fang, and Zheng Wang. 2022. Detecting code vulnerabilities by learning from large-scale open source repositories. Journal of Information Security and Applications 69 (2022), 103293. 
[160] 
Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and discovering vulnerabilities with code property graphs. In Proceedings of the 2014 IEEE Symposium on Security and Privacy. 590–604. DOI:
山口法彦,尼科·戈尔德,丹尼尔·阿尔普,康拉德·里克。2014。使用代码属性图建模和发现漏洞。在 2014 年 IEEE 安全与隐私研讨会论文集中。590–604。DOI:
[161]
Fabian Yamaguchi, Markus Lottmann, and Konrad Rieck. 2012. Generalized vulnerability extrapolation using abstract syntax trees. In Proceedings of the 28th Annual Computer Security Applications Conference. 359–368.
山口法彦,洛特曼马克斯,里克康拉德。2012。使用抽象语法树进行泛化漏洞外推。在第 28 届计算机安全应用年会论文集中。359–368。
[162]
Fabian Yamaguchi, Felix Lindner, and Konrad Rieck. 2011. Vulnerability extrapolation: Assisted discovery of vulnerabilities using machine learning. In Proceedings of the 5th USENIX Workshop on Offensive Technologies (WOOT’11).
Fabian Yamaguchi,Felix Lindner 和 Konrad Rieck. 2011. 漏洞外推:利用机器学习辅助漏洞发现。在第五届 USENIX 攻击技术研讨会(WOOT’11)论文集中。
[163]
Fabian Yamaguchi, Christian Wressnegger, Hugo Gascon, and Konrad Rieck. 2013. Chucky: Exposing missing checks in source code for vulnerability discovery. In Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security (CCS’13). 499–510.
法比安·山口,克里斯蒂安·韦斯内格尔,雨果·加斯科,康拉德·里克。2013。Chucky:揭示源代码中缺失的检查以发现漏洞。在 2013 年 ACM SIGSAC 计算机与通信安全会议(CCS'13)论文集中。499–510。
[164]
Han Yan, Senlin Luo, Limin Pan, and Yifei Zhang. 2021. HAN-BSVD: A hierarchical attention network for binary software vulnerability detection. Computers & Security 108 (2021), 102286.
韩岩,罗森林,潘丽敏,张一飞。2021。HAN-BSVD:用于二值软件漏洞检测的层次注意力网络。计算机与安全 108(2021),102286。
[165]
Hongyu Yang, Haiyun Yang, Liang Zhang, and Xiang Cheng. 2022. Source code vulnerability detection using vulnerability dependency representation graph. In Proceedings of the 2022 IEEE International Conference on Trust, Security, and Privacy in Computing and Communications (TrustCom’22). IEEE, 457–464.
杨红宇,杨海云,张亮,程翔。2022. 基于漏洞依赖表示图的源代码漏洞检测。在 2022 年 IEEE 国际信任、安全和隐私计算与通信会议(TrustCom’22)论文集中。IEEE,457–464。
[166]
Limin Yang, Xiangxue Li, and Yu Yu. 2017. VulDigger: A just-in-time and cost-aware tool for digging vulnerability-contributing changes. In Proceedings of the 2017 IEEE Global Communications Conference (GLOBECOM’17). IEEE, 1–7.
杨立民,李向学,余余。2017。VulDigger:一种即时和成本感知的挖掘漏洞贡献变更的工具。在 2017 年 IEEE 全球通信会议(GLOBECOM’17)论文集中。IEEE,1–7。
[167]
Suan Hsi Yong and Susan Horwitz. 2005. Using static analysis to reduce dynamic analysis overhead. Formal Methods in System Design 27 (2005), 313–334.
端希勇和苏珊·霍维茨。2005。利用静态分析减少动态分析开销。形式方法在系统设计 27(2005),313–334。
[168]
Xiaozhou You, Hui Li, Han Wang, and Faisal Mehmood. 2023. SmartDT: An effective vulnerability detection system of smart contracts based on deep learning. In Proceedings of the 2023 IEEE International Conference on Big Data (Big Data’23). IEEE, 2369–2376.
肖舟宇,李辉,王汉,法萨尔·梅茂德。2023. 基于深度学习的智能合约有效漏洞检测系统 SmartDT。在 2023 年 IEEE 国际大数据会议(Big Data’23)论文集中。IEEE,2369–2376。
[169]
Lei Yu, Junyi Lu, Xianglong Liu, Li Yang, Fengjun Zhang, and Jiajia Ma. 2023. PSCVFinder: A prompt-tuning based framework for smart contract vulnerability detection. In Proceedings of the 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE’23). IEEE, 556–567.
[170]
Bin Yuan, Yifan Lu, Yilin Fang, Yueming Wu, Deqing Zou, Zhen Li, Zhi Li, and Hai Jin. 2023. Enhancing deep learning-based vulnerability detection by building behavior graph model. In Proceedings of the 45th International Conference on Software Engineering (ICSE’23). IEEE, 2262–2274.
[171]
Dawei Yuan, Xiaohui Wang, Yao Li, and Tao Zhang. 2023. Optimizing smart contract vulnerability detection via multi-modality code and entropy embedding. Journal of Systems and Software 202 (2023), 111699.
袁大伟,王晓辉,李瑶,张涛。2023. 通过多模态代码和熵嵌入优化智能合约漏洞检测。系统与软件杂志 202 (2023),111699。
[172]
Cheng Zeng, Chun Ying Zhou, Sheng Kai Lv, Peng He, and Jie Huang. 2021. GCN2defect: Graph convolutional networks for SMOTETomek-based software defect prediction. In Proceedings of the 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE’21). IEEE, 69–79.
成曾,周春英,吕胜凯,何鹏,黄杰。2021。GCN2defect:基于 SMOTETomek 的软件缺陷预测图卷积网络。在 2021 IEEE 第 32 届国际软件可靠性工程研讨会(ISSRE’21)论文集中。IEEE,第 69-79 页。
[173]
Peng Zeng, Guanjun Lin, Lei Pan, Yonghang Tai, and Jun Zhang. 2020. Software vulnerability analysis and discovery using deep learning techniques: A survey. IEEE Access 8 (2020), 197158–197172.
彭曾,林冠军,潘磊,台永航,张军。2020。利用深度学习技术进行软件漏洞分析和发现:综述。IEEE Access 8 (2020),197158–197172。
[174]
Chunyong Zhang, Bin Liu, Yang Xin, and Liangwei Yao. 2023. CPVD: Cross project vulnerability detection based on graph attention network and domain adaptation. IEEE Transactions on Software Engineering 49, 8 (2023), 4152–4168.
张春勇,刘斌,杨欣,姚亮伟。2023. 基于图注意力网络和领域自适应的跨项目漏洞检测(CPVD)。IEEE 软件工程杂志 49, 8 (2023), 4152–4168。
[175]
Chunyong Zhang, Tianxiang Yu, Bin Liu, and Yang Xin. 2024. Vulnerability detection based on federated learning. Information and Software Technology 167 (2024), 107371.
张春勇,余天翔,刘斌,辛杨. 2024. 基于联邦学习的漏洞检测. 信息与软件技术 167(2024),107371.
[176]
Hengyan Zhang, Weizhe Zhang, Yuming Feng, and Yang Liu. 2023. SVScanner: Detecting smart contract vulnerabilities via deep semantic extraction. Journal of Information Security and Applications 75 (2023), 103484.
张恒岩,张伟哲,冯宇明,刘洋。2023。SVScanner:通过深度语义提取检测智能合约漏洞。信息安全与应用杂志 75(2023),103484。
[177]
Zhuo Zhang, Yan Lei, Meng Yan, Yue Yu, Jiachi Chen, Shangwen Wang, and Xiaoguang Mao. 2022. Reentrancy vulnerability detection and localization: A deep learning based two-phase approach. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE’22). 1–13.
朱卓,雷岩,颜萌,余越,陈佳驰,王尚文,毛晓光。2022. 可重入漏洞检测与定位:基于深度学习的两阶段方法。在 37 届 IEEE/ACM 国际自动软件工程会议(ASE’22)论文集中。1–13。
[178]
Zixian Zhen, Xiangfu Zhao, Jinkai Zhang, Yichen Wang, and Haiyue Chen. 2024. DA-GNN: A smart contract vulnerability detection method based on dual attention graph neural network. Computer Networks 242 (2024), 110238.
Zixian Zhen, Xiangfu Zhao, Jinkai Zhang, Yichen Wang, 和 Haiyue Chen. 2024. 基于双注意力图神经网络的智能合约漏洞检测方法 DA-GNN. 计算机网络 242 (2024), 110238.
[179]
Weining Zheng, Yuan Jiang, and Xiaohong Su. 2021. Vu1SPG: Vulnerability detection based on slice property graph representation learning. In Proceedings of the 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE’21). IEEE, 457–467. 
[180]
Weining Zheng, Yuan Jiang, and Xiaohong Su. 2021. Vu1SPG: Vulnerability detection based on slice property graph representation learning. In Proceedings of the 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE’21). IEEE, 457–467. 
[181]
Zhangqi Zheng, Yongshan Liu, Bing Zhang, Xinqian Liu, Hongyan He, and Xiang Gong. 2023. A multitype software buffer overflow vulnerability prediction method based on a software graph structure and a self-attentive graph neural network. Information and Software Technology 160 (2023), 107246.
张琪正,刘勇山,张冰,刘欣倩,何红燕,和龚翔。2023. 基于软件图结构和自注意力图神经网络的软件缓冲区溢出漏洞预测方法。信息与软件技术 160(2023),107246。
[182]
Kuo Zhou, Jing Huang, Honggui Han, Bei Gong, Ao Xiong, Wei Wang, and Qihui Wu. 2023. Smart contracts vulnerability detection model based on adversarial multi-task learning. Journal of Information Security and Applications 77 (2023), 103555.
郭周,黄静,韩红桂,龚贝,熊傲,王伟,吴启辉。2023. 基于对抗多任务学习的智能合约漏洞检测模型。信息安全与应用杂志 77(2023),103555。
[183]
Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, and Yang Liu. 2019. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS’19). 1–11.
周亚琴,刘尚清,萧景凯,杜晓宁,刘阳. 2019. Devign:通过图神经网络学习综合程序语义的有效漏洞识别. 在第 33 届神经信息处理系统会议(NeurIPS’19)论文集中,第 1-11 页。
[184]
Yaqin Zhou and Asankhaya Sharma. 2017. Automated identification of security issues from commit messages and bug reports. In Proceedings of the 2017 11th ACM Joint Meeting on Foundations of Software Engineering (FSE’17). 914–919.
周亚琴,沙尔马·阿桑卡亚。2017。从提交信息和错误报告中自动识别安全问题。载于 2017 年第 11 届 ACM 软件工程基础联合会议(FSE’17)论文集。第 914-919 页。
[185]
Huijuan Zhu, Kaixuan Yang, Liangmin Wang, Zhicheng Xu, and Victor S. Sheng. 2023. GraBit: A sequential model-based framework for smart contract vulnerability detection. In Proceedings of the 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE’23). IEEE, 568–577.
朱慧娟,杨凯旋,王亮民,徐志成,沈维胜. 2023. GraBit:一种基于序列模型的智能合约漏洞检测框架. 在 2023 年 IEEE 第 34 届国际软件可靠性工程研讨会(ISSRE’23)论文集中. IEEE,第 568-577 页.
[186]
Weiyuan Zhuang, Hao Wang, and Xiaofang Zhang. 2022. Just-in-time defect prediction based on AST change embedding. Knowledge-Based Systems 248 (2022), 108852.
庄伟元,王浩,张晓芳。2022. 基于 AST 变化嵌入的即时缺陷预测。知识工程系统 248(2022),108852。
[187]
Yuan Zhuang, Zhenguang Liu, Peng Qian, Qi Liu, Xiang Wang, and Qinming He. 2020. Smart contract vulnerability detection using graph neural network. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI’20). 3283–3290.
袁庄,刘振光,钱鹏,刘琪,王翔,何勤明。2020。基于图神经网络的智能合约漏洞检测。在《第 29 届国际人工智能联合会议(IJCAI’20)》论文集中。3283–3290。
[188]
Deqing Zou, Yutao Hu, Wenke Li, Yueming Wu, Haojun Zhao, and Hai Jin. 2022. mVulPreter: A multi-granularity vulnerability detection system with interpretations. IEEE Transactions on Dependable and Secure Computing. Early Access, August 22, 2022.
邹德清,胡宇涛,李文克,吴宇明,赵昊军,金海。2022. mVulPreter:一种具有解释的多粒度漏洞检测系统。IEEE 可信赖与安全计算杂志。提前获取,2022 年 8 月 22 日。
[189]
Deqing Zou, Sujuan Wang, Shouhuai Xu, Zhen Li, and Hai Jin. 2019. muVulDeePecker: A deep learning-based system for multiclass vulnerability detection. IEEE Transactions on Dependable and Secure Computing 18, 5 (2019), 2224–2236.
邹德清,王素娟,徐守怀,李振,金海. 2019. muVulDeePecker:一种基于深度学习的多类漏洞检测系统. IEEE Transactions on Dependable and Secure Computing 18, 5 (2019), 2224–2236.
[190]
Deqing Zou, Yawei Zhu, Shouhuai Xu, Zhen Li, Hai Jin, and Hengkai Ye. 2021. Interpreting deep learning-based vulnerability detector predictions based on heuristic searching. ACM Transactions on Software Engineering and Methodology 30, 2 (2021), 1–31.
邹德清,朱亚伟,徐守怀,李振,金海,叶恒凯。2021. 基于启发式搜索的深度学习漏洞检测器预测解释。ACM 软件工程与方法论交易 30,2(2021),1–31。

Index Terms  索引术语

  1. A Systematic Literature Review on Automated Software Vulnerability Detection Using Machine Learning
    系统化文献综述:基于机器学习的自动化软件漏洞检测

    Recommendations  建议

    Comments  注释

    View full text|Download PDF
    查看全文 | 下载 PDF