NIST:AI可以修复Bug代码吗?探索大型语言模型在自动程序修复中的应用(2025) 8页

VIP文档

ID:74043

阅读量:0

大小:0.47 MB

页数:8页

时间:2025-07-04

金币:10

上传者:PASHU
Can AI Fix Buggy Code? Exploring the Use of
Large Language Models in Automated
Program Repair
Lan Zhang, Northern Arizona University, Flagstaff, AZ, 86005, USA
Anoop Singhal, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA
Qingtian Zou, University of Texas Southwestern Medical Center, Dallas, TX, USA
Xiaoyan Sun, Worcester Polytechnic Institute, Worcester, MA, 01609, USA
Peng Liu, Pennsylvania State University, State College, PA, 16803, USA
Abstract: LLMs are becoming increasingly used to
help programmers fix buggy code due to their re-
markable capabilities. This article reviews the current
human-LLM collaboration approach to bug fixing and
points out the research directions towards (the devel-
opment of) autonomous program repair AI agents.
INTRODUCTION
The field of software engineering has witnessed a
paradigm shift with the advent of large language
models (LLMs). These sophisticated AI systems have
demonstrated remarkable versatility across various
software development tasks, including code genera-
tion, bug detection, and code review [1, 2, 3]. The
potential of LLMs to revolutionize software develop-
ment practices has sparked broad interest within both
academic and industry circles, prompting a surge of
research into their capabilities and limitations.
A recent breakthrough in this domain came with the
introduction of Devin, an LLM-powered AI system ca-
pable of autonomously completing 13.8% of real-world
coding tasks [4]. These tasks encompass a range of
complex operations, from diagnosing and fixing bugs to
conducting comprehensive code reviews. However, the
relatively modest success rate of 13.8% in real-world
scenarios raises a critical question that forms the core
of our investigation: Are we truly prepared to leverage
LLMs for repairing buggy complex programs? This
question is not merely academic but has far-reaching
implications for the future of software development and
maintenance practices.
To address this fundamental quest, our study fo-
cuses on two modes of LLM-supported program repair:
Human-LLM Collaboration: This approach examines
the synergistic relationship between human software
XXXX-XXX © 2024 IEEE
Digital Object Identifier 10.1109/XXX.0000.0000000
engineers and LLMs in the bug repair process [5]. It en-
compasses both interactive, dialogue-based method-
ologies and more integrated solutions such as real-
time code completion and suggestion systems.
Autonomous AI Agent Repair: This mode investi-
gates the potential for LLMs to independently identify
and rectify bugs without direct human intervention,
representing a more ambitious vision of automated
program repair.
By examining the efficacy of LLMs across diverse
programming contexts, e.g., C/C++, Java, Python, we
aim to provide a nuanced understanding of their cur-
rent capabilities and limitations in addressing com-
plex software bugs. Our findings reveal a nuanced
landscape of LLM-supported program repair. For the
Human-LLM Collaboration mode, we observed that
results could be significantly improved when humans
provide additional contextual knowledge. This includes
information about variable contexts, relevant data
structures, related functions, and even the underlying
logic of the code. This synergy between human exper-
tise and LLM capabilities shows promise for enhancing
bug repair processes in complex software systems.
In contrast, the Autonomous AI Agent Repair mode
presents a more challenging frontier. Our research
indicates that we are still far from achieving reliable
automatic code repair using LLMs alone. The com-
plexity of real-world software systems, coupled with
the nuanced understanding required for effective bug
repair, continues to pose significant challenges for fully
autonomous LLM-based solutions.
Human-LLM Collaboration
GitHub Copilot’s ROBIN system represents a signifi-
cant advancement in human-LLM collaboration for de-
bugging [6]. It uses multiple AI agents to analyze code
context, exception information, and user queries, guid-
ing developers through systematic debugging steps.
ROBIN leverages LLMs as reasoning engines to pro-
Month Published by the IEEE Computer Society Publication Name
1
资源描述:

《人工智能能否修复错误代码?探索大语言模型在自动程序修复中的应用》探讨大语言模型(LLMs)在程序修复中的应用。研究聚焦于LLM支持的程序修复的两种模式:人机协作和自主AI代理修复。通过对多种编程语言和方法的研究发现,人机协作模式下,人类提供额外上下文知识可显著提高修复效果;自主AI代理修复模式则面临挑战,当前仍难以实现可靠的自动代码修复。未来研究方向包括提高程序理解能力、加强验证和测试、提升多层次软件推理能力以及增强可解释性和透明度。总之,LLMs虽为软件工程有力工具,但目前无法取代人类专业知识,混合方法可能是更有前景的途径。

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭