NIST：AI可以修复Bug代码吗？探索大型语言模型在自动程序修复中的应用（2025） 8页

VIP文档

ID：74043

阅读量：0

大小：0.47 MB

页数：8页

时间：2025-07-04

金币：10

上传者：PASHU

Can AI Fix Buggy Code? Exploring the Use of

Large Language Models in Automated

Program Repair

Lan Zhang, Northern Arizona University, Flagstaff, AZ, 86005, USA

Anoop Singhal, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA

Qingtian Zou, University of Texas Southwestern Medical Center, Dallas, TX, USA

Xiaoyan Sun, Worcester Polytechnic Institute, Worcester, MA, 01609, USA

Peng Liu, Pennsylvania State University, State College, PA, 16803, USA

Abstract: LLMs are becoming increasingly used to

help programmers ﬁx buggy code due to their re-

markable capabilities. This article reviews the current

human-LLM collaboration approach to bug ﬁxing and

points out the research directions towards (the devel-

opment of) autonomous program repair AI agents.

INTRODUCTION

The ﬁeld of software engineering has witnessed a

paradigm shift with the advent of large language

models (LLMs). These sophisticated AI systems have

demonstrated remarkable versatility across various

software development tasks, including code genera-

tion, bug detection, and code review [1, 2, 3]. The

potential of LLMs to revolutionize software develop-

ment practices has sparked broad interest within both

academic and industry circles, prompting a surge of

research into their capabilities and limitations.

A recent breakthrough in this domain came with the

introduction of Devin, an LLM-powered AI system ca-

pable of autonomously completing 13.8% of real-world

coding tasks [4]. These tasks encompass a range of

complex operations, from diagnosing and ﬁxing bugs to

conducting comprehensive code reviews. However, the

relatively modest success rate of 13.8% in real-world

scenarios raises a critical question that forms the core

of our investigation: Are we truly prepared to leverage

LLMs for repairing buggy complex programs? This

question is not merely academic but has far-reaching

implications for the future of software development and

maintenance practices.

To address this fundamental quest, our study fo-

cuses on two modes of LLM-supported program repair:

Human-LLM Collaboration: This approach examines

the synergistic relationship between human software

Digital Object Identiﬁer 10.1109/XXX.0000.0000000

engineers and LLMs in the bug repair process [5]. It en-

compasses both interactive, dialogue-based method-

ologies and more integrated solutions such as real-

time code completion and suggestion systems.

Autonomous AI Agent Repair: This mode investi-

gates the potential for LLMs to independently identify

and rectify bugs without direct human intervention,

representing a more ambitious vision of automated

program repair.

By examining the efﬁcacy of LLMs across diverse

programming contexts, e.g., C/C++, Java, Python, we

aim to provide a nuanced understanding of their cur-

rent capabilities and limitations in addressing com-

plex software bugs. Our ﬁndings reveal a nuanced

landscape of LLM-supported program repair. For the

Human-LLM Collaboration mode, we observed that

results could be signiﬁcantly improved when humans

provide additional contextual knowledge. This includes

information about variable contexts, relevant data

structures, related functions, and even the underlying

logic of the code. This synergy between human exper-

tise and LLM capabilities shows promise for enhancing

bug repair processes in complex software systems.

In contrast, the Autonomous AI Agent Repair mode

presents a more challenging frontier. Our research

indicates that we are still far from achieving reliable

automatic code repair using LLMs alone. The com-

plexity of real-world software systems, coupled with

the nuanced understanding required for effective bug

repair, continues to pose signiﬁcant challenges for fully

autonomous LLM-based solutions.

Human-LLM Collaboration

GitHub Copilot’s ROBIN system represents a signiﬁ-

cant advancement in human-LLM collaboration for de-

bugging [6]. It uses multiple AI agents to analyze code

context, exception information, and user queries, guid-

ing developers through systematic debugging steps.

ROBIN leverages LLMs as reasoning engines to pro-

Month Published by the IEEE Computer Society Publication Name

资源描述：

《人工智能能否修复错误代码？探索大语言模型在自动程序修复中的应用》探讨大语言模型（LLMs）在程序修复中的应用。研究聚焦于LLM支持的程序修复的两种模式：人机协作和自主AI代理修复。通过对多种编程语言和方法的研究发现，人机协作模式下，人类提供额外上下文知识可显著提高修复效果；自主AI代理修复模式则面临挑战，当前仍难以实现可靠的自动代码修复。未来研究方向包括提高程序理解能力、加强验证和测试、提升多层次软件推理能力以及增强可解释性和透明度。总之，LLMs虽为软件工程有力工具，但目前无法取代人类专业知识，混合方法可能是更有前景的途径。

当前文档最多预览五页，下载文档查看全文

侵权申诉



1 1 2 3 4 5 / 8



此文档下载收益归作者所有

当前文档最多预览五页，下载文档查看全文

版权提示

温馨提示：
1. 部分包含数学公式或PPT动画的文件，查看预览时可能会显示错乱或异常，文件下载后无此问题，请放心下载。
2. 本文档由用户上传，版权归属用户，天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容，确认文档内容符合您的需求后进行下载，若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误，付费完成后未能成功下载的用户请联系客服处理。

大家都在看

近期热门

NIST：AI可以修复Bug代码吗？探索大型语言模型在自动程序修复中的应用（2025） 8页

最近更新

大家都在看

相关文章

相关标签