The researchers tested their system on a set of programming errors, culled from real open-source applications that had been compiled to evaluate automatic bug-repair systems. While earlier systems were able to repair one or two of the bugs, the MIT machine-learning system repaired between 15 and 18, depending on whether it settled on the first solution it found or was allowed to run longer.
“One of the most intriguing aspects of this research is that we’ve found that there are indeed universal properties of correct code that you can learn from one set of applications and apply to another set of applications,” explained Martin Rinard, professor of electrical engineering and computer science.
These results could lead not only to developing automatic bug-repair tools but could be used across other engineering domains, according to Rinard.
“If you can recognize correct code, that has enormous implications across all software engineering. This is just the first application of what we hope will be a brand-new, fabulous technique.”
The research, which was presented in a paper by graduate student Fan Long at the latest Symposium on Principles of Programming Languages, describes how Long was able to write a computer script to automatically extract both the uncorrected code and patches for 777 errors in eight common open-source applications stored in the online repository GitHub.
To initiate their machine-learning system, Long and Rinard’s first had to select a “feature set” that the system would analyze. The researchers concentrated on values stored in memory, either variables, which can be modified during a program’s execution, or constants, which can’t. They identified 30 prime characteristics of a given value: being involved in an operation (addition, multiplication, comparison); being local or global; being variable or not, and so on.
They then wrote a computer program to evaluate all the possible relationships between these characteristics in successive lines of code, finding over 3,500 such relationships in their feature set. Their machine-learning algorithm then tried to determine what combination of features most consistently predicted the success of a patch.