Machine learning program fixes 10 times as many open-source code errors as its predecessors

Amazing stuff.

MIT researchers have developed a machine-learning system that can comb through repairs to open-source computer programs and learn their general properties, in order to produce new repairs for a different set of programs.

The researchers tested their system on a set of programming errors, culled from real open-source applications, that had been compiled to evaluate automatic bug-repair systems. Where those earlier systems were able to repair one or two of the bugs, the MIT system repaired between 15 and 18, depending on whether it settled on the first solution it found or was allowed to run longer…

This is an article from a year ago about the same subject.


Even a partially self repairing program would be handy especially during development.

1 Like

If you’ve read this, can you answer - does a human or the machine decide what is a bug, or what is a repair, or what is the best repair? I.e. to what degree is this autonomous? To what degree does “fixing” mean meeting a goal and how is the goal specified? Etc!

I’m guessing humans are still involved around the edges, but curious as to where the edges are :smile:

1 Like

What I get from it is that these bugs show up during QA.

The researchers tested their system on a set of programming errors, culled from real open-source applications, that had been compiled to evaluate automatic bug-repair systems.

They’ve had a system before that just started to change code in the hope to fix the error. Like bruteforcing the problem. In this case, they use machine learning to actually learn from different languages what good code is.

Long and Rinard wrote a computer program that evaluated all the possible relationships between these characteristics in successive lines of code. More than 3,500 such relationships constitute their feature set. Their machine-learning algorithm then tried to determine what combination of features most consistently predicted the success of a patch.

This is the great thing with systems like Git. You can see all changes ever done. And let a machine learn from that.

1 Like

Interesting, but good test driven development practices should catch most obvious bugs and improve software design.

In my experience, TDD with good unit tests catches most implementation bugs, with further testing uncovering more subtle problems (such as different interpretations of requirements).

Maybe this sort of software could be used as a crutch to circumvent unit testing, but I suspect it would be less effective. No automated procedure is going to cover weaknesses in requirements or the interpretation of them either.

I know techniques like BDD hope to bridge the gap between requirements and implementation too. However, good programmers are great at interpreting and implementing requirements and good product people are good at defining and explaining requirements. A breakdown between these these elements of a team is not going to be fixed by any quantity of code analysis.

1 Like

Machine Learning is coming quite fast and hard. Experts say there was an “explosion” in 2012 where AI became better at watching pictures than we do as humans. Jeremy Howard is an AI-guy. He started a medical company and learned systems to diagnose images from MRI-scanners better than what we as humans can do. Same for detecting cancers in images. And remember, ML started at MIT in the 1950’s. It has been growing exponential for decades. IBM’s Watson is helping doctors with treating cancer already. I wouldn’t be surprised if these systems become smarter and smarter and will help you code a project as well. I agree that we still need the humans to create the big logic. But wouldn’t it be nice if the Dev’s from Maidsade did a QA, found 12 bugs that prevented the system from working, started a system like this at the end of a long day, and came back the next morning to see 9 of them patched?? They show with this research that things like that are possible.


There is a script in python that converts python 2.7 to 3.*

Is that not machine learning? lol.

Thanks @polpolrene

In that case I’m wondering if this could be an extension of compiler syntax checking, that after compiling runs code though QA/automated tests and presents the failure and delivers a fix that passes the tests.