In this talk I will discuss a new generation of software tools based on probabilistic models learned from large codebases of code a.k.a “Big Code”. By leveraging the massive effort already spent by thousands of programmers, these tools make useful predictions about new, unseen programs, thus helping to solve important and difficult software tasks. As an example, I will illustrate our systems for statistical code completion, deobfuscation and defect prediction. Two of these systems (jsnice.org and apk-deguard.com) are freely available and already have thousands of users. In the talk, I will present some of the core machine learning and program analysis techniques behind these learning tools.
Veselin Raychev obtained his PhD from ETH Zürich in 2016 on the topic of “Learning from Large Codebases”. Before this, he worked as a software engineer at Google on the public transportation routing algorithm of Google Maps as well as several other projects. Currently he is a co-founder and CTO of DeepCode GmbH – a company developing “Big Code” programming tools.