Research group

Machine Learning Methods in Software Engineering

Embeddings of Code Changes

Project supervisor: Timofey Bryksin
Status: Active

The goal of this project is to build explicit vector representations of code changes. In the course of this project, it is planned to obtain vector representations that can effectively encode information about the code change, and thus allow you to set semantic transformations over it. The approach treats the program code as a sequence of tokens. The model can be trained in an unsupervised manner which allows us to do a big pre-train of the network. The approach is evaluated on commit message generation, stable patch prediction, and application of changes to the program code tasks.