Research group

Machine Learning Methods in Software Engineering

Authorship Attribution of Source Code

Project supervisors: Timofey Bryksin, Vladimir Kovalenko
Status: Active

We developed a language-agnostic approach to authorship attribution of source code, that improved over SOtA accuracy of some language-specific approaches on existing datasets. We also demonstrated that existing evaluation techniques of authorship attribution methods are misaligned with potential practical applications. Finally, we proposed a more realistic approach to building evaluation datasets.

The paper is under review for IEEE TSE.