Browse Prior Art Database

Automatic team-driven social code style standard selection

IP.com Disclosure Number: IPCOM000246509D
Publication Date: 2016-Jun-14
Document File: 3 page(s) / 96K

Publishing Venue

The IP.com Prior Art Database

Abstract

This solution aims to provide an algorithm for selecting, among an established group of standard code styles, the one which most closely resembles those appearing in input code samples. The beauty of our approach mainly resides in adopting the Euclidean distance as metrics for comparing grammars: each programming style can indeed be parameterized according to some recurring patterns in its visual appearance. This way, it can be represented by a point in a P-dimensional space, where P is the number of parameters taken into account for the considered programming language.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 3

Automatic team-driven social code style standard selection

Background

Whenever a team of developers embarks on a new project, one of the first decisions that has to be made is which code style to adopt, in order to make the source as readable and tidy as possible. This is an essential requirement when it comes to long-term projects which will be maintained for several years. In case this constraint is overlooked, each developer would feel comfortable in "polluting" the project with its own code style, soon leading to heavily inconsistent formatting and unnecessary confusion while working on the source. This, as a consequence, brings to a much higher probability of introducing errors in the program.

    Furthermore, variation of code style has negative implications on merges and code diff comparisons.The solution described tackles the problem outlined above by automatically selecting the standard code style which is the "closest" to the ones preferred by the involved programmers. The algorithm we propose has been designed for minimizing the overall dissatisfaction in this fundamental choice through an approach which leverages machine learning techniques.

Novelty

    Tools currently on the market are solely focused on the enforcement aspect of the code style (e.g. checkstyle), but overlook origins of such style in the first place. The solution described allows us to make this definition in an automated way, by receiving source code of a single selected individual (industry expert) or from multiple contributors (e.g. set of experts, organization, development team).


Page 02 of 3

Description

Algorithm

Input

    - N source files written in the programming language specific to the project (let us call it PL). Preferably, N/M of these files should belong to a single developer assigned to the project (and should be representative of its code style for programming language PL), where M is the number of developers involved.

     - A set of formally-defined standard code style specifications for PL of size K. Steps

    - Each source file undergoes a processing step similar to the first phase of a compiler front-end: it needs to be lexically analysed in such a way to record statistics on each "code style parameter" (CSP). These parameters represent the recurring patterns of usage/count of white space characters appearing in each grammar rule of PL. Let us define the number of CSPs for the considered programming language as P.

- For each source file, the val...