Browse Prior Art Database

Code formatting independent code repository

IP.com Disclosure Number: IPCOM000175947D
Original Publication Date: 2008-Oct-28
Included in the Prior Art Database: 2008-Oct-28
Document File: 2 page(s) / 22K

Publishing Venue

IBM

Abstract

This publication is about freeing developers from the hassle with coding styles. A developer will be enabled to just use the coding style he prefers for every project he is involved in. Therefore the publication describes a facility to store code in a canonicalized manner on a local disk or in a remote code repository together with a toolset which transparently translates code between the internally used canonicalized format to a user preferred one and vice versa.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Code formatting independent code repository

Source code is represented in a textual manner consumed by a compiler to be translated into machine language.

The expressions to be used in the source code are defined by the programming language. The compiler checks based on a formal description of that language whether the source code is a valid program.

Besides the characters used to form identifiers, statements and expressions all of the common programming languages also define a set of so called whitespace characters (like blanks, line feeds etc.). A whitespace character might be used to separate distinct expressions defined be the programming language. Adding more than one whitespace character does not affect the program semantics but its representation. Therefore one and same machine code program generated by a compiler might have been generated from a wide range of different representations of the same source code.

So in fact it can be stated that code formatting is a matter of taste of the developer and does not affect the resulting code.

Bigger code projects provide a document which describes in detail how source code of a certain language has to be formatted by the developer. That way it can be assured that code coming from different involved developers looks the same. The advantage is that all developers familiar with that coding style are able to easily read and review the source code written by others.

Problems arise for developers involved in several projects using different coding styles. This often happens in the open source community. There might be coding styles used which differ greatly (e.g. Linux Kernel vs. GNU). Although the coding style in fact does not affect the resulting code a lot of effort is spent during reviews to make sure that new code conforms to the required coding style.

The publication enhances existing code management software to internally store code in a different format than presented to the developer. The internal format will be called "canonicalized format" and the other "developer format".

1.1. The canonicalized format


This format will be used for the code stored within the code repository. This format might either be defined to store the code in a format which is...