Browse Prior Art Database

Structured Merge Method for Program Source Files Disclosure Number: IPCOM000125924D
Original Publication Date: 2005-Jun-22
Included in the Prior Art Database: 2005-Jun-22
Document File: 3 page(s) / 129K

Publishing Venue



Although much has been researched and written regarding the merging of multiple generations of source code files, the state of the art in practice falls far short of the ideal. Merge-related errors are still a common occurence, reducing productivity both directly -- because it takes time to find and fix merge errors, and indirectly -- because software engineers delay their merges until they have no choice and this actually adds to the problem by increasing the number of line to line conflicts. The root cause of the problem is the use of text differencing engines to compare and merge structured data. This article discusses a method by which a structure aware merge engine (e.g. the EMF compare support in Rational Software Architect 6.0) could handle java source files.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 40% of the total text.

Page 1 of 3

Structured Merge Method for Program Source Files

Source Code Management (SCM) systems like ClearCase and CVS allow developers to work on features in an isolated private stream (sandbox) and then synchronize (and merge if necessary) changes into another branch, usually the main branch for the current development stream. Parallel development of artifacts will require frequent merging by definition.

Some SCM systems support automated synchronization of one's sandbox with any other branch of the code library. For efficiency, these bulk operations try to merge files silently, that is without user intervention. With a silent merge, non-conflicting changes are simply applied from both contributing branches.

Much can go wrong with silent merges. In fact, even with user intervention, merging is a frequent source of error in the development process. A classic failed automated merge result might look like:

void aMethod () {

a line; } else {

another line; } }

The root of this problem is the fact that all source code merge tools use text-based difference engines but source code is by nature a structured format.

Text merge tools commonly highlight the end of one method and the beginning of the next as a single change when a new method has been inserted. While this is technically correct in the strictest sense of text differences, there is never a time when the developer doing the merge sees this change as making sense. If a method was inserted, then that method should be highlighted as an insertion. This sort of confusion often leads to difficulty in resolving proximate conflicts, which itself often leads to corrupted files
(i.e. non-compiling source code results.)

A structured merge format is required; one that can be used by all manufacturers and is language neutral. One embodiment of such a common language / format is the Eclipse Modeling Framework (EMF). EMF provides the ability to create arbitrarily complex meta-models that can be used to create instance documents to represent source code in any language.

EMF currently has meta-models for the highest level structural representation of Java and C++. These meta-models are designed around the requirements of the UML, and therefore are concerned mainly with interfaces, classes, operations and attributes. For the purposes of merging source code using an EMF instance document, the defined meta-model must be capable of representing a complete source file without losing content. Literally every byte in the file must be represented in a structured way and the transformation to and from the instance document must be perfectly reliable.


Page 2 of 3

It is not necessarily a requirement that the source code be recreated byte-for-byte from the instance document. Rather, all of the important orderings must be preserved. In other words, this transformation: source file In instance document source file out ... must produce this logical relationship: source file in == source file out ... such that a compil...