Dynamic repair of malformed XML
Original Publication Date: 2010-Jan-29
Included in the Prior Art Database: 2010-Jan-29
We propose a system which can automatically repair malformed XML documents.We propose that an intelligent heuristic is used to infer a schema from the well formed data in the document. This is then used to generate one or many fixes to the malformed section(s) of the XML file. The file can then be further processed and some value gained from it. In the case where there are a number of fixes to produce a valid document, many possible valid documents could be returned. An inferred schema will be arrived at by means of analysis of the tree structure of the tags in the well formed parts of the document. Such an algorithm could log the frequency and type of sub-tags (and attributes) and use this to judge the probability of a tag existing in the bad block.