Browse Prior Art Database

Regular Expressions for Searching Multidimensional Data

IP.com Disclosure Number: IPCOM000020804D
Original Publication Date: 2003-Dec-15
Included in the Prior Art Database: 2003-Dec-15
Document File: 1 page(s) / 7K

Publishing Venue

IBM

Abstract

Disclosed is an extension of traditional regular expressions for dealing with multidimensional arrays of character data. This extension allows a regular expression author to describe patterns in the data in their original form, and not in the resultant projection of the data into linear space.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 1 of 1

Regular Expressions for Searching Multidimensional Data

Regular expressions (regex) are a powerful tool for searching and replacing textual data. There are a number of regex implementations defined that provide a straightforward syntax for scripting pattern recognition. The advantage of these high level pattern recognition languages is that a user can define a complex search (and replacement) task without having to write custom logic. For instance, the perl command s/company/IBM can substitute the string 'IBM' wherever the string 'company' is found within a string.

Regular expressions that exist today are only equipped to search on linear arrays of character data such as {'h','e','l','l','o',' ','w','o','r','l','d'} such that a regex user could easily define a command to tell whether or not the data contains the string 'world', or begins with an 'h'.

Multidimensional data is that which can be defined as sets of linear arrays of data. For example, we can add a dimension of data to the previous example indicating whether or not each character is editable, {{'h','e','l','l','o',' ','w','o','r','l','d'},
{'0','0','0','0','0','0','1','1','1','1','1'}}

Indeed, one can join the two arrays into a single array of data that can be searched linearly such as {'h','e','l','l','o',' ','w','o','r','l','d','0','0','0','0','0','0','1','1','1','1','1'}
or
{'h','0','e','0','l','0','l','0','o','0',' ','0','w','1','o','1','r','1','l','1','d','1'}} but in doing so we've complicated the nature of the data such that a traditional regular expression to represent our search criteria would be difficult to grasp. There is not currently a way to define a high level expression to determine whether or not the data contains the editable string 'wor...