Browse Prior Art Database

Flat file indexing based on a pattern

IP.com Disclosure Number: IPCOM000237109D
Publication Date: 2014-Jun-03
Document File: 3 page(s) / 42K

Publishing Venue

The IP.com Prior Art Database

Abstract

Proposed is a novel approach that could be used in text editors to enable viewing and editing of large text files.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 50% of the total text.

Page 01 of 3

Flat file indexing based on a pattern

Disclosed is a file indexing method that would make it possible for large text files to be viewed and edited in a text editor. Currently it is very hard (virtually impossible) to work

with large text files (filesize greater than 300MB) using simple text editor (e.g. MS Notepad) because of insufficient memory and other implementation related limitations. While it is possible to view such files using editors like vi , vim or more , all the advanced functions of these programs will not be available (e.g. syntax highlighting, macros, quick find etc). This problem is often encountered when analysing huge log files, making it
very difficult to analyze information produced by a program.

The proposed solution to the problem is to logically divide a large flat file basing on a pattern. The pattern (e.g. a regular expression) would be provided by the user and

would depend on the data that is saved in the file to be opened.

Depending on the implementation, the pattern could be represented as a regular expression. As an example let us consider the following log file:

2013-05-31-15.32.38.484256+120 I1125992A560 LEVEL: Error
PID : 22675698 TID : 22499 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DBNAME
APPHDL : 0-21868 APPID: 10.183.130.37.45760.130531133209
AUTHID : DB2INST1 HOSTNAME: xyz1234.xyz.aa-bbbbb.com
EDUID : 22499 EDUNAME: db2agent (DBNAME) 0
FUNCTION: DB2 UDB, relation data serv, sqlrr_fetch_error, probe:300
RETCODE : ZRC=0x8012006D=-2146303891=SQLR_CA_BUILT
"SQLCA has already been built"

2013-05-31-15.33.08.650787+120 I1126553A1294 LEVEL: Error
PID : 22675698 TID : 34786 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DBNAME
APPHDL : 0-21873 APPID: 10.183.130.37.45765.130531133309
AUTHID : DB2INST1 HOSTNAME: xyz1234.xyz.aa-bbbbb.com
EDUID : 34786 EDUNAME: db2agent (DBNAME) 0
FUNCTION: DB2 UDB, trace services, sqlt_logerr_data (secondary logging
function), probe:50
MESSAGE : Superceding prev. error, probe 165, SQLCA:
DATA #1 : Hexdump, 136 bytes
0x0A0000003EBF5D70 : 5351 4C43 4120 2020 0000 0088 0000 0064 SQLCA
.......d
0x0A0000003EBF5D80 : 0000 2020 2020 2020 2020 2020 2020 2020 ..
0x0A0000003EBF5D90 : 2020 2020 2020 2020 2020 2020 2020 2020
0x0A0000003EBF5DA0 : 2020 2020 2020 2020 2020 2020 2020 2020
0x0A0000003EBF5DB0 : 2020 2020 2020 2020 2020 2020 2020 2020
0x0A0000003EBF5DC0 : 2020 2020 2020 2020 5351 4C52 4930 3146 SQLRI01F
0x0A0000003EBF5DD0 : 8004 0001 0000 0001 0000 0000 0000 0000

1


Page 02 of 3

................

0x0A0000003EBF5DE0 : 0000 0000 0000 0000 2020 2020 2020 2020 ........
0x0A0000003EBF5DF0 : 2020 2030 3230 3030 02000

In this sample log file, messages start with a timestamp and their end is marked by a double end-of-line tag. The structureof the file could be depicted in the following diagram, where 'S' represents the start of a message, 'E' is the end of the message and '*' the contents of the message.

-----------------------------------------------------------------
-


*********S***ES****ES**...