Parser-State Based Method for Non-reserved Keyword Resolution
Original Publication Date: 2004-Jan-21
Included in the Prior Art Database: 2004-Jan-21
Most conventional programming languages define sets of reserved words that cannot be used as identifiers, such as names of variables or other programming language entites. Some languages, such as SQL or ADA for example, introduce a lexical construct named nonreserved keyword. Nonreserved words are also keywords and have special meaning in the syntax where they appear as terminals, but they can also be used as identifiers. A typical development environment includes a builder used to validate and compile programs written in the programming language and a specialized editor with functionality such as syntax highlighting and content assist. One of the most popular and widely used tools for compiler and parser construction is YACC (Yet Another Compiler-Compiler) and its various extensions. As a result of a YACC grammar compilation, an ASCII file is generated that contains a table of parser states with corresponding grammar rules. This invention uses this file to automatically generate a lookup table containing states and non-reserved words valid for those states.
Parser-State Based Method for Non -reserved Keyword Resolution
Most conventional programming languages (with exceptions like PL/I) define sets of reserved words that cannot be used as identifiers, such as names of variables or other programming language entites. For example, in BASIC and COBOL, the word IF is reserved because it has a special meaning.
Some languages, such as SQL or ADA for example, introduce a lexical construct named nonreserved keyword. Nonreserved words are also keywords and have special meaning in the syntax where they appear as terminals, but they can also be used as identifiers. Nonreserved words make it possible to add new syntax without compromising the readability of the language, and without introducing incompatibilities. The grammar rules which involve nonreserved words must be such that they do not make the grammar ambiguous, and require a reasonable lookahead. A typical development environment includes a builder used to validate and compile programs writen in the programming language and a specialized editor with functionality such as syntax highlighting and content assist.
Content assist (a.k.a code assist) is a usability feature for text editors. User-driven text choices are provided to complete a phrase or statement. The user can select these choices for automatic insertion in the text. Content assist also supports contextual completion proposals for providing the user with information that is related to the current position in the document. Content assist proposals must be correct and complete. To accomplish this, the editing tool needs to be aware of the current lexer/parser state and the set of valid lexical tokens that can take the parser to the next correct state.
The fact that grammar rules involving nonreserved words rely on parser's lookahead makes interactive content support difficult. The ability to legally use nonreserved words as identifiers makes it difficult for content assist to distinguish if the grammar role of a word is a keyword or an identifier. If the word appears in given context as a keyword, content assist must show it to the user; but if the word appears as an identifier, content assist must not show it unless the word is actually defined as an identifier visible in current scope.
Java programming language has a relatively small set of keywords. Integrated development environments for Java, e.g., VisualAge for Java and WebSphere Studio Application Developer (WSAD) Java editor, provide no content assist support for keywords. WSAD SQL editor does not provide context-aware assistance, its content assist helps a user to start a statement, but does not limit the completion proposals to valid keywords and identifiers to complete a statement.
The Extended SQL (ESQL) language for message transformation and database access, used in WebSphere MQ Integrator, is based on the SQL:1999 standard and therefore contains a relatively large set of keywords, considerable portion of which is non...