Browse Prior Art Database

Improved Compiler Parsing Method for LL Languages

IP.com Disclosure Number: IPCOM000115246D
Original Publication Date: 1995-Apr-01
Included in the Prior Art Database: 2005-Mar-30
Document File: 3 page(s) / 100K

Publishing Venue

IBM

Related People

Feriozi, DT: AUTHOR

Abstract

Described is an improved compiler parsing method for LL languages, as used by computer compiler developers. The method describes the use of LL(1) parsing, but handles any LL(2) constructs in the language as exceptions. It combines the efficiency of the LL(1) parsing method with the expanded power of strong LL(2) grammars.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Improved Compiler Parsing Method for LL Languages

      Described is an improved compiler parsing method for LL
languages, as used by computer compiler developers.  The method
describes the use of LL(1) parsing, but handles any LL(2) constructs
in the language as exceptions.  It combines the efficiency of the
LL(1) parsing method with the expanded power of strong LL(2)
grammars.

      In prior art, only two viable choices were available for
parsing LL(2) languages, as used in the C programming language.  One
choice was hand coding the parser, however this was both tedious and
difficult to modify.  The other choice was to use a bottom-up parser
generator, such as the UNIX tool YACC.  Top-down parsing methods were
generally considered to be preferable to bottom-up methods in all
respects except one.  That is that the bottom-up methods could handle
a larger class of grammars than the equivalent top-down method using
the same amount of lookahead.  In practice, if a language could not
be expressed as an LL(1) grammar, automated top-down parsing was not
an option.

      The concept described herein provides a practical and efficient
top-down method of parsing LL(2) languages for compiler development.
The LL(1) method of parsing is a special case of LL(k) parsing that
restricts the amount of look-ahead to a single token.  This
restriction enables efficient parsing of an LL(1) language compared
to the algorithms that can be used to parse LL(k) languages for
values of k greater than one.  Because of the wide gap in efficiency
between LL(1) and LL(k), only LL(1) parsers are used in practice.
  The LL(1) method of parsing consists of the following elements:
    o  A parse table of non-terminals x terminals
    o  A table of productions
    o  A simple parsing engine
  The LL(1) parsing algorithm is summarized as follows:
  1.  Pop a grammar symbol from the stack
  2.  If the symbol is a terminal, match it an get the next input
token.
  3.  If the symbol is a non-terminal, lookup the next production
       number in the parse table and push that production onto the
stack.
  4.  Go to (1)

      In the LL(1)+ parsing method, specially marked conflict entries
are used in the parse table to indirect into a conflict resolution
table.  This provides the proper production to be used, given a
knowledge of the second lookahead token as well as the current state
of the parse procedure.  A negative parse table entry signifies a
conflict.  The absolute value of this entry serves as the index into
the conflict resolution table.  The additions to the standard LL(1)
elements a...