Browse Prior Art Database

Efficient Method for Searching Multi-Layered Dictionaries by Using Word Frequency

IP.com Disclosure Number: IPCOM000102368D
Original Publication Date: 1990-Nov-01
Included in the Prior Art Database: 2005-Mar-17
Document File: 3 page(s) / 97K

Publishing Venue

IBM

Related People

Nomiyama, H: AUTHOR

Abstract

Disclosed is a mechanism for searching entries in multi-layered dictionaries efficiently by using word frequency.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Efficient Method for Searching Multi-Layered Dictionaries by Using Word Frequency

       Disclosed is a mechanism for searching entries in
multi-layered dictionaries efficiently by using word frequency.

      Most natural language processing systems are constructed in
several layers, such as a morphological analysis phase, a syntactic
analysis phase, and a semantic analysis phase 1,2,3.

      Since these phases need different kinds of information,
different dictionaries are provided for each phase (Fig. 1).
Dictionaries are consulted in order of the layer numbers.

      Our mechanism minimizes the cost of searching such
multi-layered dictionaries.

      Configuration of Dictionaries

      The configuration of dictionaries that our system assumes is
shown in Fig. 2.

      We assume that N is the total number of layers. The first-layer
dictionary has keys and pointers to the words in the second-layer
dictionary.

      The physical construction of the first-layer dictionary depends
on the algorithm used to search it, but the overall mechanism is
independent of the algorithm.

      The Ith-layer dictionaries, where I is greater than 1, do not
have keys, but lexical contents and pointers to the next-layer
dictionary (the lowest-layer dictionary does not have pointers).

      The contents of these dictionaries are ordered according to
word frequencies in each phase.

      Dictionary Compiler

      Dictionaries consulted by application programs are created by a
dictionary compiler from information kept in the lexical database.

      First, the Nth-layer dictionary is created. Keys, contents, and
frequencies are extracted from the lexical database, and are ordered
according to their frequencies.

      If any records have the same lexical contents, only one of them
is kept, and its frequency is set to a total of the frequencies of
those records.

      The physical loc...