Browse Prior Art Database

Abstraction View of Japanese Text

IP.com Disclosure Number: IPCOM000122885D
Original Publication Date: 1998-Jan-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 2 page(s) / 73K

Publishing Venue

IBM

Related People

Baba, M: AUTHOR [+2]

Abstract

Disclosed is a system which describes the Abstraction View of Japanese Text which includes abbreviation marks which represent some non-important phrases, and describes a function to expand the abbreviation mark to original text. This method has the following merits: o it enables skimming the content of text quickly, because non-important phrases or sentences are replaced with abbreviation marks, such as '...'. o it enables browsing text abstractly according to specified capacity size (this method is a kind of text summarizing method.) o it enables browsing the outline of text by preserving punctuations o it enables expanding abbreviation marks into original text, if necessary

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Abstraction View of Japanese Text

      Disclosed is a system which describes the Abstraction View
of Japanese Text which includes abbreviation marks which represent
some non-important phrases, and describes a function to expand the
abbreviation mark to original text.  This method has the following
merits:
  o  it enables skimming the content of text quickly, because
      non-important phrases or sentences are replaced with
      abbreviation marks, such as '...'.
  o  it enables browsing text abstractly according to specified
      capacity size (this method is a kind of text summarizing
      method.)
  o  it enables browsing the outline of text by preserving
      punctuations
  o  it enables expanding abbreviation marks into original text,
      if necessary

Flow of the invention:  The disclosed method consists of making
folding text phases and unfolding text phases.
  folding text phase
  step-1:  get words list from text using morphological
            analysis in order to choose keywords
  step-2:  count word frequency which appears in Text except
            function word (fuzoku-go) (keyword consists of
            content word (jiritsu-go) but doesnot include
            function words)
  step-3:  replace words which match the following conditions
            with abbreviation mark (In this article, '...' denotes
            abbreviation mark.)
        rule-1:  replace HIRAGANA word (and function word) whose
                  length is more than two characters with the
                  abbreviation mark '...' .
        rule-2:  replace KANJI word whose length is one character
                  with the mark '...'.
                  (except prefix or suffix word)
        rule-3: ...