Browse Prior Art Database

Japanese Text Summarizing Method without using Dictionary

IP.com Disclosure Number: IPCOM000122819D
Original Publication Date: 1998-Jan-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 4 page(s) / 118K

Publishing Venue

IBM

Related People

Mabuchi, T: AUTHOR [+2]

Abstract

Disclosed is a method for summarizing Japanese Text without a dictionary. In general, in order to get words from Japanese Text, it is necessary to morphologically analyze with a dictionary because there is no separator between words. But words boundary frequently fit the location where a character type changes to another type. Using this characteristic of Japanese Text, without a word dictionary, most NOUN words which are essential as keywords in a text summarizing system can be obtained. The text summarizing system means to select some sentences from text according to specified conditions, such as summary ratio or number of sentences.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 59% of the total text.

Japanese Text Summarizing Method without using Dictionary

      Disclosed is a method for summarizing Japanese Text without a
dictionary.  In general, in order to get words from Japanese Text, it
is necessary to morphologically analyze with a dictionary because
there is no separator between words.  But words boundary frequently
fit the location where a character type changes to another type.
Using this characteristic of Japanese Text, without a word
dictionary, most NOUN words which are essential as keywords in a text
summarizing system  can be obtained.  The text summarizing system
means to select some sentences from text according to specified
conditions, such as summary  ratio or number of sentences.

Flow of the invention: The disclosed method mainly consists of two
parts.  First part (step-1) is the part of morphological analysis,
and the other part (step-2) is the one to select sentences.
  Step 1: morphological analysis using character type
           without a dictionary

This process consists of the following three steps:
  step-1.1
  make rough segmentation between words based on the location
   where character type changes.
  A character in Japanese Text certainly belongs to the type
   shown in Table 1.
  In this article, character type is expressed in the format
   of putting it between '<' and '>'.

   Most of character type has DBCS (full-size) and SBCS (half-size)
type.
  step-1.2
  adjust the boundary of the word by connectin...