Browse Prior Art Database

Analysis and Coding of an Unformatted Semantic Set

IP.com Disclosure Number: IPCOM000086480D
Original Publication Date: 1976-Sep-01
Included in the Prior Art Database: 2005-Mar-03
Document File: 2 page(s) / 50K

Publishing Venue

IBM

Related People

Rocchi, P: AUTHOR

Abstract

This is a coding system for processing a yearly mass of one and a half to two million labor accident reports written in unformatted Italian, everyday language. The solution is based on the analysis of a sample of 30,000 accident reports.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 2

Analysis and Coding of an Unformatted Semantic Set

This is a coding system for processing a yearly mass of one and a half to two million labor accident reports written in unformatted Italian, everyday language. The solution is based on the analysis of a sample of 30,000 accident reports.

The system provides computerized coding of any set of words by searching an indexed sequential thesaurus file. The input items describe machines, equipment, animals, buildings and any other object which relates to a labor accident. The thesaurus is loaded from cards; the Italian language presently provides 18,000 entries.

Each record contains three main fields: the record key or NP code field, the uncoded item or WW field and the NS code field. NP: Contains two subfields of seven numerical positions each - NP1 for the first word and NP2 for the second word. The first letter of each word is coded with two digits, following the ascending Italian alphabetical collating sequence (PhiPhi through 2 Phi). The remaining five digits represent a sequence number. NP2 is filled by all 0's if it contains no word. WW: Contains a first word (FWW) and second word (SWW) item written in unabbreviated form, with or without a joining preposition. For instance 'vertical lathe' is a two-card item without a preposition. NS: A string of bits which identify classes to which the WW item belongs/belongs not (bit on/bit off). The first six bits identify generic items, type of material, quality, machinery, f...