Browse Prior Art Database

Automatic Acquisition of Translation Rules for Adpositions

IP.com Disclosure Number: IPCOM000015507D
Original Publication Date: 2002-Jan-29
Included in the Prior Art Database: 2003-Jun-20
Document File: 3 page(s) / 130K

Publishing Venue

IBM

Abstract

Disclosed is a system that selects an appropriate adposition (postpositions or prepositions) between a verb and a noun in Language B, as the translation of a verb phrase which consists of a noun, a specific adposition Px and a verb in Language A. This system is effective especially when P x is Japanese postposition de or Korean postposition lo/eulo ", and reduces human labor cost to create the dictionaries for machine translation. As in Fig. 1, the system uses corpora of both Language A (1-1) and Language B (1-2), and a bilingual lexicon between the two languages (1-3). The two corpora can be independent of each other. The system generates a set of preferred adpositions for a specific VN-pair (a pair of a verb and a noun) (1-7) and a set of preferred adpositions for a specific noun (1-9). [1-4] constructs Restricted VPN-tuple Set (1-5), a set of VPN-tuples in Language B, which can be possible translations of a verb phrase in Language A which includes P x . A VPN-tuple v , p , n consists of a verb v , an adposition p and a noun n where p modifies v and n modifies p . Restricted VPN-tuple Set is constructed as follows: Extract VPN-tuples whose adposition is P x from corpora of Language A. Let the set of VPN-tuples A vpn . Translate v and n in each VPN-tuple in A vpn into VN-pairs in Language B by using the bilingual lexicon. Let the set of VN-pairs B vn . Extract VPN-tuples from corpora of Language B. Let the set of VPN-tuples B vpn . Restricted VPN-tuple Set is defined as v , p , n v , p , n c B vp n , v , n c B vn . [1-6] determines the preferred adposition between a verb and a noun by the frequencies of the VPN-tuples in the Restricted VPN-tuple Set. Only when the most frequent adposition between a VN-pair is more frequent enough than the other adpositions, the adposition for the VN-pair is determined. Fig. 2 shows the iterative process in [1-8], which acquires preferred adpositions for nouns. Note that [2-1] and [2-4] correspond to [1-5] and [1-9], respectively. Using the current Restricted VPN-tuple Set (2-1), [2-2] determines the preferred adposition for a noun (2-3), ignoring the co-occurring verb. [2-5] detects VP-compounds. A VP-compound v, p is a pair of a verb v and an adposition p where two or more nouns n exist such that Adposition is determined for VN-pair v , n in [2-4] Adposition is determined for noun n in [2-3]

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

Automatic Acquisition of Translation Rules for Adpositions

Disclosed is a system that selects an appropriate adposition (postpositions or prepositions) between a verb and a noun in Language B, as the translation of a verb phrase which consists of a noun, a specific adposition Px and a verb in Language A. This system is effective especially when Px is Japanese postposition "de" or Korean postposition "lo/eulo", and reduces human labor cost to create the dictionaries for machine translation. As in Fig. 1, the system uses corpora of both Language A (1-1) and Language B (1-2), and a bilingual lexicon between the two languages (1-3). The two corpora can be independent of each other. The system generates a set of preferred adpositions for a specific VN-pair (a pair of a verb and a noun) (1-7) and a set of preferred adpositions for a specific noun (1-9). [1-4] constructs Restricted VPN-tuple Set (1-5), a set of VPN-tuples in Language B, which can be possible translations of a verb phrase in Language A which includes Px. A VPN-tuple (v, p, n) consists of a verb v, an adposition p and a noun n where p modifies v and n modifies p. Restricted VPN-tuple Set is constructed as follows: - Extract VPN-tuples whose adposition is Pxfrom corpora of Language A. Let the set of VPN-tuples Avpn. - Translate v and n in each VPN-tuple in Avpn into VN-pairs in Language B by using the bilingual lexicon. Let the set of VN-pairs Bvn.
- Extract VPN-tuples from corpora of Language B. Let the s...