Brought to you by EarthWeb
IT Library Logo

Click Here!
Click Here!

Search the site:
 
EXPERT SEARCH -----
Programming Languages
Databases
Security
Web Services
Network Services
Middleware
Components
Operating Systems
User Interfaces
Groupware & Collaboration
Content Management
Productivity Applications
Hardware
Fun & Games

EarthWeb Direct EarthWeb Direct Fatbrain Auctions Support Source Answers

EarthWeb sites
Crossnodes
Datamation
Developer.com
DICE
EarthWeb.com
EarthWeb Direct
ERP Hub
Gamelan
GoCertify.com
HTMLGoodies
Intranet Journal
IT Knowledge
IT Library
JavaGoodies
JARS
JavaScripts.com
open source IT
RoadCoders
Y2K Info

Previous Table of Contents Next


3.4.4.2. Anaphora and Ellipsis

One of the hardiest problems in Computational Linguistic and NLP processing concerns finding algorithmic procedures for solving anaphora. Anaphora is a linguistic problem concerning the fact that pronouns (like "he," "himself," or "they"), possessive determiners (like "her" or "their") and full noun phrases (like "the patient," see below), can be used to denote implicitly entities already mentioned in the discourse; by "discourse" we mean here a coherent segment of text, either written or spoken. Examples can be, respectively, "John went to the party and he got drunk," "this girl is very nice, and her brother is charming," "Jack was admitted to the hospital early this morning. The patient complained of chest pain." Solving anaphors implies then the following two steps:

  1. Realising that the pronoun, the possessive determiner, or the noun phrase (NP) represents an implicit reference (normally, an abbreviated reference) to some other entities -- this first step is not so trivial, especially when the anaphora is constructed through an NP; not all the personal pronouns are necessarily anaphoric, etc.
  2. Disambiguating the reference, by substituting to this last one the entity (the "antecedent) that represents its real identity.

A detailed analysis of all the techniques that have been proposed for solving the anaphora problem is, of course, beside the limits of this chapter (see Carter, 1987). We can also say here that these methods can be roughly classed into two categories, syntactic and semantic methods.

The first are based only on the results of the morphosyntactic analysis. In some cases, especially when the implicit reference is created by making use of pronouns (pronominal anaphora), the syntactic procedures for resolution can be very simple. In such cases, a list is kept of all the entities mentioned in the discourse. When a pronoun is encountered, the list is examined starting from the more recent entries, and each pronoun is associated with the most recently mentioned entity that satisfies grammatical and semantic constraints. For example, in the previous sentence: "John went to the party and he got drunk," we can stick to the fact that "he" must refer to a previous, animate NP; we look for such an NP and we find "John," so we can now proceed to disambiguate the reference by substituting "John" for "he." Of course, this very simple method of resolution cannot be generalized, even in the case of pronominal anaphora, given that, inter alia, it is very difficult to conjecture with some precision, as we have done in the previous example, the semantic type and the syntactic structure of the antecedent. We can mention, in this context, the following example where the antecedent ("to buy a new dress") is not a simple NP, but a structured verbal phrase: "I have to buy a new dress. I prefer to do it before holidays."

Some reliable syntactic procedures exist in very limited domain; for pronominal anaphora, they are linked with the concepts of "dominating node" and "command." In a syntactic tree, a node A dominates a node B if B is included in the syntactic sub-tree stemming from A. A node A commands now a node B if (1) neither A nor B dominates the other; (2) the S(entence) or NP node that most immediately dominates A (i.e., the lowest S or NP node dominating A) also dominates B. For pronominal anaphora, a first rule then says that a pronoun cannot both precede and command its antecedent. Let us consider, as an example, the two sentences: "John left the room after he shot the girl" and "He left the room after John shot the girl." In the two, the node corresponding to the subject of the main clause (... left the room) commands the node corresponding to the subject of the subordinate clause (... shot the girl). Then in the second sentence, according to the previous rule, "he," which precedes "John," cannot have this last noun as its antecedent. Please note that a rule like this is mainly a "negative" one, allowing us to exclude some NPs as possible antecedents of a pronoun, but without allowing a precise identification of this antecedent. A more precise rule exists for anaphoric situations created by the use of reflexive and reciprocal pronouns like "each other," "herself," or "himself." It says that (1) the antecedent of a reflexive pronoun must be dominated by the same, lowest S or NP node that dominates the pronoun; (2) a nonreflexive pronoun may not have an antecedent dominated by this node. Looking now to the following David Carter example: "I took my dog to the vet on Friday. He bit him at the hand," we can say, using simple morphosyntactic criteria like that of the "list" expounded before, that "he" is either "dog" or "vet"; applying semantic information for "hand," i.e., "only people have hands," we can add that "him" is the "vet." The second part of the previous rule tells us that the predictions "he = vet" and "him = vet" are inconsistent, since both pronouns are dominated by the same S node and are not reflexive. The prediction "he = vet" is then rejected because the pronoun "he" has another alternative, i.e., he = dog.


Previous Table of Contents Next

footer nav
Use of this site is subject certain Terms & Conditions.
Copyright (c) 1996-1999 EarthWeb, Inc.. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Please read our privacy policy for details.