Brought to you by EarthWeb
IT Library Logo

Click Here!
Click Here!

Search the site:
 
EXPERT SEARCH -----
Programming Languages
Databases
Security
Web Services
Network Services
Middleware
Components
Operating Systems
User Interfaces
Groupware & Collaboration
Content Management
Productivity Applications
Hardware
Fun & Games

EarthWeb Direct EarthWeb Direct Fatbrain Auctions Support Source Answers

EarthWeb sites
Crossnodes
Datamation
Developer.com
DICE
EarthWeb.com
EarthWeb Direct
ERP Hub
Gamelan
GoCertify.com
HTMLGoodies
Intranet Journal
IT Knowledge
IT Library
JavaGoodies
JARS
JavaScripts.com
open source IT
RoadCoders
Y2K Info

Previous Table of Contents Next


Syntactic methods for anaphora resolution exist also for anaphoric noun phrases; they are often based, as in the previous example, on the search of conflicts between information in the anaphoric noun phrase and information in the candidate antecedents.

With respect now to the semantic methods, we will limit ourselves to mention briefly here the resolution methods based on "focusing mechanisms" that have been popularized by Candice Sidner. This mechanism is based on the fact that, in a discourse, speakers always center their attention on a particular discourse element: this element is the "discourse focus." The speaker uses several linguistic artifices, that the hearer tries to interpret, to evidence the element on which he is focusing his attention; as the discourse progresses, the speaker may maintain the same discourse focus or may focus on another entity. A change in discourse focus, or the lack of a change, are signaled by the linguistic choices the speaker makes, particularly by his use of various anaphoric expressions: the hearer must then solve these expressions to be able to follow the changes in discourse focus.

Very roughly (see Carter's book for more details), a resolution process based on focusing mechanism consists of the following phases:

  1. Finding the discourse focus. The focus recognition algorithm begins with the selection, in the first sentence of a text, of an initial focus (the "expected focus"), which may or may not be confirmed in the subsequent sentences of the text. The validation procedure is based on syntactic or semantic rules. In the most general case, the decision implies a "case grammar" analysis of the input sentences: the most reliable focus default then fills the "object case of a verb" where, e.g., the "agent" position has the weakest capacity to determine the focus.
  2. Setting a set of six "registers" that represents the state of the focus at a given point in the text. The registers are: (1) the "discourse focus" (DF), which is set initially to the value of the expected focus; (2) the "actor focus" (AF) that is a parallel structure to the discourse focus, but concerning an animated entity, i.e., an entity characterized by the semantic feature [+animate]; (3) the "potential discourse focus" (PDF); (4) the "potential actor focus" (PAF); (5) the "discourse focus stack" (DFS); (6) the "actor focus stack" (AFS). Each of these registers contain a list of zero or more entities.
  3. Anaphora resolution, based on the semantic representations (case grammar representation) of the input sentences, and taking into account the state of the focus. The expected focus algorithm, applied to the first sentence of the text and making use of syntactic and semantic criteria, allows us to set the DF register. In the following sentences, an "interpreter" is made up of a set of algorithms to try to identify and to resolve the anaphoric ambiguities. Each algorithm applies to a given type of anaphoric situation; depending on this type and on the content of the focus registers, the interpreter predicts the registers as to where to find one or more antecedents for a given anaphora. The suggestions are evaluated by an inference mechanism based on the detection of contradictions, e.g., the violation of semantic constraints on the arguments of the verb.
  4. Updating the focus registers. A "focusing algorithm" refreshes the focus registers using the (validated) anaphoric interpretation chosen by the interpreter. This algorithm is also in charge to confirm or to reject the predicted focus; in case of rejection, it suggests a new focus.

Another frequent source of ambiguities typical of NL dialog systems and sharing some similarities with the anaphoric phenomena concerns the presence of elliptical sentences, i.e., sentences that appear ill-formed because they do not form complete sentences. This phenomenon is illustrated by this simple dialog: "Who is the director of the computer science department? Professor Smith. Of the nuclear physics department?"; "of the nuclear physics department" is an elliptical sentence. The resolution methods for dealing with ellipsis try typically to extract the missing parts, "Who is the director" in our example, from the previous, complete sentences. This can be accomplished by making use, e.g., of "context registers" in which to store the most significant, recent items found in the complete sentences, to be paired with the fragments according to their semantic category.

3.4.4.3. Other Types of Ambiguities

Even if "PP-attachment," "anaphora," and "ellipsis" constitute probably the most pervasive classes of linguistic problems that can affect the parse operations, they (unfortunately) do not exhaust the catalog of possible sources of ambiguities; see Androutsopoulos et al. (1995). The "quantifier scope" problem concerns the difficulty, in sentences with many words, in determining which quantifier, like "a," "each," "all," "there exists," should receive the wider scope. A classical example of this ambiguity is given by the two sentences, where the second is simply the passive of the first: "Everyone in the room speaks two languages" and "Two languages are spoken by everyone in the room." In interpreting them, people normally attribute wider scope to the quantifier "all in the room" in the first sentence (therefore, the two languages may be different for different people); while in the second sentence, they attribute wider scope to the quantifier "there exist two languages" (therefore, the languages are the same for everybody). This example also shows that if two interpretations are equally plausible, the preferred one is that where the quantifier ordering corresponds to the surface ordering of the noun phrases. Other ambiguity phenomena concern, e.g., the possibility that the word "and" denotes disjunction rather than conjunction ("How many people live in Boston and New York?"), the nominal compound problem (e.g., "computer science" has a totally different functional role in "a computer science department" and "a computer science device"), etc.


Previous Table of Contents Next

footer nav
Use of this site is subject certain Terms & Conditions.
Copyright (c) 1996-1999 EarthWeb, Inc.. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Please read our privacy policy for details.