Brought to you by EarthWeb
IT Library Logo

Click Here!
Click Here!

Search the site:
 
EXPERT SEARCH -----
Programming Languages
Databases
Security
Web Services
Network Services
Middleware
Components
Operating Systems
User Interfaces
Groupware & Collaboration
Content Management
Productivity Applications
Hardware
Fun & Games

EarthWeb Direct EarthWeb Direct Fatbrain Auctions Support Source Answers

EarthWeb sites
Crossnodes
Datamation
Developer.com
DICE
EarthWeb.com
EarthWeb Direct
ERP Hub
Gamelan
GoCertify.com
HTMLGoodies
Intranet Journal
IT Knowledge
IT Library
JavaGoodies
JARS
JavaScripts.com
open source IT
RoadCoders
Y2K Info

Previous Table of Contents Next


2.2. THE ALPAC REPORT AND THE FIRST AI PROGRAMS

Semantics, however, was lagging behind. In 1960, Bar-Hillel published a survey of MT where he expressed the opinion that the syntactic aspects of current MT systems could, in effect, be greatly improved through the incorporation of theoretical research, but that semantic problems could never be completely solved because of the huge quantity of implicit "world knowledge" necessary to translate correctly from one language into another. His famous example concerns the two sentences "The pen is in the box" and "The box is in the pen," and the impossibility for a machine to determine that the word "pen" in the second sentence means "an enclosure where small children can play" without having at its disposal some sort of universal encyclopedia. Therefore, high-quality MT was simply impossible. Not all researchers agreed with Bar-Hillel's conclusions; however, they were officially endorsed by the publication, in 1966, of the so-called ALPAC (Automatic Language Processing Advisory Committee) report. In this document, it was stated that MT was slower, less accurate, and much more expensive than human translation; no machine translation of general scientific texts was demonstrable, and none was foreseeable in the immediate prospect. A severe reduction of public investment in MT was recommended, and immediately put into effect.

The ALPAC report could have sounded the knell of any hope to see NLP techniques go beyond the simple production of indexes and concordances. Fortunately, Artificial Intelligence (AI) was, at that moment, already established as an autonomous discipline -- the official start of AI corresponds with the Dartmouth Summer School of 1956 -- and AI researchers were already working on NLP programs that tried to go beyond syntactical analysis, searching for key words or statistical techniques, for dealing instead with the "meaning" hidden behind words and sentences.

The early 1960s saw the burgeoning of a series of AI programs that can be considered, in a sense, as the first examples of NL interfaces to databases and KBSs. We can mention here SAD-SAM (Syntactic Appraiser and Diagrammer-Semantic Analyzing Machine), BASEBALL, SIR (Semantic Information Retrieval), STUDENT, and mainly ELIZA, perhaps the most famous among these early AI programs, written by Joseph Weizenbaum in 1966 and programmed in the SLIP language (Symmetric List Processor) developed by the same Weizenbaum. All these programs dealt only with very restricted domains -- e.g., kinship relationships in SAD-SAM, or the American League games during 1 year in BASEBALL. Their dialog capabilities were very limited (with input sentences reduced to simple declarative and interrogative forms) and, mainly, their syntactic and semantic capabilities were particularly meagre. No real syntactic analysis of the queries was in fact performed; they were simply scanned to find keywords or patterns related to the chosen domain (and representing, therefore, the only semantic knowledge of the system), used then in a pattern-matching mode to retrieve information inside some sort of fact database. Heuristic rules were often used to augment the possibility of recognizing useful keywords and patterns: for example, two simple heuristics used in STUDENT are "one half always means .5" and "distance equals speed times time."

The amazing results obtained by ELIZA in simulating the interaction between a (nondirective) psychiatrist and his patient were largely due to some enhancements of this sort of heuristics, that allowed the program to produce dialogs which appear to be surprisingly realistic. For example, assuming that the retrieved keyword is "me," this will trigger a set of associated patterns to look for in the patient's input: one of these will correspond, e.g., to "0 you 0 me" (0 in the pattern can match any string of words). Patterns like this are linked in turn with a set of "transformation rules," as "what makes you think I 3 you," where 3 indicates here the third element matched, i.e., everything in the input between "you" and "me." A user's utterance like "you bother me" will then produce an ELIZA's answer like "WHAT MAKES YOU THINK I BOTHER YOU." Weizenbaum himself argued later (1976) that, to the extent the ELIZA's results were particularly impressive, they were also particularly misleading. Another, later dialog system (early 1970s) in the same vein was PARRY, by C. Colby, where the basic technique consisted again in matching a sequence of keywords extracted from the user's input against a long list of predefined patterns (about 1700).

We must wait until early 1970s to see the coming of two NL question-answering systems, William Woods' LUNAR and Terry Winograd's SHRDLU, attempting to deal in a comprehensive way with both syntax and semantics, and trying to do some real "reasoning," at least to some limited extent. Both LUNAR and SHRDLU are representative of the "procedural" approach to knowledge representation: knowledge about the world is represented as procedures (programs) within the system, and reasoning about this knowledge corresponds to executing the related programs. The dispute about the relative merits of "declarative" (semantic nets, logic formulae, frames, etc.) versus procedural representations was an important, historical issue in Artificial Intelligence, superseded today. In the declarative approach, some general knowledge representation structures are manipulated by a general deductive process.

LUNAR was an NL interface to a database provided by NASA that concerned chemical analysis data on the samples of rock brought back from the moon by the Apollo 11 mission. LUNAR handled NL queries in the style of: "What is the average concentration of aluminium in high alkali rocks?" in three steps:

  1. Syntactic analysis of the query. This was executed making use of (a) an ATN (Augmented Transition Network) grammar, and (b) heuristic information to prune the results, in order to produce the most likely "derivation tree" corresponding to the query. A derivation tree is a hierarchical structure indicating the syntactic functions (e.g., surface subject) of the words of the analyzed expression, and the relationships (e.g., adjective/noun) between them. ATN is a formalism for describing and applying NL grammars, developed by William Wood himself, which was considered a sort of standard for syntactical analysis until at least the mid-1980s -- see subsection 3.4.3 for some additional information on ATNs
  2. Semantic interpretation of the results of (1) in order to produce a representation of the "meaning" of the query; in practice, the query was translated into a formal language in the predicate calculus style. The language included essentially "propositions," formed by "predicates" having as arguments "object designators," and "commands," which started actions. An example is: "(TEST (CONTAIN 810046 OLIV))," where TEST is a command concerning the evaluation of a truth-test, CONTAIN is a predicate, 810046 is the designator of a particular sample, and OLIV the designator of the olivine mineral.
  3. Execution on the database of the program represented by the formal query to produce an answer ("procedural semantics").


Previous Table of Contents Next

footer nav
Use of this site is subject certain Terms & Conditions.
Copyright (c) 1996-1999 EarthWeb, Inc.. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Please read our privacy policy for details.