![]() |
|||
![]()
|
![]() |
![]() |
![]() |
Chapter 8
|
1. | Introduction | |
2. | Definitions | |
3. | Where Do We Start? | |
4. | Certainty Factors | |
5. | The MYCIN Approach | |
6. | Dempster-Shafer Approach | |
7. | Bayesian Belief Networks | |
7.1. | What Is a Belief Network? | |
7.2. | Why Use a Belief Network? | |
7.3. | Structure of a Belief Network | |
7.4. | Knowledge Engineering | |
7.5. | Process of Using a Belief Network | |
7.6. | Multiply Connected Belief Networks | |
7.7. | Example | |
8. | Future Research | |
References |
Artificial Intelligence (AI) has struggled to find ways to effectively use probabilistic reasoning to aid in solving problems where knowledge is incomplete, error-prone, or approximate. It has invented logics to deal with the problem symbolically. It has invented concepts to skirt the issue of conditional independence, prior probabilities, and the difficulties of conditional probabilities and causal inferences. A summary of the development of these ideas could be stated as, "We would use Bayesian models if only we could satisfy all the assumptions and were omniscient." We will focus on the dominant themes that have occupied most of the literature on uncertainty and expert systems. Those include the Bayesian approach, the certainty factor approach, the Dempster-Shafer approach, and the more advanced Bayesian belief networks approach. Fuzzy reasoning will not be discussed because it addresses the problem of vagueness rather than uncertainty. As Russell and Norvig (1995) point out, it is not a method for uncertain reasoning and is problematic in that it is inconsistent with the first-order predicate calculus.
When we consider behavior and knowledge quantitatively, we are thinking in both numeric and empirical systems. If we make the statement P(x) = 0.4, we are making a statement in the numeric system. In that system, this is a probability statement based, for example, on frequency of occurrence (this is not the only way to frame this). If, on the other hand, we view the statement P(x) = 0.4 as stating a degree of belief that the event x will occur, we are making a statement in the empirical system. When we go on to manipulate this belief using the laws of mathematics, we are making strong assumptions about the isomorphism between the axioms in the numeric system and the corresponding axioms in the empirical system.
We are interested in certainty or degree of belief in something. The assumption is that probabilities as degrees of belief or certainty change as a function of what we know -- as a function of evidence that supports or refutes the belief.
We start with a simple problem. Suppose that you get a new bread maker and when you make your first loaf of bread it does not rise. Your goal is to determine or diagnose this problem and to determine how to solve or fix it. The hypotheses you entertain are:
We can assume that, based on our past experience making bread and whatever other knowledge or expertise we have concerning bread making, that each hypothesis has a prior probability of being the proper diagnosis for dough not rising during bread making. We will assume that each hypothesis is equally probable and assign each the certainty 0.25.
Consider also that we have the following rules that affect our belief in certain hypotheses (this set is not complete); we will for the sake of brevity omit the rules for H2 through H4:
Using R1 and H1, we can restate the rule as a conditional probability as follows:
where H is H1 and E is the left-hand side of R1.
If from R1, we know that the yeast is old, i.e., Old(Y), then our certainty in H1 changes to:
We know that:
and
That is, our belief in H1 has increased. We could go on to compute the probability of H1 given multiple pieces of evidence E1, E2, ... En. We must assume E1, E2, ..., En are independent events to do so. The basic form, which is discussed in Gonzalez and Dankel (1993), is:
Stefik (1995) as well as Russell and Norvig (1995) provide excellent discussions of Bayesian probability models.
What we just did was to use a simple Bayesian approach. The problems with this approach are implicit in what we assumed. We assumed that the set of hypotheses were both inclusive and exhaustive. A Bayesian model must have this property. With expert systems, it will often be the case that such properties cannot be established. Secondly, we had to define prior probabilities. This is difficult and often impossible. Thirdly, if we combined evidence to compute the conditional probability for H given multiple pieces of evidence, we must assume that the evidences are independent with respect to each other.
Previous | Table of Contents | Next |
![]() |
|
Use of this site is subject certain Terms & Conditions. Copyright (c) 1996-1999 EarthWeb, Inc.. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Please read our privacy policy for details. |