Brought to you by EarthWeb
IT Library Logo

Click Here!
Click Here!

Search the site:
 
EXPERT SEARCH -----
Programming Languages
Databases
Security
Web Services
Network Services
Middleware
Components
Operating Systems
User Interfaces
Groupware & Collaboration
Content Management
Productivity Applications
Hardware
Fun & Games

EarthWeb Direct EarthWeb Direct Fatbrain Auctions Support Source Answers

EarthWeb sites
Crossnodes
Datamation
Developer.com
DICE
EarthWeb.com
EarthWeb Direct
ERP Hub
Gamelan
GoCertify.com
HTMLGoodies
Intranet Journal
IT Knowledge
IT Library
JavaGoodies
JARS
JavaScripts.com
open source IT
RoadCoders
Y2K Info

Previous Table of Contents Next


Rule Induction. Another use of GAs that is somewhat related to rule-based systems is applying GAs to extract or symbolically learn rules or patterns from data. Symbolic learning is a supervised learning paradigm that can derive rules from data. Unfortunately, experience has shown that even to extract a small number of rules without applying any other heuristic strategies results in lengthy evolution times. The evolution times required to derive a small number of rules (less than six) from even small data sets (several hundred records) can typically be hours (Al-Attar, 1994).

Symbolic learning extracts rules from data in the form of a classification tree. The nodes of the tree consist of attributes extracted from the data set. The leaves of the tree nodes are the possible outcomes. The basic concept behind the GA approach is to evolve these trees, with each gene representing a node in a classification tree. This approach was applied to loan data obtained from a bank (Al-Attar, 1994, p. 37), 430 data records were available. Both the traditional algorithm, known as ID3, and the GA hybrid were run on the same data. ID3-type algorithms build classification trees from data using entropy as a heuristic measure and use heuristic tree-pruning to prevent the tree from growing too large. Table 3 summarizes the results.

The GA hybrid approach for extracting rules from data shows some improvement over the traditional approach.

It was mentioned in the beginning of this section that there are basically two ways to combine GAs and expert systems into one system. The first approach uses the expert system to create or generate initial solutions, taking advantage of any domain knowledge. Remember that GAs are weak methods; no knowledge of the problem is required. You can increase the chances of success of finding a satisfactory solution by using what domain knowledge you have to generate the initial population of genotypes for the GA. Typically, a starting point is created by random population generation. A much better starting point is a population where some domain knowledge was used to create at least some feasible solutions. Unless you check, generating a random population of solutions does not guarantee that any of the starting points are good or even valid. It is also important to ensure that the fitness function used by the GA agrees with the derived initial population. In other words, you do not want to take the time to create an initial population that the fitness function evaluates as poor. For example, if I were trying to schedule a list of classes for students, and I was concerned about student satisfaction, then I could obtain knowledge about their class preferences (times preferred, etc.). I could then encode this knowledge into an expert system, and use the results as starting points for the GA to optimize. This can only be taken so far, since too many constraints could be generated when collecting this knowledge (e.g., nobody wants to take a morning class). Note that you are not constraining the solution space with the derived initial population. There is nothing to stop the GA from exploring the entire search space, especially if the fitness function evaluates other solutions as much better. With the derived initial population, you are hopefully providing a good starting point.


TABLE 3
ID3 and GA Comparison
 
Approach # of rules extracted # of attributes extracted Classification accuracy
 
ID3   5 3 73%
GA hybrid 11 6 78%

Another approach to combining the two technologies is to use GAs and expert systems to refine parameters such as certainty factors for rules or membership functions for fuzzy expert systems. This approach was used in one of the previously mentioned applications (Ghisleni et al., 1996). The greatest concern here is one of time. A basic approach is to set up the GA to generate a population of solutions, with each solution containing a set of parameters. These parameters are then tested by inserting them into the expert system, running the expert system, and checking the expert system's output. In this approach, the expert system is acting as a fitness function, since to find the best solution you must evaluate each one using the expert system. This, of course, can take much time and also leads to other considerations such as how many test cases do you use with the expert system. A similar approach can be used for fuzzy expert systems. With fuzzy systems, though, it is the membership functions that are being refined and not certainty factors. The fuzzy expert system is still being used as a fitness function. After evaluating the population, you then continue with the evolutionary procedure with selection, search, and eventual termination.


Previous Table of Contents Next

footer nav
Use of this site is subject certain Terms & Conditions.
Copyright (c) 1996-1999 EarthWeb, Inc.. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Please read our privacy policy for details.