Genetic Programming System

Last modified: "March 25, 2007 by matt"

In this assignment you will implement a simple genetic programming system and use it to solve a symbolic regression problem involving a single variable. The project consists of several steps.


1. Implement an expression tree class, ExpressionTree. The leaves of the trees are either the variable, X, or a constant (a double between -10 and +10). The interior nodes are arithmetic operators {+, -, *, %, power(z, e), abs(z), sin(z)}, where '%' is the "protected division" operator. The eval(double x) method should return the value of the tree, given that the variable has the given variable. Create a constructor taking a single variable, int maxHeight, indicating the maximum height of the tree, that generates a random expression tree. You'll want a toString method so that you can print your expression trees for debugging and output purposes.

2. After testing ExpressionTree to make sure the evaluation is working properly, add the method crossover(ExpressionTree mate) that returns an array of two new ExpressionTree, formed by the single-point crossover method. Then add a method mutate(ExpressionTree indiv) that returns a new ExpressionTree, formed by single point mutation. (You can use the constructor from the previous paragraph to partially implement this.) After all this is tested, move on to the next stage.

3. Create the class Individual, which consists of an ExpressionTree and whatever bookeeping you think necessary (probably including a fitness measure(s).) Create the class Population, which consists of a list of Individuals and whatever bookeeping you think necessary. A DataSet class, which consists of a set of ordered pairs, representing observed x and f(x) values. A Fitness class, which supports the method double eval(Individual i, DataSet s) that returns the fitness of the given individual over the given DataSet.

4. The ExecuteGP class has an effectSingleRun(int populationSize, int maxGenerations, double mutationRate, double crossOverRate) method that effects a single run of the GP system, and returns Individual representing the best individual seen during the run. The Individual should contain its fitness value, as well. The mutationRate is that portion of each population that is created from the previous via mutation. The difference between 1.0 and the sum of mutationRate and crossOverRate represents that portion of each population that is created via direct copying from the previous.


To Hand In:

Submit a hardcopy of your code, and submit an electronic copy as well. Your hardcopy should be accompanied by a brief lab report. It should include: