
Introduction to Stats
EpochX provides what we call the Stats system, to gather data about how a run
is progressing and provide easy access to that data in a convenient form, including
being able to output it. The center of the Stats system in EpochX is the
Stats
class. Like the Life
class, this class implements the
singleton pattern and so there is only ever one Stats
instance which is
obtainable with a call to the static method Stats.get()
.
Raw data that is generated by the evolutionary algorithm, such as the population at the
end of a generation, gets inserted into the Stats
instance. Then other
components are able to easily get a copy of the data with a call to one of the class'
getStat
methods. This method takes an instance of Stat
, which
is used to lookup whether any data has been stored against that same Stat
instance. So, for this to work, the request for the data must be using the same
Stat
instance that was used to put the data in, and so they are made
available as public fields throughout the API.
Dynamic generation
What if we want to provide access to the average fitness from a generation or the average depth of
the programs in a breeding pool? This kind of data introduces an overhead to calculate and store,
and so to generate all these extra statistics without knowing if they are even needed
would have a big performance hit. Fortunately, we have a solution. The Stat
interface defines the method getStatValue()
. In the case that the value for
a requested Stat
is not already held in memory, then this method is called,
giving the Stat
instance itself the opportunity to generate a value. It is
quite common for dependency chains to develop where the generation of one Stat
requests another which must be generated, and where that one requires another etc. As long as
there is no recursion in the dependencies, this works perfectly and results in all intermediate
Stat
values being stashed for future requests which removes any unnecessary
performance penalty.
Expiry
The other method that all Stat
implementations must implement is the
getExpiryEvent()
method which returns a value from the Stats.ExpiryEvent
enum. Possible values include:
ExpiryEvent.RUN
ExpiryEvent.GENERATION
ExpiryEvent.INITIALISATION
ExpiryEvent.ELITISM
ExpiryEvent.POOL_SELECTION
ExpiryEvent.CROSSOVER
ExpiryEvent.MUTATION
ExpiryEvent.REPRODUCTION
The ExpiryEvent
a Stat
returns, designates the event to associate any values for that
Stat
with. In practice, the effect of this is that the stats system
only stores the value for that Stat
until the start of the next event of that type.
For example, all GENERATION Stat
values will be cleared at the start of the following
generation. This means that there is a specific window during which the values are accessible, after which it
is assumed that any interested parties have extracted and processed the stats they need, storing them elsewhere
if necessary. The simplest and most common way of doing this is outputting the values, as described in the
following section.
Generating output
The most common use of the Stats system is to generate output about your runs. A number of methods
are provided on the Stats
class to help with this, with the following signatures.
print(Stat ... fields)
print(String separator, Stat ... fields)
printToStream(OutputStream out, Stat ... fields)
printToStream(OutputStream out, String separator, Stat ... fields)
The two print
methods are just for convenience to print to the System.out
OutputStream
. The effect of calling any of these methods is to retrieve the specified stat
values, and then print the value to the designated output stream, with each stat value separated
with the String
given as the separator
parameter.
Given that the life of statistics data is short, where should these print method calls be made? The right
time is almost always upon an 'end' event, such as onCrossoverEnd
or onGenerationEnd
.
At this point, all the data for that operation will be in the Stats
manager, and it is
guarenteed that none of it will have been cleared yet. There is, however, no guarentee that stats associated
with other events that are still in progress (like the run, or generation) will be available yet.
But, in general, the framework will try to put the raw data
into the Stats
instance as soon as it is available, which means many stats will be be usable
long before the end event. For example, it is completely safe to request the generation number
(StatField.GEN_NUMBER
) during an onCrossoverEnd
event.
Enough words. Here is a useful idiom you might like to use:
Life.get().addGenerationListener(new GenerationAdapter(){ public void onGenerationEnd() { Stats.get().print(�); } });
This will print statistics to the console at the end of each generation, the same idea will work for printing statistics each run, crossover, elitism etc.
Creating a new Stat
If you are implementing a new operator or other extension, you may find that you have
new data that you would like to make available through the stats system. The easiest way to do this
is to create a public field which is a reference to an anonymous class which extends AbstractStat
.
AbstractStat
implements the Stat
interface and provides defaults. In the simplest
case your Stat might look like this:
public static final Stat MY_NEW_STAT = new AbstractStat(ExpiryEvent.GENERATION) {};
All you need to do then is to add the raw data into the stats manager using the addData
method
on the Stats
instance, making sure to use your MY_NEW_STAT
as the Stat
.
You can then withdraw or print this data in the same way that you can any other stat, by referencing it with
this Stat
instance. The data will be cleared upon the next occurance of the expiry event, in this
case upon the start of the next generation.
Where it makes sense to do so, you should consider overriding the
AbstractStat
's getStatValue
method so the statistic is only calculated if required.
Here is a simple example:
/** * Returns an Integer, which is the number of programs with an above average fitness * in the generation. */ public static final Stat NO_ABOVE_AVE = new AbstractStat(ExpiryEvent.GENERATION) { @Override public Object getStatValue() { double ave = (Double) Stats.get().getStat(GEN_FITNESS_AVE); double[] fitnesses = (double[]) Stats.get().getStat(GEN_FITNESSES); int noAboveAve = 0; for (double fit: fitnesses) { if (fit > ave) { noAboveAve++; } } return noAboveAve; } };
It obtains the average fitness, and an array of all fitnesses, and counts how many of the program fitnesses are greater than an average, returning the result as the value. Notice how it depends upon the two other stats.
Next: Hooks
Previous: Life cycle listeners