Greed is bad! (for GAs)

eudamonia · Postby **eudamonia** » Mon Jun 19, 2006 8:21 pm

I know that there are a number of other GA folks out there so here's my beef.

How do I know when my GA is being Greedy (i.e. when it gets rewarded for getting stuck at local optima)?

Here's an example. I utilize the Grail GGO most of the time for my GA needs and it works really well I feel for about 10 parameters. But I like to push things a bit. Specifically, if I've got a set of entries that I like, I put together a host of typical exit methods and optimize them with the GGO in a switchable fashion (with stress testing off for the switches). Its sort of like using the CASB but only for exits.

This weekend I ran a set of tests for an entry system with about 5 parameters combined with the exit system which had about 35. So a total of 40 parameters. I ran the GGO optimization overnight and got the best NP and MaxDD of say about $20K/10k for a 3 year period. In frustration I manually optimized the system myself with just a couple of exit options I know work and came out with superior NP and MaxDD of about $40K/5k. Nothing to write home about. Then I setup a new run with the original 5 entry parameters about 7 exit parameters I felt were especially relevant to this market and timeframe (including switches). This ended up working great giving me a system with about $60k/3k (and this is for long positions only). So what's the problem?

How do I know my GA is doing its job (i.e. hunting efficiently for global optima)? I know it already succesfully avoids curve fitting (that's what GAs are supposed to do). And I know when I've curve fit anyway (poor OOS results). How do I know when my GA is getting stuck at a local optima? Do I need to run more iterations? Change the population or mutation rates? Are there any mathmatical guidelines?

If anyone has any materials or thoughts on how to quantitatively define and resolve these issues, please let me know. Thanks.

Edward

michal.kreslik · Postby **michal.kreslik** » Mon Jun 19, 2006 9:30 pm

Ed,

I've got a simple tool - a table in excel - and I put all the parameter ranges that are to be optimized into the sheet prior to running the search.

I simply calculate the total number of combinations by multiplying the total count of possible values within the individual parameter ranges. (I've also suggested this calculation to be done automatically to Wouter and he replied it's a good idea and simple to implement, but unfortunately it was not implemented after all

Then if the total number of combinations starts resembling the US budget deficit, prepare for the troubles.

The GA strategic parameters that may attenuate the impact of such a licentious set of input parameters/ranges are the mutation pobability and the population size.

The core problem is in pretending the GA is the Holy Grail to which one will dump all the garbage found along the way and will get a diamond ring.

This doesn't work like that. True, GA help us to arrive at a solution WAY faster, but if you present it with a too vast (and let's admit uselessly vast) search space, it will either

need much more time accordingly, or
scamp the task

There are no free lunches. GA is like a Lamborghini compared to the old rusty pickup of exhaustive optimization. It can go very fast. But there are limits as well.

[align=center][/align]

Attached: simple excel calculation of the total combinations count

eudamonia · Postby **eudamonia** » Mon Jun 19, 2006 10:07 pm

Michal,

I like the idea of examining how many combinations I'm searching. And my combinations definately look like the U.S. deficit. Thanks for the spreadsheet.

I guess my question is how do I use mutation probability and population size in correlation to the number of combinations I want to look at? I.e.

if combinations > 1,000,000 then set pop size to 50, mutation to 2
> 10,000,000 then set pop size to 60, mutation to 3
>100,000,000 then set pop size to 70, mutation to 4 etc,

Wouter suggests not changing the population or mutations for any combination of parameters less than 100. However, that would have to assume pretty narrow ranges for those 100 parameters I would guess.

BTW I agree the Grail is no Grail and garbage in is garbage out. I'm merely attempting to replicate the function of the CASB for exits only. Unfortunately the CASB does not support the type of entry logic I'm using or I would simply use that (with a few modifications). So in essense, I accept that I will need to do a "development and prototype" phase where my GA run will not be totally optimal. I just need it to perform better than abysmal!

michal.kreslik · Postby **michal.kreslik** » Tue Jun 20, 2006 10:38 am

First we must come to realize what are these strategic parameters, as they are called in GA, intended to do.

The population size is exactly what it appears to be - the number of individuals, or candidate solutions in our context, that exist in every generation of the entire population.

This population size might be defined as "the more, the better" since the bigger the population size, the more candidate solutions to choose from for breeding to spawn the next, hopefully better fit generation.

Obviously, we are putting restriction on population size not because we want less individuals for system purposes, but because it's faster with less individuals in every generation (the same as with the number of generations total).

The trouble is that if the search space is too vast, the limited number of individuals in each generation won't be enough to catch the entire possible variability of the search space.

Allegory

You may picture this as if you wanted to deploy soldiers to the battle field. Let's say you wanted to deploy them into perfectly square battlefield so that they will see each other and may communicate with each other by gestures:

Exhaustive optimization would place these soldiers uselessly close to each other, like in the above picture. Also, there will be no vacant places in the grid in exhaustive optimization. This way, the total number of soldiers (candidate solutions) will cover only a very small battlefield (very small search space, like two parameters each in a range of 1 to 5).

The genetic optimization, on the other hand, deploys the same soldiers like this:

You see, we have covered much larger battelfield (search space) with the same number of soldiers (individual tests or candidate solutions).

The soldiers can still see each other and communicate with each other by gestures.

The population size x number of generations in this respect is the total number of soldiers available.

The mutation probability is the probability that the soldier will change his place in the grid in favor of another vacant place in the grid. The soldiers must be allowed to do this in case the random space between the two soldiers is too large.

Now how can you handle a very large battelfield, like the one 1000x larger than the one presented in the illustrations above?

You can only do it

by increasing the total number of soldiers, thus increasing the number of generations and the population size (although the population size is more important)
or by increasing the mutation probability so that you are hoping for some soldiers to leave their original place and come to a new one (but be careful with mutation prob, there are number of side effects). Thus, you hope to cover the battelfield better. If you set the mutation probability to 0, then no soldier will ever change his origianlly designated place so no one will ever fight the enemy in the space between the two soldiers.

This is a very simplistit example - there are lots of other aspects not covered by this allegory, but I think it's a vivid one.

The core idea here is that even with GA, you have a limited number of soldiers (individual tests), so very large battlefields can't be covered at all.

GA is not omnipotent, just much better than the exhaustive methods.

Michal

eudamonia · Postby **eudamonia** » Tue Jun 20, 2006 3:26 pm

Michal,

That's very helpful, thank you. What types of side effects can you encounter when the mutation rate is too high (and how much is too high)?

Edward

eudamonia · Postby **eudamonia** » Tue Jun 20, 2006 5:11 pm

Michal,

Have you seen this journal article from IlliGAL (Illinois GA lab)?

ftp://ftp-illigal.ge.uiuc.edu/pub/paper ... 005017.pdf

The article discusses population sizing adjustments on the fly and also discusses various ways to determine appropriate population size from a set of parameters based on a certain "building-block" range. This is great stuff!

Edward

michal.kreslik · Postby **michal.kreslik** » Tue Jun 20, 2006 7:48 pm

Thanks, Ed, but the above link is not working for me. Could you please update? Thanks

eudamonia · Postby **eudamonia** » Tue Jun 20, 2006 8:27 pm

Michal,

Its a .pdf and should be working, but if not here is the page its listed on.
http://www-illigal.ge.uiuc.edu/techreps.php3. Look for the following document:

Yu, T.-L., Sastry, K., Goldberg, D.E (2005)
Online Population Size Adjusting Using Noise and Substructural Measurements

Obviously the IlliGAL website has lots of other relevant publications to the GA arena as well.

Htarlov · Postby **Htarlov** » Mon Dec 15, 2008 10:16 pm

I had pretty nice effects when I've used GA with variable frequency and size of mutations. So it started with high level of mutations than it lowered it in time, then it went to some local optimum and then make mutation level higher to go out of the hole and move to other optimum. It was also non-elite algorithm (it remembered best solutions on side, but not kept them - there was probability of "killing" actual best solution).
Mutation rates was high-lower-low-higher-high-lower-low... etc.

But there are systems where GA won't work - because of too "chaotic" or "noisy" goal function. I think it's better to be aware of them - they are probably brittle systems, not stable in time nor in parameters.

Also there is "tabu search" algorithm that is pretty good at going out of local optimum. But it works only for phase spaces based on something discrete (for example permutations). Not good and nearly unimplementable for continous or quasi-contious values of parameters, so not very useful for trading system optimisation.

nickgurtsu · Postby **nickgurtsu** » Mon Jul 13, 2009 3:06 am

eudamonia wrote:I know that there are a number of other GA folks out there so here's my beef.

How do I know when my GA is being Greedy (i.e. when it gets rewarded for getting stuck at local optima)?

Here's an example. I utilize the Grail GGO most of the time for my GA needs and it works really well I feel for about 10 parameters. But I like to push things a bit. Specifically, if I've got a set of entries that I like, I put together a host of typical exit methods and optimize them with the GGO in a switchable fashion (with stress testing off for the switches). Its sort of like using the CASB but only for exits.

This weekend I ran a set of tests for an entry system with about 5 parameters combined with the exit system which had about 35. So a total of 40 parameters. I ran the GGO optimization overnight and got the best NP and MaxDD of say about $20K/10k for a 3 year period. In frustration I manually optimized the system myself with just a couple of exit options I know work and came out with superior NP and MaxDD of about $40K/5k. Nothing to write home about. Then I setup a new run with the original 5 entry parameters about 7 exit parameters I felt were especially relevant to this market and timeframe (including switches). This ended up working great giving me a system with about $60k/3k (and this is for long positions only). So what's the problem?

How do I know my GA is doing its job (i.e. hunting efficiently for global optima)? I know it already succesfully avoids curve fitting (that's what GAs are supposed to do). And I know when I've curve fit anyway (poor OOS results). How do I know when my GA is getting stuck at a local optima? Do I need to run more iterations? Change the population or mutation rates? Are there any mathmatical guidelines?

If anyone has any materials or thoughts on how to quantitatively define and resolve these issues, please let me know. Thanks.

Edward

Good OOS results ( the equity curve is similar to the one in sample) don't guarantee future performance

I tested a dozen systems that gave me great equity curves in and out of sample
I always hold 3 - 6 months of data just for manual testing which is not included in the GA's out of sample
when I tested so called robust systems on my hidden data only half of them performed well and the rest fell appart

My conclusion is no matter how well optimized the system is and how good is the robustness index the matter whether the system is going to work or not in the future is a toss of a coin

ohh and the fitness function should not include ANY property of the out of sample data AT ALL !!!!!!!!!!!!!!
it's easy to write a function that will fit the system so the equity curve is the same in and out of sample
those systems die right away

the out of sample equity curve needs to be left to a "chance"
when the system is optimized and the top 10 - 15 ranked parameters give you good equity curve out of sample the chance that the system is going to work in the future is good

again this needs to be tested on some hidden data that was no included in the out of sample

these are my experiences

Kreslik.com - traders community since 2006