## Genetic Algorithm Questions

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

Paragon99
Posts: 16
Joined: Thu Sep 20, 2012 4:10 am UTC

### Genetic Algorithm Questions

I found a site with an article on genetic algorithms so I thought I would try it out... I have two classes that seem to work as the article describes, but there are a couple of kinks to still be worked out.
Part of the issue may be due to the fact that the nature of the problem is not particularly well suited to genetic algorithms.

The idea is that you create a population of objects with a chromosome of 9 genes, each gene representing 0,1,2,3,4,5,6,7,8,9,+,-,* or / then without worrying about order of operations and discarding nonsensical data, the object applies the ops to the numbers to get a value... so
8*2+*13-3 => 8*2+1-3=14

Then given a target number the fitness is determined by 1/Abs(target-value)... if target==value the solution is found and fitness not calculated.

The problem is that it seems that either a solution is found almost immediately (<10 generations) -or- the solution is not found for >10000 generations.

I noticed that eventually, the entire population converges to the same nearly correct but not quite accurate chromosome and further generations are just replications of the previous generations with minor mutations that end up not propagating to the next generation...
for example, assume the target number is 80... at a certain point a chromosome such as 9*9 is generated... the fitness function ends up giving this chromosome a fitness of 1/Abs(80-81) = 1... which is a pretty good fitness, so that chromosome is selected more and more often with each iteration as more and more of the population uses this as either or both of the parent chromosomes... but from this chromosome a very specific mutation must occur... namely 9*9-1 so thousands of generations pass without that specific mutation...

One solution I found, was that after a set number of generations I would double the population with new chromosomes... this works to an extent... but two issues crop up with this approach, 1) everything still ends up converging to the original chromosome (as all the new ones generally have a lower fitness) and 2) It takes twice as long to run each generation...

Is this just due to the nature of the specific problem/fitness function?

I suppose in reality, you would use a genetic algorithm in a situation where there where outside variables and or some level of randomness... such that the same chromosome would be unlikely to continually receive the same fitness level...

Xanthir
My HERO!!!
Posts: 5281
Joined: Tue Feb 20, 2007 12:49 am UTC
Contact:

### Re: Genetic Algorithm Questions

First, your algorithm is actually doing quite well if it finds 9*9 when the target is 80, as that's very nearly the correct answer! GAs are bad for finding absolute optimums, but are good for finding "good" solutions. A fitness of 1 (under your model) is objectively great, and you shouldn't blame the algorithm for really liking that solution.

Second, your problem space is very spiky. Small changes in the genome can cause *drastic* shifts in fitness. This is bad for GAs - they're hill-climbers, and depend on there actually being "hills" to climb when optimizing. If finding the right solution isn't a matter of landing near and then inching upwards, but rather hitting an exact target surrounded by a large low-fitness area, it won't do well; you'll either find the answer more or less by accident, or not at all.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

Jplus
Posts: 1709
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

### Re: Genetic Algorithm Questions

In addition to what Xanthir said, I think your evaluation function is too spiky over the space of possible genomes. You could just use 100 - abs(target - value) instead.

In addition there are a few more variations on the genetic algorithm that you could try:
• Use sexual reproduction: mate fit individuals randomly and have their genomes cross over at random positions to determine the sequence of each child. In addition to the random mutations, of course.
• Alternate between different evaluation functions. For example, you could also select for genomes that neatly alternate between operators and digits, or for genomes that contain more different digits, to create genomes that are more sensible overall.
• Instead of making the number of offspring per individual depend on its fitness, you can also just divide the population into a relatively good part and a relatively bad part. The individuals in the bad part do not get any offspring. The individuals in the good part all get the same number of offspring. In my experience this works better.

Finally, population size is important. The larger your population, the less your evolution will suffer from random effects of the initial composition of your population.
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

freakish777
Posts: 354
Joined: Wed Jul 13, 2011 2:14 pm UTC

### Re: Genetic Algorithm Questions

Sort of a side note. Assuming you have a problem-space with lots of hills, like the following problem space (how GAs are typically presented to first timers, as x and y coords your finding a solution to maximize z, which also doubles as your fitness function, so people can visualize):

Here's some things I've found helpful in not getting stuck on the wrong hill (not particularly good for problem spaces with a single hill, or reaching the top of a hill once your population has found the right hill):

1. Run multiple independent populations (so PopA explores Hill Alpha, and PopB explores Hill Beta, hopefully). Don't have them reproduce with each other. Once populations converge, the population with Best Member that is less fit is culled, and a new population of random members takes it's place. Benefit, more exploration. Downside, a lot of extra work, PopB will explore Hill Alpha a lot anyways, and most of the times, new populations to replace the "losing" one will also lose. This one is really useful for problem spaces with only a couple hills, and not really good with lots of hills.

2. Culling the bottom 20~50% of your population entirely, and replacing with random members. Benefit, more exploration. Drawback 1, your GA will take longer to decide on an answer, one partial solution to this drawback is to have the amount of culling decrease with the number of generations run (it will still take longer to decide, but you should avoid scenarios in which convergence doesn't happen at all, you can't solve the fact that you're adding Onlogn sorting to each iteration aside from having culling only happen every 5 or 10 iterations instead of every 1).

The reason to tune the culling down is that once your main population starts converging on a solution, the likelihood of a random member doing better is slim to none (but a random member is still likely to be better at finding the solution than your worst member). I highly recommend this over your solution of doubling your population size (at any point).

Drawback 2, you can potentially have population members near the base of the right hill, only to be led away to the wrong hill by a member that was culled and replace by a random member on top of the wrong hill, this doesn't happen very often though in my experience. The best way to think about this one, would you rather the bottom 20% of your population just move up the hill a little bit (after being bred with better members of the population), or be kicked off the hill, forced to try and find a different, better hill? Keep in mind that the overall fitness of the population is not as important as the fitness of a couple members. The way your problem space is set up, there is definitely not one hill. Target number = 80, has solutions of (or hills centered on) 8*10, 10*8, 9*9-1, 20+60, etc, etc, etc.

More on topic (climbing the right hill better once you've found it):

1. Keeping the top X members of your population (always less than 5%, preferably only a couple anyways). Don't breed them, just have them stay for another generation. Benefit, you always have a measuring stick of the best found solution so far. Drawback 1, you're basically taking them out of the equation as far as finding the solution for that iteration. They have to wait until another member surpasses them on the fitness function to be active in the search for the solution. Drawback 2, if you don't combine this with some other methodology, these members can make your population converge prematurely.

2. In conjunction with the "keep the top X members" methodology, I've found that cranking up the mutation is beneficial the longer a single member retains the "Most Fit" position. So if you've got a member, 9*9 holding the "Carry over from previous generation" position for several generations in a row, you're population is becoming stagnate and has either found the right solution, or needs some mutation injected into it. This isn't the best example, but imagine the following hill being narrower in a larger problem space:

If your "Most Fit" point is being caught on one of the blue points (the ring around the real hill), you need some mutation to try and get your population to "jump" over to the other side of the valley. Once you've improved on your previous "Most Fit" position, you can dial the mutation back a bit.

Similarly, ensuring that your mutation function encompasses as many possible ways to mutate members of your population is pretty important. For your particular problem (string based, up to x characters), I see no issue with adding a random number of random characters to the end of the string of a given population member. Obviously that would not be the only mutation your mutation function should be capable of doing, but you need to think comprehensively about mutation.

phillip1882
Posts: 95
Joined: Fri Jun 14, 2013 9:11 pm UTC
Location: geogia
Contact:

### Re: Genetic Algorithm Questions

writing a good fitness function is probably the hardest part about GA's
1/abs(target-result) is very spiky, as others have pointed out. going from a distance of 2, to a distance of 1, goes from 1/2 (50%) to 100%
or alternatively you can dynamically change the fitness once a certain overall fitness has been reached.
i also might recommend removing multiplication and division, and treat multiple digits as a concatenation,
and then try a fitness run; only adding multiplication and division later to see if parts can be expressed more simply.
for example, if you seek 80 and get 76 +4; you might try a fitness run with only multiplication, to see if you can get either 2*2 = 4 or 19*4 = 76.

amos.ngwoke
Posts: 3
Joined: Thu May 08, 2014 11:14 am UTC

### Re: Genetic Algorithm Questions

for my B.Eng project, i'm working on optimization of thermal efficiency of a boiler using genetic algorithm in matlab. i've been struggling with writing the fitness function and constraint equations and also uploading my initial generation which is a set of data from my case study plant. i really need help

tarapena
Posts: 1
Joined: Wed Jun 29, 2016 5:54 pm UTC

### Re: Genetic Algorithm Questions

Hey guys,

I know this is slightly off topic - but I was wondering if anyone had advice in optimizing GAs ? Are there certain fitness/mutation/mating functions that produce better results or should the parameters need to follow a certain structure? I know this question is kinda ambiguous however I'm testing different variations of these functions together and getting interesting results. By optimize, I mean take less generations.

Soupspoon
You have done something you shouldn't. Or are about to.
Posts: 3200
Joined: Thu Jan 28, 2016 7:00 pm UTC
Location: 53-1

### Re: Genetic Algorithm Questions

tarapena wrote:Hey guys,

I know this is slightly off topic - but I was wondering if anyone had advice in optimizing GAs ? Are there certain fitness/mutation/mating functions that produce better results or should the parameters need to follow a certain structure? I know this question is kinda ambiguous however I'm testing different variations of these functions together and getting interesting results. By optimize, I mean take less generations.

Depends a lot on the nature of the GA, I would say, and probably any idea of the end result you're looking/searching/hoping for.

The premise behind GA (as far as I'm concerned, do correct me if I'm wrong) is that onee conducts a number of "drunkard's walks" towards the 'solution' (or even a number of 'fitness islands' in the whole potential search-space), rather than try every one of the astronomically large number of permutations possible. Thus if you know something of the desired target solution(s), you can perhaps ensure the stumbles head more in that direction, before even culling for unfitness, and certainly don't linger in old territory.

If 'breeding' fitter versions, by cross-linking parental elements, then avoiding 'incest' in the partnering algorithm might be useful (i.e. ABCDEF mixed with ABCGEF might only produce ABC[D|G]EF, monoploidically, to no new 'advantage', although if you're employing a polyploidy method you could still get a hidden miracle out of 'recessive genes' reappearing in the child), and in both that and purely mutative (asexual branching) it might be useful to index each (recent?/still breeding?) copy to not 'revisit' an exact version of a previously example unneccessarily (unless you're looking for competative fitness, not absolute, and suspect something Rock Paper Scissors might exist).

Essentially, how much foreknowledge/expectation are you willing to put into the process, to preventing or cut short the assessment through 'guided evolution'? Less pure than the fire-and-forget methods that can reveal unexpected and unanticipated results, but only you currently know how hands-off you want the process to be.