The "Hello World" of genetic algorithms

karpathy · on April 2, 2012

Genetic algorithms, heuristic hill-climbing, simulated annealing, etc, etc, etc. These are all basically equivalent ways of tackling an optimization function that can only be point-wise queried efficiently, and they are all subject to same problems (most notably local optima, curse of dimensionality, etc.).

I've seen these used rather sporadically in academia, where people generally prefer to reformulate a difficult optimization problem as a convex (or similar) relaxation that is provably solvable, and then throwing solvers at it.

However, I get the impression that when you just need to get some reasonable solution quick and dirty in practice (especially with discrete domains), one of these is the method of choice. I'm not aware of some particularly good arguments for GA's specifically though.

Kartificial · on April 3, 2012

The power of EAs (Evolutionary Algorithms) is that they are able to escape local optima through random mutation and recombination.

EAs are suitable for finding solutions to complex problems, as long as you state the problem correctly and have a decent fitness function. Besides this, parameter tuning is vital for creating a decent EA. This goes so far that there has been research in tuning parameters for an EA using an EA (creating an EA Inception ;)).

_delirium · on April 3, 2012

> I've seen these used rather sporadically in academia

It really depends on the area. In some areas, like graphics and control theory, convex optimization is the norm. In others, like parts of AI, randomized optimization is a lot more common. Traditionally GAs were more strongly planted in the IEEE-flavored parts of AI, sometimes called "computational intelligence"; for example, the IEEE Transactions on Evolutionary Computation has one of the higher impact factors of AI-related journals. I agree there's no great reason to treat GAs as categorically different from other randomized optimization, though; I suspect they get more press because of the biology metaphor.

Nvn · on April 2, 2012

I really love the elegance of algorithms like these, a bunch of similar algorithms are discussed in Clever Algorithms[1].

It also discusses genetic algorithms[2], providing a really simple implementation that suffers from the same problem as the article in question; the outcome you're looking for is already known so the fitness function is rather trivial, but nonetheless its a good place to get started.

[1] http://www.cleveralgorithms.com/

[2] http://www.cleveralgorithms.com/nature-inspired/evolution/ge...

peteretep · on April 3, 2012

Maybe I'm way off the mark here, but calling a function that's iterating towards a target string doesn't really feel like a genetic algorithm. If you know the exact output already, you aren't really testing 'fitness' in any meaningful sense - you're testing closeness. Why not just use A* to path-find your way to it?

Genetic algorithms are surely meant for iterating towards a goal you can't define, other than fitness towards a certain set of criteria - if you know what the target looks like, what's the point in iterating towards it, rather than just going at it?

peteretep · on April 3, 2012

I think what I'm trying to say is: if your fitness function contains the solution, then your problem is entirely contrived, and not a good illustration of genetic algorithms - you're just using a very round-about way of calculating something you already knew, inefficiently.

llambda · on April 2, 2012

I love these. I think the original concept is based on the Weasel Program[1] formulated by Richard Dawkins to demonstrate evolutionary change and natural selection. Here's my attempt[2] in Clojure from a few weeks ago.

[1] http://en.wikipedia.org/wiki/Weasel_program

[2] https://gist.github.com/2072019

jordhy · on April 2, 2012

Very basic but also very instructional. The key to these sort of algorithms is to write a very efficient fitness function. The best GAs use GA to fetch the best fitness function (sort of like meta programming or "meta-reflection").

This PPT has a great intro on GAs: http://cs.uga.edu/~khaled/ECcourse/chapter03.ppt

sukuriant · on April 3, 2012

It's so much more complex and amazing than that, especially in problem spaces that have numerous local optima. My personal favorite weird genetic algorithm exploration paradigms are where you start looking at the behavior of the genome; and, instead of searching out best behavior, you search out ~different~ behavior. It's called Novelty Search, and it's awesome!

jal278 · on April 3, 2012

i'm glad you think novelty search is awesome, i'm doing my dissertation on it! out of curiosity, how did you hear about it?

sukuriant · on April 3, 2012

I know Kenneth Stanley and loooove NEAT :)

Lambdanaut · on April 3, 2012

I apologize for the cheese factor of the "goal" beforehand, but here's a Python program I wrote recently that does the same thing:

https://gist.github.com/1649116

I sat down with my fiance on Valentines Day and asked her if she could figure out what the letters on the screen were trying to spell. I think she really loved it!

debacle · on April 2, 2012

I did a zombie port of this to JavaScript so I could take a peek at what's going on without having to remember my C++. Take a look:

http://jsfiddle.net/zENKM/

rheide · on April 2, 2012

Quite a good read and a good introduction to the topic. I somehow ended up writing a Python version. Here it is: https://github.com/rheide/Hello-genetic-algorithm/blob/maste...

agscala · on April 2, 2012

Link is dead, google cache to the rescue:

http://webcache.googleusercontent.com/search?q=cache:JBfk95W...

Caligula · on April 2, 2012

Wow it has been years since I saw generation5. It used to have the best ai learning materials and a good forum about eight years ago. The site seems to be down for me however.

user24 · on April 3, 2012

I did the same thing in JavaScript a while ago with a more detailed writeup: http://www.puremango.co.uk/2010/12/genetic-algorithm-for-hel...

Here's the HN discussion at the time: http://news.ycombinator.com/item?id=2002673

arctangent · on April 3, 2012

I wrote something similar over Christmas in both C and D:

http://jacobconradmartin.com/experiments/evolution/

The graph of mutation rate versus convergence time might be of interest.

Kartificial · on April 3, 2012

A nice TEDx talk from Guszti Eiben about the possible applications of Evolutionary Algorithms. http://www.youtube.com/watch?v=WJX_wAKhg8A

aerique · on April 3, 2012

There's a more elaborate version on GitHub for multiple languages: http://github.com/jsvazic/GAHelloWorld

tiles · on April 3, 2012

An excellent, simple extension to this is to throw simple arithmetic operators into the mix (digits 0-9, +, -, / and * perhaps) and to try and breed an algorithm which when evaluated yields a number, like the number 500. You can get startlingly close answers and interesting equations this way.

jostmey · on April 2, 2012

My impression of genetic algorithms are that they work in the context of biology (you can actually see it happen in real time with viruses & bacteria), not computer science. I have yet to see an instance where a real problem in computer science was best solved using genetic algorithms.

SeanLuke · on April 3, 2012

The Genetic Algorithm ("genetic algorithms" is actually a misnomer) actually has little to do with biology except for original inspiration. It is simply one of a number of sample-based methods for doing global stochastic optimization.

There's an old saying about the Genetic Algorithm: that it's the "third best way to do anything". Stochastic optimization methods (or "metaheuristics") in general are knowledge-poor methods, essentially last-ditch techniques where you don't have any other known way to solve your problem and you don't want to jump off the cliff into random or brute-force search. They rely on a central heuristic: that similar candidate solutions will likely have similar performance (the "smoothness" criterion).

Here's the thing. There are a huge and growing number of crunchy problems in this category. If you're trying to find a good tic-tac-toe solution, you can almost certainly do better than a stochastic optimization method (just use state-space search, say). But if what if you're trying to find the set of behaviors for a two-thousand-agent multiagent simulation model which produces statistics most closely resembling known historical data? Or what if you're trying to find the best parameters for optimizing an aircraft engine whose space is filled with local optima? It's ugly problems like these, for which there is no principled solution method, where the Genetic Algorithm and its ilk reign supreme. You might say that you never see problems in computer science which are "best solved using genetic algorithms". My answer is: your daily problems are too simple to need them.

Book plug: you might enjoy my free online text on the subject, called Essentials of Metaheuristics. You can also get it in paperback. http://cs.gmu.edu/~sean/book/metaheuristics/

NyxWulf · on April 3, 2012

I've read through a large portion of the PDF version of your book quite a few times. It is very well written and helped me understand a number of different techniques very well.

I definitely recommend Sean's book as a starting point if you are interested in this field.

(I don't know Sean, and don't get anything for this recommendation other than the happy glow of helping a well deserved author get more recognition)

SeanLuke · on April 5, 2012

Thank you very much.

nierman · on April 2, 2012

it's basically a search technique for problems where the search space is too large. so you visit "promising" portions of the search space. this happens a lot in computer science, and it happens even more in real life ;)

the postgres query optimizer is an example I know off the top of my head (it uses GAs for the join order; at least for queries with enough joins): http://www.postgresql.org/docs/9.1/static/geqo.html

Homunculiheaded · on April 2, 2012

The thing about GAs is they're probably one of the easier to understand tools in Machine Learning, even someone with almost no math background can successfully implement a GA. However knowing when to use them optimally does require understanding of the rest of ML. Essentially if your optimization problem is convex, or "convex enough" (as is the case when minimizing the cost function for linear/logistic regression, SVMs, neural networks and more), GAs aren't the best solution. However when you have a really ugly optimization problem on your hands, they are a really good last resort.

There are plenty of applications, however you'll never see them everywhere because they'll never beat optimization techniques like gradient descent, hill climbers, back propagation etc. in cases where those techniques work.

So while GAs are very easy to understand, finding an ideal use case for them takes a bit of knowledge. Looking to solve problems with GAs is harder than having a hard problem and realizing GAs may be helpful.

blacksmith_tb · on April 2, 2012

Uh, how about these? http://en.wikipedia.org/wiki/List_of_genetic_algorithm_appli...

jostmey · on April 3, 2012

I did not say that they did not work. I said that they were not the best solution. Other algorithms will usually do the trick at a fraction of the cost :-)

cowkingdeluxe · on April 3, 2012

If you firmly believe so, could you please post what you think would be better solutions to some of those problems listed on the wiki? Namely electronic circuit design and plant floor layouts.

Ironballs · on April 2, 2012

Evolutionary algorithms are very capable metaheurisitics with striking exploratory abilities. Exploratory in the sense that given a large search space, a well performing evolutionary algorithm can navigate this search space a lot faster than traditional metaheuristics. However, the problem of getting stuck into local optima is common still.

They definitely are a staple algorithm type in the field of optimization, especially combinatorial. Look it up.

vellum · on April 2, 2012

A good write-up. I played around with Pyevolve[1], which is also written in Python. GAs are non-optimal for most problems, but it's a fun starting point.

[1]: http://pyevolve.sourceforge.net/

lilyball · on April 2, 2012

Here's a fairly direct translation of the original program into Go: http://play.golang.org/p/Xg9f0lM40n

lilyball · on April 2, 2012

There's a bug in the program. In the mate() function, when it recalculates the string, it uses `esize-spos` when it meant to use `tsize-spos`.