The traveling salesperson problem is a relatively famous math/geometry/computer science problem. The version I’ll be talking about in this post is where the salesperson has to visit several points on a map, minimizing the travel distance. Repeating a point on the map isn’t allowed, though I’m pretty sure that would increase the distance anyways. I got thinking about this problem thanks to this great post showing how to solve this problem using simulated annealing.
My approach to solving this problems involves my go-to tool for lots of hard problems: a genetic algorithm. Basically I try a random set of solutions, rank them based on the distance each takes, and I randomly choose them (with weights determined from their “fitness” or distance) to make babies by mutating them (switching two points in the visit order). I make lots of babies and then choose the best ones to populate the next generation. This tool is quite versatile as you just have to define a fitness function that returns a rankable value and you have to carefully determine what the genome structure is and what mutation is. Then you’re off and running. Note that I didn’t do any sexual reproduction as I didn’t have any good ideas for how to do that in this situation (though my friend Jon Gaffney did send me some great resources for that).
I also decided to throw a wrench in. For fun I had the locations on the map slowly move around, using basically a random walk for each. I did that so that I can show this to my First Year Seminar students to talk about how evolution can handle a changing landscape (their summer reading book was Paleofantasy and their assignment was to find a connection between that book and engineering – most are either agreeing or disagreeing with the notion that evolution is an example of engineering).
Here’s a gif showing the best candidate at the end of each generation (one frame for each generation).
And here’s a plot showing the fitnesses of every candidate in every generation:
As you can see, it converges rather nicely, thought the spread in candidates never goes to zero because the dots are constantly moving. To show that it can handle an abrupt change to the environment, here’s a gif showing a situation where the points are randomly shifted to a new location halfway through:
and here’s the plot of the fitness versus generation (note the spike when the points move):
Finally, I thought I’d have some fun with river crossings. I decided to see what would happen if I put in some rivers (line segments) and charged a huge cost (10 meters) for crossing them. Here’s an example:
Here the plot of fitness versus generation really just shows when extra crossings get added:
So, how can we quantify the value of this approach. Here’s on way: compare to a blind hunt. I could just randomly create paths (really just randomizing a list of integers from 1 to the number of points) and constantly check their fitness. Here’s what a histogram looks like for a set of points with a “best path” fitness around 9:
Note how gaussian it looks. It has a mean of 22 and a standard deviation of about 2. That means that the “best” is more than six standard deviations away from the mean. What are the chances of getting that randomly (assuming the normal distribution holds, which is surely doesn’t but whatever)? That’s just a relatively straightforward statistics calculation that leads to an answer of 37 billion to one against. Another way to say it is that you’d have to run billions of random tests to have a decent chance of getting a “close-to-best” path.
Now, that last paragraph suffers from a huge assumption about the distribution staying normal (or gaussian), so here’s another way to think of it. My genetic algorithm runs 200 generations with 50 kids per generation. That’s 10000 different tests. What percentage of all the possible paths is that? Well that’s easy: 10000 divided by 20 factorial (because I’m currently using 20 points). That gives a value of 0.000000000004% (also known as basically zero). So I think the genetic algorithm is doing a very efficient job for this problem.
Thoughts? Here’s some starters for you:
- I’m in your fysem (first year seminar) and I think this is really cool. I’m going to change my paper to . . .
- I’m in your fysem and this is dumb. Just another example of why I should have dropped this class.
- How did you do this in Mathematica? Can you share your code?
- If your friend Jon has already done this, why are you trying to steal his thunder?
- If another friend has done a ton of work on this too, why haven’t you at least linked to his tweets?
- Your assumption that the distribution stays gaussian is stupid. Instead you should . . .
- How do you know you’re not in a local minimum? I can see in a few of the gifs that there are still some crossed paths and that means you’re not at a minimum.
- You speak of evolution as if were more than a theory, why do you do that?
- You’re such a fan of brute force approaches, why don’t you just do all 20! versions?
- Why didn’t you have the rivers moving too? Or have curved rivers? Or wide rivers?
- Why a cost of 10 meters for a crossing? That’s just dumb. Instead I would . . .
- In the traveling salesperson problem the visit points should not be moving. This is an abomination.