## Genetic algorithm for randomizing oral exams

I’ve written before about using genetic algorithms to solve problems, but I wanted to show how flexible they can be by writing here about how they helped me this week. My problem was that I wanted to assign standards to students for our oral exam week coming up after spring break (next week). I want each student to do three different standards throughout the week. However, I want to make sure of each of the following:

• No student should do the same standard twice
• On any given day, there should be a minimum of repeat standards
• No two students should have the same set of standards (even in a different order)

At first I thought I could just randomize the student names and then count out the standards in order. It takes care of the first two bullet points, but it turns out that the third one kept happening. So, on to using a genetic algorithm to do this.

Here’s what I did:

1. Put together a population of possibilities
1. each one of these randomly assigns three standards to each student
2. Note that the three bullet points above are not at all considered (in fact, usually all three are violated with each of these random starters)
2. rank each candidate according to the three bullet points above
1. For example, if a student has a repeat, that’s a penalty. If they have all the same standard all three times, that’s a double penalty
3. Using a weighted selection scheme, choose a parent and then mutate it to make a new candidate for the next generation
1. The weights are the penalties (lower are selected more often, though)
2. A mutation consists of randomly changing one or more of the student/standard assignments
3. The mutation rate goes down throughout the run (this I borrowed from Simulated Annealing)
4. Repeat 2-4 until you have a solution you like

After about 100 iterations it settles into something that meets all the requirements.

The flexibility I was mentioning above comes in when I realize that I want to make some other subtle changes. I realized that I want to add these conditions:

• Minimize the number of standards that students repeat throughout the semester (we have 9 oral exams all together)
• Maximize the spread of the standards for each student
• I simply do Max(list) – Min(list) to get that
• In an earlier “solution” I noticed that one student was assigned numbers 1, 2, and 3 and I didn’t want them to spend the week working on just one section of our material

Instituting those changes was simple, just change the cost function to include them and then re-run the simulation.

At the end of the day it all works quite well. I get a distribution of the standards into my students hands (so they can prepare over spring break) and I know that all my bullet points are met.

So what do you think? Here are some Rhett-Allain(TM) starters for you:

• I’m in this class and I’m really glad you’ve worked this out. I was super nervous that a fellow student might have the same standards as me and this fixes that.
• I’m in this class and I’d much rather just pick my own standards, focussing on the ones I’ve done poorly on.
• Why do you put “solution” in quotes?
• How much do you pay Rhett when you use these starters?
• Why do you have the mutation rate change? Why not have it evolve?
• If you change the weights of each category, do you get drastically different results? (yes, though not if I let it run a really long time)
• Can you post the code you use for this?
• I’ve got a similar problem, how would you code it?
• Wait, this is the Friday before your break? Why are you posting to your blog? Get a life!
Posted in mathematica, syllabus creation | 8 Comments

## Stepper motor with Arduino motor shield

We have a bunch of Ardunio-brand motor shields and I wanted to jot down what it took to drive a stepper motor with them. The documentation really only tells you how to hook up DC motors, though it does say you can just use the outputs to drive a stepper. I thought that meant I should use pins 3, 11, 12, and 13 for the four coils of my stepper, but that didn’t quite work out. I was confused about how to energize each of the outputs so I finally just sent all 16 combinations of those pins to see which combos are the ones I needed. Here’s what I found. For the list below, A+, A-, B+, and B- refer to the outputs on the board and the pin order is 3, 11, 12, 13. 1 means high, 0 means low. The integer in parentheses is the decimal equivalent to the binary.

• B-: 1000 (4)
• B+: 1001 (5)
• A-: 1000 (8)
• A+: 1010 (10)

If you consider the four coils to be the cardinal directions, I put north on B-, south on B+, east on A-, and west on A+. To get half steps:

• NE: 1100 (12)
• SE: 1101 (13)
• SW: 1111 (15)
• NW: 1110 (14)

So to do the standard 8 steps when doing half steps, the decimal order is

4, 12, 8, 13, 5, 15, 10, 14

Here’s some quick code for turning decimals into firing the correct pins:

void fixbool(int n) {
digitalWrite(3, HIGH && (n & B1000));
digitalWrite(11, HIGH && (n & B100));
digitalWrite(12, HIGH && (n & B10));
digitalWrite(13, HIGH && (n & B1));
}

## Keep from just plowing through

I spent a good portion of the day trying to figure out what I wanted to do in my Theoretical Mechanics class tomorrow. We’ve recently begun the chapter on central potentials and I wasn’t sure how far to try to get. I started writing down the big ideas that I thought were important:

• The effective potential idea gets us all the way down to a single variable (from 6 with the x’s, y’s, and z’s of the two particles involved in a central potential interaction)
• Kepler’s second law (which really applies for any central potential). This is the one about the orbit sweeping out an equal area for every equal time period.
• Kepler->Newton (showing how a 1/r^2 force leads to ellipses): more on this below
• Finding relationships for the eccentricity and ellipse axes in terms of the energy and angular momentum of the orbit.

I knew I couldn’t do all that in one class period, but I thought I could probably get through the first two and a half or so.

But that’s when I wondered what the “I can . . .” statement of the day would be and it hit me that I shouldn’t just be plowing through material. Instead I want the “I can . . .” statement to be driving the work, motivating the students to engage and to figure out what other resources they’re going to need to figure out how to prove that “they can.” I realized that repackaging the first two points made it a nice consistent idea that we could really play with in class. The r- and $\theta$-equations each bring something to the table when understanding orbits and I think if we do them well, we’ll be really set up for the rest on the next day.

It’s funny that I fell off my “standards-decide-the-day” wagon and didn’t even notice. I spent so much of my day today trying to find cool connections between Kepler and Newton that I lost track of the utility of really focussing each class day on a articulatable idea.

One other idea came out of my brainstorming today. The proof that inverse-square-law orbits are ellipses is a beast. The integrals involved are nasty, often utilizing all kinds of “I would have never thought of that” tricks. So I really tried thinking today of other ways to get that point across. I thought about having the students code up some numerical simulations to see that all the orbits seem to be ellipses, but I knew that it would fall short of the goal of proving that all possibilities are ellipses. I talked with a friend in the math(ematics) department about this and we explored a number of interesting pedagogical/curricular issues:

• Do students need to do a derivation like this?
• yes: it’s famous and really interesting, tying together Brahe, Kepler, and Newton
• no: ugh, you’ll never assess it, so why bother
• My friend uses the phrase “it’s good for your soul” to motivate students to try to struggle with a tough idea, even if there’s a good chance it won’t be formally assessed.
• I make it clear what will be assessed and try hard to not have anything else in the class. This has led to cutting material (like catenaries) even if it’s “interesting” if I’m not planning on assessing it. My friend suggests that I’m limiting myself and my students’ learning by doing that.

So what do you think? Here are some starters for you:

• I’m in this class and I want to see if Newton was a genius or if I could have figured this out.
• I’m in this class and I’ll take your word for it that Newton was able to do that nasty integral by hand.
• I like how you try to make each class a stand-alone idea, but I don’t think “articulatable” is a word (and neither does Chrome)
• I don’t understand why you put so much focus on the arbitrary class time breaks. The material is huge and doesn’t have to fit cleanly into those breaks. I say just plow through.
• I say “it’s good for your soul” a lot too. Here’s an example . . .
• I don’t like the “it’s good for your soul” motivation because . . .
• If the students’ numeric results are always ellipses, good enough!
• Do you know if the “equal area in equal time” only works in the reduced mass system?
• I’m the friend from the math(ematics) department and you’ve misquoted me like heck. What I really said was . . .
Posted in sbar, sbg, syllabus creation | 2 Comments

## Double pendulum roller coaster FIXED

My last post was wrong. I’m to blame. But in thinking about it and talking about it with with lots of helpful friends I ended up learning a ton. Here’s the upshot: There were kinks in the roller coaster loop that led to integration mistakes on the part of Mathematica. Thanks to a great suggestion from my friend Craig I smoothed those out:

The blue track is the one with kinks in it. The orange one is the used for the simulation in this post

And now the simulation animation looks like this (there’s some extra annotation that I’ll talk about below):

The green dot is the center of mass of the system. The orange arrow is the normal force. The purple arrow is the direction that the center of mass is traveling.

Note first that the system never gets above the dotted green line, which was my (mistaken) idea from the last post. This post will try to talk about what I learned about whether the normal force does any work (which was my mistaken explanation in the last post).

The track exerts a force on the red ball to keep it on the track. Gravity and the connection to the first black ball are yanking on that ball and the track does whatever it has to in order to ensure that the red ball stays on the track. My argument from the last post boils down to this: The normal force is an external force to the system of three balls. That system has a center of mass that I can pretend the external force acts on. If the center of mass is moving perpendicularly to the normal force (as would happen with just the red bead), there would be no work. But if the center of mass is moving at times slightly parallel to the normal force, then there would be some work. It turns out there’s really nothing wrong with that description. However, assuming that changes the kinetic energy of the system is wrong. What it does (as again my friend Craig suggested) is it changes the translational kinetic energy of the system (basically the kinetic energy of the system if you replaced it with all the mass being at the center of mass). However, the total kinetic energy of the system is both the translational and rotational kinetic energy. What I intend to discuss here is that the effect on the rotational kinetic energy due to the normal force is exactly the opposite of the effect on the translational kinetic energy.

First a quick plot. This shows in blue the time derivative of the translational kinetic energy of the system (subtracting out the effects of gravity) and in orange the work per unit time that the normal force does on the system:

The comparison of the rates of change of the translational kinetic energy (without gravity effects) and the work that the normal force does on the system.

Actually, you don’t see the orange because, to the accuracy of the thickness of the lines, the orange is completely underneath the blue.

Let’s try to understand what’s going on. Consider first the time rate of change of the kinetic energy of the system:

$\frac{d}{dt}\left(\frac{1}{2}\sum_i m_i v_i^2\right)=\frac{d}{dt}\left(\frac{1}{2}\sum_i m_i \vec{v}_i\cdot \vec{v}_i\right)$

The derivative can come right into the sum, and the vector product rule gives us:

$\frac{d\text{KE}}{dt}=\sum_i m_i \vec{a}_i \cdot \vec{v}_i=\sum_i \vec{F}_i\cdot \vec{v}_i$

where I’ve used Newton’s second law in the last step. Now the normal force is only “attached” to the red bead, but that’s the one bead that’s guaranteed to be moving perpendicular to the normal force. So the contribution to the time change of kinetic energy due to the normal force is indeed zero! Hence my last post is wrong.

But what about this business with the translational kinetic energy? We tell our students all the time that they can think of all forces as acting on the center of mass. In other words, the change of momentum of the center of mass is due to the collection of all external forces. Those forces will do work if they act, at least partially, in the direction that the center of mass is traveling. In the animation above the purple arrow shows the direction that the center of mass is traveling. You can see that it doesn’t always point along the track. That means that it’s not always perpendicular to the normal force. Hence work is done on the center of mass. But that just affects the translation of that center of mass, not any rotation about it. To see the whole story, let’s redo the last calculation using a coordinate system centered on the center of mass. For those variables, I’ll use primes. First I’ll start with an expression for the kinetic energy:

$\text{KE}=\frac{1}{2}\sum_i m_i (\vec{v}_\text{cm}+\vec{v}_i')\cdot(\vec{v}_\text{cm}+\vec{v}_i')$

Now when you do the FOIL of that dot product, two of the terms go to zero (that’s the beauty of using the center of mass, by the way) and you’re left with:

$\text{KE}=\frac{1}{2} M V^2+\frac{1}{2}\sum_i m_i \vec{v}_i'\cdot \vec{v}_i'$

where M is the total mass of the system and V is the velocity of the center of mass. Now, let’s consider doing a time derivative of that. For the first term you’ll get exactly what I was talking about above. In other words you’ll get the dot product of the total external forces and the velocity of the center of mass.

$\frac{d}{dt}\text{KE}=\sum_i \vec{F}_i\cdot \vec{v}_\text{cm}+\sum_i m_i \vec{a}_i\cdot \vec{v}_i'$

Now here’s a trick. Let’s re-express the velocity back into the normal frame (and use Newton’s second law again) for the second term above:

Here’s where the magic happens. The normal force is only applied to the particle on the track. But it’s velocity is perpendicular to the normal force by definition. So the first term in the parenthesis yields a zero. What we’re left with is:

$-\sum_i \vec{F}_i\cdot \vec{v}_\text{cm}$

which is exactly the opposite of the change to the translational energy. In other words, you can either say that, yes, the normal force does some work, but it changes the translational kinetic energy by exactly an amount that is the opposite of how it changes the rotational kinetic energy, or you can just say that the normal force does no work. You decide.

Thoughts? Here are some starters for you:

• Thanks for this, I was totally at a loss for figuring out the mistakes in the last post.
• I’m glad you figured this out for yourself, just know that the rest of us knew this all along and have been laughing at you for your last post for a few days now.
• Wait, it doesn’t work!? I’ve already starting building it in my backyard!
• How did you figure out the normal force? Did you determine the accelerations of all the particles and subtract all known forces, starting with the last black dot and moving up to the red dot. Or did you use Lagrange multipliers to figure out the normal force more directly, and, if so, how did you figure out the constraint equations for the track? (yes, yes, and it’s a long but interesting story involving me jumping out of bed this morning and trying something that worked!)
• So how would you say it? Does the track do work on the system?
• How is it that you were willing to believe that the track could help you violate energy conservation? What, are you some sort of “momentum is king” kind of guy or something?
Posted in mathematica, physics | 6 Comments

## Double pendulum roller coaster

I’ve been doing a lot of modeling of beads on wires lately, but today I discovered something that really surprised me. The surprise came when I found a bead/wire system that seemed to violate conservation of energy. Now, it turns out that I was just thinking about it wrong, but still, it’s interesting.

Here’s the animation that got me thinking:

Double pendulum on a roller coaster

Take a look at how high the whole system gets compared with the original height. It sure looks like it ends up a little higher. Where does that energy come from?

Why I was wrong: I think I had it in my head that the track which provides the constraint force never does any work. That’s certainly true for the case of a single bead on a wire as the normal force is always perpendicular to the direction of travel, hence no work. But in this case that’s not true! Instead, the normal force is at times not perfectly perpendicular to the direction of travel of the center of mass of the system. You can see that in this annotated version of the animation where I track the center of mass in green and show the initial height as a dotted line:

Annotated version with the center of mass in green. The green dotted line is at the original height of the center of mass.

Cool, huh?

Your thoughts? Here are some starters for you:

• Thanks, this is very cool. Can you find the same thing with a single pendulum on a track?
• This is old news. Designers of hang-under coasters take this into account all the time.
• Can you post the Mathematica code? (yes: here you go)
• Can you figure out the normal force of the track using Lagrange multipliers? (note: I can’t figure out a way to parametrize the loop as a constraint that looks like g(x,y)=0)
• Why do you always post animated gifs. Don’t you know people hate those?
• I think your code is wrong. There’s no way the center of mass can get up higher than it starts.
Posted in mathematica, physics | 6 Comments

## Lagrange multipliers revisited

I spent the last few days trying to decide whether to teach Lagrange multipliers in my Theoretical Mechanics course. Ultimately I decided to go ahead and do it and I wanted to get down my thoughts on why and what we ended up doing in class today.

I made a lot of progress on how to teach Lagrange multipliers the last time I taught this class, and I have to say that it was great being able to read my old thoughts when prepping this class. In that post I break down how to derive the needed result, so I won’t repeat that here (though I did find a slightly better approach for one of the steps that I’ll talk about below).

So why was I waffling? Every time I come around to this topic, I begin to realize that if my students are predominantly going to numerically integrate the ultimate equations of motion (or equation of motions as I usually pluralize the acronym – eoms) they can get the same information that Lagrange multipliers provide (typically the constraint forces) by simply plotting the total acceleration of the relevant particles minus the known forces. In fact that’s one of the beauties of the Lagrangian approach — you can ignore the constraint forces when figuring out the equations of motion. At the end of the day, you have access to the full accelerations which come about due to both the known external forces AND the constraint forces. What Lagrangian multipliers do is help you explicitly calculate those constraint forces, but sometimes I don’t really see the value of that.

So what tipped the scales towards teaching them this year? Well, I realized that actually doing what I suggest above is kind of a hassle. Consider the prototypical problem of when a sled will lose contact with the ground as it goes down a hill. If you have the equation of the hill, you can then reduce the problem to a one-dimensional one, typically the horizontal variable. Then you can do the Lagrangian approach where you force the sled to stay on the curve. When you’re done you can look at the total acceleration and subtract gravity from that vector. Then you have the normal force vector and you can investigate when that switches from a vector pointing out of the ground to a vector pointing into the ground. When that happens the sled will lose contact with the ground. Now, that all sounds great, but actually determining which way the vector points involves dotting it with a vector that’s normal to the curve. To get that you have to do some derivatives on the function that defines the curve in the first place. In other words, there’s a little bit of a hassle.

Contrast that with learning about Lagrange multipliers and applying them. Now, there’s no question that learning about them (the whole separation of variables thing in my other post) is not a cake walk. However, there’s some very cool calculus of variations in there and it helps me reinforce the original derivation of the Euler equation (which I always choose to teach). So teaching it isn’t really the big deal. What is the big deal is whether what you get after implementing it makes your life easier. And guess what? It does! For the sled problem, if you have access to the Lagrange multiplier as a function of time (which you will after implementing the approach) you just have to plot it and look for when it changes sign. That doesn’t require the multivariable calculus that’s necessary to define the normal direction that you need for the other way.

Here’s the screencasts that I made for my students today. We were tackling whether a sled would ever lose contact with the ground on a parabolic hill. I did it once without Lagrange multipliers and once with them. It’s my contention that the second way is a more clear way of finding when the sled leaves the ground (too long, didn’t watch: it never does for a parabolic hill). But I recognize it’s not a slam dunk case.

I was talking with a math professor buddy of mine today and he suggested that it might be a slam dunk case if you have a constraint that is hard to solve for one variable in terms of the other. I get what he was saying, but I don’t immediately have an idea of a sled/hill constraint where that would be the case.

Quick note about a change to the derivation: There’s a point in my other post where I say that the constraint should not be a function of the perturbation variable, $\alpha$. I then take a derivative and find a relationship between the two perturbation functions, $\eta_1$ and $\eta_2$. I realized this time around that there’s a different way to approach that. Basically we know that the constraint, g, has to be obeyed whether you’re on the best paths (x(t) and y(t)) or on nearby paths ($x_\text{best}+\alpha \eta_1$ and $y_\text{best}+\alpha \eta_2$):

$g(x_b, y_b)=g(x_b+\alpha \eta_1, y_b+\alpha \eta_2)$

$latex g(x_b+\alpha \eta_1, y_b+\alpha \eta_2) \approx g(x_b, y_b)+\frac{\partial g}{\partial x}(\alpha \eta_1)+\frac{\partial g}{\partial y}(\alpha \eta_1)$

or

$latex \frac{\partial g}{\partial x}(\alpha \eta_1)+\frac{\partial g}{\partial x}(\alpha \eta_1)=0&s=2$

which leads directly to the same result I have in the last page.

Update on how to do this in Mathematica: In the screencasts above you might notice that I deviate from what I suggest in my other post. The reason is that Mathematica has a new method for NDSolve that saves the day. Now you can simply tell NDSolve about the constraint and go, without having to do the differentiation that was my work around. The key is to use Method->{“IndexReduction”->Automatic} which Mathematica kindly suggests if you try to do NDSolve without it. It’s great, you even only have to give initial conditions for one of the variables. Mathematica will figure out the initial conditions for the other variable(s) by using the constraint. Awesome!

Your thoughts? Here are some starters for you:

• I like this, I’ve been looking for better motivation and I’m going to use this to . . .
• I’m in this class and I found today very useful. Here’s why . . .
• I’m in this class and I thought today was a waste of time. Here’s why . . .
• Of course you don’t leave the ground on a parabolic hill, everyone knows that.
• Why didn’t you type up all the stuff you said in the screencasts? I hate watching videos.
• Thanks for the screencasts, do your students find them useful?
• Mathematica had that new method two years ago, I just decided not to tell you about it
• Here’s a suggestion for a sled/hill problem with a constraint that’s hard to solve for one of the variables . . .

## Setting up oral exams

I’m teaching Theoretical Mechanics this term and next week we have the first set of oral exams. Each student will take 9 oral exams, but each will only be five minutes long. With only 13 students in the course, each set only takes just over an hour. We’ll devote 9 days of class to this exercise, and I think they’ll be worth every minute.

I was thinking about the oral exams today as I was grading some of the screencast submissions from my students on the “I can derive the Euler/Lagrange equation” standard. A few of them had pretty good derivations, but there were tiny issues that I was disappointed to see that they didn’t nail. My first inclination was to give a “3 improvable” meaning that they could just try to figure out the tiny thing I care about to turn that into a 4 (the highest score on my Frank Noschese rubric). But then I remembered the oral exams and realized that I could give them 4s now but lay into whoever gets that standard for the oral exams next week. Every standard will be done roughly four times next week, so the whole class will be able to hear my tiny issues discussed.

I want to be clear here. I’m not saying that I’m giving them a good score now only to blast them next week. Instead I’m trying to really reflect on my rubric. If I’d brag about it, or it seems they could teach it well, that means they get a four. That’s not the same as saying they can do every single tiny detail that I have learned to watch for after teaching this course six times.

I also thought about the role oral exams play in some of the resource screencasts I make for the students. Today we were talking about applying the Euler/Lagrange approach to multi-dimensional problems and I offered up that I could fill in the details of the derivation if they weren’t super comfortable with just saying “ah, ok, I could believe that you’d get the same equation for every variable.” At the end of the day our “I can” statement was “I can do an interesting multi-dimensional Lagrangian problem” so when I sat down to do the screencast, I realized that it really was only needed if the standard had ended up being “I can derive . . .” But that’s when it hit me that the students who watch study that screencast are going to be much better off if they get that standard in the oral exam. They’ll have a “interesting” problem worked out, but I might just ask why they can use that differential equation in the first place. Is it fair? You bet! They need to know this stuff. They need to know where it comes from and why it matters, not just how to turn the crank.

• I like this approach. I think for my oral exams next time I’ll . . .
• I’m in this class and I think this is great. I think it especially makes sense for the standard on . . .
• I’m in this class and I think this is dumb. It’s a double standard! If you want us to know the tiny details, you should lay them all out for us so that we can just read them back for you.