Today I spent a good part of my day solving a couple of problems with a couple of colleagues. It was a pretty typical day in the Dean’s office (I’ve been the Associate Dean for undergraduate programs in the College of Liberal Arts for a couple years now) in that there were logistical problems to be solved that I was able to help with. What was interesting was that for both problems I leveraged my experience in scripting algorithms to accomplish tough tasks. I thought it might be fun to document a little of what I did to help with arguments about whether learning scripting is a useful thing for folks to do. Certainly a lot of the programming that I do is for particular physics projects, but it’s been interesting how often my skills have come in handy in the Dean’s office.
Assigning New Student Mentors to FYSEMS
One of the hats I wear is that I’m the director of the First Year SEMinar program (FYSEM). One of the cool things we do with that program is to build a student success triangle for new students consisting of a FYSEM instructor, a Campus Colleague (a staff member at the institution), and a New Student Mentor (NSM). We find that triangle works really well, and right now we’re right in the midst of assigning NSMs to the FYSEMs scheduled for this fall.
About half the FYSEM instructors identify NSMs that they want to work with. For the remainder of the NSMs, they give their top three preferences and I write an algorithm to pair them up, trying to make them all collectively as happy as I can, based on their preferences.
Basically I use the approach I lay out in these two posts. I randomly assign the NSMs to the available FYSEMs and look to see how happy everyone is. I generate an entire generation of such assignments and determine which ones are the happiest. From those I choose a few and mutate them by making a few switches (randomly). This produces the next generation and I repeat it.
What’s interesting is that I’ve done this for three cycles now and each time I have to make small tweaks to the cost/fitness calculation. This year we had one extra NSM and I had to determine which FYSEM could get two to maximize the happiness of everyone. I didn’t want to re-build the whole system (which currently assumes the number of NSMs and FYSEMs is the same) but realized instead that I could just duplicate one of the FYSEMs and then run the algorithm. Of course that forces that FYSEM to be the one that gets the spare, but the whole thing runs fast enough that I just repeated that with every possible FYSEM to be the extra one and at the end looked for the happiest situation. It worked great!
Fixing hyperlinks in a Google Doc
We’ve got an important visit coming to campus this weekend by a team of observers. They’ve been given a bunch of linked documents to get them prepared for their visit, but we hit a technical snag. It seems the documents we sent occasionally have broken links. They’re not really broken, they just seem to lose the folder structure that’s built in. Regardless, we wanted to make sure that when they were here they for sure had the access they need. I was talking with a colleague in the department and we wondered about using a structured Google Drive folder system as a backup. I thought it might work, but my colleague pointed out that all of the links came in with the wrong structure when we converted it all to Google Docs.
I said I could probably help, but I wanted to make sure that there was a clear path to doing it. He said that all the links end with the proper file name, and that those files were all in a different Drive folder. I said I could probably write a script to get it done, but I wasn’t sure how long it would take. I predicted two hours of learning to fix the first link and then two minutes to fix the other 246 links. He pointed out that he figured it would be 2-4 hours of his labor to do it, so it didn’t seem to be the obvious solution. However, I had the time and he wasn’t sure he could do it today, so off I went.
Long story short, I think it took me only a total of an hour to get the script working, and then it really was only two minutes to fix them all. Pretty cool!
First I just googled how to find links in a Google Doc and found this super helpful Stack Overflow post. It was frustrating to see how hard finding the links were, but I really loved two things about it: 1) It just hunkers down and deals with the fact that every character that’s part of a hyperlink has a connected url. That’s really a pain, but the code clearly just brute forces its way through until it gets to a character that doesn’t have a connected url. It only collects the url once, then spends the rest of its time hunting for the end. 2) It uses a very cool recursive approach, scraping any links it finds and, if it stumbles on a child of text it just sends that child through the very same function.
So I made a loop that went through each found link, found the correct url, and then updated the text in the original Google Doc. What was super cool about the Stack Overflow code was that the elements I was dealing with (searching, finding children, doing replacements) were live in the sense that if you made a change, it actually changed them in the original doc. Very cool.
When I first ran it I was super happy and I called another colleague to double check that all the new links were right. However, during that phone call I noted that a bunch hadn’t been fixed. All the ones that were part of bullet points were untouched! So I spent a half hour trying to understand what was different about them. Unfortunately I could have saved all that time if I’d just read the comments under the Stack Overflow code. It turns out that code assumes that all hyperlinks have unlinked characters at the end. The bullet point ones didn’t have that! A simple adjustment to one of the if/then arguments fixed it and I was done!
- Student evaluation comparisons
- Finding student paths through our curriculum
- A neural network trying to understand what makes our students successful
- Sending individualized Dean’s List notifications
- Sending emails to students on our Early Alerts list and connecting them to proper resources
- Collecting applications for general education requirements and assigning them to random members of the Undergraduate Curriculum Committee
- Helping folks transition off of committees by changing the owner on thousands of documents
- Making graphs of the enrollment trends for courses
Here are some starters for you:
- I really like how you paired up the NSMs. What I liked the most was . . .
- I don’t understand why you don’t just randomly assign the NSMs. Who cares about their preferences?
- Can you tell me more about the triangle, especially the Campus Colleagues?
- In your 7 year old post you talk about Nobel Prize work on the pairing problem and yet it seems you still haven’t read and applied their work. Jerk.
- Why don’t you just print out all the evidence docs and put them in an awesome binder for the visitors?
- Are you saying there should be some technical requirements to be an Associate Dean?
- Why don’t you just write Ass Dean?
- I have a pairing problem. Can you write an algorithm for me? It would have to be in python.
- Google wants everyone to just use search to find everything. Why didn’t you just strip the hyperlinks and tell people to just search for any evidence they need?
- Let me get this straight: Half this post is just riding on the coat tails of someone who wrote a Stack Overflow answer. How can you call yourself a programmer?