I'm looking at optimizing class selection for the school I go to. We have a rank-by-preference system, because it's a relatively small school and classes need to be able to stay balanced. Here's my attempt at formalizing it:

There are c classes, and s students.

Each student has a tuple of n class selections, ranked in order of preference. Each student will only get into one class.

I'm trying to find / create and implement an algorithm to optimize the classes, so that each student gets into their highest choice possible. The problem is that there's a lot of material to sift through. So far I've done some basic research into mathematical programming, operations research, and combinatorics- they're all huge, complex fields, and I'm not sure where to look to find the information I need.

Any pointers? I don't want solutions, just a direction to go in. I'll post my eventual answer when I find it.

Edit: formatting

## Finding an optimization algorithm

**Moderators:** phlip, Moderators General, Prelates

- Nautilus
**Posts:**105**Joined:**Thu Feb 25, 2010 1:19 am UTC**Location:**That's a strange place to put a piano

### Finding an optimization algorithm

My flagella bring all the boys to the yard

- Yakk
- Poster with most posts but no title.
**Posts:**10879**Joined:**Sat Jan 27, 2007 7:27 pm UTC**Location:**E pur si muove

### Re: Finding an optimization algorithm

Your description does not describe a unique optimal solution in general, or even in most situations.

"Each student gets into their highest choice possible" does not generate a consistent solution, because the solutions are antagonistic to each other.

So you need to choose what the tradeoff is, so you can pick an actual function that can be optimized.

You'll want the function you choose to weigh a solution to have some reasonable properties probably, like making the same decision when two students swap choices.

As a toy example to illustrate issues, we'll allow 1 student per class, and denote students as ordered lists of classes by preference.

This order:

(A,B,C)

(B,C,A)

(C,A,B)

literally has no optimal solution (all solutions are equivalent, assuming students are treated symmetrically).

Other toy situations cause two people to be in conflict. The question becomes, is two people getting a 2nd place better or worse than one getting a 1st and another a 3rd? There are lots of reasonable ways to decide to weigh the "value" of a given allocation of students to classes.

In many cases, it isn't practical to find the most optimal solution (as you end up having to solve an NP-hard problem). So you build heuristics. You'll still probably want the weighing function so you can decide how well you did.

"Each student gets into their highest choice possible" does not generate a consistent solution, because the solutions are antagonistic to each other.

So you need to choose what the tradeoff is, so you can pick an actual function that can be optimized.

You'll want the function you choose to weigh a solution to have some reasonable properties probably, like making the same decision when two students swap choices.

As a toy example to illustrate issues, we'll allow 1 student per class, and denote students as ordered lists of classes by preference.

This order:

(A,B,C)

(B,C,A)

(C,A,B)

literally has no optimal solution (all solutions are equivalent, assuming students are treated symmetrically).

Other toy situations cause two people to be in conflict. The question becomes, is two people getting a 2nd place better or worse than one getting a 1st and another a 3rd? There are lots of reasonable ways to decide to weigh the "value" of a given allocation of students to classes.

In many cases, it isn't practical to find the most optimal solution (as you end up having to solve an NP-hard problem). So you build heuristics. You'll still probably want the weighing function so you can decide how well you did.

One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

- Nautilus
**Posts:**105**Joined:**Thu Feb 25, 2010 1:19 am UTC**Location:**That's a strange place to put a piano

### Re: Finding an optimization algorithm

s1:(A,B,C)

s2:(B,C,A)

s3:(C,A,B)

Does have an optimal solution:

A: s1

B: s2

C: s3

given that all students get into their first choice.

But

s1:(A,B,C)

s2:(A,B,C)

s3:(A,B,C)

doesn't, you're right. I can fix this with a weight function. What I'm wondering is: once I have that weight function, what is an algorithm I can use for the optimization? I can use a genetic (or maybe agent based) heuristic to find a solution, I know those pretty well. The thing is, I've had a somewhat sketchy training in CS and I don't know what the "traditional" algorithms for optimization are. It's okay if they're not optimal in this situation, it's more of a proof-of-concept project than anything else.

Sorry if I'm missing the point.

s2:(B,C,A)

s3:(C,A,B)

Does have an optimal solution:

A: s1

B: s2

C: s3

given that all students get into their first choice.

But

s1:(A,B,C)

s2:(A,B,C)

s3:(A,B,C)

doesn't, you're right. I can fix this with a weight function. What I'm wondering is: once I have that weight function, what is an algorithm I can use for the optimization? I can use a genetic (or maybe agent based) heuristic to find a solution, I know those pretty well. The thing is, I've had a somewhat sketchy training in CS and I don't know what the "traditional" algorithms for optimization are. It's okay if they're not optimal in this situation, it's more of a proof-of-concept project than anything else.

Sorry if I'm missing the point.

My flagella bring all the boys to the yard

- Yakk
- Poster with most posts but no title.
**Posts:**10879**Joined:**Sat Jan 27, 2007 7:27 pm UTC**Location:**E pur si muove

### Re: Finding an optimization algorithm

Ya, sorry, I got a brain fart with the (A,B,C) thing.

Your choice of weight function basically decides that the optimal choices are. Even a reasonable restriction (it is monotonicly increasing for each student if they get a higher choice) only restricts you to Pareto optimial solutions.

Basically, your choice of weight function picks which solution is optimal.

The easiest optimization is the greedy algorithm -- pick a student at random, give them their first choice, repeat until done. A variation is adding some jitter to it -- so you sometimes see if dislodging a prior student from their optimal choice is useful. Hill climbing with mutation algorithms, or genetic algorithms, give yet other ways to examine the search space.

It is easy to introduce perverse incentives to such a scheme. Ie, where failing to have additional choices screws you, or picking 3 really popular choices as your first 3 gives you your 4th pretty much guaranteed (as opposed to picking your real choice first, then the 3 really popular as 2nd, 3rd and 4th, which might be more likely to get you your 5th choice...) On the other hand, if they don't get to examine it, and you change it fast enough, people can't learn the perverse incentives.

Your choice of weight function basically decides that the optimal choices are. Even a reasonable restriction (it is monotonicly increasing for each student if they get a higher choice) only restricts you to Pareto optimial solutions.

Basically, your choice of weight function picks which solution is optimal.

The easiest optimization is the greedy algorithm -- pick a student at random, give them their first choice, repeat until done. A variation is adding some jitter to it -- so you sometimes see if dislodging a prior student from their optimal choice is useful. Hill climbing with mutation algorithms, or genetic algorithms, give yet other ways to examine the search space.

It is easy to introduce perverse incentives to such a scheme. Ie, where failing to have additional choices screws you, or picking 3 really popular choices as your first 3 gives you your 4th pretty much guaranteed (as opposed to picking your real choice first, then the 3 really popular as 2nd, 3rd and 4th, which might be more likely to get you your 5th choice...) On the other hand, if they don't get to examine it, and you change it fast enough, people can't learn the perverse incentives.

One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

### Re: Finding an optimization algorithm

Let me know if I'm missing something here, but can't you just solve this using Maximum Weight Matching? We have a bipartite graph, on the left, we have one node for each student, and on the right we have the classes. If class j has k_j spots (don't think you mentioned capacity constraints on each class, but obviously there has to be some, or else just put each student in their favorite class) then we have k_j identical copies of class j.

Then, we have a weighted edge (i, j) between student i and each copy of class j where the weight is whatever the weighting function for i's preference for attending class j.

You can find a perfect matching of maximum weight using Max-Flow (which is fast) and this will have the highest weight over all assigments of students to classes, and will assign one class to each student, and no more than k_j students to any class j.

You could even have different weighting functions for each student, so say, letting each student split up one total unit of weight depending on how much they want to attend each class.

Then, we have a weighted edge (i, j) between student i and each copy of class j where the weight is whatever the weighting function for i's preference for attending class j.

You can find a perfect matching of maximum weight using Max-Flow (which is fast) and this will have the highest weight over all assigments of students to classes, and will assign one class to each student, and no more than k_j students to any class j.

You could even have different weighting functions for each student, so say, letting each student split up one total unit of weight depending on how much they want to attend each class.

- Yakk
- Poster with most posts but no title.
**Posts:**10879**Joined:**Sat Jan 27, 2007 7:27 pm UTC**Location:**E pur si muove

### Re: Finding an optimization algorithm

Neat! Yes, I think that works.

The perfect matching on the bipartite graph does restrict us to valid solutions. And the maximum weight condition produces the optimal solution from said set.

It isn't all that cheap however -- O( V^2 E ) cost ( or V^2 log V + EV with extra effort ). V = {students, places in a class * number of classes}, which can be large. E = students * places in a class * number of choices each student gets, which can be larger -- but this is far, far faster than NP-hard!

The symmetry of the problem makes me guess that we could do way better than this...

Neat.

The perfect matching on the bipartite graph does restrict us to valid solutions. And the maximum weight condition produces the optimal solution from said set.

It isn't all that cheap however -- O( V^2 E ) cost ( or V^2 log V + EV with extra effort ). V = {students, places in a class * number of classes}, which can be large. E = students * places in a class * number of choices each student gets, which can be larger -- but this is far, far faster than NP-hard!

The symmetry of the problem makes me guess that we could do way better than this...

Neat.

Last edited by Yakk on Tue Dec 20, 2011 8:38 pm UTC, edited 1 time in total.

One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

### Re: Finding an optimization algorithm

If you're really worried about speed, Matching is a very well studied problem which is the basis for a number of other applications, so it's likely to have very fast heuristics which speed up the algorithm for reasonable inputs. It's not the kind of thing you prove theorems about, but I'd be willing to bet that there are existing matching finders which are blazingly fast.

- Nautilus
**Posts:**105**Joined:**Thu Feb 25, 2010 1:19 am UTC**Location:**That's a strange place to put a piano

### Re: Finding an optimization algorithm

That's really helpful. After some wikipediing, I've discovered that the problem of assigning students to classes is an instance of the assignment problem. Go figure.

It can be solved with max-flow or the "augmenting path" algorithm in O(V^2 E), or with the hungarian algorithm in O(S^3), where S is the number of students. I haven't checked which is more efficient yet.

It can be solved with max-flow or the "augmenting path" algorithm in O(V^2 E), or with the hungarian algorithm in O(S^3), where S is the number of students. I haven't checked which is more efficient yet.

My flagella bring all the boys to the yard

- Yakk
- Poster with most posts but no title.
**Posts:**10879**Joined:**Sat Jan 27, 2007 7:27 pm UTC**Location:**E pur si muove

### Re: Finding an optimization algorithm

While S^3 < V^2E, as both V and E are at least as big as S, this is only by a constant factor (even if it is something like "number of students in a class"). At ~30 students in a class, that is a factor of 2700 -- which is a reasonably large constant.

In addition, hungarian is aimed directly at the problem, while augmenting path algorithm is a general algorithm which your problem can be modified to be a case of. The augmenting path algorithm might have more research/optimization aimed at it, but I'd start out with the algorithm that is directly aimed at your problem. It will probably be easier to implement, as you don't have to transform the problem, and quite possibly will be much faster. If it ends up being too slow for your problem size, then go poke at the other one.

In addition, hungarian is aimed directly at the problem, while augmenting path algorithm is a general algorithm which your problem can be modified to be a case of. The augmenting path algorithm might have more research/optimization aimed at it, but I'd start out with the algorithm that is directly aimed at your problem. It will probably be easier to implement, as you don't have to transform the problem, and quite possibly will be much faster. If it ends up being too slow for your problem size, then go poke at the other one.

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

### Who is online

Users browsing this forum: No registered users and 0 guests