Still no word on whether the UVic Whites have made the 2010 ACM ICPC world programming finals in Harbin, China, but we’re going to forge ahead with practices as if we have made it. Based on everything I could research, we have a very good chance. In the last two years, there’s been 100 teams at the finals, and both times there’s been three from the Pacific Northwest region. UBC^ was in the exact same position as us last year and they went, so I am confident we’ll make it.
I gave everyone the week off last week, however the guys finished off all the problems from last week’s regional on their own. Dan also re-coded his solution to problem D using a line sweep algorithm rather than the more direct but less efficient N^3 solution that he submitted during the competition.
As for world finals preparation, there’s some key areas that the guys need to work on: dynamic programming, search problems, hard geometry problems, and coding speed. I’ve started compiling a list of these kinds of problems. We’ll also spend some time going over previous world finals. I’ve also started compiling a list of algorithms and techniques that guys should learn.
For example:
- Weighted matching, assignment problem, hungarian method
- Linear programming, Pick’s theorem
- Advanced geometry techniques (line sweeping, convex hull, 3D convex hull)
- String matching (KMP)
- Number theory: primitive roots, extended euclid’s algorithm, prime numbers
I plan to continue having two practices a week, Wednesday evenings and Saturday or Sunday. Wednesday’s will be for skill development while the weekend practice will be for practice competitions, hopefully with some other Canadian teams. I will be away over Christmas to early January, so the guys will need to practice on their own then. Also, at least one of the team members will be in another city starting next term for a co-op so full team practices in January will probably be out of the question.
I plan to spend the next two Wednesday’s working on the team’s most obvious weak area: dynamic programming(DP)/memoization. They tend to shy away from these problems and have convinced themselves that they aren’t very good at them. I think they just need more practice.
Many DP problems have very distinct patterns that give the solver clues about what to memoize or DP on. For example, recently in one of the team’s practice competitions there was a problem where you had to figure out the minimum number of days that you could assign several different allergens for testing (see problem statement). An allergen can be assigned at 8am every morning and symptoms for the allergen can be detected by 8pm at night. However, if there is more than one allergen in the body, you can’t uniquely identify the allergen causing the issue. Different allergens take different numbers of days to work their way out of a body, maximum of 8 days. You have to figure out how to assign the allergens (at most 20) such that all of them are detected, but such that the total number of days is minimized.
Let’s take a simple example. Consider allergens that take 1, 2, 3 and 7 days. One solution could be to assign the allergen that takes 7 days first, then the one that takes 3, then 2, and finally 1. So, the first allergen is assigned and detected the first day. We can’t assign the second allergen until the 6th day, and it will be detected the 8th day. Now we can assign the allergen taking 2 days, wait for it to run its course, and finally assign the last allergen. The total number of days is 11.
Alternatively, we could assign the allergen taking 2 days first, then the 7, 3, and 1, which only takes 10 days.
Just looking at the problem, it’s tempting to perhaps try greedy, but as you can see from the simple example I just went over, it’s not going to work. The other hint that greedy is probably the wrong approach is that there’s only 20 allergens. If greedy worked, there could be 100 or perhaps 1000 allergens.
The fact that it’s a minimization problem and that there are only 20 allergens are big hints that DP is the way to go. Furthermore, 20 is a curiously small number, which gives us a hint about a potential parameter to DP on. For DP, we want to compute optimal sub-problems and use those to figure out the solution to the whole problem. The sub-problem here is the minimum number of days for the assignment of X allergens. We can represent all allergen assignments in binary, where a number like 100101 means we’ve assigned the first, third, and last allergen so far.
When we apply a new allergen to an existing solution, we need to know how many new days we are adding to the existing solution. The only information we need in order to solve this problem is to know how many days in the existing solution the last applied allergen is active for. Together with our allergen assignments and the number of days the last applied allergen is active for, we can represent any potential solution. The solution would look something like the following (initialization code is not shown):
// allergens, D[0] is the number of days allergen 0 is active for
int D[21];
// DP array, all assignments by days last allergen is active
int M[1<<20+1][8];
// number of allergens in the current case
int k;
// try all possible allergen assignments
for(int i = 1; i < (1 << k); i++){
// try assigning allergen j
for(int j = 0; j < k; j++) {
// only try j if it's not in the solution
if(i&(1<<j)){
// try all days active options
for(int n = 0; n < 8; n++) {
// days active for j
int days_active = max(D[j] - n - 1, 0);
// solution without j
int t = i ^ (1 << j);
// keep track of best answer
M[i][days_active] = min(M[i][days_active],
M[t][n] + days_active);
}
}
}
}
All the best solutions involving all allergens will be stored at M[1<<20-1][i] for i=0 up to i=8. You simply need to take the minimum one as your answer.
There are many problems that can be solved similarly. If you do enough of these DP problems, you start to see the tricks faster. I find sometimes drawing out the call tree or drawing out dependencies between sub-problems as a graph can be an effective means for seeing the overlap between sub-problems.