Algorithms Weekly by Petr Mitrichev: February 2017

Sunday, February 26, 2017

A 7x week

This week was extremely packed with competitions - I think this might be the new record for the number of scoreboards in one summary. Consequently, there were a couple nice problems, so if you haven't solved all of this week's competitions (:P), read on!

First off, Codeforces held its Round 399 on Monday (problems, results, top 5 on the left, analysis). y0105w49 and W4yneb0t had a fiery challenge competition and have managed to overtake the only contestant who solved all problems, V--o_o--V. Congratulations to all three!

TopCoder SRM 709 on Tuesday offered the deceptively little 800 points for the hard problem (problems, results, top 5 on the left, my screencast). Just 3 competitors have managed to grab (some part of) those points, and thus -XraY-'s dominating victory is even more impressive!

Codeforces Round 400 on Thursday continued the non-stop week (problems, results, top 5 on the left, analysis). The challenges were not in abundance this time, and thus the coding speed and accuracy came to the forefront. ainta was a tiny bit faster this time, and made fewer submissions that didn't pass pretests - congratulations on the win!

On Saturday, Mujin Programming Challenge 2017 on AtCoder offered monetary prizes for the top performers (problems, results, top 5 on the left, analysis, my screencast). Tourist has claimed the first prize by being way ahead of everybody on the hardest problem - well done!

I think problem B in this round provided a nice way to practice one's skill for dissecting a problem step by step until it becomes trivial. You were given a square n times n matrix with each cell either black or white. In one step, you could take an arbitrary row of the matrix, and replace an arbitrary column of the matrix with the values from the chosen row. Your goal is to make all cells black. What is the minimum number of steps required to achieve that?

Right after the AtCoder round ended, Codechef held its February Lunchtime 2017 competition (problems, results, top 5 on the left, my screencast). The problems were a bit easier than in other competitions mentioned in this summary, but still provided a nice challenge!

Codeforces Round 402 started at the same time with the Open Cup round (problems, results, top 6 on the left). As a result, many active ACM ICPC participants have skipped the Codeforces Round, but the competition remained fierce. Congratulations to bmerry on finding a unique set of problems to solve and winning thanks to that strategy!

Finally, the aforementioned Open Cup 2016-17 Grand Prix of Wroclaw also took place today (results, top 5 on the left). Problem F was right in the middle by difficulty level, and required a nice observation to come up with a O(n²) solution.

Consider a permutation of size n. For each number, we can compute its radius: the largest number r such that r numbers to the left of our number, and r numbers to the right of our number, are all smaller than that number. For example, for the permutation "1 3 2 6 4 5" the radii are "0 1 0 2 0 0". Given the sequence of radii, how many different permutations correspond to it?

Thanks for reading, and check back next week! (which promises to be much calmer)

Sunday, February 19, 2017

A casino week

Codeforces Round 397 was the highlight of this week (problems, results, top 5 on the left, analysis). Merkurev has created a comfortable margin for himself thanks to fast solutions of the first five problems and two successful challenges - congratulations on the victory!

In my previous summary, I have mentioned two interesting problems. The first was a TopCoder problem about distinguishing optimal strategy and random play: you are playing a betting game in a casino. You start with a dollars, and your goal is to reach b dollars. You can play at most k rounds. In each round, you bet some integer (possibly zero) amount of dollars not exceeding the amount you have now, then with probability 50% you lose that amount, and with probability 50% you gain that amount. As soon as you reach your goal (have at least b dollars), you leave the casino.

There are of course many strategies to place your bets. In this problem we're concerned with two of them: one is an optimal strategy, where you place each bet to maximize the probability that you will reach your goal by the end of the k-th round or earlier. Note that there might still be several choices for each bet in an optimal strategy that all lead to the same probability of achieving the goal. One of the optimal strategies can be somewhat easily googled. The other strategy we're concerned with is the fully random one: for each bet, we choose the number of dollars between 0 and our amount of money uniformly at random.

Now, you've chosen to use the fully random strategy, and played it to completion (end of k-th round, or getting at least b dollars). An outside observer (who knows the values of a, b and k), however, was assuming that you're actually following an optimal strategy - and did not find evidence otherwise observing your fully random play (in other words, each time you randomly chose a bet that leads to the maximum probability of achieving the goal). What is the probability of such situation?

a and b are up to 100000, and k is at most 5.

A quadratic dynamic programming solution is straightforward: given the amount of money we have and the number of rounds remaining, we will compute what's the probability of winning, and what's the probability that a random play will look optimal. In order to compute those probabilities, we will iterate over all possible bets and see what state to we get into if we win and if we lose.

The key insight required to speed it up is: the probability of winning is always a rational number with denominator 2^k(because our play consists of k events each with 50-50 chances). Because of this, it has at most 2^k+1 possible values. On the other hand, it's not hard to see that this probability will not decrease if we increase the amount of money we have. That means that for a fixed k, the probabilities of winning for each a form a set of at most 2^k+1 segments, each with the same value. And that, in turn, allows to speed up the dynamic programming: instead of considering possible bets one by one, we will consider them in groups: a group is determined by which segments on level k-1 we land on if we win, and if we lose. There will be at most 2*(2^k-1+1) such groups, and thus the total running time will be O(b*k*2^k) which is fast enough.

The second problem from the previous summary was a purely mathematical puzzle: you are given an undirected graph with at most 200000 vertices and at most 200000 edges. Consider its spanning subsets of edges: such subsets of its edges that connect all its vertices. Is the number of such subsets odd or even?

The solution here is completely magical. It turns out that the number of spanning subsets of edges is odd if and only if the graph is bipartite. A couple of different proofs can be found in this Codeforces thread, but I think they don't fully answer the natural question: what is the intuition behind this fact? How to come up with it without knowing it in advance? I would be grateful if the readers could shed some light here.

Thanks for reading, and check back next week!

A spanning week

TopCoder SRM 708 took place in the middle of last week (problems, results, top 5 on the left, my screencast, analysis). rng_58 returned to winning ways in his second SRM after the November TCO victory - congratulations!

The victory is mostly thanks to the high score on the hard problem (moreover, as you can see from the top 5 this problem decided all five places, not just the winner). It went like this: you are playing a betting game in a casino. You start with a dollars, and your goal is to reach b dollars. You can play at most k rounds. In each round, you bet some integer (possibly zero) amount of dollars not exceeding the amount you have now, then with probability 50% you lose that amount, and with probability 50% you gain that amount. As soon as you reach your goal (have at least b dollars), you leave the casino.

There are of course many strategies to place your bets. In this problem we're concerned with two of them: one is an optimal strategy, where you place each bet to maximize the probability that you will reach your goal by the end of the k-th round or earlier. Note that there might still be several choices for each bet in an optimal strategy that all lead to the same probability of achieving the goal. One of the optimal strategies can be somewhat easily googled. The other strategy we're concerned with is the fully random one: for each bet, we choose the number of dollars between 0 and our amount of money uniformly at random.

Now, you've chosen to use the fully random strategy, and played it to completion (end of k-th round, or getting at least b dollars). An outside observer (who knows the values of a, b and k), however, was assuming that you're actually following an optimal strategy - and did not find evidence otherwise observing your fully random play (in other words, each time you randomly chose a bet that leads to the maximum probability of achieving the goal). What is the probability of such situation?

a and b are up to 100000, and k is at most 5. Can you solve it as fast as Makoto?

Open Cup 2016-17 Grand Prix of China on the weekend provided another set of tough problems (results, top 5 on the left). The reigning ICPC world champions from SPb SU continued their winning streak - well done!

Problem B in this round was a cute purely mathematical puzzle. You are given an undirected graph with at most 200000 vertices and at most 200000 edges. Consider its spanning subsets of edges: such subsets of its edges that connect all its vertices. Is the number of such subsets odd or even?

In my last summary, I have mentioned two nice problems. The first one came from AtCoder: two players are playing a game on a tree where each node has some amount of stones. There's also a chip in one of the vertices of the tree. They make moves in turn, and each move consists of removing one stone from the vertex where the chip is, and then moving the chip to an adjacent vertex. When there's no more stones in the vertex where the chip is, the player that needs to make the next move is unable to do so, and thus loses the game. How to determine who will win with optimal play?

The key idea needed to solve this problem is: if the vertex a the chip is currently on has less or the same amount of stones as its adjacent vertex b, then the second player can force the game to never go beyond b in the tree. If the first player moves the chip to b, the second player can return it to a, and after repeating this a few times a will run out of stones earlier than b, so if the first player doesn't want to lose, he will have to pick a vertex other than b to move to.

Because of this, we can see that the outcome of the game does not change if we only ever allow to move from a vertex to a vertex with strictly fewer stones. The latter game is acyclic, and thus it's very easy to determine the winner in just a few lines of code.

Another problem I mentioned was from Gennady's contest: n<=400 people have participated in a programming contest, and will receive x<=10⁹ roubles of prize money in total. You do not know what is the prize for each place, you only know that each prize is an integer amount of roubles, that they add up to x, and that the prize amounts are in non-increasing order as we go through the ranking list. Given a subset of places in the ranking list (for example the 1st place, the 5th place and the 8th place), what is the smallest possible total prize for those places?

The main idea here is to realize that the prizes essentially form a Young diagram, and as it is often the case, reading the diagram column-by-column after writing it row-by-row proves super useful. Without abstract mathematical terms, one could just say that we can view the distribution of the prize money as the following process: we repeatedly pick a number m_i<=n, and give 1 dollar to the top m_i finishers. All m_i must add up to x.

Now, giving 1 dollar to the top m_i finishers gives some concrete amount f(m_i) dollars to the finishers in our chosen subset of places. Now we can see that we have a classical knapsack problem instance where the weight of each item is relatively small (m_i<=400), each item has a cost f(m_i), and we want to pick a multiset of items with the given total weight x and the smallest possible cost. This problem is solved by noticing that we must use the item with the smallest cost-to-weight ratio for most of the total weight - more precisely, we can keep using that item until the remaining total weight is below 400²=160000. And for such small total weight we can then solve the knapsack problem using the standard dynamic programming.

The proof of the fact that we only need to run the dynamic programming up to 400² is based on the following lemma: in a set of n numbers, there's always a non-empty subset with sum divisible by n.

Thanks for reading, and check back soon for this week's summary!

Saturday, February 11, 2017

A Gomel week

Last week started with the early morning TopCoder SRM 707 on Tuesday (problems, results, top 5 on the left). Nobody has managed to solve all three problems correctly - but not just because the hard problem was on the difficult side: the medium problem had an unusually low 19% success rate, with a deceptively simple statement but quite a few corner cases.

The winner Kriii also couldn't get the medium right, but has managed to get some compensation by successfully challenging two other medium solutions. Congratulations on the win!

Codeforces Round 395 took place on Thursday (problems, results, top 5 on the left, analysis). moejy0viiiiiv has solved everything with 24 minutes to spare, while nobody else was able to get all five. That's a very convincing victory!

AtCoder Grand Contest 010 on Saturday has gathered the top three rated contestants among others, and it was exactly those three contestants who occupied the top of the scoreboard (problems, results, top 5 on the left, analysis). Congratulations to Um_nik on being 8 minutes faster than everybody else!

The last problem turned out a bit easier than the organizers have expected. Two players are playing a game on a tree where each node has some amount of stones. There's also a chip in one of the vertices of the tree. They make moves in turn, and each move consists of removing one stone from the vertex where the chip is, and then moving the chip to an adjacent vertex. When there's no more stones in the vertex where the chip is, the player that needs to make the next move is unable to do so, and thus loses the game. How to determine who will win with optimal play?

And finally, OpenCup 2016-17 Grand Prix of Gomel resumed the Open Cup season on Sunday, featuring Gennady as the problemsetter (results, top 5 on the left). The reigning ACM ICPC world champions SPb SU Base were significantly faster than other teams, and thus found time to solve the hardest problem C in the last hour - great job!

Problem D has disguised a relatively standard idea behind an unusual but very natural background. n<=400 people have participated in a programming contest, and will receive x<=10⁹ roubles of prize money in total. You do not know what is the prize for each place, you only know that each prize is an integer amount of roubles, that they add up to x, and that the prize amounts are in non-increasing order as we go through the ranking list. Given a subset of places in the ranking list (for example the 1st place, the 5th place and the 8th place), what is the smallest possible total prize for those places?

In my previous summary, I have mentioned a Hacker Cup problem: you are given a sequence of n positive integers h_i, and a number k. You're allowed to do the following k times: take an integer, and decrease it by 1 (but it must remain positive). After you do this, we compute the maximum value of h_i+h_j+abs(i-j). How small can this maximum value be?

The first idea is to start with a binary search. Given a value m, can we achieve h_i+h_j+abs(i-j)<=m for all i not equal to j? Since abs(i-j)=max(i-j,j-i), this is equivalent to h_i+h_j+i-j<=m and h_i+h_j+j-i<=m. The second inequality is obtained from the first by swapping i and j, so they're equivalent and we can keep just one: h_i+h_j+i-j<=m. We can now rewrite it as (h_i+i)+(h_j-j)<=m.

Now, if we didn't have the constraint that i is not equal to j, we could define a=max(h_i+i), b=(h_j-j), and simply check if we can make a+b<=m. If we fix the value of a, then we know the (maximum value of) b, and thus know how much each height needs to be decreased, and thus can check if k decreases are enough. Of course we can't check all possible values of a this way, but we can notice that if we go through values of a in increasing order, the contribution of each h_i to the total required decrease will be piecewise-linear with at most 5 pieces, and thus we can add all those piecewise-linear curves together in O(nlogn) time.

Now, what to do with the additional constraint that i is not equal to j? It turns out that it doesn't matter in most cases. More specifically, for this constraint to matter both max(h_i+i) and max(h_j-j) must be achieved in the same point, and only in that point. In that case we have one h_i that is significantly bigger than all others, and thus decreasing this h_i will result in the answer decreasing as well. Because of this, we can always "transfer" one more decrease from any other number to that h_i without making the answer worse, unless all k decreases have been applied to it. And that means that in addition to the piecewise-linear sum mentioned above, it suffices to check just one more case: when all k decreases are applied to the maximum h_i.

Thanks for reading, and check back next week!

Friday, February 3, 2017

A k-ary week

Last week, the 200 participants of Facebook Hacker Cup Round 3 were fighting for the 25 spots in the Seattle onsite (problems, results, top 5 on the left, analysis). This time double- and triple-checking was not the best approach, as one needed to solve the first three problems quite quickly to get through. Congratulations to Egor on the victory!

The hardest problem required both a careful implementation and several key insights. You are given a sequence of n positive integers h_i, and a number k. You're allowed to do the following k times: take an integer, and decrease it by 1 (but it must remain positive). After you do this, we compute the maximum value of h_i+h_j+abs(i-j). How small can this maximum value be?

In my previous summary, I have mentioned two interesting problems. The first one, coming from TopCoder, asked to find any sequence of 500 positive integers up to 10¹⁸, such that any two adjacent numbers in this sequence are coprime, and any two non-adjacent numbers in this sequence are not coprime.

As usual with such "constructive" problems, a lot of different approaches are possible. I think the most beautiful approach is the randomized one, as described in the analysis. Let's take the first n primes, and each of the 500 numbers will be a product of some k of them. To build the sequence, we will pick each number at random from all possible products of some k of those primes that do not include the k primes used to build the previous number. This will guarantee that adjacent numbers will be coprime. After generating each number, we will also verify if it's coprime with any other number except the one directly preceding it. If yes, we'll regenerate it again at random, and so on until this number fits, then move on to the next number.

Why does this work, and how to pick n and k? I can only offer some advice partly based on intuition here. Obviously k must be chosen in such a way that the product of the largest k of the first n primes does not exceed 10¹⁸. Also, k must be close to n/2, but not too close :) Basically, the higher is k, the more likely it is for two random numbers with k bits out of n to share a common bit. However, when k approaches n/2 a different mechanic comes into play: the bits (primes used) in the sequence start to alternate, and thus the numbers at an odd distance are more likely to be coprime. Practically speaking, n=23 and k=10 seems to work with a comfortable margin. Still, it's not entirely clear to me how to expect this solution to work before trying it out - I think it requires the right kind of statistical intuition.

The other problem I mentioned was from AtCoder: you start with n zeros and m ones on a blackboard, and you're also given a number k such that k-1 divides n+m-1. You repeatedly pick any k numbers on the blackboard, erase them, and instead write their arithmetic mean (which is not necessarily an integer). Since k-1 divides n+m-1, you will eventually end up with just one number. How many different values can this last number have? n and m are up to 2000.

The right approach to this problem is to build a rigorous mental model of what's going on. We can view the computation as a rooted tree: the root of the tree corresponds to the final value, each non-leaf node has exactly k children corresponding to the values used to obtain the value of this node, and there are n+m leaf nodes corresponding to the original numbers, each containing either 0 or 1, with exactly m 1s. There will be d=(n+m-1)/(k-1) inner nodes in the tree.

We can then notice that the final value is uniquely determined by the depths of each 1 leaf node. More precisely, a 1 at distance p from root contributes 1/k^p to the final value. Now it's clear that the final value must be representable as x/k^q, where q<=d. For a given value, let's pick the minimum such q. How do we determine if it's representable given x and q?

Representing x/k^q as a sum of m terms of form 1/k^p is equivalent to representing x as a sum of m terms of form k^q-p. In other words, we need to represent x in k-ary system with sum of digits equal to m, but where each digit can be arbitrarily big. It's not hard to see that the minimum sum of digits is achieved if we use the standard representation of x in k-ary system, and all other representations are obtained by replacing one of the 1s with k smaller 1s. We can make at most d-q such replacements (since q out of d tree nodes are already used for the initial representation).

To summarize, for each q<=d we just need to count the amount of numbers x with q digits in k-ary system with sum of digits equal to m-(k-1)*y, where 0<=y<=d-q, and with nonzero last digit. This is now a standard dynamic programming task, and note that we didn't need to have any insights or magical ideas to arrive at it - just gradually expanding our understanding of the problem domain.

Thanks for reading, and check back next week!

A Limak week

TopCoder SRM 706 kicked off the activity-packed weekend of the Jan 16 - Jan 22 week (problems, results, top 5 on the left, analysis, my screencast). Our old friend, bear Limak, promised the reliably high quality of Errichto's problems, and the contest did not disappoint. I had other reasons to be disappointed, though, not being able to maintain the first place through the challenge phase. Kudos to ainta for the decisive +50!

The hardest problem had a deceptively easy statement: you need to find any sequence of 500 positive integers up to 10¹⁸, such that any two adjacent numbers in this sequence are coprime, and any two non-adjacent numbers in this sequence are not coprime. Can you see the way?

Right after the end of the SRM, Facebook Hacker Cup 2017 Round 2 narrowed the list of contenders for the cup to just 200 algorithmists (problems, results, top 5 on the left, analysis, my screencast). Given the extremely harsh judging system and the fact that the 1st and the 200th place were similarly useful, one had to find the right balance between double- and triple-checking the solutions, and moving on to the next problem. Congratulations to all 200 algorithmists who got this right, and to Marek for being the fastest!

On Sunday, AtCoder Grand Contest 009 has gathered roughly the same crowd again (problems, results, top 5 on the left, analysis). w4yneb0t has continued his string of excellent results, winning this round and cementing his second place in the overall AtCoder rating list - well done!

The last problem was a great example of the type where after developing a good understanding of the mechanics described in the problem statement, coming up with the actual solution is very straightforward. You start with n zeros and m ones on a blackboard, and you're also given a number k such that k-1 divides n+m-1. You repeatedly pick any k numbers on the blackboard, erase them, and instead write their arithmetic mean (which is not necessarily an integer). Since k-1 divides n+m-1, you will eventually end up with just one number. How many different values can this last number have? n and m are up to 2000.

And finally, Codeforces 8VC Venture Cup 2017 Final Round gave the contestants a shot at monetary prizes (problems, results, top 5 on the left, parallel round results, analysis). V--o_o--V, previously known as overtroll, was super fast through the first four problems, and has managed to solve the extremely tedious problem E, to win the first prize with a comfortable margin. Congratulations!

Thanks for reading, and check back soon for the last week's summary!