Tuesday, November 11, 2008

Long time no see

Hiya!

Today, I've got a small tip that has proved useful several times for me in ACM-style competitions. Suppose you have a program that does some calculations with integers. You've coded it using a 32-bit data type (say, 'int' in C++/Java), and then you figure out that the values you get don't fit into that type. No problem! You can easily switch to a 64-bit data type (say, 'long long' in C++ or 'long' in Java), and your program just works with numbers up to 9*10^18! But... It turns out that such numbers are not big enough as well. What to do now? Go for arbitrarily-long integers? But coding those in C++ takes reasonable time, especially if there're negative numbers involved; in Java, you'd have to rewrite all your code to ugly BigInteger syntax.

There's an intermediate solution that can allow you about 30 decimal digits almost without any hassle (it's only applicable when you use only addition, subtraction and multiplication, but not division): do the calculations with doubles AND longs. The basic idea is that imprecise doubles will give you the first, say, 13 digits correctly, and long will give you the last 18 digits, because it will contain the precise answer modulo 2^64.

More precisely: suppose we have the answer computed in a double variable 'x' and in a long variable 'y'. How do we get the full answer in a String? Here's a Java snippet that should mostly work (please don't use it at TopCoder :)):

 1: public static String build(double x, long y) {
2: String sx = String.format("%.0f", x);
3: if (sx.length() <= 18)
4: return "" + y;
5: sx = sx.substring(0, sx.length() - 18);
6: long sl = 0;
7: for (int i = 0; i < sx.length(); ++i)
8: sl = sl * 10 + sx.charAt(i) - '0';
9: for (int i = 0; i < 18; ++i)
10: sl *= 10;
11: String sy = String.format("%018d", y - sl);
12: return sx + sy;
13: }


Basically, lines 1-5 format all digits but the last 18, lines 6-10 convert the number formed by already formatted digits plus 18 zeroes to a long, and then subtracting that number from y gives us exactly the last 18 digits.

Apart from standard possible double precision issues, this approach has one more flaw - when the digits on the border of long and double are ..99999... or ...00000..., it may fail even if the precision error is tiny. For example, the above code won't correctly format 10^30. You can work around that by using BigInteger (which I intentionally didn't because it should be usable in C++ as well), or by manually adjusting for that case, but that brings too much hassle. Usually, for the numbers that you output in ACM-style problems, the above code is enough, and the joy of ACM is that you can submit, and do the more accurate coding only if you get WA :)

5 comments:

  1. And in case you use C++ for real (not competitions) then you'd probably want to (1) add #include <gmpxx.h>, (2) change int to mpz_class, and (3) add -lgmpxx -lgmp to the command line when you link.

    ReplyDelete
  2. Cool, thanks for the advice!

    Although I doubt I will ever need arbitrary-precision integers outside competitions :)

    ReplyDelete
  3. informative! keep writing.. we're listening :)

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Petr, just a curiosity: in how many ACM contests did you participate?

    ReplyDelete