Data structures for Project Euler problems

Not all PE problems are based on mathematics alone, to solve some PE problems you need to have a good knowledge of data structures. Some data structures that help solve some PE problems are:

  1. HashSet or TreeSet – to remove duplicates
  2. PriorityQueue – to access elements in some well-defined order
  3. Union-Find – to keep track of the number of components while adding edges to a graph.
  4. Persistent Collections – There are some PE problems where immutable/persistent collections are needed to implement solutions efficiently, so take a look at Scala immutable/mutable collections. There are also some libraries available for Java: PCollections.
Advertisements

Product of the form 6k + 1

Given 6k + 1 = x_1 \times x_2 \times \ldots \times x_n, what can we say about x_1, x_2, \ldots x_n. Note that (6a + 1) \times (6b + 1) = (6c + 1) and (6a + 5) \times (6b + 5) = (6c + 1). So, either all x_i are of the form (6a + 1) or some x_i are of the form (6a + 1) and an even number of them are of the form (6a + 5). This kind of analysis is needed to solve a few PE problems.

Square-free positive integers

A positive integer is square-free, if it is not divisible by any perfect square greater than one. That is, if we look at the prime factorization of n, n = p_{1}^{m_1} \times p_{2}^{m_2} \times \ldots \times p_{k}^{m_k}. For n to be square-free, we must have m_i = 1 for 1 \le i \le k. All positive integers n can be represented as n = x \times y, where x is a perfect square and y is square-free. The probability that a randomly picked number is square-free is \frac{6}{\pi^{2}}.

There are a few problems in Project Euler that deal with square-free positive integers. Just a few days ago I solved a PE problem in Kotlin, it took 325 seconds to solve the problem. While I was thinking about solving the problem quicker, I realized that if a certain number was square-free, I should be able to prune some computations. Since we know that about 60.8 percent of all numbers are square-free, I went ahead and added a check to my code. This reduced the running time to 260 seconds. A reasonable improvement.

Here is the actual snippet that makes the square-free check:

  val (sqr_pfs, sqr_free_pfs) = split_square_square_free_pfs(m_pfs)
  var of = Long.MAX_VALUE
  var c_min = 1L
  var skip = false
  for ((p, _) in sqr_free_pfs) {
    val p_long = p.toLong()
    if (p_long > of) {
      skip = true
      break
    }
    of = of / p_long
    c_min = c_min * p_long
    if (c_min >= n_long) {
      skip = true
      break
    }
  }
  if (skip == false) {
    val trplts_for_a = triplets_for_a(n, a, sqr_pfs)
    result = result + trplts_for_a
  }

How to compute ceiling(a/b) for positive integers a, b?

One way to compute \left \lceil \frac{a}{b} \right \rceil is to convert a, b to double and then use the ceil function on the ratio \frac{a}{b} and then convert the result back to an integer. This way may leave us with some uneasiness about using floating point operations for something that could be computed using integer operations. So here is a simple way to do it with just integer operations.

uint64_t x_next(uint64_t n, uint64_t x) {
  auto n_div_x = n / x;
  auto n_mod_x = n % x;
  auto ceil_n_over_x = 0ULL;
  if (n_mod_x == 0ULL) {
    ceil_n_over_x = n_div_x;
  } else {
    ceil_n_over_x = n_div_x + 1ULL;
  }
  return ceil_n_over_x;
}

An alternate way to compute the same thing using only integer operations is:

  uint64_t i_ceil(uint64_t n, uint64_t x) {
    if (x == 0) {
      throw std::invalid_argument("Check failed: x > 0.");
    } else {
      return (n + x - 1ULL) / x;
    }
  }

How to list all the primes in the interval [1, n] for large n?

In my earlier post I mentioned the sieve as a very efficient way to identify the primes in the interval [1, n]. Now, what if n \sim 10^{9}. In most programming languages using a boolean array would take up to 1GB of RAM. Since, most languages represent a boolean as a byte even in an array. an exception is the C++ STL std::vector<bool> which represents a bool as a bit and allows us to sieve up to 10^{9} using only 125MB of RAM. If your language of choice does not have a bit vector like C++’s std::vector<bool>, then you can implement your own BitVector.

class BitVector(n : Int) {
  val bits : IntArray
  val size : Int

  companion object {
    const val SZ = 32
  }

  init {
    val n_div_sz = n / SZ
    val n_rem_sz = n % SZ
    var size = n_div_sz
    if (n_rem_sz == 0) {
      // Do nothing ...
    } else {
      size = size + 1
    }
    this.bits = IntArray(size)
    this.size = n
  }

  operator fun get(idx : Int) : Boolean {
    val idx_div_sz = idx / SZ
    val idx_rem_sz = idx % SZ
    val one = 1
    val mask = one shl idx_rem_sz
    val bit_group = bits[idx_div_sz] 
    val bit = bit_group and mask
    if (bit == 0) {
      return false
    } else {
      return true
    }
  }

  operator fun set(idx : Int, bit : Boolean) {
    val idx_div_sz = idx / SZ
    val idx_rem_sz = idx % SZ
    val one = 1
    val mask = one shl idx_rem_sz
    val bit_group = bits[idx_div_sz] 
    if (bit) {
      val new_bit_group = bit_group or mask
      bits[idx_div_sz] = new_bit_group
    } else {
      val new_bit_group = bit_group and mask.inv()
      bits[idx_div_sz] = new_bit_group
    }
    return
  }
}

Efficient way to list all the primes in the interval [1, n]

In one of the earlier posts, I mentioned deterministic miller-rabin test as a way of efficiently checking whether a number is prime or not. So, if we just wanted to list all the primes in the interval [1, n], then we could simply check whether each x \in [1, n] is a prime. This works for small n. For large n on the order of 10^{8}, we can do much better with a sieve.

fun sieve(n : Int) : BooleanArray {
    val sv = BooleanArray(n + 1, { _ -> true })
    var i = 2
    sv[0] = false
    sv[1] = false
    while ((i * i) <= n) {
        if (sv[i]) {
            var j = i + i
            while (j <= n) {
                sv[j] = false
                j = j + i
            }
        }
        i = i + 1
    }
    return sv
}