algorithm arrays

Find the element repeated more than n/2 times

There is an array (of size N) with an element repeated more than N/2 number of time and the rest of the element in the array can also be repeated but only one element is repeated more than N/2 times. Find the number.

I could think of few approaches:

  • Naive, keep the count of each number in a hash map.
  • Simplest, sort the array and the number at n/2+1 th index is the
    required number.
  • Keep count of only consecutive duplicate values found. Check
    separately for the pattern where the values are stored alternatively.

Unable to think of a better solution, there has to be.

There is a beautiful algorithm for solving this that works in two passes (total time O(N)) using only constant external space (O(1)). I have an implementation of this algorithm, along with comments including a correctness proof, available here

The intuition behind the algorithm is actually quite beautiful. Suppose that you were to have a roomful of people each holding one element of the array. Whenever two people find each other where neither is holding the same array element as the other, the two of them sit down. Eventually, at the very end, if anyone is left standing, there’s a chance that they’re in the majority, and you can just check that element. As long as one element occurs with frequency at least N/2, you can guarantee that this approach will always find the majority element.

To actually implement the algorithm, you make a linear scan over the array and keep track of your current guess as to what the majority element is, along with the number of times that you’ve seen it so far. Initially, this guess is undefined and the number of repeats is zero. As you walk across the array, if the current element matches your guess, you increment the counter. If the current element doesn’t match your guess, you decrement the counter. If the counter ever hits zero, then you reset it to the next element you encounter. You can think about this implementation as a concrete realization of the above “standing around in a room” algorithm. Whenever two people meet with different elements, they cancel out (dropping the counter). Whenever two people have the same element, then they don’t interact with each other.

For a full correctness proof, citation to the original paper (by Boyer and Moore of the more famous Boyer-Moore string matching algorithm), and an implementation in C++, check out the above link.