Finding the median of an unsorted array

Question

To find the median of an unsorted array  we can make a min-heap in O nlogn  time for n elements  and then we can extract one by one n 2 elements to get the median  But this approach would take O nlogn  time   Can we do the same by some method in O n  time  If we can  then please tell or suggest some method

User · Answer

Quickselect works in O n   this is also used in the partition step of Quicksort

User · Answer

It can be done using Quickselect Algorithm in O n   do refer to Kth order statistics  randomized algorithms

User · Answer

Let the problem be  finding the Kth largest element in an unsorted array  Divide the array into n 5 groups where each group consisting of 5 elements  Now a1 a2 a3    a n 5  represent the medians of each group  x   Median of the elements a1 a2      a n 5   Now if k lt n 2 then we can remove the largets  2nd largest and 3rd largest element of the groups whose median is greater than the x  We can now call the function again with 7n 10 elements and finding the kth largest value  else if k gt n 2 then we can remove the smallest  2nd smallest and 3rd smallest element of the group whose median is smaller than the x  We can now call the function of again with 7n 10 elements and finding the  k-3n 10 th largest value  Time Complexity Analysis  T n  time complexity to find the kth largest in an array of size n  T n    T n 5    T 7n 10    O n  if you solve this you will find out that T n  is actually O n  n 5   7n 10   9n 10  lt  n

User · Answer

You can use the Median of Medians algorithm to find median of an unsorted array in linear time

User · Answer

As wikipedia says  Median-of-Medians is theoretically o N   but it is not used in practice because the overhead of finding  good  pivots makes it too slow  http   en wikipedia org wiki Selection algorithm  Here is Java source for a Quickselect algorithm to find the k th element in an array          Returns position of k th largest element of sub-list          param list list to search  whose sub-list may be shuffled before               returning     param lo first element of sub-list in list     param hi just after last element of sub-list in list     param k     return position of k th largest element of  possibly shuffled  sub-list      static int select double   list  int lo  int hi  int k        int n   hi - lo      if  n  lt  2          return lo       double pivot   list lo    k   7919    n      Pick a random pivot         Triage list to   lt pivot   pivot   gt pivot      int nLess   0  nSame   0  nMore   0      int lo3   lo      int hi3   hi      while  lo3  lt  hi3            double e   list lo3           int cmp   compare e  pivot           if  cmp  lt  0                nLess                lo3              else if  cmp  gt  0                swap list  lo3  --hi3               if  nSame  gt  0                  swap list  hi3  hi3   nSame               nMore              else               nSame                swap list  lo3  --hi3                       assert  nSame  gt  0       assert  nLess   nSame   nMore    n       assert  list lo   nLess     pivot       assert  list hi - nMore - 1     pivot       if  k  gt   n - nMore          return select list  hi - nMore  hi  k - nLess - nSame       else if  k  lt  nLess          return select list  lo  lo   nLess  k       return lo   k      I have not included the source of the compare and swap methods  so it s easy to change the code to work with Object   instead of double     In practice  you can expect the above code to be o N

User · Answer

The answer is  No  one can t find the median of an arbitrary  unsorted dataset in linear time    The best one can do as a general rule  as far as I know  is Median of Medians  to get a decent start   followed by Quickselect   Ref   https   en wikipedia org wiki Median of medians  1

User · Answer

The quick select algorithm can find the k-th smallest element of an array in linear  O n   running time  Here is an implementation in python   import random  def partition L  v       smaller          bigger          for val in L          if val  lt  v  smaller     val          if val  gt  v  bigger     val      return  smaller   v   bigger   def top k L  k       v   L random randrange len L         left  middle  right    partition L  v        middle used below  in place of  v   for clarity     if len left     k    return left     if len left  1    k  return left   middle     if len left   gt  k     return top k left  k      return left   middle   top k right  k - len left  - len middle    def median L       n   len L      l   top k L  n   2   1      return max l

User · Answer

I have already upvoted the  dasblinkenlight answer since the Median of Medians algorithm in fact solves this problem in O n  time  I only want to add that this problem could be solved in O n  time by using heaps also  Building a heap could be done in O n  time by using the bottom-up  Take a look to the following article for a detailed explanation Heap sort  Supposing that your array has N elements  you have to build two heaps  A MaxHeap that contains the first N 2 elements  or  N 2  1 if N is odd  and a MinHeap that contains the remaining elements  If N is odd then your median is the maximum element of MaxHeap  O 1  by getting the max   If N is even  then your median is  MaxHeap max   MinHeap min    2 this takes O 1  also  Thus  the real cost of the whole operation is the heaps building operation which is O n    BTW this MaxHeap MinHeap algorithm works also when you don t know the number of the array elements beforehand  if you have to resolve the same problem for a stream of integers for e g   You can see more details about how to resolve this problem in the following article Median Of integer streams

[algorithm] Finding the median of an unsorted array

Examples related to algorithm

Examples related to heap

Examples related to median