[algorithm] Counting inversions in an array

I'm designing an algorithm to do the following: Given array A[1... n], for every i < j, find all inversion pairs such that A[i] > A[j]. I'm using merge sort and copying array A to array B and then comparing the two arrays, but I'm having a difficult time seeing how I can use this to find the number of inversions. Any hints or help would be greatly appreciated.

This question is related to algorithm

The answer is


One possible solution in C++ satisfying the O(N*log(N)) time complexity requirement would be as follows.

#include <algorithm>

vector<int> merge(vector<int>left, vector<int>right, int &counter)
{

    vector<int> result;

    vector<int>::iterator it_l=left.begin();
    vector<int>::iterator it_r=right.begin();

    int index_left=0;

    while(it_l!=left.end() || it_r!=right.end())
    {

        // the following is true if we are finished with the left vector 
        // OR if the value in the right vector is the smaller one.

        if(it_l==left.end() || (it_r!=right.end() && *it_r<*it_l) )
        {
            result.push_back(*it_r);
            it_r++;

            // increase inversion counter
            counter+=left.size()-index_left;
        }
        else
        {
            result.push_back(*it_l);
            it_l++;
            index_left++;

        }
    }

    return result;
}

vector<int> merge_sort_and_count(vector<int> A, int &counter)
{

    int N=A.size();
    if(N==1)return A;

    vector<int> left(A.begin(),A.begin()+N/2);
    vector<int> right(A.begin()+N/2,A.end());

    left=merge_sort_and_count(left,counter);
    right=merge_sort_and_count(right,counter);


    return merge(left, right, counter);

}

It differs from a regular merge sort only by the counter.


The easy O(n^2) answer is to use nested for-loops and increment a counter for every inversion

int counter = 0;

for(int i = 0; i < n - 1; i++)
{
    for(int j = i+1; j < n; j++)
    {
        if( A[i] > A[j] )
        {
            counter++;
        }
    }
}

return counter;

Now I suppose you want a more efficient solution, I'll think about it.


The number of inversions can be found by analyzing the merge process in merge sort : merge process

When copying a element from the second array to the merge array (the 9 in this exemple), it keeps its place relatively to other elements. When copying a element from the first array to the merge array (the 5 here) it is inverted with all the elements staying in the second array (2 inversions with the 3 and the 4). So a little modification of merge sort can solve the problem in O(n ln n).
For exemple, just uncomment the two # lines in the mergesort python code below to have the count.

def merge(l1,l2):
    l = []
    # global count
    while l1 and l2:
        if l1[-1] <= l2[-1]:
            l.append(l2.pop())
        else:
            l.append(l1.pop())
            # count += len(l2)
    l.reverse()
    return l1 + l2 + l

def sort(l): 
    t = len(l) // 2
    return merge(sort(l[:t]), sort(l[t:])) if t > 0 else l

count=0
print(sort([5,1,2,4,9,3]), count)
# [1, 2, 3, 4, 5, 9] 6

EDIT 1

The same task can be achieved with a stable version of quick sort, known to be slightly faster :

def part(l):
    pivot=l[-1]
    small,big = [],[]
    count = big_count = 0
    for x in l:
        if x <= pivot:
            small.append(x)
            count += big_count
        else:
            big.append(x)
            big_count += 1
    return count,small,big

def quick_count(l):
    if len(l)<2 : return 0
    count,small,big = part(l)
    small.pop()
    return count + quick_count(small) + quick_count(big)

Choosing pivot as the last element, inversions are well counted, and execution time 40% better than merge one above.

EDIT 2

For performance in python, a numpy & numba version :

First the numpy part, which use argsort O (n ln n) :

def count_inversions(a):
    n = a.size
    counts = np.arange(n) & -np.arange(n)  # The BIT
    ags = a.argsort(kind='mergesort')    
    return  BIT(ags,counts,n)

And the numba part for the efficient BIT approach :

@numba.njit
def BIT(ags,counts,n):
    res = 0        
    for x in ags :
        i = x
        while i:
            res += counts[i]
            i -= i & -i
        i = x+1
        while i < n:
            counts[i] -= 1
            i += i & -i
    return  res  

I had a question similar to this for homework actually. I was restricted that it must have O(nlogn) efficiency.

I used the idea you proposed of using Mergesort, since it is already of the correct efficiency. I just inserted some code into the merging function that was basically: Whenever a number from the array on the right is being added to the output array, I add to the total number of inversions, the amount of numbers remaining in the left array.

This makes a lot of sense to me now that I've thought about it enough. Your counting how many times there is a greater number coming before any numbers.

hth.


Here is one possible solution with variation of binary tree. It adds a field called rightSubTreeSize to each tree node. Keep on inserting number into binary tree in the order they appear in the array. If number goes lhs of node the inversion count for that element would be (1 + rightSubTreeSize). Since all those elements are greater than current element and they would have appeared earlier in the array. If element goes to rhs of a node, just increase its rightSubTreeSize. Following is the code.

Node { 
    int data;
    Node* left, *right;
    int rightSubTreeSize;

    Node(int data) { 
        rightSubTreeSize = 0;
    }   
};

Node* root = null;
int totCnt = 0;
for(i = 0; i < n; ++i) { 
    Node* p = new Node(a[i]);
    if(root == null) { 
        root = p;
        continue;
    } 

    Node* q = root;
    int curCnt = 0;
    while(q) { 
        if(p->data <= q->data) { 
            curCnt += 1 + q->rightSubTreeSize;
            if(q->left) { 
                q = q->left;
            } else { 
                q->left = p;
                break;
            }
        } else { 
            q->rightSubTreeSize++;
            if(q->right) { 
                q = q->right;
            } else { 
                q->right = p;
                break;
            }
        }
    }

    totCnt += curCnt;
  }
  return totCnt;

The number of inversions in an array is half the total distance elements must be moved in order to sort the array. Therefore, it can be computed by sorting the array, maintaining the resulting permutation p[i], and then computing the sum of abs(p[i]-i)/2. This takes O(n log n) time, which is optimal.

An alternative method is given at http://mathworld.wolfram.com/PermutationInversion.html. This method is equivalent to the sum of max(0, p[i]-i), which is equal to the sum of abs(p[i]-i])/2 since the total distance elements move left is equal to the total distance elements move to the right.

EDIT: This method is wrong (see comments), and there is unfortunately no way to fix it while preserving the character of the method.


The number of inversions can be found by analyzing the merge process in merge sort : merge process

When copying a element from the second array to the merge array (the 9 in this exemple), it keeps its place relatively to other elements. When copying a element from the first array to the merge array (the 5 here) it is inverted with all the elements staying in the second array (2 inversions with the 3 and the 4). So a little modification of merge sort can solve the problem in O(n ln n).
For exemple, just uncomment the two # lines in the mergesort python code below to have the count.

def merge(l1,l2):
    l = []
    # global count
    while l1 and l2:
        if l1[-1] <= l2[-1]:
            l.append(l2.pop())
        else:
            l.append(l1.pop())
            # count += len(l2)
    l.reverse()
    return l1 + l2 + l

def sort(l): 
    t = len(l) // 2
    return merge(sort(l[:t]), sort(l[t:])) if t > 0 else l

count=0
print(sort([5,1,2,4,9,3]), count)
# [1, 2, 3, 4, 5, 9] 6

EDIT 1

The same task can be achieved with a stable version of quick sort, known to be slightly faster :

def part(l):
    pivot=l[-1]
    small,big = [],[]
    count = big_count = 0
    for x in l:
        if x <= pivot:
            small.append(x)
            count += big_count
        else:
            big.append(x)
            big_count += 1
    return count,small,big

def quick_count(l):
    if len(l)<2 : return 0
    count,small,big = part(l)
    small.pop()
    return count + quick_count(small) + quick_count(big)

Choosing pivot as the last element, inversions are well counted, and execution time 40% better than merge one above.

EDIT 2

For performance in python, a numpy & numba version :

First the numpy part, which use argsort O (n ln n) :

def count_inversions(a):
    n = a.size
    counts = np.arange(n) & -np.arange(n)  # The BIT
    ags = a.argsort(kind='mergesort')    
    return  BIT(ags,counts,n)

And the numba part for the efficient BIT approach :

@numba.njit
def BIT(ags,counts,n):
    res = 0        
    for x in ags :
        i = x
        while i:
            res += counts[i]
            i -= i & -i
        i = x+1
        while i < n:
            counts[i] -= 1
            i += i & -i
    return  res  

Most answers are based on MergeSort but it isn't the only way to solve this is in O(nlogn)

I'll discuss a few approaches.

  1. Use a Balanced Binary Search Tree

    • Augment your tree to store frequencies for duplicate elements.
    • The idea is to keep counting greater nodes when the tree is traversed from root to a leaf for insertion.

Something like this.

Node *insert(Node* root, int data, int& count){
    if(!root) return new Node(data);
    if(root->data == data){
        root->freq++;
        count += getSize(root->right);
    }
    else if(root->data > data){
        count += getSize(root->right) + root->freq;
        root->left = insert(root->left, data, count);
    }
    else root->right = insert(root->right, data, count);
    return balance(root);
}

int getCount(int *a, int n){
    int c = 0;
    Node *root = NULL;
    for(auto i=0; i<n; i++) root = insert(root, a[i], c);
    return c;
}
  1. Use a Binary Indexed Tree
    • Create a summation BIT.
    • Loop from the end and start finding the count of greater elements.
int getInversions(int[] a) {
    int n = a.length, inversions = 0;
    int[] bit = new int[n+1];
    compress(a);
    BIT b = new BIT();
    for (int i=n-1; i>=0; i--) {
         inversions += b.getSum(bit, a[i] - 1);
         b.update(bit, n, a[i], 1);
     }
     return inversions;
}
  1. Use a Segment Tree
    • Create a summation segment Tree.
    • Loop from the end of the array and query between [0, a[i]-1] and update a[i] with 1
int getInversions(int *a, int n) {
    int N = n + 1, c = 0;
    compress(a, n);
    int tree[N<<1] = {0};
    for (int i=n-1; i>=0; i--) {
        c+= query(tree, N, 0, a[i] - 1);
        update(tree, N, a[i], 1);
    }
    return c;
}

Also, when using BIT or Segment-Tree a good idea is to do Coordinate compression

void compress(int *a, int n) {
    int temp[n];
    for (int i=0; i<n; i++) temp[i] = a[i];
    sort(temp, temp+n);
    for (int i=0; i<n; i++) a[i] = lower_bound(temp, temp+n, a[i]) - temp + 1;
}


Since this is an old question, I'll provide my answer in C.

#include <stdio.h>

int count = 0;
int inversions(int a[], int len);
void mergesort(int a[], int left, int right);
void merge(int a[], int left, int mid, int right);

int main() {
  int a[] = { 1, 5, 2, 4, 0 };
  printf("%d\n", inversions(a, 5));
}

int inversions(int a[], int len) {
  mergesort(a, 0, len - 1);
  return count;
}

void mergesort(int a[], int left, int right) {
  if (left < right) {
     int mid = (left + right) / 2;
     mergesort(a, left, mid);
     mergesort(a, mid + 1, right);
     merge(a, left, mid, right);
  }
}

void merge(int a[], int left, int mid, int right) {
  int i = left;
  int j = mid + 1;
  int k = 0;
  int b[right - left + 1];
  while (i <= mid && j <= right) {
     if (a[i] <= a[j]) {
       b[k++] = a[i++];
     } else {
       printf("right element: %d\n", a[j]);
       count += (mid - i + 1);
       printf("new count: %d\n", count);
       b[k++] = a[j++];
     }
  }
  while (i <= mid)
    b[k++] = a[i++];
  while (j <= right)
    b[k++] = a[j++];
  for (i = left, k = 0; i <= right; i++, k++) {
    a[i] = b[k];
  }
}

Check this out: http://www.cs.jhu.edu/~xfliu/600.363_F03/hw_solution/solution1.pdf

I hope that it will give you the right answer.

  • 2-3 Inversion part (d)
  • It's running time is O(nlogn)

Java implementation:

import java.lang.reflect.Array;
import java.util.Arrays;


public class main {

public static void main(String[] args) {
    int[] arr = {6, 9, 1, 14, 8, 12, 3, 2};
    System.out.println(findinversion(arr,0,arr.length-1));
}

public static int findinversion(int[] arr,int beg,int end) {
    if(beg >= end)
        return 0;

    int[] result = new int[end-beg+1];
    int index = 0;
    int mid = (beg+end)/2;
    int count = 0, leftinv,rightinv;
    //System.out.println("...."+beg+"   "+end+"  "+mid);
    leftinv = findinversion(arr, beg, mid);
    rightinv = findinversion(arr, mid+1, end);
    l1:
    for(int i = beg, j = mid+1; i<=mid || j<=end;/*index < result.length;*/ ) {
        if(i>mid) {
            for(;j<=end;j++)
                result[index++]=arr[j];
            break l1;
        }
        if(j>end) {
            for(;i<=mid;i++)
                result[index++]=arr[i];
            break l1;
        }
        if(arr[i] <= arr[j]) {
            result[index++]=arr[i];
            i++;    
        } else {
            System.out.println(arr[i]+"  "+arr[j]);
            count = count+ mid-i+1;
            result[index++]=arr[j];
            j++;    
        }
    }

    for(int i = 0, j=beg; i< end-beg+1; i++,j++)
        arr[j]= result[i];
    return (count+leftinv+rightinv);
    //System.out.println(Arrays.toString(arr));
}

}

Maximum number of inversions possible for a list of size n could be generalized by an expression:

maxPossibleInversions = (n * (n-1) ) / 2

So for an array of size 6 maximum possible inversions will equal 15.

To achieve a complexity of n logn we could piggy back the inversion algorithm on merge sort.

Here are the generalized steps:

  • Split the array into two
  • Call the mergeSort routine. If the element in the left subarray is greater than the element in right sub array make inversionCount += leftSubArray.length

That's it!

This is a simple example, I made using Javascript:

var arr = [6,5,4,3,2,1]; // Sample input array

var inversionCount = 0;

function mergeSort(arr) {
    if(arr.length == 1)
        return arr;

    if(arr.length > 1) {
        let breakpoint = Math.ceil((arr.length/2));
        // Left list starts with 0, breakpoint-1
        let leftList = arr.slice(0,breakpoint);
        // Right list starts with breakpoint, length-1
        let rightList = arr.slice(breakpoint,arr.length);

        // Make a recursive call
        leftList = mergeSort(leftList);
        rightList = mergeSort(rightList);

        var a = merge(leftList,rightList);
        return a;
    }
}

function merge(leftList,rightList) {
    let result = [];
    while(leftList.length && rightList.length) {
        /**
         * The shift() method removes the first element from an array
         * and returns that element. This method changes the length
         * of the array.
         */
        if(leftList[0] <= rightList[0]) {
            result.push(leftList.shift());
        }else{
            inversionCount += leftList.length;
            result.push(rightList.shift());
        }
    }

    while(leftList.length)
        result.push(leftList.shift());

    while(rightList.length)
        result.push(rightList.shift());

    console.log(result);
    return result;
}

mergeSort(arr);
console.log('Number of inversions: ' + inversionCount);

Best optimized way will be to solve it through merge sort where will merging itself we can check how many inversions are required by comparing left and right array. Whenever element at left array is greater than element at right array, it will be inversion.

Merge sort Approach :-

Here is the code . Code is exact same as merge sort except code snippet under mergeToParent method where i am counting the inversion under else condition of (left[leftunPicked] < right[rightunPicked])

public class TestInversionThruMergeSort {
    
    static int count =0;

    public static void main(String[] args) {
        int[] arr = {6, 9, 1, 14, 8, 12, 3, 2};
        

        partition(arr);

        for (int i = 0; i < arr.length; i++) {

            System.out.println(arr[i]);
        }
        
        System.out.println("inversions are "+count);

    }

    public static void partition(int[] arr) {

        if (arr.length > 1) {

            int mid = (arr.length) / 2;
            int[] left = null;

            if (mid > 0) {
                left = new int[mid];

                for (int i = 0; i < mid; i++) {
                    left[i] = arr[i];
                }
            }

            int[] right = new int[arr.length - left.length];

            if ((arr.length - left.length) > 0) {
                int j = 0;
                for (int i = mid; i < arr.length; i++) {
                    right[j] = arr[i];
                    ++j;
                }
            }

            partition(left);
            partition(right);
            mergeToParent(left, right, arr);
        }

    }

    public static void mergeToParent(int[] left, int[] right, int[] parent) {

        int leftunPicked = 0;
        int rightunPicked = 0;
        int parentIndex = -1;

        while (rightunPicked < right.length && leftunPicked < left.length) {

            if (left[leftunPicked] < right[rightunPicked]) {
                parent[++parentIndex] = left[leftunPicked];
                ++leftunPicked;

            } else {
                count = count + left.length-leftunPicked;
                if ((rightunPicked < right.length)) {
                    parent[++parentIndex] = right[rightunPicked];
                    ++rightunPicked;
                }
            }

        }

        while (leftunPicked < left.length) {
            parent[++parentIndex] = left[leftunPicked];
            ++leftunPicked;
        }

        while (rightunPicked < right.length) {
            parent[++parentIndex] = right[rightunPicked];
            ++rightunPicked;
        }

    }

}

Another approach where we can compare the input array with sorted array:- This implementation of Diablo answer. Though this should not be preferred approach as removing the n elements from an array or list is log(n^2).

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;


public class TestInversion {

    public static void main(String[] args) {
        
        Integer [] arr1 = {6, 9, 1, 14, 8, 12, 3, 2};
        
        List<Integer> arr = new ArrayList(Arrays.asList(arr1));
        List<Integer> sortArr = new ArrayList<Integer>();
        
        for(int i=0;i<arr.size();i++){
            sortArr.add(arr.get(i));
         
        }
        
            
        Collections.sort(sortArr);
        
        int inversion = 0;
        
        Iterator<Integer> iter = arr.iterator();
        
        while(iter.hasNext()){
            
            Integer el = (Integer)iter.next();
            int index = sortArr.indexOf(el);
            
            if(index+1 > 1){
                inversion = inversion + ((index+1)-1);
            }
            
            //iter.remove();
            sortArr.remove(el);
            
        }
        
        System.out.println("Inversions are "+inversion);
        
        
        

    }


}

Check this out: http://www.cs.jhu.edu/~xfliu/600.363_F03/hw_solution/solution1.pdf

I hope that it will give you the right answer.

  • 2-3 Inversion part (d)
  • It's running time is O(nlogn)

I recently had to do this in R:

inversionNumber <- function(x){
    mergeSort <- function(x){
        if(length(x) == 1){
            inv <- 0
        } else {
            n <- length(x)
            n1 <- ceiling(n/2)
            n2 <- n-n1
            y1 <- mergeSort(x[1:n1])
            y2 <- mergeSort(x[n1+1:n2])
            inv <- y1$inversions + y2$inversions
            x1 <- y1$sortedVector
            x2 <- y2$sortedVector
            i1 <- 1
            i2 <- 1
            while(i1+i2 <= n1+n2+1){
                if(i2 > n2 || i1 <= n1 && x1[i1] <= x2[i2]){
                    x[i1+i2-1] <- x1[i1]
                    i1 <- i1 + 1
                } else {
                    inv <- inv + n1 + 1 - i1
                    x[i1+i2-1] <- x2[i2]
                    i2 <- i2 + 1
                }
            }
        }
        return (list(inversions=inv,sortedVector=x))
    }
    r <- mergeSort(x)
    return (r$inversions)
}

C++ T(n lg n) Solution with the printing of pair which constitute in inversion count.

int merge(vector<int>&nums , int low , int mid , int high){
    int size1 = mid - low +1;
    int size2= high - mid;
    vector<int>left;
    vector<int>right;
    for(int i = 0  ; i < size1 ; ++i){
        left.push_back(nums[low+i]);
    }
    for(int i = 0 ; i <size2 ; ++i){
        right.push_back(nums[mid+i+1]);
    }
    left.push_back(INT_MAX);
    right.push_back(INT_MAX);
    int i = 0 ;
    int j = 0;
    int start  = low;
    int inversion = 0 ;
    while(i < size1 && j < size2){
        if(left[i]<right[j]){
            nums[start] = left[i];
            start++;
            i++;
        }else{
            for(int l = i ; l < size1; ++l){
                cout<<"("<<left[l]<<","<<right[j]<<")"<<endl;
            }
            inversion += size1 - i;
            nums[start] = right[j];
            start++;
            j++;
        }
    }
    if(i == size1){
        for(int c = j ; c< size2 ; ++c){
            nums[start] = right[c];
            start++;
        }
    }
    if(j == size2){
        for(int c =  i ; c< size1 ; ++c){
            nums[start] = left[c];
            start++;
        }
    }
    return inversion;
}
int inversion_count(vector<int>& nums , int low , int high){
    if(high>low){
        int mid = low + (high-low)/2;
        int left = inversion_count(nums,low,mid);
        int right = inversion_count(nums,mid+1,high);
        int inversion = merge(nums,low,mid,high) + left + right;
        return inversion;
    }
    return 0 ;
}

Here is a C code for count inversions

#include <stdio.h>
#include <stdlib.h>

int  _mergeSort(int arr[], int temp[], int left, int right);
int merge(int arr[], int temp[], int left, int mid, int right);

/* This function sorts the input array and returns the
   number of inversions in the array */
int mergeSort(int arr[], int array_size)
{
    int *temp = (int *)malloc(sizeof(int)*array_size);
    return _mergeSort(arr, temp, 0, array_size - 1);
}

/* An auxiliary recursive function that sorts the input array and
  returns the number of inversions in the array. */
int _mergeSort(int arr[], int temp[], int left, int right)
{
  int mid, inv_count = 0;
  if (right > left)
  {
    /* Divide the array into two parts and call _mergeSortAndCountInv()
       for each of the parts */
    mid = (right + left)/2;

    /* Inversion count will be sum of inversions in left-part, right-part
      and number of inversions in merging */
    inv_count  = _mergeSort(arr, temp, left, mid);
    inv_count += _mergeSort(arr, temp, mid+1, right);

    /*Merge the two parts*/
    inv_count += merge(arr, temp, left, mid+1, right);
  }
  return inv_count;
}

/* This funt merges two sorted arrays and returns inversion count in
   the arrays.*/
int merge(int arr[], int temp[], int left, int mid, int right)
{
  int i, j, k;
  int inv_count = 0;

  i = left; /* i is index for left subarray*/
  j = mid;  /* i is index for right subarray*/
  k = left; /* i is index for resultant merged subarray*/
  while ((i <= mid - 1) && (j <= right))
  {
    if (arr[i] <= arr[j])
    {
      temp[k++] = arr[i++];
    }
    else
    {
      temp[k++] = arr[j++];

     /*this is tricky -- see above explanation/diagram for merge()*/
      inv_count = inv_count + (mid - i);
    }
  }

  /* Copy the remaining elements of left subarray
   (if there are any) to temp*/
  while (i <= mid - 1)
    temp[k++] = arr[i++];

  /* Copy the remaining elements of right subarray
   (if there are any) to temp*/
  while (j <= right)
    temp[k++] = arr[j++];

  /*Copy back the merged elements to original array*/
  for (i=left; i <= right; i++)
    arr[i] = temp[i];

  return inv_count;
}

/* Driver progra to test above functions */
int main(int argv, char** args)
{
  int arr[] = {1, 20, 6, 4, 5};
  printf(" Number of inversions are %d \n", mergeSort(arr, 5));
  getchar();
  return 0;
}

An explanation was given in detail here: http://www.geeksforgeeks.org/counting-inversions/


Well I have a different solution but I am afraid that would work only for distinct array elements.

//Code
#include <bits/stdc++.h>
using namespace std;

int main()
{
    int i,n;
    cin >> n;
    int arr[n],inv[n];
    for(i=0;i<n;i++){
        cin >> arr[i];
    }
    vector<int> v;
    v.push_back(arr[n-1]);
    inv[n-1]=0;
    for(i=n-2;i>=0;i--){
        auto it = lower_bound(v.begin(),v.end(),arr[i]); 
        //calculating least element in vector v which is greater than arr[i]
        inv[i]=it-v.begin();
        //calculating distance from starting of vector
        v.insert(it,arr[i]);
        //inserting that element into vector v
    }
    for(i=0;i<n;i++){
        cout << inv[i] << " ";
    }
    cout << endl;
    return 0;
}

To explain my code we keep on adding elements from the end of Array.For any incoming array element we find the index of first element in vector v which is greater than our incoming element and assign that value to inversion count of the index of incoming element.After that we insert that element into vector v at it's correct position such that vector v remain in sorted order.

//INPUT     
4
2 1 4 3

//OUTPUT    
1 0 1 0

//To calculate total inversion count just add up all the elements in output array

Here's my O(n log n) solution in Ruby:

def solution(t)
    sorted, inversion_count = sort_inversion_count(t)
    return inversion_count
end

def sort_inversion_count(t)
    midpoint = t.length / 2
    left_half = t[0...midpoint]
    right_half = t[midpoint..t.length]

    if midpoint == 0
        return t, 0
    end

    sorted_left_half, left_half_inversion_count = sort_inversion_count(left_half)
    sorted_right_half, right_half_inversion_count = sort_inversion_count(right_half)

    sorted = []
    inversion_count = 0
    while sorted_left_half.length > 0 or sorted_right_half.length > 0
        if sorted_left_half.empty?
            sorted.push sorted_right_half.shift
        elsif sorted_right_half.empty?
            sorted.push sorted_left_half.shift
        else
            if sorted_left_half[0] > sorted_right_half[0]
                inversion_count += sorted_left_half.length
                sorted.push sorted_right_half.shift
            else
                sorted.push sorted_left_half.shift
            end
        end
    end

    return sorted, inversion_count + left_half_inversion_count + right_half_inversion_count
end

And some test cases:

require "minitest/autorun"

class TestCodility < Minitest::Test
    def test_given_example
        a = [-1, 6, 3, 4, 7, 4]
        assert_equal solution(a), 4
    end

    def test_empty
        a = []
        assert_equal solution(a), 0
    end

    def test_singleton
        a = [0]
        assert_equal solution(a), 0
    end

    def test_none
        a = [1,2,3,4,5,6,7]
        assert_equal solution(a), 0
    end

    def test_all
        a = [5,4,3,2,1]
        assert_equal solution(a), 10
    end

    def test_clones
        a = [4,4,4,4,4,4]
        assert_equal solution(a), 0
    end
end

Note that the answer by Geoffrey Irving is wrong.

The number of inversions in an array is half the total distance elements must be moved in order to sort the array. Therefore, it can be computed by sorting the array, maintaining the resulting permutation p[i], and then computing the sum of abs(p[i]-i)/2. This takes O(n log n) time, which is optimal.

An alternative method is given at http://mathworld.wolfram.com/PermutationInversion.html. This method is equivalent to the sum of max(0, p[i]-i), which is equal to the sum of abs(p[i]-i])/2 since the total distance elements move left is equal to the total distance elements move to the right.

Take the sequence { 3, 2, 1 } as an example. There are three inversions: (3, 2), (3, 1), (2, 1), so the inversion number is 3. However, according to the quoted method the answer would have been 2.


C code easy to understand:

#include<stdio.h>
#include<stdlib.h>

//To print an array
void print(int arr[],int n)
{
    int i;
    for(i=0,printf("\n");i<n;i++)
        printf("%d ",arr[i]);
    printf("\n");
}

//Merge Sort
int merge(int arr[],int left[],int right[],int l,int r)
{
    int i=0,j=0,count=0;
    while(i<l || j<r)
    {
        if(i==l)
        {
            arr[i+j]=right[j];
            j++;
        }
        else if(j==r)
        {
            arr[i+j]=left[i];
            i++;
        }
        else if(left[i]<=right[j])
        {
            arr[i+j]=left[i];
            i++;
        }
        else
        {
            arr[i+j]=right[j];
            count+=l-i;
            j++;
        }
    }
    //printf("\ncount:%d\n",count);
    return count;
}

//Inversion Finding
int inversions(int arr[],int high)
{
    if(high<1)
        return 0;

    int mid=(high+1)/2;
    int left[mid];
    int right[high-mid+1];

    int i,j;
    for(i=0;i<mid;i++)
        left[i]=arr[i];


    for(i=high-mid,j=high;j>=mid;i--,j--)
        right[i]=arr[j];

    //print(arr,high+1);
    //print(left,mid);
    //print(right,high-mid+1);

    return inversions(left,mid-1) + inversions(right,high-mid) + merge(arr,left,right,mid,high-mid+1);

}
int main()
{
    int arr[]={6,9,1,14,8,12,3,2};
    int n=sizeof(arr)/sizeof(arr[0]);
    print(arr,n);
    printf("%d ",inversions(arr,n-1));
    return 0;
}

Implementation of counting inversions in an array with merge sort in Swift:

Note that the number of swaps is incremented by

nSwaps += mid + 1 - iL 

(which is the relative length of the left side of the array minus the index of the current element in the left side)

... because that is the number of elements which the element in the right side of the array had to skip over (# of inversions) to become sorted.

func merge(arr: inout [Int], arr2: inout [Int], low: Int, mid: Int, high: Int) -> Int {
    var nSwaps = 0;

    var i = low;
    var iL = low;
    var iR = mid + 1;

    while iL <= mid && iR <= high {
        if arr2[iL] <= arr2[iR] {
            arr[i] = arr2[iL]
            iL += 1
            i += 1
        } else {
            arr[i] = arr2[iR]
            nSwaps += mid + 1 - iL
            iR += 1
            i += 1
        }
    }

    while iL <= mid {
        arr[i] = arr2[iL]
        iL += 1
        i += 1
    }

    while iR <= high {
        arr[i] = arr2[iR]
        iR += 1
        i += 1
    }

    return nSwaps
}

func mergeSort(arr: inout [Int]) -> Int {
    var arr2 = arr
    let nSwaps = mergeSort(arr: &arr, arr2: &arr2, low: 0, high: arr.count-1)
    return nSwaps
}

func mergeSort(arr: inout [Int], arr2: inout [Int], low: Int, high: Int) -> Int {

    if low >= high {
        return 0
    }

    let mid = low + ((high - low) / 2)

    var nSwaps = 0;
    nSwaps += mergeSort(arr: &arr2, arr2: &arr, low: low, high: mid)
    nSwaps += mergeSort(arr: &arr2, arr2: &arr, low: mid+1, high: high)
    nSwaps += merge(arr: &arr, arr2: &arr2, low: low, mid: mid, high: high)

    return nSwaps
}

var arrayToSort: [Int] = [2, 1, 3, 1, 2]
let nSwaps = mergeSort(arr: &arrayToSort)

print(arrayToSort) // [1, 1, 2, 2, 3]
print(nSwaps) // 4

The easy O(n^2) answer is to use nested for-loops and increment a counter for every inversion

int counter = 0;

for(int i = 0; i < n - 1; i++)
{
    for(int j = i+1; j < n; j++)
    {
        if( A[i] > A[j] )
        {
            counter++;
        }
    }
}

return counter;

Now I suppose you want a more efficient solution, I'll think about it.


Most answers are based on MergeSort but it isn't the only way to solve this is in O(nlogn)

I'll discuss a few approaches.

  1. Use a Balanced Binary Search Tree

    • Augment your tree to store frequencies for duplicate elements.
    • The idea is to keep counting greater nodes when the tree is traversed from root to a leaf for insertion.

Something like this.

Node *insert(Node* root, int data, int& count){
    if(!root) return new Node(data);
    if(root->data == data){
        root->freq++;
        count += getSize(root->right);
    }
    else if(root->data > data){
        count += getSize(root->right) + root->freq;
        root->left = insert(root->left, data, count);
    }
    else root->right = insert(root->right, data, count);
    return balance(root);
}

int getCount(int *a, int n){
    int c = 0;
    Node *root = NULL;
    for(auto i=0; i<n; i++) root = insert(root, a[i], c);
    return c;
}
  1. Use a Binary Indexed Tree
    • Create a summation BIT.
    • Loop from the end and start finding the count of greater elements.
int getInversions(int[] a) {
    int n = a.length, inversions = 0;
    int[] bit = new int[n+1];
    compress(a);
    BIT b = new BIT();
    for (int i=n-1; i>=0; i--) {
         inversions += b.getSum(bit, a[i] - 1);
         b.update(bit, n, a[i], 1);
     }
     return inversions;
}
  1. Use a Segment Tree
    • Create a summation segment Tree.
    • Loop from the end of the array and query between [0, a[i]-1] and update a[i] with 1
int getInversions(int *a, int n) {
    int N = n + 1, c = 0;
    compress(a, n);
    int tree[N<<1] = {0};
    for (int i=n-1; i>=0; i--) {
        c+= query(tree, N, 0, a[i] - 1);
        update(tree, N, a[i], 1);
    }
    return c;
}

Also, when using BIT or Segment-Tree a good idea is to do Coordinate compression

void compress(int *a, int n) {
    int temp[n];
    for (int i=0; i<n; i++) temp[i] = a[i];
    sort(temp, temp+n);
    for (int i=0; i<n; i++) a[i] = lower_bound(temp, temp+n, a[i]) - temp + 1;
}


Implementation of counting inversions in an array with merge sort in Swift:

Note that the number of swaps is incremented by

nSwaps += mid + 1 - iL 

(which is the relative length of the left side of the array minus the index of the current element in the left side)

... because that is the number of elements which the element in the right side of the array had to skip over (# of inversions) to become sorted.

func merge(arr: inout [Int], arr2: inout [Int], low: Int, mid: Int, high: Int) -> Int {
    var nSwaps = 0;

    var i = low;
    var iL = low;
    var iR = mid + 1;

    while iL <= mid && iR <= high {
        if arr2[iL] <= arr2[iR] {
            arr[i] = arr2[iL]
            iL += 1
            i += 1
        } else {
            arr[i] = arr2[iR]
            nSwaps += mid + 1 - iL
            iR += 1
            i += 1
        }
    }

    while iL <= mid {
        arr[i] = arr2[iL]
        iL += 1
        i += 1
    }

    while iR <= high {
        arr[i] = arr2[iR]
        iR += 1
        i += 1
    }

    return nSwaps
}

func mergeSort(arr: inout [Int]) -> Int {
    var arr2 = arr
    let nSwaps = mergeSort(arr: &arr, arr2: &arr2, low: 0, high: arr.count-1)
    return nSwaps
}

func mergeSort(arr: inout [Int], arr2: inout [Int], low: Int, high: Int) -> Int {

    if low >= high {
        return 0
    }

    let mid = low + ((high - low) / 2)

    var nSwaps = 0;
    nSwaps += mergeSort(arr: &arr2, arr2: &arr, low: low, high: mid)
    nSwaps += mergeSort(arr: &arr2, arr2: &arr, low: mid+1, high: high)
    nSwaps += merge(arr: &arr, arr2: &arr2, low: low, mid: mid, high: high)

    return nSwaps
}

var arrayToSort: [Int] = [2, 1, 3, 1, 2]
let nSwaps = mergeSort(arr: &arrayToSort)

print(arrayToSort) // [1, 1, 2, 2, 3]
print(nSwaps) // 4

Since this is an old question, I'll provide my answer in C.

#include <stdio.h>

int count = 0;
int inversions(int a[], int len);
void mergesort(int a[], int left, int right);
void merge(int a[], int left, int mid, int right);

int main() {
  int a[] = { 1, 5, 2, 4, 0 };
  printf("%d\n", inversions(a, 5));
}

int inversions(int a[], int len) {
  mergesort(a, 0, len - 1);
  return count;
}

void mergesort(int a[], int left, int right) {
  if (left < right) {
     int mid = (left + right) / 2;
     mergesort(a, left, mid);
     mergesort(a, mid + 1, right);
     merge(a, left, mid, right);
  }
}

void merge(int a[], int left, int mid, int right) {
  int i = left;
  int j = mid + 1;
  int k = 0;
  int b[right - left + 1];
  while (i <= mid && j <= right) {
     if (a[i] <= a[j]) {
       b[k++] = a[i++];
     } else {
       printf("right element: %d\n", a[j]);
       count += (mid - i + 1);
       printf("new count: %d\n", count);
       b[k++] = a[j++];
     }
  }
  while (i <= mid)
    b[k++] = a[i++];
  while (j <= right)
    b[k++] = a[j++];
  for (i = left, k = 0; i <= right; i++, k++) {
    a[i] = b[k];
  }
}

Here is c++ solution

/**
*array sorting needed to verify if first arrays n'th element is greater than sencond arrays
*some element then all elements following n will do the same
*/
#include<stdio.h>
#include<iostream>
using namespace std;
int countInversions(int array[],int size);
int merge(int arr1[],int size1,int arr2[],int size2,int[]);
int main()
{
    int array[] = {2, 4, 1, 3, 5};
    int size = sizeof(array) / sizeof(array[0]);
    int x = countInversions(array,size);
    printf("number of inversions = %d",x);
}

int countInversions(int array[],int size)
{
    if(size > 1 )
    {
    int mid = size / 2;
    int count1 = countInversions(array,mid);
    int count2 = countInversions(array+mid,size-mid);
    int temp[size];
    int count3 = merge(array,mid,array+mid,size-mid,temp);
    for(int x =0;x<size ;x++)
    {
        array[x] = temp[x];
    }
    return count1 + count2 + count3;
    }else{
        return 0;
    }
}

int merge(int arr1[],int size1,int arr2[],int size2,int temp[])
{
    int count  = 0;
    int a = 0;
    int b = 0;
    int c = 0;
    while(a < size1 && b < size2)
    {
        if(arr1[a] < arr2[b])
        {
            temp[c] = arr1[a];
            c++;
            a++;
        }else{
            temp[c] = arr2[b];
            b++;
            c++;
            count = count + size1 -a;
        }
    }

    while(a < size1)
    {
        temp[c] = arr1[a];
        c++;a++;
    }

while(b < size2)
    {
        temp[c] = arr2[b];
        c++;b++;
    }

    return count;
}

Best optimized way will be to solve it through merge sort where will merging itself we can check how many inversions are required by comparing left and right array. Whenever element at left array is greater than element at right array, it will be inversion.

Merge sort Approach :-

Here is the code . Code is exact same as merge sort except code snippet under mergeToParent method where i am counting the inversion under else condition of (left[leftunPicked] < right[rightunPicked])

public class TestInversionThruMergeSort {
    
    static int count =0;

    public static void main(String[] args) {
        int[] arr = {6, 9, 1, 14, 8, 12, 3, 2};
        

        partition(arr);

        for (int i = 0; i < arr.length; i++) {

            System.out.println(arr[i]);
        }
        
        System.out.println("inversions are "+count);

    }

    public static void partition(int[] arr) {

        if (arr.length > 1) {

            int mid = (arr.length) / 2;
            int[] left = null;

            if (mid > 0) {
                left = new int[mid];

                for (int i = 0; i < mid; i++) {
                    left[i] = arr[i];
                }
            }

            int[] right = new int[arr.length - left.length];

            if ((arr.length - left.length) > 0) {
                int j = 0;
                for (int i = mid; i < arr.length; i++) {
                    right[j] = arr[i];
                    ++j;
                }
            }

            partition(left);
            partition(right);
            mergeToParent(left, right, arr);
        }

    }

    public static void mergeToParent(int[] left, int[] right, int[] parent) {

        int leftunPicked = 0;
        int rightunPicked = 0;
        int parentIndex = -1;

        while (rightunPicked < right.length && leftunPicked < left.length) {

            if (left[leftunPicked] < right[rightunPicked]) {
                parent[++parentIndex] = left[leftunPicked];
                ++leftunPicked;

            } else {
                count = count + left.length-leftunPicked;
                if ((rightunPicked < right.length)) {
                    parent[++parentIndex] = right[rightunPicked];
                    ++rightunPicked;
                }
            }

        }

        while (leftunPicked < left.length) {
            parent[++parentIndex] = left[leftunPicked];
            ++leftunPicked;
        }

        while (rightunPicked < right.length) {
            parent[++parentIndex] = right[rightunPicked];
            ++rightunPicked;
        }

    }

}

Another approach where we can compare the input array with sorted array:- This implementation of Diablo answer. Though this should not be preferred approach as removing the n elements from an array or list is log(n^2).

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Iterator;
import java.util.List;


public class TestInversion {

    public static void main(String[] args) {
        
        Integer [] arr1 = {6, 9, 1, 14, 8, 12, 3, 2};
        
        List<Integer> arr = new ArrayList(Arrays.asList(arr1));
        List<Integer> sortArr = new ArrayList<Integer>();
        
        for(int i=0;i<arr.size();i++){
            sortArr.add(arr.get(i));
         
        }
        
            
        Collections.sort(sortArr);
        
        int inversion = 0;
        
        Iterator<Integer> iter = arr.iterator();
        
        while(iter.hasNext()){
            
            Integer el = (Integer)iter.next();
            int index = sortArr.indexOf(el);
            
            if(index+1 > 1){
                inversion = inversion + ((index+1)-1);
            }
            
            //iter.remove();
            sortArr.remove(el);
            
        }
        
        System.out.println("Inversions are "+inversion);
        
        
        

    }


}

In Java Brute force algorithm works faster than piggy backed merge sort algorithm this is because of run time optimization done by Java Dynamic compiler.

For Brute force loop rolling optimization will result in much better results.


public static int mergeSort(int[] a, int p, int r)
{
    int countInversion = 0;
    if(p < r)
    {
        int q = (p + r)/2;
        countInversion = mergeSort(a, p, q);
        countInversion += mergeSort(a, q+1, r);
        countInversion += merge(a, p, q, r);
    }
    return countInversion;
}

public static int merge(int[] a, int p, int q, int r)
{
    //p=0, q=1, r=3
    int countingInversion = 0;
    int n1 = q-p+1;
    int n2 = r-q;
    int[] temp1 = new int[n1+1];
    int[] temp2 = new int[n2+1];
    for(int i=0; i<n1; i++) temp1[i] = a[p+i];
    for(int i=0; i<n2; i++) temp2[i] = a[q+1+i];

    temp1[n1] = Integer.MAX_VALUE;
    temp2[n2] = Integer.MAX_VALUE;
    int i = 0, j = 0;

    for(int k=p; k<=r; k++)
    {
        if(temp1[i] <= temp2[j])
        {
            a[k] = temp1[i];
            i++;
        }
        else
        {
            a[k] = temp2[j];
            j++;
            countingInversion=countingInversion+(n1-i); 
        }
    }
    return countingInversion;
}
public static void main(String[] args)
{
    int[] a = {1, 20, 6, 4, 5};
    int countInversion = mergeSort(a, 0, a.length-1);
    System.out.println(countInversion);
}

O(n log n) time, O(n) space solution in java.

A mergesort, with a tweak to preserve the number of inversions performed during the merge step. (for a well explained mergesort take a look at http://www.vogella.com/tutorials/JavaAlgorithmsMergesort/article.html )

Since mergesort can be made in place, the space complexity may be improved to O(1).

When using this sort, the inversions happen only in the merge step and only when we have to put an element of the second part before elements from the first half, e.g.

  • 0 5 10 15

merged with

  • 1 6 22

we have 3 + 2 + 0 = 5 inversions:

  • 1 with {5, 10, 15}
  • 6 with {10, 15}
  • 22 with {}

After we have made the 5 inversions, our new merged list is 0, 1, 5, 6, 10, 15, 22

There is a demo task on Codility called ArrayInversionCount, where you can test your solution.

    public class FindInversions {

    public static int solution(int[] input) {
        if (input == null)
            return 0;
        int[] helper = new int[input.length];
        return mergeSort(0, input.length - 1, input, helper);
    }

    public static int mergeSort(int low, int high, int[] input, int[] helper) {
        int inversionCount = 0;
        if (low < high) {
            int medium = low + (high - low) / 2;
            inversionCount += mergeSort(low, medium, input, helper);
            inversionCount += mergeSort(medium + 1, high, input, helper);
            inversionCount += merge(low, medium, high, input, helper);
        }
        return inversionCount;
    }

    public static int merge(int low, int medium, int high, int[] input, int[] helper) {
        int inversionCount = 0;

        for (int i = low; i <= high; i++)
            helper[i] = input[i];

        int i = low;
        int j = medium + 1;
        int k = low;

        while (i <= medium && j <= high) {
            if (helper[i] <= helper[j]) {
                input[k] = helper[i];
                i++;
            } else {
                input[k] = helper[j];
                // the number of elements in the first half which the j element needs to jump over.
                // there is an inversion between each of those elements and j.
                inversionCount += (medium + 1 - i);
                j++;
            }
            k++;
        }

        // finish writing back in the input the elements from the first part
        while (i <= medium) {
            input[k] = helper[i];
            i++;
            k++;
        }
        return inversionCount;
    }

}

In Python

# O(n log n)

def count_inversion(lst):
    return merge_count_inversion(lst)[1]

def merge_count_inversion(lst):
    if len(lst) <= 1:
        return lst, 0
    middle = int( len(lst) / 2 )
    left, a = merge_count_inversion(lst[:middle])
    right, b = merge_count_inversion(lst[middle:])
    result, c = merge_count_split_inversion(left, right)
    return result, (a + b + c)

def merge_count_split_inversion(left, right):
    result = []
    count = 0
    i, j = 0, 0
    left_len = len(left)
    while i < left_len and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            count += left_len - i
            j += 1
    result += left[i:]
    result += right[j:]
    return result, count        


#test code
input_array_1 = []  #0
input_array_2 = [1] #0
input_array_3 = [1, 5]  #0
input_array_4 = [4, 1] #1
input_array_5 = [4, 1, 2, 3, 9] #3
input_array_6 = [4, 1, 3, 2, 9, 5]  #5
input_array_7 = [4, 1, 3, 2, 9, 1]  #8

print count_inversion(input_array_1)
print count_inversion(input_array_2)
print count_inversion(input_array_3)
print count_inversion(input_array_4)
print count_inversion(input_array_5)
print count_inversion(input_array_6)
print count_inversion(input_array_7)

I've found it in O(n * log n) time by the following method.

  1. Merge sort array A and create a copy (array B)
  2. Take A[1] and find its position in sorted array B via a binary search. The number of inversions for this element will be one less than the index number of its position in B since every lower number that appears after the first element of A will be an inversion.

    2a. accumulate the number of inversions to counter variable num_inversions.

    2b. remove A[1] from array A and also from its corresponding position in array B

  3. rerun from step 2 until there are no more elements in A.

Here’s an example run of this algorithm. Original array A = (6, 9, 1, 14, 8, 12, 3, 2)

1: Merge sort and copy to array B

B = (1, 2, 3, 6, 8, 9, 12, 14)

2: Take A[1] and binary search to find it in array B

A[1] = 6

B = (1, 2, 3, 6, 8, 9, 12, 14)

6 is in the 4th position of array B, thus there are 3 inversions. We know this because 6 was in the first position in array A, thus any lower value element that subsequently appears in array A would have an index of j > i (since i in this case is 1).

2.b: Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed).

A = (6, 9, 1, 14, 8, 12, 3, 2) = (9, 1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 6, 8, 9, 12, 14) = (1, 2, 3, 8, 9, 12, 14)

3: Rerun from step 2 on the new A and B arrays.

A[1] = 9

B = (1, 2, 3, 8, 9, 12, 14)

9 is now in the 5th position of array B, thus there are 4 inversions. We know this because 9 was in the first position in array A, thus any lower value element that subsequently appears would have an index of j > i (since i in this case is again 1). Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed)

A = (9, 1, 14, 8, 12, 3, 2) = (1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 8, 9, 12, 14) = (1, 2, 3, 8, 12, 14)

Continuing in this vein will give us the total number of inversions for array A once the loop is complete.

Step 1 (merge sort) would take O(n * log n) to execute. Step 2 would execute n times and at each execution would perform a binary search that takes O(log n) to run for a total of O(n * log n). Total running time would thus be O(n * log n) + O(n * log n) = O(n * log n).

Thanks for your help. Writing out the sample arrays on a piece of paper really helped to visualize the problem.


another Python solution

def inv_cnt(a):
n = len(a)
if n==1:
    return a,0
left = a[0:n//2] # should be smaller
left,cnt1 = inv_cnt(left)    
right = a[n//2:] # should be larger
right, cnt2 = inv_cnt(right)

cnt = 0    
i_left = i_right = i_a = 0
while i_a < n:
    if (i_right>=len(right)) or (i_left < len(left) and left[i_left] <= right[i_right]):
        a[i_a] = left[i_left]
        i_left += 1
    else:
        a[i_a] = right[i_right]
        i_right += 1              
        if i_left < len(left):
            cnt += len(left) - i_left        
    i_a += 1    

return (a, (cnt1 + cnt2 + cnt))

C++ T(n lg n) Solution with the printing of pair which constitute in inversion count.

int merge(vector<int>&nums , int low , int mid , int high){
    int size1 = mid - low +1;
    int size2= high - mid;
    vector<int>left;
    vector<int>right;
    for(int i = 0  ; i < size1 ; ++i){
        left.push_back(nums[low+i]);
    }
    for(int i = 0 ; i <size2 ; ++i){
        right.push_back(nums[mid+i+1]);
    }
    left.push_back(INT_MAX);
    right.push_back(INT_MAX);
    int i = 0 ;
    int j = 0;
    int start  = low;
    int inversion = 0 ;
    while(i < size1 && j < size2){
        if(left[i]<right[j]){
            nums[start] = left[i];
            start++;
            i++;
        }else{
            for(int l = i ; l < size1; ++l){
                cout<<"("<<left[l]<<","<<right[j]<<")"<<endl;
            }
            inversion += size1 - i;
            nums[start] = right[j];
            start++;
            j++;
        }
    }
    if(i == size1){
        for(int c = j ; c< size2 ; ++c){
            nums[start] = right[c];
            start++;
        }
    }
    if(j == size2){
        for(int c =  i ; c< size1 ; ++c){
            nums[start] = left[c];
            start++;
        }
    }
    return inversion;
}
int inversion_count(vector<int>& nums , int low , int high){
    if(high>low){
        int mid = low + (high-low)/2;
        int left = inversion_count(nums,low,mid);
        int right = inversion_count(nums,mid+1,high);
        int inversion = merge(nums,low,mid,high) + left + right;
        return inversion;
    }
    return 0 ;
}

O(n log n) time, O(n) space solution in java.

A mergesort, with a tweak to preserve the number of inversions performed during the merge step. (for a well explained mergesort take a look at http://www.vogella.com/tutorials/JavaAlgorithmsMergesort/article.html )

Since mergesort can be made in place, the space complexity may be improved to O(1).

When using this sort, the inversions happen only in the merge step and only when we have to put an element of the second part before elements from the first half, e.g.

  • 0 5 10 15

merged with

  • 1 6 22

we have 3 + 2 + 0 = 5 inversions:

  • 1 with {5, 10, 15}
  • 6 with {10, 15}
  • 22 with {}

After we have made the 5 inversions, our new merged list is 0, 1, 5, 6, 10, 15, 22

There is a demo task on Codility called ArrayInversionCount, where you can test your solution.

    public class FindInversions {

    public static int solution(int[] input) {
        if (input == null)
            return 0;
        int[] helper = new int[input.length];
        return mergeSort(0, input.length - 1, input, helper);
    }

    public static int mergeSort(int low, int high, int[] input, int[] helper) {
        int inversionCount = 0;
        if (low < high) {
            int medium = low + (high - low) / 2;
            inversionCount += mergeSort(low, medium, input, helper);
            inversionCount += mergeSort(medium + 1, high, input, helper);
            inversionCount += merge(low, medium, high, input, helper);
        }
        return inversionCount;
    }

    public static int merge(int low, int medium, int high, int[] input, int[] helper) {
        int inversionCount = 0;

        for (int i = low; i <= high; i++)
            helper[i] = input[i];

        int i = low;
        int j = medium + 1;
        int k = low;

        while (i <= medium && j <= high) {
            if (helper[i] <= helper[j]) {
                input[k] = helper[i];
                i++;
            } else {
                input[k] = helper[j];
                // the number of elements in the first half which the j element needs to jump over.
                // there is an inversion between each of those elements and j.
                inversionCount += (medium + 1 - i);
                j++;
            }
            k++;
        }

        // finish writing back in the input the elements from the first part
        while (i <= medium) {
            input[k] = helper[i];
            i++;
            k++;
        }
        return inversionCount;
    }

}

One possible solution in C++ satisfying the O(N*log(N)) time complexity requirement would be as follows.

#include <algorithm>

vector<int> merge(vector<int>left, vector<int>right, int &counter)
{

    vector<int> result;

    vector<int>::iterator it_l=left.begin();
    vector<int>::iterator it_r=right.begin();

    int index_left=0;

    while(it_l!=left.end() || it_r!=right.end())
    {

        // the following is true if we are finished with the left vector 
        // OR if the value in the right vector is the smaller one.

        if(it_l==left.end() || (it_r!=right.end() && *it_r<*it_l) )
        {
            result.push_back(*it_r);
            it_r++;

            // increase inversion counter
            counter+=left.size()-index_left;
        }
        else
        {
            result.push_back(*it_l);
            it_l++;
            index_left++;

        }
    }

    return result;
}

vector<int> merge_sort_and_count(vector<int> A, int &counter)
{

    int N=A.size();
    if(N==1)return A;

    vector<int> left(A.begin(),A.begin()+N/2);
    vector<int> right(A.begin()+N/2,A.end());

    left=merge_sort_and_count(left,counter);
    right=merge_sort_and_count(right,counter);


    return merge(left, right, counter);

}

It differs from a regular merge sort only by the counter.


Use mergesort, in merge step incremeant counter if the number copied to output is from right array.


The primary purpose of this answer is to compare the speeds of the various Python versions found here, but I also have a few contributions of my own. (FWIW, I just discovered this question while performing a duplicate search).

The relative execution speeds of algorithms implemented in CPython may be different to what one would expect from a simple analysis of the algorithms, and from experience with other languages. That's because Python provides many powerful functions and methods implemented in C that can operate on lists and other collections at close to the speed one would get in a fully-compiled language, so those operations run much faster than equivalent algorithms implemented "manually" with Python code.

Code that takes advantage of these tools can often outperform theoretically superior algorithms that try to do everything with Python operations on individual items of the collection. Of course the actual quantity of data being processed has an impact on this too. But for moderate amounts of data, code that uses an O(n²) algorithm running at C speed can easily beat an O(n log n) algorithm that does the bulk of its work with individual Python operations.

Many of the posted answers to this inversion counting question use an algorithm based on mergesort. Theoretically, this is a good approach, unless the array size is very small. But Python's built-in TimSort (a hybrid stable sorting algorithm, derived from merge sort and insertion sort) runs at C speed, and a mergesort coded by hand in Python cannot hope to compete with it for speed.

One of the more intriguing solutions here, in the answer posted by Niklas B, uses the built-in sort to determine the ranking of array items, and a Binary Indexed Tree (aka Fenwick tree) to store the cumulative sums required to calculate the inversion count. In the process of trying to understand this data structure and Niklas's algorithm I wrote a few variations of my own (posted below). But I also discovered that for moderate list sizes it's actually faster to use Python's built-in sum function than the lovely Fenwick tree.

def count_inversions(a):
    total = 0
    counts = [0] * len(a)
    rank = {v: i for i, v in enumerate(sorted(a))}
    for u in reversed(a):
        i = rank[u]
        total += sum(counts[:i])
        counts[i] += 1
    return total

Eventually, when the list size gets around 500, the O(n²) aspect of calling sum inside that for loop rears its ugly head, and the performance starts to plummet.

Mergesort isn't the only O(nlogn) sort, and several others may be utilized to perform inversion counting. prasadvk's answer uses a binary tree sort, however his code appears to be in C++ or one of its derivatives. So I've added a Python version. I originally used a class to implement the tree nodes, but discovered that a dict is noticeably faster. I eventually used list, which is even faster, although it does make the code a little less readable.

One bonus of treesort is that it's a lot easier to implement iteratively than mergesort is. Python doesn't optimize recursion and it has a recursion depth limit (although that can be increased if you really need it). And of course Python function calls are relatively slow, so when you're trying to optimize for speed it's good to avoid function calls, when practical.

Another O(nlogn) sort is the venerable radix sort. It's big advantage is that it doesn't compare keys to each other. It's disadvantage is that it works best on contiguous sequences of integers, ideally a permutation of integers in range(b**m) where b is usually 2. I added a few versions based on radix sort after attempting to read Counting Inversions, Offline Orthogonal Range Counting, and Related Problems which is linked in calculating the number of “inversions” in a permutation.

To use radix sort effectively to count inversions in a general sequence seq of length n we can create a permutation of range(n) that has the same number of inversions as seq. We can do that in (at worst) O(nlogn) time via TimSort. The trick is to permute the indices of seq by sorting seq. It's easier to explain this with a small example.

seq = [15, 14, 11, 12, 10, 13]
b = [t[::-1] for t in enumerate(seq)]
print(b)
b.sort()
print(b)

output

[(15, 0), (14, 1), (11, 2), (12, 3), (10, 4), (13, 5)]
[(10, 4), (11, 2), (12, 3), (13, 5), (14, 1), (15, 0)]

By sorting the (value, index) pairs of seq we have permuted the indices of seq with the same number of swaps that are required to put seq into its original order from its sorted order. We can create that permutation by sorting range(n) with a suitable key function:

print(sorted(range(len(seq)), key=lambda k: seq[k]))

output

[4, 2, 3, 5, 1, 0]

We can avoid that lambda by using seq's .__getitem__ method:

sorted(range(len(seq)), key=seq.__getitem__)

This is only slightly faster, but we're looking for all the speed enhancements we can get. ;)


The code below performs timeit tests on all of the existing Python algorithms on this page, plus a few of my own: a couple of brute-force O(n²) versions, a few variations on Niklas B's algorithm, and of course one based on mergesort (which I wrote without referring to the existing answers). It also has my list-based treesort code roughly derived from prasadvk's code, and various functions based on radix sort, some using a similar strategy to the mergesort approaches, and some using sum or a Fenwick tree.

This program measures the execution time of each function on a series of random lists of integers; it can also verify that each function gives the same results as the others, and that it doesn't modify the input list.

Each timeit call gives a vector containing 3 results, which I sort. The main value to look at here is the minimum one, the other values merely give an indication of how reliable that minimum value is, as discussed in the Note in the timeit module docs.

Unfortunately, the output from this program is too large to include in this answer, so I'm posting it in its own (community wiki) answer.

The output is from 3 runs on my ancient 32 bit single core 2GHz machine running Python 3.6.0 on an old Debian-derivative distro. YMMV. During the tests I shut down my Web browser and disconnected from my router to minimize the impact of other tasks on the CPU.

The first run tests all the functions with list sizes from 5 to 320, with loop sizes from 4096 to 64 (as the list size doubles, the loop size is halved). The random pool used to construct each list is half the size of the list itself, so we are likely to get lots of duplicates. Some of the inversion counting algorithms are more sensitive to duplicates than others.

The second run uses larger lists: 640 to 10240, and a fixed loop size of 8. To save time it eliminates several of the slowest functions from the tests. My brute-force O(n²) functions are just way too slow at these sizes, and as mentioned earlier, my code that uses sum, which does so well on small to moderate lists, just can't keep up on big lists.

The final run covers list sizes from 20480 to 655360, and a fixed loop size of 4, with the 8 fastest functions. For list sizes under 40,000 or so Tim Babych's code is the clear winner. Well done Tim! Niklas B's code is a good all-round performer too, although it gets beaten on the smaller lists. The bisection-based code of "python" also does rather well, although it appears to be a little slower with huge lists with lots of duplicates, probably due to that linear while loop it uses to step over dupes.

However, for the very large list sizes, the bisection-based algorithms can't compete with the true O(nlogn) algorithms.

#!/usr/bin/env python3

''' Test speeds of various ways of counting inversions in a list

    The inversion count is a measure of how sorted an array is.
    A pair of items in a are inverted if i < j but a[j] > a[i]

    See https://stackoverflow.com/questions/337664/counting-inversions-in-an-array

    This program contains code by the following authors:
    mkso
    Niklas B
    B. M.
    Tim Babych
    python
    Zhe Hu
    prasadvk
    noman pouigt
    PM 2Ring

    Timing and verification code by PM 2Ring
    Collated 2017.12.16
    Updated 2017.12.21
'''

from timeit import Timer
from random import seed, randrange
from bisect import bisect, insort_left

seed('A random seed string')

# Merge sort version by mkso
def count_inversion_mkso(lst):
    return merge_count_inversion(lst)[1]

def merge_count_inversion(lst):
    if len(lst) <= 1:
        return lst, 0
    middle = len(lst) // 2
    left, a = merge_count_inversion(lst[:middle])
    right, b = merge_count_inversion(lst[middle:])
    result, c = merge_count_split_inversion(left, right)
    return result, (a + b + c)

def merge_count_split_inversion(left, right):
    result = []
    count = 0
    i, j = 0, 0
    left_len = len(left)
    while i < left_len and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            count += left_len - i
            j += 1
    result += left[i:]
    result += right[j:]
    return result, count

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Using a Binary Indexed Tree, aka a Fenwick tree, by Niklas B.
def count_inversions_NiklasB(a):
    res = 0
    counts = [0] * (len(a) + 1)
    rank = {v: i for i, v in enumerate(sorted(a), 1)}
    for x in reversed(a):
        i = rank[x] - 1
        while i:
            res += counts[i]
            i -= i & -i
        i = rank[x]
        while i <= len(a):
            counts[i] += 1
            i += i & -i
    return res

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Merge sort version by B.M
# Modified by PM 2Ring to deal with the global counter
bm_count = 0

def merge_count_BM(seq):
    global bm_count
    bm_count = 0
    sort_bm(seq)
    return bm_count

def merge_bm(l1,l2):
    global bm_count
    l = []
    while l1 and l2:
        if l1[-1] <= l2[-1]:
            l.append(l2.pop())
        else:
            l.append(l1.pop())
            bm_count += len(l2)
    l.reverse()
    return l1 + l2 + l

def sort_bm(l):
    t = len(l) // 2
    return merge_bm(sort_bm(l[:t]), sort_bm(l[t:])) if t > 0 else l

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Bisection based method by Tim Babych
def solution_TimBabych(A):
    sorted_left = []
    res = 0
    for i in range(1, len(A)):
        insort_left(sorted_left, A[i-1])
        # i is also the length of sorted_left
        res += (i - bisect(sorted_left, A[i]))
    return res

# Slightly faster, except for very small lists
def solutionE_TimBabych(A):
    res = 0
    sorted_left = []
    for i, u in enumerate(A):
        # i is also the length of sorted_left
        res += (i - bisect(sorted_left, u))
        insort_left(sorted_left, u)
    return res

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Bisection based method by "python"
def solution_python(A):
    B = list(A)
    B.sort()
    inversion_count = 0
    for i in range(len(A)):
        j = binarySearch_python(B, A[i])
        while B[j] == B[j - 1]:
            if j < 1:
                break
            j -= 1
        inversion_count += j
        B.pop(j)
    return inversion_count

def binarySearch_python(alist, item):
    first = 0
    last = len(alist) - 1
    found = False
    while first <= last and not found:
        midpoint = (first + last) // 2
        if alist[midpoint] == item:
            return midpoint
        else:
            if item < alist[midpoint]:
                last = midpoint - 1
            else:
                first = midpoint + 1

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Merge sort version by Zhe Hu
def inv_cnt_ZheHu(a):
    _, count = inv_cnt(a.copy())
    return count

def inv_cnt(a):
    n = len(a)
    if n==1:
        return a, 0
    left = a[0:n//2] # should be smaller
    left, cnt1 = inv_cnt(left)
    right = a[n//2:] # should be larger
    right, cnt2 = inv_cnt(right)

    cnt = 0
    i_left = i_right = i_a = 0
    while i_a < n:
        if (i_right>=len(right)) or (i_left < len(left)
            and left[i_left] <= right[i_right]):
            a[i_a] = left[i_left]
            i_left += 1
        else:
            a[i_a] = right[i_right]
            i_right += 1
            if i_left < len(left):
                cnt += len(left) - i_left
        i_a += 1
    return (a, cnt1 + cnt2 + cnt)

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Merge sort version by noman pouigt
# From https://stackoverflow.com/q/47830098
def reversePairs_nomanpouigt(nums):
    def merge(left, right):
        if not left or not right:
            return (0, left + right)
        #if everything in left is less than right
        if left[len(left)-1] < right[0]:
            return (0, left + right)
        else:
            left_idx, right_idx, count = 0, 0, 0
            merged_output = []

            # check for condition before we merge it
            while left_idx < len(left) and right_idx < len(right):
                #if left[left_idx] > 2 * right[right_idx]:
                if left[left_idx] > right[right_idx]:
                    count += len(left) - left_idx
                    right_idx += 1
                else:
                    left_idx += 1

            #merging the sorted list
            left_idx, right_idx = 0, 0
            while left_idx < len(left) and right_idx < len(right):
                if left[left_idx] > right[right_idx]:
                    merged_output += [right[right_idx]]
                    right_idx += 1
                else:
                    merged_output += [left[left_idx]]
                    left_idx += 1
            if left_idx == len(left):
                merged_output += right[right_idx:]
            else:
                merged_output += left[left_idx:]
        return (count, merged_output)

    def partition(nums):
        count = 0
        if len(nums) == 1 or not nums:
            return (0, nums)
        pivot = len(nums)//2
        left_count, l = partition(nums[:pivot])
        right_count, r = partition(nums[pivot:])
        temp_count, temp_list = merge(l, r)
        return (temp_count + left_count + right_count, temp_list)
    return partition(nums)[0]

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# PM 2Ring
def merge_PM2R(seq):
    seq, count = merge_sort_count_PM2R(seq)
    return count

def merge_sort_count_PM2R(seq):
    mid = len(seq) // 2
    if mid == 0:
        return seq, 0
    left, left_total = merge_sort_count_PM2R(seq[:mid])
    right, right_total = merge_sort_count_PM2R(seq[mid:])
    total = left_total + right_total
    result = []
    i = j = 0
    left_len, right_len = len(left), len(right)
    while i < left_len and j < right_len:
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
            total += left_len - i
    result.extend(left[i:])
    result.extend(right[j:])
    return result, total

def rank_sum_PM2R(a):
    total = 0
    counts = [0] * len(a)
    rank = {v: i for i, v in enumerate(sorted(a))}
    for u in reversed(a):
        i = rank[u]
        total += sum(counts[:i])
        counts[i] += 1
    return total

# Fenwick tree functions adapted from C code on Wikipedia
def fen_sum(tree, i):
    ''' Return the sum of the first i elements, 0 through i-1 '''
    total = 0
    while i:
        total += tree[i-1]
        i -= i & -i
    return total

def fen_add(tree, delta, i):
    ''' Add delta to element i and thus 
        to fen_sum(tree, j) for all j > i 
    '''
    size = len(tree)
    while i < size:
        tree[i] += delta
        i += (i+1) & -(i+1)

def fenwick_PM2R(a):
    total = 0
    counts = [0] * len(a)
    rank = {v: i for i, v in enumerate(sorted(a))}
    for u in reversed(a):
        i = rank[u]
        total += fen_sum(counts, i)
        fen_add(counts, 1, i)
    return total

def fenwick_inline_PM2R(a):
    total = 0
    size = len(a)
    counts = [0] * size
    rank = {v: i for i, v in enumerate(sorted(a))}
    for u in reversed(a):
        i = rank[u]
        j = i + 1
        while i:
            total += counts[i]
            i -= i & -i
        while j < size:
            counts[j] += 1
            j += j & -j
    return total

def bruteforce_loops_PM2R(a):
    total = 0
    for i in range(1, len(a)):
        u = a[i]
        for j in range(i):
            if a[j] > u:
                total += 1
    return total

def bruteforce_sum_PM2R(a):
    return sum(1 for i in range(1, len(a)) for j in range(i) if a[j] > a[i])

# Using binary tree counting, derived from C++ code (?) by prasadvk
# https://stackoverflow.com/a/16056139
def ltree_count_PM2R(a):
    total, root = 0, None
    for u in a:
        # Store data in a list-based tree structure
        # [data, count, left_child, right_child]
        p = [u, 0, None, None]
        if root is None:
            root = p
            continue
        q = root
        while True:
            if p[0] < q[0]:
                total += 1 + q[1]
                child = 2
            else:
                q[1] += 1
                child = 3
            if q[child]:
                q = q[child]
            else:
                q[child] = p
                break
    return total

# Counting based on radix sort, recursive version
def radix_partition_rec(a, L):
    if len(a) < 2:
        return 0
    if len(a) == 2:
        return a[1] < a[0]
    left, right = [], []
    count = 0
    for u in a:
        if u & L:
            right.append(u)
        else:
            count += len(right)
            left.append(u)
    L >>= 1
    if L:
        count += radix_partition_rec(left, L) + radix_partition_rec(right, L)
    return count

# The following functions determine swaps using a permutation of 
# range(len(a)) that has the same inversion count as `a`. We can create
# this permutation with `sorted(range(len(a)), key=lambda k: a[k])`
# but `sorted(range(len(a)), key=a.__getitem__)` is a little faster.

# Counting based on radix sort, iterative version
def radix_partition_iter(seq, L):
    count = 0
    parts = [seq]
    while L and parts:
        newparts = []
        for a in parts:
            if len(a) < 2:
                continue
            if len(a) == 2:
                count += a[1] < a[0]
                continue
            left, right = [], []
            for u in a:
                if u & L:
                    right.append(u)
                else:
                    count += len(right)
                    left.append(u)
            if left:
                newparts.append(left)
            if right:
                newparts.append(right)
        parts = newparts
        L >>= 1
    return count

def perm_radixR_PM2R(a):
    size = len(a)
    b = sorted(range(size), key=a.__getitem__)
    n = size.bit_length() - 1
    return radix_partition_rec(b, 1 << n)

def perm_radixI_PM2R(a):
    size = len(a)
    b = sorted(range(size), key=a.__getitem__)
    n = size.bit_length() - 1
    return radix_partition_iter(b, 1 << n)

# Plain sum of the counts of the permutation
def perm_sum_PM2R(a):
    total = 0
    size = len(a)
    counts = [0] * size
    for i in reversed(sorted(range(size), key=a.__getitem__)):
        total += sum(counts[:i])
        counts[i] = 1
    return total

# Fenwick sum of the counts of the permutation
def perm_fenwick_PM2R(a):
    total = 0
    size = len(a)
    counts = [0] * size
    for i in reversed(sorted(range(size), key=a.__getitem__)):
        j = i + 1
        while i:
            total += counts[i]
            i -= i & -i
        while j < size:
            counts[j] += 1
            j += j & -j
    return total

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# All the inversion-counting functions
funcs = (
    solution_TimBabych,
    solutionE_TimBabych,
    solution_python,
    count_inversion_mkso,
    count_inversions_NiklasB,
    merge_count_BM,
    inv_cnt_ZheHu,
    reversePairs_nomanpouigt,
    fenwick_PM2R,
    fenwick_inline_PM2R,
    merge_PM2R,
    rank_sum_PM2R,
    bruteforce_loops_PM2R,
    bruteforce_sum_PM2R,
    ltree_count_PM2R,
    perm_radixR_PM2R,
    perm_radixI_PM2R,
    perm_sum_PM2R,
    perm_fenwick_PM2R,
)

def time_test(seq, loops, verify=False):
    orig = seq
    timings = []
    for func in funcs:
        seq = orig.copy()
        value = func(seq) if verify else None
        t = Timer(lambda: func(seq))
        result = sorted(t.repeat(3, loops))
        timings.append((result, func.__name__, value))
        assert seq==orig, 'Sequence altered by {}!'.format(func.__name__)
    first = timings[0][-1]
    timings.sort()
    for result, name, value in timings:
        result = ', '.join([format(u, '.5f') for u in result])
        print('{:24} : {}'.format(name, result))

    if verify:
        # Check that all results are identical
        bad = ['%s: %d' % (name, value)
            for _, name, value in timings if value != first]
        if bad:
            print('ERROR. Value: {}, bad: {}'.format(first, ', '.join(bad)))
        else:
            print('Value: {}'.format(first))
    print()

#Run the tests
size, loops = 5, 1 << 12
verify = True
for _ in range(7):
    hi = size // 2
    print('Size = {}, hi = {}, {} loops'.format(size, hi, loops))
    seq = [randrange(hi) for _ in range(size)]
    time_test(seq, loops, verify)
    loops >>= 1
    size <<= 1

#size, loops = 640, 8
#verify = False
#for _ in range(5):
    #hi = size // 2
    #print('Size = {}, hi = {}, {} loops'.format(size, hi, loops))
    #seq = [randrange(hi) for _ in range(size)]
    #time_test(seq, loops, verify)
    #size <<= 1

#size, loops = 163840, 4
#verify = False
#for _ in range(3):
    #hi = size // 2
    #print('Size = {}, hi = {}, {} loops'.format(size, hi, loops))
    #seq = [randrange(hi) for _ in range(size)]
    #time_test(seq, loops, verify)
    #size <<= 1

Please see here for the output


I've found it in O(n * log n) time by the following method.

  1. Merge sort array A and create a copy (array B)
  2. Take A[1] and find its position in sorted array B via a binary search. The number of inversions for this element will be one less than the index number of its position in B since every lower number that appears after the first element of A will be an inversion.

    2a. accumulate the number of inversions to counter variable num_inversions.

    2b. remove A[1] from array A and also from its corresponding position in array B

  3. rerun from step 2 until there are no more elements in A.

Here’s an example run of this algorithm. Original array A = (6, 9, 1, 14, 8, 12, 3, 2)

1: Merge sort and copy to array B

B = (1, 2, 3, 6, 8, 9, 12, 14)

2: Take A[1] and binary search to find it in array B

A[1] = 6

B = (1, 2, 3, 6, 8, 9, 12, 14)

6 is in the 4th position of array B, thus there are 3 inversions. We know this because 6 was in the first position in array A, thus any lower value element that subsequently appears in array A would have an index of j > i (since i in this case is 1).

2.b: Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed).

A = (6, 9, 1, 14, 8, 12, 3, 2) = (9, 1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 6, 8, 9, 12, 14) = (1, 2, 3, 8, 9, 12, 14)

3: Rerun from step 2 on the new A and B arrays.

A[1] = 9

B = (1, 2, 3, 8, 9, 12, 14)

9 is now in the 5th position of array B, thus there are 4 inversions. We know this because 9 was in the first position in array A, thus any lower value element that subsequently appears would have an index of j > i (since i in this case is again 1). Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed)

A = (9, 1, 14, 8, 12, 3, 2) = (1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 8, 9, 12, 14) = (1, 2, 3, 8, 12, 14)

Continuing in this vein will give us the total number of inversions for array A once the loop is complete.

Step 1 (merge sort) would take O(n * log n) to execute. Step 2 would execute n times and at each execution would perform a binary search that takes O(log n) to run for a total of O(n * log n). Total running time would thus be O(n * log n) + O(n * log n) = O(n * log n).

Thanks for your help. Writing out the sample arrays on a piece of paper really helped to visualize the problem.


I wonder why nobody mentioned binary-indexed trees yet. You can use one to maintain prefix sums on the values of your permutation elements. Then you can just proceed from right to left and count for every element the number of elements smaller than it to the right:

def count_inversions(a):
  res = 0
  counts = [0]*(len(a)+1)
  rank = { v : i+1 for i, v in enumerate(sorted(a)) }
  for x in reversed(a):
    i = rank[x] - 1
    while i:
      res += counts[i]
      i -= i & -i
    i = rank[x]
    while i <= len(a):
      counts[i] += 1
      i += i & -i
  return res

The complexity is O(n log n), and the constant factor is very low.


Note that the answer by Geoffrey Irving is wrong.

The number of inversions in an array is half the total distance elements must be moved in order to sort the array. Therefore, it can be computed by sorting the array, maintaining the resulting permutation p[i], and then computing the sum of abs(p[i]-i)/2. This takes O(n log n) time, which is optimal.

An alternative method is given at http://mathworld.wolfram.com/PermutationInversion.html. This method is equivalent to the sum of max(0, p[i]-i), which is equal to the sum of abs(p[i]-i])/2 since the total distance elements move left is equal to the total distance elements move to the right.

Take the sequence { 3, 2, 1 } as an example. There are three inversions: (3, 2), (3, 1), (2, 1), so the inversion number is 3. However, according to the quoted method the answer would have been 2.


Use mergesort, in merge step incremeant counter if the number copied to output is from right array.


I've found it in O(n * log n) time by the following method.

  1. Merge sort array A and create a copy (array B)
  2. Take A[1] and find its position in sorted array B via a binary search. The number of inversions for this element will be one less than the index number of its position in B since every lower number that appears after the first element of A will be an inversion.

    2a. accumulate the number of inversions to counter variable num_inversions.

    2b. remove A[1] from array A and also from its corresponding position in array B

  3. rerun from step 2 until there are no more elements in A.

Here’s an example run of this algorithm. Original array A = (6, 9, 1, 14, 8, 12, 3, 2)

1: Merge sort and copy to array B

B = (1, 2, 3, 6, 8, 9, 12, 14)

2: Take A[1] and binary search to find it in array B

A[1] = 6

B = (1, 2, 3, 6, 8, 9, 12, 14)

6 is in the 4th position of array B, thus there are 3 inversions. We know this because 6 was in the first position in array A, thus any lower value element that subsequently appears in array A would have an index of j > i (since i in this case is 1).

2.b: Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed).

A = (6, 9, 1, 14, 8, 12, 3, 2) = (9, 1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 6, 8, 9, 12, 14) = (1, 2, 3, 8, 9, 12, 14)

3: Rerun from step 2 on the new A and B arrays.

A[1] = 9

B = (1, 2, 3, 8, 9, 12, 14)

9 is now in the 5th position of array B, thus there are 4 inversions. We know this because 9 was in the first position in array A, thus any lower value element that subsequently appears would have an index of j > i (since i in this case is again 1). Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed)

A = (9, 1, 14, 8, 12, 3, 2) = (1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 8, 9, 12, 14) = (1, 2, 3, 8, 12, 14)

Continuing in this vein will give us the total number of inversions for array A once the loop is complete.

Step 1 (merge sort) would take O(n * log n) to execute. Step 2 would execute n times and at each execution would perform a binary search that takes O(log n) to run for a total of O(n * log n). Total running time would thus be O(n * log n) + O(n * log n) = O(n * log n).

Thanks for your help. Writing out the sample arrays on a piece of paper really helped to visualize the problem.


So here is O(n log n) solution in java.

long merge(int[] arr, int[] left, int[] right) {
    int i = 0, j = 0, count = 0;
    while (i < left.length || j < right.length) {
        if (i == left.length) {
            arr[i+j] = right[j];
            j++;
        } else if (j == right.length) {
            arr[i+j] = left[i];
            i++;
        } else if (left[i] <= right[j]) {
            arr[i+j] = left[i];
            i++;                
        } else {
            arr[i+j] = right[j];
            count += left.length-i;
            j++;
        }
    }
    return count;
}

long invCount(int[] arr) {
    if (arr.length < 2)
        return 0;

    int m = (arr.length + 1) / 2;
    int left[] = Arrays.copyOfRange(arr, 0, m);
    int right[] = Arrays.copyOfRange(arr, m, arr.length);

    return invCount(left) + invCount(right) + merge(arr, left, right);
}

This is almost normal merge sort, the whole magic is hidden in merge function. Note that while sorting algorithm remove inversions. While merging algorithm counts number of removed inversions (sorted out one might say).

The only moment when inversions are removed is when algorithm takes element from the right side of an array and merge it to the main array. The number of inversions removed by this operation is the number of elements left from the the left array to be merged. :)

Hope it's explanatory enough.


Here is O(n*log(n)) perl implementation:

sub sort_and_count {
    my ($arr, $n) = @_;
    return ($arr, 0) unless $n > 1;

    my $mid = $n % 2 == 1 ? ($n-1)/2 : $n/2;
    my @left = @$arr[0..$mid-1];
    my @right = @$arr[$mid..$n-1];

    my ($sleft, $x) = sort_and_count( \@left, $mid );
    my ($sright, $y) = sort_and_count( \@right, $n-$mid);
    my ($merged, $z) = merge_and_countsplitinv( $sleft, $sright, $n );

    return ($merged, $x+$y+$z);
}

sub merge_and_countsplitinv {
    my ($left, $right, $n) = @_;

    my ($l_c, $r_c) = ($#$left+1, $#$right+1);
    my ($i, $j) = (0, 0);
    my @merged;
    my $inv = 0;

    for my $k (0..$n-1) {
        if ($i<$l_c && $j<$r_c) {
            if ( $left->[$i] < $right->[$j]) {
                push @merged, $left->[$i];
                $i+=1;
            } else {
                push @merged, $right->[$j];
                $j+=1;
                $inv += $l_c - $i;
            }
        } else {
            if ($i>=$l_c) {
                push @merged, @$right[ $j..$#$right ];
            } else {
                push @merged, @$left[ $i..$#$left ];
            }
            last;
        }
    }

    return (\@merged, $inv);
}

The number of inversions in an array is half the total distance elements must be moved in order to sort the array. Therefore, it can be computed by sorting the array, maintaining the resulting permutation p[i], and then computing the sum of abs(p[i]-i)/2. This takes O(n log n) time, which is optimal.

An alternative method is given at http://mathworld.wolfram.com/PermutationInversion.html. This method is equivalent to the sum of max(0, p[i]-i), which is equal to the sum of abs(p[i]-i])/2 since the total distance elements move left is equal to the total distance elements move to the right.

EDIT: This method is wrong (see comments), and there is unfortunately no way to fix it while preserving the character of the method.


I had a question similar to this for homework actually. I was restricted that it must have O(nlogn) efficiency.

I used the idea you proposed of using Mergesort, since it is already of the correct efficiency. I just inserted some code into the merging function that was basically: Whenever a number from the array on the right is being added to the output array, I add to the total number of inversions, the amount of numbers remaining in the left array.

This makes a lot of sense to me now that I've thought about it enough. Your counting how many times there is a greater number coming before any numbers.

hth.


Here is c++ solution

/**
*array sorting needed to verify if first arrays n'th element is greater than sencond arrays
*some element then all elements following n will do the same
*/
#include<stdio.h>
#include<iostream>
using namespace std;
int countInversions(int array[],int size);
int merge(int arr1[],int size1,int arr2[],int size2,int[]);
int main()
{
    int array[] = {2, 4, 1, 3, 5};
    int size = sizeof(array) / sizeof(array[0]);
    int x = countInversions(array,size);
    printf("number of inversions = %d",x);
}

int countInversions(int array[],int size)
{
    if(size > 1 )
    {
    int mid = size / 2;
    int count1 = countInversions(array,mid);
    int count2 = countInversions(array+mid,size-mid);
    int temp[size];
    int count3 = merge(array,mid,array+mid,size-mid,temp);
    for(int x =0;x<size ;x++)
    {
        array[x] = temp[x];
    }
    return count1 + count2 + count3;
    }else{
        return 0;
    }
}

int merge(int arr1[],int size1,int arr2[],int size2,int temp[])
{
    int count  = 0;
    int a = 0;
    int b = 0;
    int c = 0;
    while(a < size1 && b < size2)
    {
        if(arr1[a] < arr2[b])
        {
            temp[c] = arr1[a];
            c++;
            a++;
        }else{
            temp[c] = arr2[b];
            b++;
            c++;
            count = count + size1 -a;
        }
    }

    while(a < size1)
    {
        temp[c] = arr1[a];
        c++;a++;
    }

while(b < size2)
    {
        temp[c] = arr2[b];
        c++;b++;
    }

    return count;
}

This answer contains the results of the timeit tests produced by the code in my main answer. Please see that answer for details!

count_inversions speed test results

Size = 5, hi = 2, 4096 loops
ltree_count_PM2R         : 0.04871, 0.04872, 0.04876
bruteforce_loops_PM2R    : 0.05696, 0.05700, 0.05776
solution_TimBabych       : 0.05760, 0.05822, 0.05943
solutionE_TimBabych      : 0.06642, 0.06704, 0.06760
bruteforce_sum_PM2R      : 0.07523, 0.07545, 0.07563
perm_sum_PM2R            : 0.09873, 0.09875, 0.09935
rank_sum_PM2R            : 0.10449, 0.10463, 0.10468
solution_python          : 0.13034, 0.13061, 0.13221
fenwick_inline_PM2R      : 0.14323, 0.14610, 0.18802
perm_radixR_PM2R         : 0.15146, 0.15203, 0.15235
merge_count_BM           : 0.16179, 0.16267, 0.16467
perm_radixI_PM2R         : 0.16200, 0.16202, 0.16768
perm_fenwick_PM2R        : 0.16887, 0.16920, 0.17075
merge_PM2R               : 0.18262, 0.18271, 0.18418
count_inversions_NiklasB : 0.19183, 0.19279, 0.20388
count_inversion_mkso     : 0.20060, 0.20141, 0.20398
inv_cnt_ZheHu            : 0.20815, 0.20841, 0.20906
fenwick_PM2R             : 0.22109, 0.22137, 0.22379
reversePairs_nomanpouigt : 0.29620, 0.29689, 0.30293
Value: 5

Size = 10, hi = 5, 2048 loops
solution_TimBabych       : 0.05954, 0.05989, 0.05991
solutionE_TimBabych      : 0.05970, 0.05972, 0.05998
perm_sum_PM2R            : 0.07517, 0.07519, 0.07520
ltree_count_PM2R         : 0.07672, 0.07677, 0.07684
bruteforce_loops_PM2R    : 0.07719, 0.07724, 0.07817
rank_sum_PM2R            : 0.08587, 0.08823, 0.08864
bruteforce_sum_PM2R      : 0.09470, 0.09472, 0.09484
solution_python          : 0.13126, 0.13154, 0.13185
perm_radixR_PM2R         : 0.14239, 0.14320, 0.14474
perm_radixI_PM2R         : 0.14632, 0.14669, 0.14679
fenwick_inline_PM2R      : 0.16796, 0.16831, 0.17030
perm_fenwick_PM2R        : 0.18189, 0.18212, 0.18638
merge_count_BM           : 0.19816, 0.19870, 0.19948
count_inversions_NiklasB : 0.21807, 0.22031, 0.22215
merge_PM2R               : 0.22037, 0.22048, 0.26106
fenwick_PM2R             : 0.24290, 0.24314, 0.24744
count_inversion_mkso     : 0.24895, 0.24899, 0.25205
inv_cnt_ZheHu            : 0.26253, 0.26259, 0.26590
reversePairs_nomanpouigt : 0.35711, 0.35762, 0.35973
Value: 20

Size = 20, hi = 10, 1024 loops
solutionE_TimBabych      : 0.05687, 0.05696, 0.05720
solution_TimBabych       : 0.06126, 0.06151, 0.06168
perm_sum_PM2R            : 0.06875, 0.06906, 0.07054
rank_sum_PM2R            : 0.07988, 0.07995, 0.08002
ltree_count_PM2R         : 0.11232, 0.11239, 0.11257
bruteforce_loops_PM2R    : 0.12553, 0.12584, 0.12592
solution_python          : 0.13472, 0.13540, 0.13694
bruteforce_sum_PM2R      : 0.15820, 0.15849, 0.16021
perm_radixI_PM2R         : 0.17101, 0.17148, 0.17229
perm_radixR_PM2R         : 0.17891, 0.18087, 0.18366
perm_fenwick_PM2R        : 0.20554, 0.20708, 0.21412
fenwick_inline_PM2R      : 0.21161, 0.21163, 0.22047
merge_count_BM           : 0.24125, 0.24261, 0.24565
count_inversions_NiklasB : 0.25712, 0.25754, 0.25778
merge_PM2R               : 0.26477, 0.26566, 0.31297
fenwick_PM2R             : 0.28178, 0.28216, 0.29069
count_inversion_mkso     : 0.30286, 0.30290, 0.30652
inv_cnt_ZheHu            : 0.32024, 0.32041, 0.32447
reversePairs_nomanpouigt : 0.45812, 0.45822, 0.46172
Value: 98

Size = 40, hi = 20, 512 loops
solutionE_TimBabych      : 0.05784, 0.05787, 0.05958
solution_TimBabych       : 0.06452, 0.06475, 0.06479
perm_sum_PM2R            : 0.07254, 0.07261, 0.07263
rank_sum_PM2R            : 0.08537, 0.08540, 0.08572
ltree_count_PM2R         : 0.11744, 0.11749, 0.11792
solution_python          : 0.14262, 0.14285, 0.14465
perm_radixI_PM2R         : 0.18774, 0.18776, 0.18922
perm_radixR_PM2R         : 0.19425, 0.19435, 0.19609
bruteforce_loops_PM2R    : 0.21500, 0.21511, 0.21686
perm_fenwick_PM2R        : 0.23338, 0.23375, 0.23674
fenwick_inline_PM2R      : 0.24947, 0.24958, 0.25189
bruteforce_sum_PM2R      : 0.27627, 0.27646, 0.28041
merge_count_BM           : 0.28059, 0.28128, 0.28294
count_inversions_NiklasB : 0.28557, 0.28759, 0.29022
merge_PM2R               : 0.29886, 0.29928, 0.30317
fenwick_PM2R             : 0.30241, 0.30259, 0.35237
count_inversion_mkso     : 0.34252, 0.34356, 0.34441
inv_cnt_ZheHu            : 0.37468, 0.37569, 0.37847
reversePairs_nomanpouigt : 0.50725, 0.50770, 0.50943
Value: 369

Size = 80, hi = 40, 256 loops
solutionE_TimBabych      : 0.06339, 0.06373, 0.06513
solution_TimBabych       : 0.06984, 0.06994, 0.07009
perm_sum_PM2R            : 0.09171, 0.09172, 0.09186
rank_sum_PM2R            : 0.10468, 0.10474, 0.10500
ltree_count_PM2R         : 0.14416, 0.15187, 0.18541
solution_python          : 0.17415, 0.17423, 0.17451
perm_radixI_PM2R         : 0.20676, 0.20681, 0.20936
perm_radixR_PM2R         : 0.21671, 0.21695, 0.21736
perm_fenwick_PM2R        : 0.26197, 0.26252, 0.26264
fenwick_inline_PM2R      : 0.28111, 0.28249, 0.28382
count_inversions_NiklasB : 0.31746, 0.32448, 0.32451
merge_count_BM           : 0.31964, 0.33842, 0.35276
merge_PM2R               : 0.32890, 0.32941, 0.33322
fenwick_PM2R             : 0.34355, 0.34377, 0.34873
count_inversion_mkso     : 0.37689, 0.37698, 0.38079
inv_cnt_ZheHu            : 0.42923, 0.42941, 0.43249
bruteforce_loops_PM2R    : 0.43544, 0.43601, 0.43902
bruteforce_sum_PM2R      : 0.52106, 0.52160, 0.52531
reversePairs_nomanpouigt : 0.57805, 0.58156, 0.58252
Value: 1467

Size = 160, hi = 80, 128 loops
solutionE_TimBabych      : 0.06766, 0.06784, 0.06963
solution_TimBabych       : 0.07433, 0.07489, 0.07516
perm_sum_PM2R            : 0.13143, 0.13175, 0.13179
rank_sum_PM2R            : 0.14428, 0.14440, 0.14922
solution_python          : 0.20072, 0.20076, 0.20084
ltree_count_PM2R         : 0.20314, 0.20583, 0.24776
perm_radixI_PM2R         : 0.23061, 0.23078, 0.23525
perm_radixR_PM2R         : 0.23894, 0.23915, 0.24234
perm_fenwick_PM2R        : 0.30984, 0.31181, 0.31503
fenwick_inline_PM2R      : 0.31933, 0.32680, 0.32722
merge_count_BM           : 0.36003, 0.36387, 0.36409
count_inversions_NiklasB : 0.36796, 0.36814, 0.37106
merge_PM2R               : 0.36847, 0.36848, 0.37127
fenwick_PM2R             : 0.37833, 0.37847, 0.38095
count_inversion_mkso     : 0.42746, 0.42747, 0.43184
inv_cnt_ZheHu            : 0.48969, 0.48974, 0.49293
reversePairs_nomanpouigt : 0.67791, 0.68157, 0.72420
bruteforce_loops_PM2R    : 0.82816, 0.83175, 0.83282
bruteforce_sum_PM2R      : 1.03322, 1.03378, 1.03562
Value: 6194

Size = 320, hi = 160, 64 loops
solutionE_TimBabych      : 0.07467, 0.07470, 0.07483
solution_TimBabych       : 0.08036, 0.08066, 0.08077
perm_sum_PM2R            : 0.21142, 0.21201, 0.25766
solution_python          : 0.22410, 0.22644, 0.22897
rank_sum_PM2R            : 0.22820, 0.22851, 0.22877
ltree_count_PM2R         : 0.24424, 0.24595, 0.24645
perm_radixI_PM2R         : 0.25690, 0.25710, 0.26191
perm_radixR_PM2R         : 0.26501, 0.26504, 0.26729
perm_fenwick_PM2R        : 0.33483, 0.33507, 0.33845
fenwick_inline_PM2R      : 0.34413, 0.34484, 0.35153
merge_count_BM           : 0.39875, 0.39919, 0.40302
fenwick_PM2R             : 0.40434, 0.40439, 0.40845
merge_PM2R               : 0.40814, 0.41531, 0.51417
count_inversions_NiklasB : 0.41681, 0.42009, 0.42128
count_inversion_mkso     : 0.47132, 0.47192, 0.47385
inv_cnt_ZheHu            : 0.54468, 0.54750, 0.54893
reversePairs_nomanpouigt : 0.76164, 0.76389, 0.80357
bruteforce_loops_PM2R    : 1.59125, 1.60430, 1.64131
bruteforce_sum_PM2R      : 2.03734, 2.03834, 2.03975
Value: 24959

Run 2

Size = 640, hi = 320, 8 loops
solutionE_TimBabych      : 0.04135, 0.04374, 0.04575
ltree_count_PM2R         : 0.06738, 0.06758, 0.06874
perm_radixI_PM2R         : 0.06928, 0.06943, 0.07019
fenwick_inline_PM2R      : 0.07850, 0.07856, 0.08059
perm_fenwick_PM2R        : 0.08151, 0.08162, 0.08170
perm_sum_PM2R            : 0.09122, 0.09133, 0.09221
rank_sum_PM2R            : 0.09549, 0.09603, 0.11270
merge_count_BM           : 0.10733, 0.10807, 0.11032
count_inversions_NiklasB : 0.12460, 0.19865, 0.20205
solution_python          : 0.13514, 0.13585, 0.13814

Size = 1280, hi = 640, 8 loops
solutionE_TimBabych      : 0.04714, 0.04742, 0.04752
perm_radixI_PM2R         : 0.15325, 0.15388, 0.15525
solution_python          : 0.15709, 0.15715, 0.16076
fenwick_inline_PM2R      : 0.16048, 0.16160, 0.16403
ltree_count_PM2R         : 0.16213, 0.16238, 0.16428
perm_fenwick_PM2R        : 0.16408, 0.16416, 0.16449
count_inversions_NiklasB : 0.19755, 0.19833, 0.19897
merge_count_BM           : 0.23736, 0.23793, 0.23912
perm_sum_PM2R            : 0.32946, 0.32969, 0.33277
rank_sum_PM2R            : 0.34637, 0.34756, 0.34858

Size = 2560, hi = 1280, 8 loops
solutionE_TimBabych      : 0.10898, 0.11005, 0.11025
perm_radixI_PM2R         : 0.33345, 0.33352, 0.37656
ltree_count_PM2R         : 0.34670, 0.34786, 0.34833
perm_fenwick_PM2R        : 0.34816, 0.34879, 0.35214
fenwick_inline_PM2R      : 0.36196, 0.36455, 0.36741
solution_python          : 0.36498, 0.36637, 0.40887
count_inversions_NiklasB : 0.42274, 0.42745, 0.42995
merge_count_BM           : 0.50799, 0.50898, 0.50917
perm_sum_PM2R            : 1.27773, 1.27897, 1.27951
rank_sum_PM2R            : 1.29728, 1.30389, 1.30448

Size = 5120, hi = 2560, 8 loops
solutionE_TimBabych      : 0.26914, 0.26993, 0.27253
perm_radixI_PM2R         : 0.71416, 0.71634, 0.71753
perm_fenwick_PM2R        : 0.71976, 0.72078, 0.72078
fenwick_inline_PM2R      : 0.72776, 0.72804, 0.73143
ltree_count_PM2R         : 0.81972, 0.82043, 0.82290
solution_python          : 0.83714, 0.83756, 0.83962
count_inversions_NiklasB : 0.87282, 0.87395, 0.92087
merge_count_BM           : 1.09496, 1.09584, 1.10207
rank_sum_PM2R            : 5.02564, 5.06277, 5.06666
perm_sum_PM2R            : 5.09088, 5.12999, 5.13512

Size = 10240, hi = 5120, 8 loops
solutionE_TimBabych      : 0.71556, 0.71718, 0.72201
perm_radixI_PM2R         : 1.54785, 1.55096, 1.55515
perm_fenwick_PM2R        : 1.55103, 1.55353, 1.59298
fenwick_inline_PM2R      : 1.57118, 1.57240, 1.57271
ltree_count_PM2R         : 1.76240, 1.76247, 1.80944
count_inversions_NiklasB : 1.86543, 1.86851, 1.87208
solution_python          : 2.01490, 2.01519, 2.06423
merge_count_BM           : 2.35215, 2.35301, 2.40023
rank_sum_PM2R            : 20.07048, 20.08399, 20.13200
perm_sum_PM2R            : 20.10187, 20.12551, 20.12683

Run 3
Size = 20480, hi = 10240, 4 loops
solutionE_TimBabych      : 1.07636, 1.08243, 1.09569
perm_radixI_PM2R         : 1.59579, 1.60519, 1.61785
perm_fenwick_PM2R        : 1.66885, 1.68549, 1.71109
fenwick_inline_PM2R      : 1.72073, 1.72752, 1.77217
ltree_count_PM2R         : 1.96900, 1.97820, 2.02578
count_inversions_NiklasB : 2.03257, 2.05005, 2.18548
merge_count_BM           : 2.46768, 2.47377, 2.52133
solution_python          : 2.49833, 2.50179, 3.79819

Size = 40960, hi = 20480, 4 loops
solutionE_TimBabych      : 3.51733, 3.52008, 3.56996
perm_radixI_PM2R         : 3.51736, 3.52365, 3.56459
perm_fenwick_PM2R        : 3.76097, 3.80900, 3.87974
fenwick_inline_PM2R      : 3.95099, 3.96300, 3.99748
ltree_count_PM2R         : 4.49866, 4.54652, 5.39716
count_inversions_NiklasB : 4.61851, 4.64303, 4.73026
merge_count_BM           : 5.31945, 5.35378, 5.35951
solution_python          : 6.78756, 6.82911, 6.98217

Size = 81920, hi = 40960, 4 loops
perm_radixI_PM2R         : 7.68723, 7.71986, 7.72135
perm_fenwick_PM2R        : 8.52404, 8.53349, 8.53710
fenwick_inline_PM2R      : 8.97082, 8.97561, 8.98347
ltree_count_PM2R         : 10.01142, 10.01426, 10.03216
count_inversions_NiklasB : 10.60807, 10.62424, 10.70425
merge_count_BM           : 11.42149, 11.42342, 11.47003
solutionE_TimBabych      : 12.83390, 12.83485, 12.89747
solution_python          : 19.66092, 19.67067, 20.72204

Size = 163840, hi = 81920, 4 loops
perm_radixI_PM2R         : 17.14153, 17.16885, 17.22240
perm_fenwick_PM2R        : 19.25944, 19.27844, 20.27568
fenwick_inline_PM2R      : 19.78221, 19.80219, 19.80766
ltree_count_PM2R         : 22.42240, 22.43259, 22.48837
count_inversions_NiklasB : 22.97341, 23.01516, 23.98052
merge_count_BM           : 24.42683, 24.48559, 24.51488
solutionE_TimBabych      : 60.96006, 61.20145, 63.71835
solution_python          : 73.75132, 73.79854, 73.95874

Size = 327680, hi = 163840, 4 loops
perm_radixI_PM2R         : 36.56715, 36.60221, 37.05071
perm_fenwick_PM2R        : 42.21616, 42.21838, 42.26053
fenwick_inline_PM2R      : 43.04987, 43.09075, 43.13287
ltree_count_PM2R         : 49.87400, 50.08509, 50.69292
count_inversions_NiklasB : 50.74591, 50.75012, 50.75551
merge_count_BM           : 52.37284, 52.51491, 53.43003
solutionE_TimBabych      : 373.67198, 377.03341, 377.42360
solution_python          : 411.69178, 411.92691, 412.83856

Size = 655360, hi = 327680, 4 loops
perm_radixI_PM2R         : 78.51927, 78.66327, 79.46325
perm_fenwick_PM2R        : 90.64711, 90.80328, 91.76126
fenwick_inline_PM2R      : 93.32482, 93.39086, 94.28880
count_inversions_NiklasB : 107.74393, 107.80036, 108.71443
ltree_count_PM2R         : 109.11328, 109.23592, 110.18247
merge_count_BM           : 111.05633, 111.07840, 112.05861
solutionE_TimBabych      : 1830.46443, 1836.39960, 1849.53918
solution_python          : 1911.03692, 1912.04484, 1914.69786

In Java Brute force algorithm works faster than piggy backed merge sort algorithm this is because of run time optimization done by Java Dynamic compiler.

For Brute force loop rolling optimization will result in much better results.


another Python solution

def inv_cnt(a):
n = len(a)
if n==1:
    return a,0
left = a[0:n//2] # should be smaller
left,cnt1 = inv_cnt(left)    
right = a[n//2:] # should be larger
right, cnt2 = inv_cnt(right)

cnt = 0    
i_left = i_right = i_a = 0
while i_a < n:
    if (i_right>=len(right)) or (i_left < len(left) and left[i_left] <= right[i_right]):
        a[i_a] = left[i_left]
        i_left += 1
    else:
        a[i_a] = right[i_right]
        i_right += 1              
        if i_left < len(left):
            cnt += len(left) - i_left        
    i_a += 1    

return (a, (cnt1 + cnt2 + cnt))

The primary purpose of this answer is to compare the speeds of the various Python versions found here, but I also have a few contributions of my own. (FWIW, I just discovered this question while performing a duplicate search).

The relative execution speeds of algorithms implemented in CPython may be different to what one would expect from a simple analysis of the algorithms, and from experience with other languages. That's because Python provides many powerful functions and methods implemented in C that can operate on lists and other collections at close to the speed one would get in a fully-compiled language, so those operations run much faster than equivalent algorithms implemented "manually" with Python code.

Code that takes advantage of these tools can often outperform theoretically superior algorithms that try to do everything with Python operations on individual items of the collection. Of course the actual quantity of data being processed has an impact on this too. But for moderate amounts of data, code that uses an O(n²) algorithm running at C speed can easily beat an O(n log n) algorithm that does the bulk of its work with individual Python operations.

Many of the posted answers to this inversion counting question use an algorithm based on mergesort. Theoretically, this is a good approach, unless the array size is very small. But Python's built-in TimSort (a hybrid stable sorting algorithm, derived from merge sort and insertion sort) runs at C speed, and a mergesort coded by hand in Python cannot hope to compete with it for speed.

One of the more intriguing solutions here, in the answer posted by Niklas B, uses the built-in sort to determine the ranking of array items, and a Binary Indexed Tree (aka Fenwick tree) to store the cumulative sums required to calculate the inversion count. In the process of trying to understand this data structure and Niklas's algorithm I wrote a few variations of my own (posted below). But I also discovered that for moderate list sizes it's actually faster to use Python's built-in sum function than the lovely Fenwick tree.

def count_inversions(a):
    total = 0
    counts = [0] * len(a)
    rank = {v: i for i, v in enumerate(sorted(a))}
    for u in reversed(a):
        i = rank[u]
        total += sum(counts[:i])
        counts[i] += 1
    return total

Eventually, when the list size gets around 500, the O(n²) aspect of calling sum inside that for loop rears its ugly head, and the performance starts to plummet.

Mergesort isn't the only O(nlogn) sort, and several others may be utilized to perform inversion counting. prasadvk's answer uses a binary tree sort, however his code appears to be in C++ or one of its derivatives. So I've added a Python version. I originally used a class to implement the tree nodes, but discovered that a dict is noticeably faster. I eventually used list, which is even faster, although it does make the code a little less readable.

One bonus of treesort is that it's a lot easier to implement iteratively than mergesort is. Python doesn't optimize recursion and it has a recursion depth limit (although that can be increased if you really need it). And of course Python function calls are relatively slow, so when you're trying to optimize for speed it's good to avoid function calls, when practical.

Another O(nlogn) sort is the venerable radix sort. It's big advantage is that it doesn't compare keys to each other. It's disadvantage is that it works best on contiguous sequences of integers, ideally a permutation of integers in range(b**m) where b is usually 2. I added a few versions based on radix sort after attempting to read Counting Inversions, Offline Orthogonal Range Counting, and Related Problems which is linked in calculating the number of “inversions” in a permutation.

To use radix sort effectively to count inversions in a general sequence seq of length n we can create a permutation of range(n) that has the same number of inversions as seq. We can do that in (at worst) O(nlogn) time via TimSort. The trick is to permute the indices of seq by sorting seq. It's easier to explain this with a small example.

seq = [15, 14, 11, 12, 10, 13]
b = [t[::-1] for t in enumerate(seq)]
print(b)
b.sort()
print(b)

output

[(15, 0), (14, 1), (11, 2), (12, 3), (10, 4), (13, 5)]
[(10, 4), (11, 2), (12, 3), (13, 5), (14, 1), (15, 0)]

By sorting the (value, index) pairs of seq we have permuted the indices of seq with the same number of swaps that are required to put seq into its original order from its sorted order. We can create that permutation by sorting range(n) with a suitable key function:

print(sorted(range(len(seq)), key=lambda k: seq[k]))

output

[4, 2, 3, 5, 1, 0]

We can avoid that lambda by using seq's .__getitem__ method:

sorted(range(len(seq)), key=seq.__getitem__)

This is only slightly faster, but we're looking for all the speed enhancements we can get. ;)


The code below performs timeit tests on all of the existing Python algorithms on this page, plus a few of my own: a couple of brute-force O(n²) versions, a few variations on Niklas B's algorithm, and of course one based on mergesort (which I wrote without referring to the existing answers). It also has my list-based treesort code roughly derived from prasadvk's code, and various functions based on radix sort, some using a similar strategy to the mergesort approaches, and some using sum or a Fenwick tree.

This program measures the execution time of each function on a series of random lists of integers; it can also verify that each function gives the same results as the others, and that it doesn't modify the input list.

Each timeit call gives a vector containing 3 results, which I sort. The main value to look at here is the minimum one, the other values merely give an indication of how reliable that minimum value is, as discussed in the Note in the timeit module docs.

Unfortunately, the output from this program is too large to include in this answer, so I'm posting it in its own (community wiki) answer.

The output is from 3 runs on my ancient 32 bit single core 2GHz machine running Python 3.6.0 on an old Debian-derivative distro. YMMV. During the tests I shut down my Web browser and disconnected from my router to minimize the impact of other tasks on the CPU.

The first run tests all the functions with list sizes from 5 to 320, with loop sizes from 4096 to 64 (as the list size doubles, the loop size is halved). The random pool used to construct each list is half the size of the list itself, so we are likely to get lots of duplicates. Some of the inversion counting algorithms are more sensitive to duplicates than others.

The second run uses larger lists: 640 to 10240, and a fixed loop size of 8. To save time it eliminates several of the slowest functions from the tests. My brute-force O(n²) functions are just way too slow at these sizes, and as mentioned earlier, my code that uses sum, which does so well on small to moderate lists, just can't keep up on big lists.

The final run covers list sizes from 20480 to 655360, and a fixed loop size of 4, with the 8 fastest functions. For list sizes under 40,000 or so Tim Babych's code is the clear winner. Well done Tim! Niklas B's code is a good all-round performer too, although it gets beaten on the smaller lists. The bisection-based code of "python" also does rather well, although it appears to be a little slower with huge lists with lots of duplicates, probably due to that linear while loop it uses to step over dupes.

However, for the very large list sizes, the bisection-based algorithms can't compete with the true O(nlogn) algorithms.

#!/usr/bin/env python3

''' Test speeds of various ways of counting inversions in a list

    The inversion count is a measure of how sorted an array is.
    A pair of items in a are inverted if i < j but a[j] > a[i]

    See https://stackoverflow.com/questions/337664/counting-inversions-in-an-array

    This program contains code by the following authors:
    mkso
    Niklas B
    B. M.
    Tim Babych
    python
    Zhe Hu
    prasadvk
    noman pouigt
    PM 2Ring

    Timing and verification code by PM 2Ring
    Collated 2017.12.16
    Updated 2017.12.21
'''

from timeit import Timer
from random import seed, randrange
from bisect import bisect, insort_left

seed('A random seed string')

# Merge sort version by mkso
def count_inversion_mkso(lst):
    return merge_count_inversion(lst)[1]

def merge_count_inversion(lst):
    if len(lst) <= 1:
        return lst, 0
    middle = len(lst) // 2
    left, a = merge_count_inversion(lst[:middle])
    right, b = merge_count_inversion(lst[middle:])
    result, c = merge_count_split_inversion(left, right)
    return result, (a + b + c)

def merge_count_split_inversion(left, right):
    result = []
    count = 0
    i, j = 0, 0
    left_len = len(left)
    while i < left_len and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            count += left_len - i
            j += 1
    result += left[i:]
    result += right[j:]
    return result, count

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Using a Binary Indexed Tree, aka a Fenwick tree, by Niklas B.
def count_inversions_NiklasB(a):
    res = 0
    counts = [0] * (len(a) + 1)
    rank = {v: i for i, v in enumerate(sorted(a), 1)}
    for x in reversed(a):
        i = rank[x] - 1
        while i:
            res += counts[i]
            i -= i & -i
        i = rank[x]
        while i <= len(a):
            counts[i] += 1
            i += i & -i
    return res

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Merge sort version by B.M
# Modified by PM 2Ring to deal with the global counter
bm_count = 0

def merge_count_BM(seq):
    global bm_count
    bm_count = 0
    sort_bm(seq)
    return bm_count

def merge_bm(l1,l2):
    global bm_count
    l = []
    while l1 and l2:
        if l1[-1] <= l2[-1]:
            l.append(l2.pop())
        else:
            l.append(l1.pop())
            bm_count += len(l2)
    l.reverse()
    return l1 + l2 + l

def sort_bm(l):
    t = len(l) // 2
    return merge_bm(sort_bm(l[:t]), sort_bm(l[t:])) if t > 0 else l

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Bisection based method by Tim Babych
def solution_TimBabych(A):
    sorted_left = []
    res = 0
    for i in range(1, len(A)):
        insort_left(sorted_left, A[i-1])
        # i is also the length of sorted_left
        res += (i - bisect(sorted_left, A[i]))
    return res

# Slightly faster, except for very small lists
def solutionE_TimBabych(A):
    res = 0
    sorted_left = []
    for i, u in enumerate(A):
        # i is also the length of sorted_left
        res += (i - bisect(sorted_left, u))
        insort_left(sorted_left, u)
    return res

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Bisection based method by "python"
def solution_python(A):
    B = list(A)
    B.sort()
    inversion_count = 0
    for i in range(len(A)):
        j = binarySearch_python(B, A[i])
        while B[j] == B[j - 1]:
            if j < 1:
                break
            j -= 1
        inversion_count += j
        B.pop(j)
    return inversion_count

def binarySearch_python(alist, item):
    first = 0
    last = len(alist) - 1
    found = False
    while first <= last and not found:
        midpoint = (first + last) // 2
        if alist[midpoint] == item:
            return midpoint
        else:
            if item < alist[midpoint]:
                last = midpoint - 1
            else:
                first = midpoint + 1

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Merge sort version by Zhe Hu
def inv_cnt_ZheHu(a):
    _, count = inv_cnt(a.copy())
    return count

def inv_cnt(a):
    n = len(a)
    if n==1:
        return a, 0
    left = a[0:n//2] # should be smaller
    left, cnt1 = inv_cnt(left)
    right = a[n//2:] # should be larger
    right, cnt2 = inv_cnt(right)

    cnt = 0
    i_left = i_right = i_a = 0
    while i_a < n:
        if (i_right>=len(right)) or (i_left < len(left)
            and left[i_left] <= right[i_right]):
            a[i_a] = left[i_left]
            i_left += 1
        else:
            a[i_a] = right[i_right]
            i_right += 1
            if i_left < len(left):
                cnt += len(left) - i_left
        i_a += 1
    return (a, cnt1 + cnt2 + cnt)

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# Merge sort version by noman pouigt
# From https://stackoverflow.com/q/47830098
def reversePairs_nomanpouigt(nums):
    def merge(left, right):
        if not left or not right:
            return (0, left + right)
        #if everything in left is less than right
        if left[len(left)-1] < right[0]:
            return (0, left + right)
        else:
            left_idx, right_idx, count = 0, 0, 0
            merged_output = []

            # check for condition before we merge it
            while left_idx < len(left) and right_idx < len(right):
                #if left[left_idx] > 2 * right[right_idx]:
                if left[left_idx] > right[right_idx]:
                    count += len(left) - left_idx
                    right_idx += 1
                else:
                    left_idx += 1

            #merging the sorted list
            left_idx, right_idx = 0, 0
            while left_idx < len(left) and right_idx < len(right):
                if left[left_idx] > right[right_idx]:
                    merged_output += [right[right_idx]]
                    right_idx += 1
                else:
                    merged_output += [left[left_idx]]
                    left_idx += 1
            if left_idx == len(left):
                merged_output += right[right_idx:]
            else:
                merged_output += left[left_idx:]
        return (count, merged_output)

    def partition(nums):
        count = 0
        if len(nums) == 1 or not nums:
            return (0, nums)
        pivot = len(nums)//2
        left_count, l = partition(nums[:pivot])
        right_count, r = partition(nums[pivot:])
        temp_count, temp_list = merge(l, r)
        return (temp_count + left_count + right_count, temp_list)
    return partition(nums)[0]

# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
# PM 2Ring
def merge_PM2R(seq):
    seq, count = merge_sort_count_PM2R(seq)
    return count

def merge_sort_count_PM2R(seq):
    mid = len(seq) // 2
    if mid == 0:
        return seq, 0
    left, left_total = merge_sort_count_PM2R(seq[:mid])
    right, right_total = merge_sort_count_PM2R(seq[mid:])
    total = left_total + right_total
    result = []
    i = j = 0
    left_len, right_len = len(left), len(right)
    while i < left_len and j < right_len:
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
            total += left_len - i
    result.extend(left[i:])
    result.extend(right[j:])
    return result, total

def rank_sum_PM2R(a):
    total = 0
    counts = [0] * len(a)
    rank = {v: i for i, v in enumerate(sorted(a))}
    for u in reversed(a):
        i = rank[u]
        total += sum(counts[:i])
        counts[i] += 1
    return total

# Fenwick tree functions adapted from C code on Wikipedia
def fen_sum(tree, i):
    ''' Return the sum of the first i elements, 0 through i-1 '''
    total = 0
    while i:
        total += tree[i-1]
        i -= i & -i
    return total

def fen_add(tree, delta, i):
    ''' Add delta to element i and thus 
        to fen_sum(tree, j) for all j > i 
    '''
    size = len(tree)
    while i < size:
        tree[i] += delta
        i += (i+1) & -(i+1)

def fenwick_PM2R(a):
    total = 0
    counts = [0] * len(a)
    rank = {v: i for i, v in enumerate(sorted(a))}
    for u in reversed(a):
        i = rank[u]
        total += fen_sum(counts, i)
        fen_add(counts, 1, i)
    return total

def fenwick_inline_PM2R(a):
    total = 0
    size = len(a)
    counts = [0] * size
    rank = {v: i for i, v in enumerate(sorted(a))}
    for u in reversed(a):
        i = rank[u]
        j = i + 1
        while i:
            total += counts[i]
            i -= i & -i
        while j < size:
            counts[j] += 1
            j += j & -j
    return total

def bruteforce_loops_PM2R(a):
    total = 0
    for i in range(1, len(a)):
        u = a[i]
        for j in range(i):
            if a[j] > u:
                total += 1
    return total

def bruteforce_sum_PM2R(a):
    return sum(1 for i in range(1, len(a)) for j in range(i) if a[j] > a[i])

# Using binary tree counting, derived from C++ code (?) by prasadvk
# https://stackoverflow.com/a/16056139
def ltree_count_PM2R(a):
    total, root = 0, None
    for u in a:
        # Store data in a list-based tree structure
        # [data, count, left_child, right_child]
        p = [u, 0, None, None]
        if root is None:
            root = p
            continue
        q = root
        while True:
            if p[0] < q[0]:
                total += 1 + q[1]
                child = 2
            else:
                q[1] += 1
                child = 3
            if q[child]:
                q = q[child]
            else:
                q[child] = p
                break
    return total

# Counting based on radix sort, recursive version
def radix_partition_rec(a, L):
    if len(a) < 2:
        return 0
    if len(a) == 2:
        return a[1] < a[0]
    left, right = [], []
    count = 0
    for u in a:
        if u & L:
            right.append(u)
        else:
            count += len(right)
            left.append(u)
    L >>= 1
    if L:
        count += radix_partition_rec(left, L) + radix_partition_rec(right, L)
    return count

# The following functions determine swaps using a permutation of 
# range(len(a)) that has the same inversion count as `a`. We can create
# this permutation with `sorted(range(len(a)), key=lambda k: a[k])`
# but `sorted(range(len(a)), key=a.__getitem__)` is a little faster.

# Counting based on radix sort, iterative version
def radix_partition_iter(seq, L):
    count = 0
    parts = [seq]
    while L and parts:
        newparts = []
        for a in parts:
            if len(a) < 2:
                continue
            if len(a) == 2:
                count += a[1] < a[0]
                continue
            left, right = [], []
            for u in a:
                if u & L:
                    right.append(u)
                else:
                    count += len(right)
                    left.append(u)
            if left:
                newparts.append(left)
            if right:
                newparts.append(right)
        parts = newparts
        L >>= 1
    return count

def perm_radixR_PM2R(a):
    size = len(a)
    b = sorted(range(size), key=a.__getitem__)
    n = size.bit_length() - 1
    return radix_partition_rec(b, 1 << n)

def perm_radixI_PM2R(a):
    size = len(a)
    b = sorted(range(size), key=a.__getitem__)
    n = size.bit_length() - 1
    return radix_partition_iter(b, 1 << n)

# Plain sum of the counts of the permutation
def perm_sum_PM2R(a):
    total = 0
    size = len(a)
    counts = [0] * size
    for i in reversed(sorted(range(size), key=a.__getitem__)):
        total += sum(counts[:i])
        counts[i] = 1
    return total

# Fenwick sum of the counts of the permutation
def perm_fenwick_PM2R(a):
    total = 0
    size = len(a)
    counts = [0] * size
    for i in reversed(sorted(range(size), key=a.__getitem__)):
        j = i + 1
        while i:
            total += counts[i]
            i -= i & -i
        while j < size:
            counts[j] += 1
            j += j & -j
    return total

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# All the inversion-counting functions
funcs = (
    solution_TimBabych,
    solutionE_TimBabych,
    solution_python,
    count_inversion_mkso,
    count_inversions_NiklasB,
    merge_count_BM,
    inv_cnt_ZheHu,
    reversePairs_nomanpouigt,
    fenwick_PM2R,
    fenwick_inline_PM2R,
    merge_PM2R,
    rank_sum_PM2R,
    bruteforce_loops_PM2R,
    bruteforce_sum_PM2R,
    ltree_count_PM2R,
    perm_radixR_PM2R,
    perm_radixI_PM2R,
    perm_sum_PM2R,
    perm_fenwick_PM2R,
)

def time_test(seq, loops, verify=False):
    orig = seq
    timings = []
    for func in funcs:
        seq = orig.copy()
        value = func(seq) if verify else None
        t = Timer(lambda: func(seq))
        result = sorted(t.repeat(3, loops))
        timings.append((result, func.__name__, value))
        assert seq==orig, 'Sequence altered by {}!'.format(func.__name__)
    first = timings[0][-1]
    timings.sort()
    for result, name, value in timings:
        result = ', '.join([format(u, '.5f') for u in result])
        print('{:24} : {}'.format(name, result))

    if verify:
        # Check that all results are identical
        bad = ['%s: %d' % (name, value)
            for _, name, value in timings if value != first]
        if bad:
            print('ERROR. Value: {}, bad: {}'.format(first, ', '.join(bad)))
        else:
            print('Value: {}'.format(first))
    print()

#Run the tests
size, loops = 5, 1 << 12
verify = True
for _ in range(7):
    hi = size // 2
    print('Size = {}, hi = {}, {} loops'.format(size, hi, loops))
    seq = [randrange(hi) for _ in range(size)]
    time_test(seq, loops, verify)
    loops >>= 1
    size <<= 1

#size, loops = 640, 8
#verify = False
#for _ in range(5):
    #hi = size // 2
    #print('Size = {}, hi = {}, {} loops'.format(size, hi, loops))
    #seq = [randrange(hi) for _ in range(size)]
    #time_test(seq, loops, verify)
    #size <<= 1

#size, loops = 163840, 4
#verify = False
#for _ in range(3):
    #hi = size // 2
    #print('Size = {}, hi = {}, {} loops'.format(size, hi, loops))
    #seq = [randrange(hi) for _ in range(size)]
    #time_test(seq, loops, verify)
    #size <<= 1

Please see here for the output


Another Python solution, short one. Makes use of builtin bisect module, which provides functions to insert element into its place in sorted array and to find index of element in sorted array.

The idea is to store elements left of n-th in such array, which would allow us to easily find the number of them greater than n-th.

import bisect
def solution(A):
    sorted_left = []
    res = 0
    for i in xrange(1, len(A)):
        bisect.insort_left(sorted_left, A[i-1])
        # i is also the length of sorted_left
        res += (i - bisect.bisect(sorted_left, A[i]))
    return res

Here is O(n*log(n)) perl implementation:

sub sort_and_count {
    my ($arr, $n) = @_;
    return ($arr, 0) unless $n > 1;

    my $mid = $n % 2 == 1 ? ($n-1)/2 : $n/2;
    my @left = @$arr[0..$mid-1];
    my @right = @$arr[$mid..$n-1];

    my ($sleft, $x) = sort_and_count( \@left, $mid );
    my ($sright, $y) = sort_and_count( \@right, $n-$mid);
    my ($merged, $z) = merge_and_countsplitinv( $sleft, $sright, $n );

    return ($merged, $x+$y+$z);
}

sub merge_and_countsplitinv {
    my ($left, $right, $n) = @_;

    my ($l_c, $r_c) = ($#$left+1, $#$right+1);
    my ($i, $j) = (0, 0);
    my @merged;
    my $inv = 0;

    for my $k (0..$n-1) {
        if ($i<$l_c && $j<$r_c) {
            if ( $left->[$i] < $right->[$j]) {
                push @merged, $left->[$i];
                $i+=1;
            } else {
                push @merged, $right->[$j];
                $j+=1;
                $inv += $l_c - $i;
            }
        } else {
            if ($i>=$l_c) {
                push @merged, @$right[ $j..$#$right ];
            } else {
                push @merged, @$left[ $i..$#$left ];
            }
            last;
        }
    }

    return (\@merged, $inv);
}

Here is my take using Scala:

trait MergeSort {
  def mergeSort(ls: List[Int]): List[Int] = {
    def merge(ls1: List[Int], ls2: List[Int]): List[Int] =
      (ls1, ls2) match {
        case (_, Nil) => ls1
        case (Nil, _) => ls2
        case (lowsHead :: lowsTail, highsHead :: highsTail) =>
          if (lowsHead <= highsHead) lowsHead :: merge(lowsTail, ls2)
          else highsHead :: merge(ls1, highsTail)
      }

    ls match {
      case Nil => Nil
      case head :: Nil => ls
      case _ =>
        val (lows, highs) = ls.splitAt(ls.size / 2)
        merge(mergeSort(lows), mergeSort(highs))
    }
  }
}

object InversionCounterApp extends App with MergeSort {
  @annotation.tailrec
  def calculate(list: List[Int], sortedListZippedWithIndex: List[(Int, Int)], counter: Int = 0): Int =
    list match {
      case Nil => counter
      case head :: tail => calculate(tail, sortedListZippedWithIndex.filterNot(_._1 == 1), counter + sortedListZippedWithIndex.find(_._1 == head).map(_._2).getOrElse(0))
    }

  val list: List[Int] = List(6, 9, 1, 14, 8, 12, 3, 2)
  val sortedListZippedWithIndex: List[(Int, Int)] = mergeSort(list).zipWithIndex
  println("inversion counter = " + calculate(list, sortedListZippedWithIndex))
  // prints: inversion counter = 28 
}

The number of inversions in an array is half the total distance elements must be moved in order to sort the array. Therefore, it can be computed by sorting the array, maintaining the resulting permutation p[i], and then computing the sum of abs(p[i]-i)/2. This takes O(n log n) time, which is optimal.

An alternative method is given at http://mathworld.wolfram.com/PermutationInversion.html. This method is equivalent to the sum of max(0, p[i]-i), which is equal to the sum of abs(p[i]-i])/2 since the total distance elements move left is equal to the total distance elements move to the right.

EDIT: This method is wrong (see comments), and there is unfortunately no way to fix it while preserving the character of the method.


I wonder why nobody mentioned binary-indexed trees yet. You can use one to maintain prefix sums on the values of your permutation elements. Then you can just proceed from right to left and count for every element the number of elements smaller than it to the right:

def count_inversions(a):
  res = 0
  counts = [0]*(len(a)+1)
  rank = { v : i+1 for i, v in enumerate(sorted(a)) }
  for x in reversed(a):
    i = rank[x] - 1
    while i:
      res += counts[i]
      i -= i & -i
    i = rank[x]
    while i <= len(a):
      counts[i] += 1
      i += i & -i
  return res

The complexity is O(n log n), and the constant factor is very low.


My answer in Python:

1- Sort the Array first and make a copy of it. In my program, B represents the sorted array. 2- Iterate over the original array (unsorted), and find the index of that element on the sorted list. Also note down the index of the element. 3- Make sure the element doesn't have any duplicates, if it has then you need to change the value of your index by -1. The while condition in my program is exactly doing that. 4- Keep counting the inversion that will your index value, and remove the element once you have calculated its inversion.

def binarySearch(alist, item):
    first = 0
    last = len(alist) - 1
    found = False

    while first <= last and not found:
        midpoint = (first + last)//2
        if alist[midpoint] == item:
            return midpoint
        else:
            if item < alist[midpoint]:
                last = midpoint - 1
            else:
                first = midpoint + 1

def solution(A):

    B = list(A)
    B.sort()
    inversion_count = 0
    for i in range(len(A)):
        j = binarySearch(B, A[i])
        while B[j] == B[j - 1]:
            if j < 1:
                break
            j -= 1

        inversion_count += j
        B.pop(j)

    if inversion_count > 1000000000:
        return -1
    else:
        return inversion_count

print solution([4, 10, 11, 1, 3, 9, 10])

Well I have a different solution but I am afraid that would work only for distinct array elements.

//Code
#include <bits/stdc++.h>
using namespace std;

int main()
{
    int i,n;
    cin >> n;
    int arr[n],inv[n];
    for(i=0;i<n;i++){
        cin >> arr[i];
    }
    vector<int> v;
    v.push_back(arr[n-1]);
    inv[n-1]=0;
    for(i=n-2;i>=0;i--){
        auto it = lower_bound(v.begin(),v.end(),arr[i]); 
        //calculating least element in vector v which is greater than arr[i]
        inv[i]=it-v.begin();
        //calculating distance from starting of vector
        v.insert(it,arr[i]);
        //inserting that element into vector v
    }
    for(i=0;i<n;i++){
        cout << inv[i] << " ";
    }
    cout << endl;
    return 0;
}

To explain my code we keep on adding elements from the end of Array.For any incoming array element we find the index of first element in vector v which is greater than our incoming element and assign that value to inversion count of the index of incoming element.After that we insert that element into vector v at it's correct position such that vector v remain in sorted order.

//INPUT     
4
2 1 4 3

//OUTPUT    
1 0 1 0

//To calculate total inversion count just add up all the elements in output array

My answer in Python:

1- Sort the Array first and make a copy of it. In my program, B represents the sorted array. 2- Iterate over the original array (unsorted), and find the index of that element on the sorted list. Also note down the index of the element. 3- Make sure the element doesn't have any duplicates, if it has then you need to change the value of your index by -1. The while condition in my program is exactly doing that. 4- Keep counting the inversion that will your index value, and remove the element once you have calculated its inversion.

def binarySearch(alist, item):
    first = 0
    last = len(alist) - 1
    found = False

    while first <= last and not found:
        midpoint = (first + last)//2
        if alist[midpoint] == item:
            return midpoint
        else:
            if item < alist[midpoint]:
                last = midpoint - 1
            else:
                first = midpoint + 1

def solution(A):

    B = list(A)
    B.sort()
    inversion_count = 0
    for i in range(len(A)):
        j = binarySearch(B, A[i])
        while B[j] == B[j - 1]:
            if j < 1:
                break
            j -= 1

        inversion_count += j
        B.pop(j)

    if inversion_count > 1000000000:
        return -1
    else:
        return inversion_count

print solution([4, 10, 11, 1, 3, 9, 10])

I've found it in O(n * log n) time by the following method.

  1. Merge sort array A and create a copy (array B)
  2. Take A[1] and find its position in sorted array B via a binary search. The number of inversions for this element will be one less than the index number of its position in B since every lower number that appears after the first element of A will be an inversion.

    2a. accumulate the number of inversions to counter variable num_inversions.

    2b. remove A[1] from array A and also from its corresponding position in array B

  3. rerun from step 2 until there are no more elements in A.

Here’s an example run of this algorithm. Original array A = (6, 9, 1, 14, 8, 12, 3, 2)

1: Merge sort and copy to array B

B = (1, 2, 3, 6, 8, 9, 12, 14)

2: Take A[1] and binary search to find it in array B

A[1] = 6

B = (1, 2, 3, 6, 8, 9, 12, 14)

6 is in the 4th position of array B, thus there are 3 inversions. We know this because 6 was in the first position in array A, thus any lower value element that subsequently appears in array A would have an index of j > i (since i in this case is 1).

2.b: Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed).

A = (6, 9, 1, 14, 8, 12, 3, 2) = (9, 1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 6, 8, 9, 12, 14) = (1, 2, 3, 8, 9, 12, 14)

3: Rerun from step 2 on the new A and B arrays.

A[1] = 9

B = (1, 2, 3, 8, 9, 12, 14)

9 is now in the 5th position of array B, thus there are 4 inversions. We know this because 9 was in the first position in array A, thus any lower value element that subsequently appears would have an index of j > i (since i in this case is again 1). Remove A[1] from array A and also from its corresponding position in array B (bold elements are removed)

A = (9, 1, 14, 8, 12, 3, 2) = (1, 14, 8, 12, 3, 2)

B = (1, 2, 3, 8, 9, 12, 14) = (1, 2, 3, 8, 12, 14)

Continuing in this vein will give us the total number of inversions for array A once the loop is complete.

Step 1 (merge sort) would take O(n * log n) to execute. Step 2 would execute n times and at each execution would perform a binary search that takes O(log n) to run for a total of O(n * log n). Total running time would thus be O(n * log n) + O(n * log n) = O(n * log n).

Thanks for your help. Writing out the sample arrays on a piece of paper really helped to visualize the problem.


Solution One.Work well when there is a large amount of numbers

def countInversions(arr):
 n = len(arr)
 if n == 1:
    return 0
 n1 = n // 2
 n2 = n - n1
 arr1 = arr[:n1]
 arr2 = arr[n1:]
 # print(n1,'||',n1,'||',arr1,'||',arr2)
 ans = countInversions(arr1) + countInversions(arr2)
 print(ans)
 i1 = 0
 i2 = 0
 for i in range(n):
     # print(i1,n1,i2,n2)
     if i1 < n1 and (i2 >= n2 or arr1[i1] <= arr2[i2]):
         arr[i] = arr1[i1]
         ans += i2
         i1 += 1
     elif i2 < n2:
         arr[i] = arr2[i2]
         i2 += 1
 return ans

Solution Two.Simple solution.

def countInversions(arr):
      count = 0
      for i in range(len(arr)):
          for j in range(i, len(arr)):
               # print(arr[i:len(arr)])
                 if arr[i] > arr[j]:
                     print(arr[i], arr[j])
                     count += 1
      print(count)

The easy O(n^2) answer is to use nested for-loops and increment a counter for every inversion

int counter = 0;

for(int i = 0; i < n - 1; i++)
{
    for(int j = i+1; j < n; j++)
    {
        if( A[i] > A[j] )
        {
            counter++;
        }
    }
}

return counter;

Now I suppose you want a more efficient solution, I'll think about it.


Here is one possible solution with variation of binary tree. It adds a field called rightSubTreeSize to each tree node. Keep on inserting number into binary tree in the order they appear in the array. If number goes lhs of node the inversion count for that element would be (1 + rightSubTreeSize). Since all those elements are greater than current element and they would have appeared earlier in the array. If element goes to rhs of a node, just increase its rightSubTreeSize. Following is the code.

Node { 
    int data;
    Node* left, *right;
    int rightSubTreeSize;

    Node(int data) { 
        rightSubTreeSize = 0;
    }   
};

Node* root = null;
int totCnt = 0;
for(i = 0; i < n; ++i) { 
    Node* p = new Node(a[i]);
    if(root == null) { 
        root = p;
        continue;
    } 

    Node* q = root;
    int curCnt = 0;
    while(q) { 
        if(p->data <= q->data) { 
            curCnt += 1 + q->rightSubTreeSize;
            if(q->left) { 
                q = q->left;
            } else { 
                q->left = p;
                break;
            }
        } else { 
            q->rightSubTreeSize++;
            if(q->right) { 
                q = q->right;
            } else { 
                q->right = p;
                break;
            }
        }
    }

    totCnt += curCnt;
  }
  return totCnt;

Another Python solution, short one. Makes use of builtin bisect module, which provides functions to insert element into its place in sorted array and to find index of element in sorted array.

The idea is to store elements left of n-th in such array, which would allow us to easily find the number of them greater than n-th.

import bisect
def solution(A):
    sorted_left = []
    res = 0
    for i in xrange(1, len(A)):
        bisect.insort_left(sorted_left, A[i-1])
        # i is also the length of sorted_left
        res += (i - bisect.bisect(sorted_left, A[i]))
    return res

Here's my O(n log n) solution in Ruby:

def solution(t)
    sorted, inversion_count = sort_inversion_count(t)
    return inversion_count
end

def sort_inversion_count(t)
    midpoint = t.length / 2
    left_half = t[0...midpoint]
    right_half = t[midpoint..t.length]

    if midpoint == 0
        return t, 0
    end

    sorted_left_half, left_half_inversion_count = sort_inversion_count(left_half)
    sorted_right_half, right_half_inversion_count = sort_inversion_count(right_half)

    sorted = []
    inversion_count = 0
    while sorted_left_half.length > 0 or sorted_right_half.length > 0
        if sorted_left_half.empty?
            sorted.push sorted_right_half.shift
        elsif sorted_right_half.empty?
            sorted.push sorted_left_half.shift
        else
            if sorted_left_half[0] > sorted_right_half[0]
                inversion_count += sorted_left_half.length
                sorted.push sorted_right_half.shift
            else
                sorted.push sorted_left_half.shift
            end
        end
    end

    return sorted, inversion_count + left_half_inversion_count + right_half_inversion_count
end

And some test cases:

require "minitest/autorun"

class TestCodility < Minitest::Test
    def test_given_example
        a = [-1, 6, 3, 4, 7, 4]
        assert_equal solution(a), 4
    end

    def test_empty
        a = []
        assert_equal solution(a), 0
    end

    def test_singleton
        a = [0]
        assert_equal solution(a), 0
    end

    def test_none
        a = [1,2,3,4,5,6,7]
        assert_equal solution(a), 0
    end

    def test_all
        a = [5,4,3,2,1]
        assert_equal solution(a), 10
    end

    def test_clones
        a = [4,4,4,4,4,4]
        assert_equal solution(a), 0
    end
end

public static int mergeSort(int[] a, int p, int r)
{
    int countInversion = 0;
    if(p < r)
    {
        int q = (p + r)/2;
        countInversion = mergeSort(a, p, q);
        countInversion += mergeSort(a, q+1, r);
        countInversion += merge(a, p, q, r);
    }
    return countInversion;
}

public static int merge(int[] a, int p, int q, int r)
{
    //p=0, q=1, r=3
    int countingInversion = 0;
    int n1 = q-p+1;
    int n2 = r-q;
    int[] temp1 = new int[n1+1];
    int[] temp2 = new int[n2+1];
    for(int i=0; i<n1; i++) temp1[i] = a[p+i];
    for(int i=0; i<n2; i++) temp2[i] = a[q+1+i];

    temp1[n1] = Integer.MAX_VALUE;
    temp2[n2] = Integer.MAX_VALUE;
    int i = 0, j = 0;

    for(int k=p; k<=r; k++)
    {
        if(temp1[i] <= temp2[j])
        {
            a[k] = temp1[i];
            i++;
        }
        else
        {
            a[k] = temp2[j];
            j++;
            countingInversion=countingInversion+(n1-i); 
        }
    }
    return countingInversion;
}
public static void main(String[] args)
{
    int[] a = {1, 20, 6, 4, 5};
    int countInversion = mergeSort(a, 0, a.length-1);
    System.out.println(countInversion);
}

Maximum number of inversions possible for a list of size n could be generalized by an expression:

maxPossibleInversions = (n * (n-1) ) / 2

So for an array of size 6 maximum possible inversions will equal 15.

To achieve a complexity of n logn we could piggy back the inversion algorithm on merge sort.

Here are the generalized steps:

  • Split the array into two
  • Call the mergeSort routine. If the element in the left subarray is greater than the element in right sub array make inversionCount += leftSubArray.length

That's it!

This is a simple example, I made using Javascript:

var arr = [6,5,4,3,2,1]; // Sample input array

var inversionCount = 0;

function mergeSort(arr) {
    if(arr.length == 1)
        return arr;

    if(arr.length > 1) {
        let breakpoint = Math.ceil((arr.length/2));
        // Left list starts with 0, breakpoint-1
        let leftList = arr.slice(0,breakpoint);
        // Right list starts with breakpoint, length-1
        let rightList = arr.slice(breakpoint,arr.length);

        // Make a recursive call
        leftList = mergeSort(leftList);
        rightList = mergeSort(rightList);

        var a = merge(leftList,rightList);
        return a;
    }
}

function merge(leftList,rightList) {
    let result = [];
    while(leftList.length && rightList.length) {
        /**
         * The shift() method removes the first element from an array
         * and returns that element. This method changes the length
         * of the array.
         */
        if(leftList[0] <= rightList[0]) {
            result.push(leftList.shift());
        }else{
            inversionCount += leftList.length;
            result.push(rightList.shift());
        }
    }

    while(leftList.length)
        result.push(leftList.shift());

    while(rightList.length)
        result.push(rightList.shift());

    console.log(result);
    return result;
}

mergeSort(arr);
console.log('Number of inversions: ' + inversionCount);

This answer contains the results of the timeit tests produced by the code in my main answer. Please see that answer for details!

count_inversions speed test results

Size = 5, hi = 2, 4096 loops
ltree_count_PM2R         : 0.04871, 0.04872, 0.04876
bruteforce_loops_PM2R    : 0.05696, 0.05700, 0.05776
solution_TimBabych       : 0.05760, 0.05822, 0.05943
solutionE_TimBabych      : 0.06642, 0.06704, 0.06760
bruteforce_sum_PM2R      : 0.07523, 0.07545, 0.07563
perm_sum_PM2R            : 0.09873, 0.09875, 0.09935
rank_sum_PM2R            : 0.10449, 0.10463, 0.10468
solution_python          : 0.13034, 0.13061, 0.13221
fenwick_inline_PM2R      : 0.14323, 0.14610, 0.18802
perm_radixR_PM2R         : 0.15146, 0.15203, 0.15235
merge_count_BM           : 0.16179, 0.16267, 0.16467
perm_radixI_PM2R         : 0.16200, 0.16202, 0.16768
perm_fenwick_PM2R        : 0.16887, 0.16920, 0.17075
merge_PM2R               : 0.18262, 0.18271, 0.18418
count_inversions_NiklasB : 0.19183, 0.19279, 0.20388
count_inversion_mkso     : 0.20060, 0.20141, 0.20398
inv_cnt_ZheHu            : 0.20815, 0.20841, 0.20906
fenwick_PM2R             : 0.22109, 0.22137, 0.22379
reversePairs_nomanpouigt : 0.29620, 0.29689, 0.30293
Value: 5

Size = 10, hi = 5, 2048 loops
solution_TimBabych       : 0.05954, 0.05989, 0.05991
solutionE_TimBabych      : 0.05970, 0.05972, 0.05998
perm_sum_PM2R            : 0.07517, 0.07519, 0.07520
ltree_count_PM2R         : 0.07672, 0.07677, 0.07684
bruteforce_loops_PM2R    : 0.07719, 0.07724, 0.07817
rank_sum_PM2R            : 0.08587, 0.08823, 0.08864
bruteforce_sum_PM2R      : 0.09470, 0.09472, 0.09484
solution_python          : 0.13126, 0.13154, 0.13185
perm_radixR_PM2R         : 0.14239, 0.14320, 0.14474
perm_radixI_PM2R         : 0.14632, 0.14669, 0.14679
fenwick_inline_PM2R      : 0.16796, 0.16831, 0.17030
perm_fenwick_PM2R        : 0.18189, 0.18212, 0.18638
merge_count_BM           : 0.19816, 0.19870, 0.19948
count_inversions_NiklasB : 0.21807, 0.22031, 0.22215
merge_PM2R               : 0.22037, 0.22048, 0.26106
fenwick_PM2R             : 0.24290, 0.24314, 0.24744
count_inversion_mkso     : 0.24895, 0.24899, 0.25205
inv_cnt_ZheHu            : 0.26253, 0.26259, 0.26590
reversePairs_nomanpouigt : 0.35711, 0.35762, 0.35973
Value: 20

Size = 20, hi = 10, 1024 loops
solutionE_TimBabych      : 0.05687, 0.05696, 0.05720
solution_TimBabych       : 0.06126, 0.06151, 0.06168
perm_sum_PM2R            : 0.06875, 0.06906, 0.07054
rank_sum_PM2R            : 0.07988, 0.07995, 0.08002
ltree_count_PM2R         : 0.11232, 0.11239, 0.11257
bruteforce_loops_PM2R    : 0.12553, 0.12584, 0.12592
solution_python          : 0.13472, 0.13540, 0.13694
bruteforce_sum_PM2R      : 0.15820, 0.15849, 0.16021
perm_radixI_PM2R         : 0.17101, 0.17148, 0.17229
perm_radixR_PM2R         : 0.17891, 0.18087, 0.18366
perm_fenwick_PM2R        : 0.20554, 0.20708, 0.21412
fenwick_inline_PM2R      : 0.21161, 0.21163, 0.22047
merge_count_BM           : 0.24125, 0.24261, 0.24565
count_inversions_NiklasB : 0.25712, 0.25754, 0.25778
merge_PM2R               : 0.26477, 0.26566, 0.31297
fenwick_PM2R             : 0.28178, 0.28216, 0.29069
count_inversion_mkso     : 0.30286, 0.30290, 0.30652
inv_cnt_ZheHu            : 0.32024, 0.32041, 0.32447
reversePairs_nomanpouigt : 0.45812, 0.45822, 0.46172
Value: 98

Size = 40, hi = 20, 512 loops
solutionE_TimBabych      : 0.05784, 0.05787, 0.05958
solution_TimBabych       : 0.06452, 0.06475, 0.06479
perm_sum_PM2R            : 0.07254, 0.07261, 0.07263
rank_sum_PM2R            : 0.08537, 0.08540, 0.08572
ltree_count_PM2R         : 0.11744, 0.11749, 0.11792
solution_python          : 0.14262, 0.14285, 0.14465
perm_radixI_PM2R         : 0.18774, 0.18776, 0.18922
perm_radixR_PM2R         : 0.19425, 0.19435, 0.19609
bruteforce_loops_PM2R    : 0.21500, 0.21511, 0.21686
perm_fenwick_PM2R        : 0.23338, 0.23375, 0.23674
fenwick_inline_PM2R      : 0.24947, 0.24958, 0.25189
bruteforce_sum_PM2R      : 0.27627, 0.27646, 0.28041
merge_count_BM           : 0.28059, 0.28128, 0.28294
count_inversions_NiklasB : 0.28557, 0.28759, 0.29022
merge_PM2R               : 0.29886, 0.29928, 0.30317
fenwick_PM2R             : 0.30241, 0.30259, 0.35237
count_inversion_mkso     : 0.34252, 0.34356, 0.34441
inv_cnt_ZheHu            : 0.37468, 0.37569, 0.37847
reversePairs_nomanpouigt : 0.50725, 0.50770, 0.50943
Value: 369

Size = 80, hi = 40, 256 loops
solutionE_TimBabych      : 0.06339, 0.06373, 0.06513
solution_TimBabych       : 0.06984, 0.06994, 0.07009
perm_sum_PM2R            : 0.09171, 0.09172, 0.09186
rank_sum_PM2R            : 0.10468, 0.10474, 0.10500
ltree_count_PM2R         : 0.14416, 0.15187, 0.18541
solution_python          : 0.17415, 0.17423, 0.17451
perm_radixI_PM2R         : 0.20676, 0.20681, 0.20936
perm_radixR_PM2R         : 0.21671, 0.21695, 0.21736
perm_fenwick_PM2R        : 0.26197, 0.26252, 0.26264
fenwick_inline_PM2R      : 0.28111, 0.28249, 0.28382
count_inversions_NiklasB : 0.31746, 0.32448, 0.32451
merge_count_BM           : 0.31964, 0.33842, 0.35276
merge_PM2R               : 0.32890, 0.32941, 0.33322
fenwick_PM2R             : 0.34355, 0.34377, 0.34873
count_inversion_mkso     : 0.37689, 0.37698, 0.38079
inv_cnt_ZheHu            : 0.42923, 0.42941, 0.43249
bruteforce_loops_PM2R    : 0.43544, 0.43601, 0.43902
bruteforce_sum_PM2R      : 0.52106, 0.52160, 0.52531
reversePairs_nomanpouigt : 0.57805, 0.58156, 0.58252
Value: 1467

Size = 160, hi = 80, 128 loops
solutionE_TimBabych      : 0.06766, 0.06784, 0.06963
solution_TimBabych       : 0.07433, 0.07489, 0.07516
perm_sum_PM2R            : 0.13143, 0.13175, 0.13179
rank_sum_PM2R            : 0.14428, 0.14440, 0.14922
solution_python          : 0.20072, 0.20076, 0.20084
ltree_count_PM2R         : 0.20314, 0.20583, 0.24776
perm_radixI_PM2R         : 0.23061, 0.23078, 0.23525
perm_radixR_PM2R         : 0.23894, 0.23915, 0.24234
perm_fenwick_PM2R        : 0.30984, 0.31181, 0.31503
fenwick_inline_PM2R      : 0.31933, 0.32680, 0.32722
merge_count_BM           : 0.36003, 0.36387, 0.36409
count_inversions_NiklasB : 0.36796, 0.36814, 0.37106
merge_PM2R               : 0.36847, 0.36848, 0.37127
fenwick_PM2R             : 0.37833, 0.37847, 0.38095
count_inversion_mkso     : 0.42746, 0.42747, 0.43184
inv_cnt_ZheHu            : 0.48969, 0.48974, 0.49293
reversePairs_nomanpouigt : 0.67791, 0.68157, 0.72420
bruteforce_loops_PM2R    : 0.82816, 0.83175, 0.83282
bruteforce_sum_PM2R      : 1.03322, 1.03378, 1.03562
Value: 6194

Size = 320, hi = 160, 64 loops
solutionE_TimBabych      : 0.07467, 0.07470, 0.07483
solution_TimBabych       : 0.08036, 0.08066, 0.08077
perm_sum_PM2R            : 0.21142, 0.21201, 0.25766
solution_python          : 0.22410, 0.22644, 0.22897
rank_sum_PM2R            : 0.22820, 0.22851, 0.22877
ltree_count_PM2R         : 0.24424, 0.24595, 0.24645
perm_radixI_PM2R         : 0.25690, 0.25710, 0.26191
perm_radixR_PM2R         : 0.26501, 0.26504, 0.26729
perm_fenwick_PM2R        : 0.33483, 0.33507, 0.33845
fenwick_inline_PM2R      : 0.34413, 0.34484, 0.35153
merge_count_BM           : 0.39875, 0.39919, 0.40302
fenwick_PM2R             : 0.40434, 0.40439, 0.40845
merge_PM2R               : 0.40814, 0.41531, 0.51417
count_inversions_NiklasB : 0.41681, 0.42009, 0.42128
count_inversion_mkso     : 0.47132, 0.47192, 0.47385
inv_cnt_ZheHu            : 0.54468, 0.54750, 0.54893
reversePairs_nomanpouigt : 0.76164, 0.76389, 0.80357
bruteforce_loops_PM2R    : 1.59125, 1.60430, 1.64131
bruteforce_sum_PM2R      : 2.03734, 2.03834, 2.03975
Value: 24959

Run 2

Size = 640, hi = 320, 8 loops
solutionE_TimBabych      : 0.04135, 0.04374, 0.04575
ltree_count_PM2R         : 0.06738, 0.06758, 0.06874
perm_radixI_PM2R         : 0.06928, 0.06943, 0.07019
fenwick_inline_PM2R      : 0.07850, 0.07856, 0.08059
perm_fenwick_PM2R        : 0.08151, 0.08162, 0.08170
perm_sum_PM2R            : 0.09122, 0.09133, 0.09221
rank_sum_PM2R            : 0.09549, 0.09603, 0.11270
merge_count_BM           : 0.10733, 0.10807, 0.11032
count_inversions_NiklasB : 0.12460, 0.19865, 0.20205
solution_python          : 0.13514, 0.13585, 0.13814

Size = 1280, hi = 640, 8 loops
solutionE_TimBabych      : 0.04714, 0.04742, 0.04752
perm_radixI_PM2R         : 0.15325, 0.15388, 0.15525
solution_python          : 0.15709, 0.15715, 0.16076
fenwick_inline_PM2R      : 0.16048, 0.16160, 0.16403
ltree_count_PM2R         : 0.16213, 0.16238, 0.16428
perm_fenwick_PM2R        : 0.16408, 0.16416, 0.16449
count_inversions_NiklasB : 0.19755, 0.19833, 0.19897
merge_count_BM           : 0.23736, 0.23793, 0.23912
perm_sum_PM2R            : 0.32946, 0.32969, 0.33277
rank_sum_PM2R            : 0.34637, 0.34756, 0.34858

Size = 2560, hi = 1280, 8 loops
solutionE_TimBabych      : 0.10898, 0.11005, 0.11025
perm_radixI_PM2R         : 0.33345, 0.33352, 0.37656
ltree_count_PM2R         : 0.34670, 0.34786, 0.34833
perm_fenwick_PM2R        : 0.34816, 0.34879, 0.35214
fenwick_inline_PM2R      : 0.36196, 0.36455, 0.36741
solution_python          : 0.36498, 0.36637, 0.40887
count_inversions_NiklasB : 0.42274, 0.42745, 0.42995
merge_count_BM           : 0.50799, 0.50898, 0.50917
perm_sum_PM2R            : 1.27773, 1.27897, 1.27951
rank_sum_PM2R            : 1.29728, 1.30389, 1.30448

Size = 5120, hi = 2560, 8 loops
solutionE_TimBabych      : 0.26914, 0.26993, 0.27253
perm_radixI_PM2R         : 0.71416, 0.71634, 0.71753
perm_fenwick_PM2R        : 0.71976, 0.72078, 0.72078
fenwick_inline_PM2R      : 0.72776, 0.72804, 0.73143
ltree_count_PM2R         : 0.81972, 0.82043, 0.82290
solution_python          : 0.83714, 0.83756, 0.83962
count_inversions_NiklasB : 0.87282, 0.87395, 0.92087
merge_count_BM           : 1.09496, 1.09584, 1.10207
rank_sum_PM2R            : 5.02564, 5.06277, 5.06666
perm_sum_PM2R            : 5.09088, 5.12999, 5.13512

Size = 10240, hi = 5120, 8 loops
solutionE_TimBabych      : 0.71556, 0.71718, 0.72201
perm_radixI_PM2R         : 1.54785, 1.55096, 1.55515
perm_fenwick_PM2R        : 1.55103, 1.55353, 1.59298
fenwick_inline_PM2R      : 1.57118, 1.57240, 1.57271
ltree_count_PM2R         : 1.76240, 1.76247, 1.80944
count_inversions_NiklasB : 1.86543, 1.86851, 1.87208
solution_python          : 2.01490, 2.01519, 2.06423
merge_count_BM           : 2.35215, 2.35301, 2.40023
rank_sum_PM2R            : 20.07048, 20.08399, 20.13200
perm_sum_PM2R            : 20.10187, 20.12551, 20.12683

Run 3
Size = 20480, hi = 10240, 4 loops
solutionE_TimBabych      : 1.07636, 1.08243, 1.09569
perm_radixI_PM2R         : 1.59579, 1.60519, 1.61785
perm_fenwick_PM2R        : 1.66885, 1.68549, 1.71109
fenwick_inline_PM2R      : 1.72073, 1.72752, 1.77217
ltree_count_PM2R         : 1.96900, 1.97820, 2.02578
count_inversions_NiklasB : 2.03257, 2.05005, 2.18548
merge_count_BM           : 2.46768, 2.47377, 2.52133
solution_python          : 2.49833, 2.50179, 3.79819

Size = 40960, hi = 20480, 4 loops
solutionE_TimBabych      : 3.51733, 3.52008, 3.56996
perm_radixI_PM2R         : 3.51736, 3.52365, 3.56459
perm_fenwick_PM2R        : 3.76097, 3.80900, 3.87974
fenwick_inline_PM2R      : 3.95099, 3.96300, 3.99748
ltree_count_PM2R         : 4.49866, 4.54652, 5.39716
count_inversions_NiklasB : 4.61851, 4.64303, 4.73026
merge_count_BM           : 5.31945, 5.35378, 5.35951
solution_python          : 6.78756, 6.82911, 6.98217

Size = 81920, hi = 40960, 4 loops
perm_radixI_PM2R         : 7.68723, 7.71986, 7.72135
perm_fenwick_PM2R        : 8.52404, 8.53349, 8.53710
fenwick_inline_PM2R      : 8.97082, 8.97561, 8.98347
ltree_count_PM2R         : 10.01142, 10.01426, 10.03216
count_inversions_NiklasB : 10.60807, 10.62424, 10.70425
merge_count_BM           : 11.42149, 11.42342, 11.47003
solutionE_TimBabych      : 12.83390, 12.83485, 12.89747
solution_python          : 19.66092, 19.67067, 20.72204

Size = 163840, hi = 81920, 4 loops
perm_radixI_PM2R         : 17.14153, 17.16885, 17.22240
perm_fenwick_PM2R        : 19.25944, 19.27844, 20.27568
fenwick_inline_PM2R      : 19.78221, 19.80219, 19.80766
ltree_count_PM2R         : 22.42240, 22.43259, 22.48837
count_inversions_NiklasB : 22.97341, 23.01516, 23.98052
merge_count_BM           : 24.42683, 24.48559, 24.51488
solutionE_TimBabych      : 60.96006, 61.20145, 63.71835
solution_python          : 73.75132, 73.79854, 73.95874

Size = 327680, hi = 163840, 4 loops
perm_radixI_PM2R         : 36.56715, 36.60221, 37.05071
perm_fenwick_PM2R        : 42.21616, 42.21838, 42.26053
fenwick_inline_PM2R      : 43.04987, 43.09075, 43.13287
ltree_count_PM2R         : 49.87400, 50.08509, 50.69292
count_inversions_NiklasB : 50.74591, 50.75012, 50.75551
merge_count_BM           : 52.37284, 52.51491, 53.43003
solutionE_TimBabych      : 373.67198, 377.03341, 377.42360
solution_python          : 411.69178, 411.92691, 412.83856

Size = 655360, hi = 327680, 4 loops
perm_radixI_PM2R         : 78.51927, 78.66327, 79.46325
perm_fenwick_PM2R        : 90.64711, 90.80328, 91.76126
fenwick_inline_PM2R      : 93.32482, 93.39086, 94.28880
count_inversions_NiklasB : 107.74393, 107.80036, 108.71443
ltree_count_PM2R         : 109.11328, 109.23592, 110.18247
merge_count_BM           : 111.05633, 111.07840, 112.05861
solutionE_TimBabych      : 1830.46443, 1836.39960, 1849.53918
solution_python          : 1911.03692, 1912.04484, 1914.69786

I recently had to do this in R:

inversionNumber <- function(x){
    mergeSort <- function(x){
        if(length(x) == 1){
            inv <- 0
        } else {
            n <- length(x)
            n1 <- ceiling(n/2)
            n2 <- n-n1
            y1 <- mergeSort(x[1:n1])
            y2 <- mergeSort(x[n1+1:n2])
            inv <- y1$inversions + y2$inversions
            x1 <- y1$sortedVector
            x2 <- y2$sortedVector
            i1 <- 1
            i2 <- 1
            while(i1+i2 <= n1+n2+1){
                if(i2 > n2 || i1 <= n1 && x1[i1] <= x2[i2]){
                    x[i1+i2-1] <- x1[i1]
                    i1 <- i1 + 1
                } else {
                    inv <- inv + n1 + 1 - i1
                    x[i1+i2-1] <- x2[i2]
                    i2 <- i2 + 1
                }
            }
        }
        return (list(inversions=inv,sortedVector=x))
    }
    r <- mergeSort(x)
    return (r$inversions)
}

C code easy to understand:

#include<stdio.h>
#include<stdlib.h>

//To print an array
void print(int arr[],int n)
{
    int i;
    for(i=0,printf("\n");i<n;i++)
        printf("%d ",arr[i]);
    printf("\n");
}

//Merge Sort
int merge(int arr[],int left[],int right[],int l,int r)
{
    int i=0,j=0,count=0;
    while(i<l || j<r)
    {
        if(i==l)
        {
            arr[i+j]=right[j];
            j++;
        }
        else if(j==r)
        {
            arr[i+j]=left[i];
            i++;
        }
        else if(left[i]<=right[j])
        {
            arr[i+j]=left[i];
            i++;
        }
        else
        {
            arr[i+j]=right[j];
            count+=l-i;
            j++;
        }
    }
    //printf("\ncount:%d\n",count);
    return count;
}

//Inversion Finding
int inversions(int arr[],int high)
{
    if(high<1)
        return 0;

    int mid=(high+1)/2;
    int left[mid];
    int right[high-mid+1];

    int i,j;
    for(i=0;i<mid;i++)
        left[i]=arr[i];


    for(i=high-mid,j=high;j>=mid;i--,j--)
        right[i]=arr[j];

    //print(arr,high+1);
    //print(left,mid);
    //print(right,high-mid+1);

    return inversions(left,mid-1) + inversions(right,high-mid) + merge(arr,left,right,mid,high-mid+1);

}
int main()
{
    int arr[]={6,9,1,14,8,12,3,2};
    int n=sizeof(arr)/sizeof(arr[0]);
    print(arr,n);
    printf("%d ",inversions(arr,n-1));
    return 0;
}

The easy O(n^2) answer is to use nested for-loops and increment a counter for every inversion

int counter = 0;

for(int i = 0; i < n - 1; i++)
{
    for(int j = i+1; j < n; j++)
    {
        if( A[i] > A[j] )
        {
            counter++;
        }
    }
}

return counter;

Now I suppose you want a more efficient solution, I'll think about it.


Here is my take using Scala:

trait MergeSort {
  def mergeSort(ls: List[Int]): List[Int] = {
    def merge(ls1: List[Int], ls2: List[Int]): List[Int] =
      (ls1, ls2) match {
        case (_, Nil) => ls1
        case (Nil, _) => ls2
        case (lowsHead :: lowsTail, highsHead :: highsTail) =>
          if (lowsHead <= highsHead) lowsHead :: merge(lowsTail, ls2)
          else highsHead :: merge(ls1, highsTail)
      }

    ls match {
      case Nil => Nil
      case head :: Nil => ls
      case _ =>
        val (lows, highs) = ls.splitAt(ls.size / 2)
        merge(mergeSort(lows), mergeSort(highs))
    }
  }
}

object InversionCounterApp extends App with MergeSort {
  @annotation.tailrec
  def calculate(list: List[Int], sortedListZippedWithIndex: List[(Int, Int)], counter: Int = 0): Int =
    list match {
      case Nil => counter
      case head :: tail => calculate(tail, sortedListZippedWithIndex.filterNot(_._1 == 1), counter + sortedListZippedWithIndex.find(_._1 == head).map(_._2).getOrElse(0))
    }

  val list: List[Int] = List(6, 9, 1, 14, 8, 12, 3, 2)
  val sortedListZippedWithIndex: List[(Int, Int)] = mergeSort(list).zipWithIndex
  println("inversion counter = " + calculate(list, sortedListZippedWithIndex))
  // prints: inversion counter = 28 
}

Solution One.Work well when there is a large amount of numbers

def countInversions(arr):
 n = len(arr)
 if n == 1:
    return 0
 n1 = n // 2
 n2 = n - n1
 arr1 = arr[:n1]
 arr2 = arr[n1:]
 # print(n1,'||',n1,'||',arr1,'||',arr2)
 ans = countInversions(arr1) + countInversions(arr2)
 print(ans)
 i1 = 0
 i2 = 0
 for i in range(n):
     # print(i1,n1,i2,n2)
     if i1 < n1 and (i2 >= n2 or arr1[i1] <= arr2[i2]):
         arr[i] = arr1[i1]
         ans += i2
         i1 += 1
     elif i2 < n2:
         arr[i] = arr2[i2]
         i2 += 1
 return ans

Solution Two.Simple solution.

def countInversions(arr):
      count = 0
      for i in range(len(arr)):
          for j in range(i, len(arr)):
               # print(arr[i:len(arr)])
                 if arr[i] > arr[j]:
                     print(arr[i], arr[j])
                     count += 1
      print(count)

Java implementation:

import java.lang.reflect.Array;
import java.util.Arrays;


public class main {

public static void main(String[] args) {
    int[] arr = {6, 9, 1, 14, 8, 12, 3, 2};
    System.out.println(findinversion(arr,0,arr.length-1));
}

public static int findinversion(int[] arr,int beg,int end) {
    if(beg >= end)
        return 0;

    int[] result = new int[end-beg+1];
    int index = 0;
    int mid = (beg+end)/2;
    int count = 0, leftinv,rightinv;
    //System.out.println("...."+beg+"   "+end+"  "+mid);
    leftinv = findinversion(arr, beg, mid);
    rightinv = findinversion(arr, mid+1, end);
    l1:
    for(int i = beg, j = mid+1; i<=mid || j<=end;/*index < result.length;*/ ) {
        if(i>mid) {
            for(;j<=end;j++)
                result[index++]=arr[j];
            break l1;
        }
        if(j>end) {
            for(;i<=mid;i++)
                result[index++]=arr[i];
            break l1;
        }
        if(arr[i] <= arr[j]) {
            result[index++]=arr[i];
            i++;    
        } else {
            System.out.println(arr[i]+"  "+arr[j]);
            count = count+ mid-i+1;
            result[index++]=arr[j];
            j++;    
        }
    }

    for(int i = 0, j=beg; i< end-beg+1; i++,j++)
        arr[j]= result[i];
    return (count+leftinv+rightinv);
    //System.out.println(Arrays.toString(arr));
}

}

In Python

# O(n log n)

def count_inversion(lst):
    return merge_count_inversion(lst)[1]

def merge_count_inversion(lst):
    if len(lst) <= 1:
        return lst, 0
    middle = int( len(lst) / 2 )
    left, a = merge_count_inversion(lst[:middle])
    right, b = merge_count_inversion(lst[middle:])
    result, c = merge_count_split_inversion(left, right)
    return result, (a + b + c)

def merge_count_split_inversion(left, right):
    result = []
    count = 0
    i, j = 0, 0
    left_len = len(left)
    while i < left_len and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            count += left_len - i
            j += 1
    result += left[i:]
    result += right[j:]
    return result, count        


#test code
input_array_1 = []  #0
input_array_2 = [1] #0
input_array_3 = [1, 5]  #0
input_array_4 = [4, 1] #1
input_array_5 = [4, 1, 2, 3, 9] #3
input_array_6 = [4, 1, 3, 2, 9, 5]  #5
input_array_7 = [4, 1, 3, 2, 9, 1]  #8

print count_inversion(input_array_1)
print count_inversion(input_array_2)
print count_inversion(input_array_3)
print count_inversion(input_array_4)
print count_inversion(input_array_5)
print count_inversion(input_array_6)
print count_inversion(input_array_7)

Here is a C code for count inversions

#include <stdio.h>
#include <stdlib.h>

int  _mergeSort(int arr[], int temp[], int left, int right);
int merge(int arr[], int temp[], int left, int mid, int right);

/* This function sorts the input array and returns the
   number of inversions in the array */
int mergeSort(int arr[], int array_size)
{
    int *temp = (int *)malloc(sizeof(int)*array_size);
    return _mergeSort(arr, temp, 0, array_size - 1);
}

/* An auxiliary recursive function that sorts the input array and
  returns the number of inversions in the array. */
int _mergeSort(int arr[], int temp[], int left, int right)
{
  int mid, inv_count = 0;
  if (right > left)
  {
    /* Divide the array into two parts and call _mergeSortAndCountInv()
       for each of the parts */
    mid = (right + left)/2;

    /* Inversion count will be sum of inversions in left-part, right-part
      and number of inversions in merging */
    inv_count  = _mergeSort(arr, temp, left, mid);
    inv_count += _mergeSort(arr, temp, mid+1, right);

    /*Merge the two parts*/
    inv_count += merge(arr, temp, left, mid+1, right);
  }
  return inv_count;
}

/* This funt merges two sorted arrays and returns inversion count in
   the arrays.*/
int merge(int arr[], int temp[], int left, int mid, int right)
{
  int i, j, k;
  int inv_count = 0;

  i = left; /* i is index for left subarray*/
  j = mid;  /* i is index for right subarray*/
  k = left; /* i is index for resultant merged subarray*/
  while ((i <= mid - 1) && (j <= right))
  {
    if (arr[i] <= arr[j])
    {
      temp[k++] = arr[i++];
    }
    else
    {
      temp[k++] = arr[j++];

     /*this is tricky -- see above explanation/diagram for merge()*/
      inv_count = inv_count + (mid - i);
    }
  }

  /* Copy the remaining elements of left subarray
   (if there are any) to temp*/
  while (i <= mid - 1)
    temp[k++] = arr[i++];

  /* Copy the remaining elements of right subarray
   (if there are any) to temp*/
  while (j <= right)
    temp[k++] = arr[j++];

  /*Copy back the merged elements to original array*/
  for (i=left; i <= right; i++)
    arr[i] = temp[i];

  return inv_count;
}

/* Driver progra to test above functions */
int main(int argv, char** args)
{
  int arr[] = {1, 20, 6, 4, 5};
  printf(" Number of inversions are %d \n", mergeSort(arr, 5));
  getchar();
  return 0;
}

An explanation was given in detail here: http://www.geeksforgeeks.org/counting-inversions/


The number of inversions in an array is half the total distance elements must be moved in order to sort the array. Therefore, it can be computed by sorting the array, maintaining the resulting permutation p[i], and then computing the sum of abs(p[i]-i)/2. This takes O(n log n) time, which is optimal.

An alternative method is given at http://mathworld.wolfram.com/PermutationInversion.html. This method is equivalent to the sum of max(0, p[i]-i), which is equal to the sum of abs(p[i]-i])/2 since the total distance elements move left is equal to the total distance elements move to the right.

EDIT: This method is wrong (see comments), and there is unfortunately no way to fix it while preserving the character of the method.