[python] How to calculate the sum of all columns of a 2D numpy array (efficiently)

Let's say I have the following 2D numpy array consisting of four rows and three columns:

>>> a = numpy.arange(12).reshape(4,3)
>>> print(a)
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]

What would be an efficient way to generate a 1D array that contains the sum of all columns (like [18, 22, 26])? Can this be done without having the need to loop through all columns?

Use numpy.sum. for your case, it is

sum = a.sum(axis=0)

Other alternatives for summing the columns are

numpy.einsum('ij->j', a)


numpy.dot(a.T, numpy.ones(a.shape[0]))

If the number of rows and columns is in the same order of magnitude, all of the possibilities are roughly equally fast:

enter image description here

If there are only a few columns, however, both the einsum and the dot solution significantly outperform numpy's sum (note the log-scale):

enter image description here

Code to reproduce the plots:

import numpy
import perfplot

def numpy_sum(a):
    return numpy.sum(a, axis=1)

def einsum(a):
    return numpy.einsum('ij->i', a)

def dot_ones(a):
    return numpy.dot(a, numpy.ones(a.shape[1]))

    # setup=lambda n: numpy.random.rand(n, n),
    setup=lambda n: numpy.random.rand(n, 3),
    n_range=[2**k for k in range(15)],
    kernels=[numpy_sum, einsum, dot_ones],


should solve the problem. It is a 2d np.array and you will get the sum of all column. axis=0 is the dimension that points downwards and axis=1 the one that points to the right.

Then NumPy sum function takes an optional axis argument that specifies along which axis you would like the sum performed:

>>> a = numpy.arange(12).reshape(4,3)
>>> a.sum(0)
array([18, 22, 26])

Or, equivalently:

>>> numpy.sum(a, 0)
array([18, 22, 26])

Use the axis argument:

>> numpy.sum(a, axis=0)
  array([18, 22, 26])