[python] Difference between numpy.array shape (R, 1) and (R,)

In numpy, some of the operations return in shape (R, 1) but some return (R,). This will make matrix multiplication more tedious since explicit reshape is required. For example, given a matrix M, if we want to do numpy.dot(M[:,0], numpy.ones((1, R))) where R is the number of rows (of course, the same issue also occurs column-wise). We will get matrices are not aligned error since M[:,0] is in shape (R,) but numpy.ones((1, R)) is in shape (1, R).

So my questions are:

  1. What's the difference between shape (R, 1) and (R,). I know literally it's list of numbers and list of lists where all list contains only a number. Just wondering why not design numpy so that it favors shape (R, 1) instead of (R,) for easier matrix multiplication.

  2. Are there better ways for the above example? Without explicitly reshape like this: numpy.dot(M[:,0].reshape(R, 1), numpy.ones((1, R)))

This question is related to python numpy matrix multidimensional-array

The answer is


The difference between (R,) and (1,R) is literally the number of indices that you need to use. ones((1,R)) is a 2-D array that happens to have only one row. ones(R) is a vector. Generally if it doesn't make sense for the variable to have more than one row/column, you should be using a vector, not a matrix with a singleton dimension.

For your specific case, there are a couple of options:

1) Just make the second argument a vector. The following works fine:

    np.dot(M[:,0], np.ones(R))

2) If you want matlab like matrix operations, use the class matrix instead of ndarray. All matricies are forced into being 2-D arrays, and operator * does matrix multiplication instead of element-wise (so you don't need dot). In my experience, this is more trouble that it is worth, but it may be nice if you are used to matlab.


There are a lot of good answers here already. But for me it was hard to find some example, where the shape or array can break all the program.

So here is the one:

import numpy as np
a = np.array([1,2,3,4])
b = np.array([10,20,30,40])


from sklearn.linear_model import LinearRegression
regr = LinearRegression()
regr.fit(a,b)

This will fail with error:

ValueError: Expected 2D array, got 1D array instead

but if we add reshape to a:

a = np.array([1,2,3,4]).reshape(-1,1)

this works correctly!


1) The reason not to prefer a shape of (R, 1) over (R,) is that it unnecessarily complicates things. Besides, why would it be preferable to have shape (R, 1) by default for a length-R vector instead of (1, R)? It's better to keep it simple and be explicit when you require additional dimensions.

2) For your example, you are computing an outer product so you can do this without a reshape call by using np.outer:

np.outer(M[:,0], numpy.ones((1, R)))

For its base array class, 2d arrays are no more special than 1d or 3d ones. There are some operations the preserve the dimensions, some that reduce them, other combine or even expand them.

M=np.arange(9).reshape(3,3)
M[:,0].shape # (3,) selects one column, returns a 1d array
M[0,:].shape # same, one row, 1d array
M[:,[0]].shape # (3,1), index with a list (or array), returns 2d
M[:,[0,1]].shape # (3,2)

In [20]: np.dot(M[:,0].reshape(3,1),np.ones((1,3)))

Out[20]: 
array([[ 0.,  0.,  0.],
       [ 3.,  3.,  3.],
       [ 6.,  6.,  6.]])

In [21]: np.dot(M[:,[0]],np.ones((1,3)))
Out[21]: 
array([[ 0.,  0.,  0.],
       [ 3.,  3.,  3.],
       [ 6.,  6.,  6.]])

Other expressions that give the same array

np.dot(M[:,0][:,np.newaxis],np.ones((1,3)))
np.dot(np.atleast_2d(M[:,0]).T,np.ones((1,3)))
np.einsum('i,j',M[:,0],np.ones((3)))
M1=M[:,0]; R=np.ones((3)); np.dot(M1[:,None], R[None,:])

MATLAB started out with just 2D arrays. Newer versions allow more dimensions, but retain the lower bound of 2. But you still have to pay attention to the difference between a row matrix and column one, one with shape (1,3) v (3,1). How often have you written [1,2,3].'? I was going to write row vector and column vector, but with that 2d constraint, there aren't any vectors in MATLAB - at least not in the mathematical sense of vector as being 1d.

Have you looked at np.atleast_2d (also _1d and _3d versions)?


The shape is a tuple. If there is only 1 dimension the shape will be one number and just blank after a comma. For 2+ dimensions, there will be a number after all the commas.

# 1 dimension with 2 elements, shape = (2,). 
# Note there's nothing after the comma.
z=np.array([  # start dimension
    10,       # not a dimension
    20        # not a dimension
])            # end dimension
print(z.shape)

(2,)

# 2 dimensions, each with 1 element, shape = (2,1)
w=np.array([  # start outer dimension 
    [10],     # element is in an inner dimension
    [20]      # element is in an inner dimension
])            # end outer dimension
print(w.shape)

(2,1)


The data structure of shape (n,) is called a rank 1 array. It doesn't behave consistently as a row vector or a column vector which makes some of its operations and effects non intuitive. If you take the transpose of this (n,) data structure, it'll look exactly same and the dot product will give you a number and not a matrix. The vectors of shape (n,1) or (1,n) row or column vectors are much more intuitive and consistent.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to numpy

Unable to allocate array with shape and data type How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? Numpy, multiply array with scalar TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" Pytorch tensor to numpy array Numpy Resize/Rescale Image what does numpy ndarray shape do? How to round a numpy array? numpy array TypeError: only integer scalar arrays can be converted to a scalar index

Examples related to matrix

How to get element-wise matrix multiplication (Hadamard product) in numpy? How can I plot a confusion matrix? Error: stray '\240' in program What does the error "arguments imply differing number of rows: x, y" mean? How to input matrix (2D list) in Python? Difference between numpy.array shape (R, 1) and (R,) Counting the number of non-NaN elements in a numpy ndarray in Python Inverse of a matrix using numpy How to create an empty matrix in R? numpy matrix vector multiplication

Examples related to multidimensional-array

what does numpy ndarray shape do? len() of a numpy array in python What is the purpose of meshgrid in Python / NumPy? Convert a numpy.ndarray to string(or bytes) and convert it back to numpy.ndarray Typescript - multidimensional array initialization How to get every first element in 2 dimensional list How does numpy.newaxis work and when to use it? How to count the occurrence of certain item in an ndarray? Iterate through 2 dimensional array Selecting specific rows and columns from NumPy array