[python] Difference between numpy.array shape (R, 1) and (R,)

1. The meaning of shapes in NumPy

You write, "I know literally it's list of numbers and list of lists where all list contains only a number" but that's a bit of an unhelpful way to think about it.

The best way to think about NumPy arrays is that they consist of two parts, a data buffer which is just a block of raw elements, and a view which describes how to interpret the data buffer.

For example, if we create an array of 12 integers:

>>> a = numpy.arange(12)
>>> a
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

Then a consists of a data buffer, arranged something like this:

+-----------------------------------------------------------+
¦  0 ¦  1 ¦  2 ¦  3 ¦  4 ¦  5 ¦  6 ¦  7 ¦  8 ¦  9 ¦ 10 ¦ 11 ¦
+-----------------------------------------------------------+

and a view which describes how to interpret the data:

>>> a.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> a.dtype
dtype('int64')
>>> a.itemsize
8
>>> a.strides
(8,)
>>> a.shape
(12,)

Here the shape (12,) means the array is indexed by a single index which runs from 0 to 11. Conceptually, if we label this single index i, the array a looks like this:

i= 0    1    2    3    4    5    6    7    8    9   10   11
+-----------------------------------------------------------+
¦  0 ¦  1 ¦  2 ¦  3 ¦  4 ¦  5 ¦  6 ¦  7 ¦  8 ¦  9 ¦ 10 ¦ 11 ¦
+-----------------------------------------------------------+

If we reshape an array, this doesn't change the data buffer. Instead, it creates a new view that describes a different way to interpret the data. So after:

>>> b = a.reshape((3, 4))

the array b has the same data buffer as a, but now it is indexed by two indices which run from 0 to 2 and 0 to 3 respectively. If we label the two indices i and j, the array b looks like this:

i= 0    0    0    0    1    1    1    1    2    2    2    2
j= 0    1    2    3    0    1    2    3    0    1    2    3
+-----------------------------------------------------------+
¦  0 ¦  1 ¦  2 ¦  3 ¦  4 ¦  5 ¦  6 ¦  7 ¦  8 ¦  9 ¦ 10 ¦ 11 ¦
+-----------------------------------------------------------+

which means that:

>>> b[2,1]
9

You can see that the second index changes quickly and the first index changes slowly. If you prefer this to be the other way round, you can specify the order parameter:

>>> c = a.reshape((3, 4), order='F')

which results in an array indexed like this:

i= 0    1    2    0    1    2    0    1    2    0    1    2
j= 0    0    0    1    1    1    2    2    2    3    3    3
+-----------------------------------------------------------+
¦  0 ¦  1 ¦  2 ¦  3 ¦  4 ¦  5 ¦  6 ¦  7 ¦  8 ¦  9 ¦ 10 ¦ 11 ¦
+-----------------------------------------------------------+

which means that:

>>> c[2,1]
5

It should now be clear what it means for an array to have a shape with one or more dimensions of size 1. After:

>>> d = a.reshape((12, 1))

the array d is indexed by two indices, the first of which runs from 0 to 11, and the second index is always 0:

i= 0    1    2    3    4    5    6    7    8    9   10   11
j= 0    0    0    0    0    0    0    0    0    0    0    0
+-----------------------------------------------------------+
¦  0 ¦  1 ¦  2 ¦  3 ¦  4 ¦  5 ¦  6 ¦  7 ¦  8 ¦  9 ¦ 10 ¦ 11 ¦
+-----------------------------------------------------------+

and so:

>>> d[10,0]
10

A dimension of length 1 is "free" (in some sense), so there's nothing stopping you from going to town:

>>> e = a.reshape((1, 2, 1, 6, 1))

giving an array indexed like this:

i= 0    0    0    0    0    0    0    0    0    0    0    0
j= 0    0    0    0    0    0    1    1    1    1    1    1
k= 0    0    0    0    0    0    0    0    0    0    0    0
l= 0    1    2    3    4    5    0    1    2    3    4    5
m= 0    0    0    0    0    0    0    0    0    0    0    0
+-----------------------------------------------------------+
¦  0 ¦  1 ¦  2 ¦  3 ¦  4 ¦  5 ¦  6 ¦  7 ¦  8 ¦  9 ¦ 10 ¦ 11 ¦
+-----------------------------------------------------------+

and so:

>>> e[0,1,0,0,0]
6

See the NumPy internals documentation for more details about how arrays are implemented.

2. What to do?

Since numpy.reshape just creates a new view, you shouldn't be scared about using it whenever necessary. It's the right tool to use when you want to index an array in a different way.

However, in a long computation it's usually possible to arrange to construct arrays with the "right" shape in the first place, and so minimize the number of reshapes and transposes. But without seeing the actual context that led to the need for a reshape, it's hard to say what should be changed.

The example in your question is:

numpy.dot(M[:,0], numpy.ones((1, R)))

but this is not realistic. First, this expression:

M[:,0].sum()

computes the result more simply. Second, is there really something special about column 0? Perhaps what you actually need is:

M.sum(axis=0)

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to numpy

Unable to allocate array with shape and data type How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? Numpy, multiply array with scalar TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" Pytorch tensor to numpy array Numpy Resize/Rescale Image what does numpy ndarray shape do? How to round a numpy array? numpy array TypeError: only integer scalar arrays can be converted to a scalar index

Examples related to matrix

How to get element-wise matrix multiplication (Hadamard product) in numpy? How can I plot a confusion matrix? Error: stray '\240' in program What does the error "arguments imply differing number of rows: x, y" mean? How to input matrix (2D list) in Python? Difference between numpy.array shape (R, 1) and (R,) Counting the number of non-NaN elements in a numpy ndarray in Python Inverse of a matrix using numpy How to create an empty matrix in R? numpy matrix vector multiplication

Examples related to multidimensional-array

what does numpy ndarray shape do? len() of a numpy array in python What is the purpose of meshgrid in Python / NumPy? Convert a numpy.ndarray to string(or bytes) and convert it back to numpy.ndarray Typescript - multidimensional array initialization How to get every first element in 2 dimensional list How does numpy.newaxis work and when to use it? How to count the occurrence of certain item in an ndarray? Iterate through 2 dimensional array Selecting specific rows and columns from NumPy array