[python] How does the "view" method work in PyTorch?

I am confused about the method view() in the following code snippet.

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool  = nn.MaxPool2d(2,2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1   = nn.Linear(16*5*5, 120)
        self.fc2   = nn.Linear(120, 84)
        self.fc3   = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16*5*5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

My confusion is regarding the following line.

x = x.view(-1, 16*5*5)

What does tensor.view() function do? I have seen its usage in many places, but I can't understand how it interprets its parameters.

What happens if I give negative values as parameters to the view() function? For example, what happens if I call, tensor_variable.view(1, 1, -1)?

Can anyone explain the main principle of view() function with some examples?

This question is related to python memory pytorch torch tensor

The answer is


The view function is meant to reshape the tensor.

Say you have a tensor

import torch
a = torch.range(1, 16)

a is a tensor that has 16 elements from 1 to 16(included). If you want to reshape this tensor to make it a 4 x 4 tensor then you can use

a = a.view(4, 4)

Now a will be a 4 x 4 tensor. Note that after the reshape the total number of elements need to remain the same. Reshaping the tensor a to a 3 x 5 tensor would not be appropriate.

What is the meaning of parameter -1?

If there is any situation that you don't know how many rows you want but are sure of the number of columns, then you can specify this with a -1. (Note that you can extend this to tensors with more dimensions. Only one of the axis value can be -1). This is a way of telling the library: "give me a tensor that has these many columns and you compute the appropriate number of rows that is necessary to make this happen".

This can be seen in the neural network code that you have given above. After the line x = self.pool(F.relu(self.conv2(x))) in the forward function, you will have a 16 depth feature map. You have to flatten this to give it to the fully connected layer. So you tell pytorch to reshape the tensor you obtained to have specific number of columns and tell it to decide the number of rows by itself.

Drawing a similarity between numpy and pytorch, view is similar to numpy's reshape function.


Let's do some examples, from simpler to more difficult.

  1. The view method returns a tensor with the same data as the self tensor (which means that the returned tensor has the same number of elements), but with a different shape. For example:

    a = torch.arange(1, 17)  # a's shape is (16,)
    
    a.view(4, 4) # output below
      1   2   3   4
      5   6   7   8
      9  10  11  12
     13  14  15  16
    [torch.FloatTensor of size 4x4]
    
    a.view(2, 2, 4) # output below
    (0 ,.,.) = 
    1   2   3   4
    5   6   7   8
    
    (1 ,.,.) = 
     9  10  11  12
    13  14  15  16
    [torch.FloatTensor of size 2x2x4]
    
  2. Assuming that -1 is not one of the parameters, when you multiply them together, the result must be equal to the number of elements in the tensor. If you do: a.view(3, 3), it will raise a RuntimeError because shape (3 x 3) is invalid for input with 16 elements. In other words: 3 x 3 does not equal 16 but 9.

  3. You can use -1 as one of the parameters that you pass to the function, but only once. All that happens is that the method will do the math for you on how to fill that dimension. For example a.view(2, -1, 4) is equivalent to a.view(2, 2, 4). [16 / (2 x 4) = 2]

  4. Notice that the returned tensor shares the same data. If you make a change in the "view" you are changing the original tensor's data:

    b = a.view(4, 4)
    b[0, 2] = 2
    a[2] == 3.0
    False
    
  5. Now, for a more complex use case. The documentation says that each new view dimension must either be a subspace of an original dimension, or only span d, d + 1, ..., d + k that satisfy the following contiguity-like condition that for all i = 0, ..., k - 1, stride[i] = stride[i + 1] x size[i + 1]. Otherwise, contiguous() needs to be called before the tensor can be viewed. For example:

    a = torch.rand(5, 4, 3, 2) # size (5, 4, 3, 2)
    a_t = a.permute(0, 2, 3, 1) # size (5, 3, 2, 4)
    
    # The commented line below will raise a RuntimeError, because one dimension
    # spans across two contiguous subspaces
    # a_t.view(-1, 4)
    
    # instead do:
    a_t.contiguous().view(-1, 4)
    
    # To see why the first one does not work and the second does,
    # compare a.stride() and a_t.stride()
    a.stride() # (24, 6, 2, 1)
    a_t.stride() # (24, 2, 1, 6)
    

    Notice that for a_t, stride[0] != stride[1] x size[1] since 24 != 2 x 3


torch.Tensor.view()

Simply put, torch.Tensor.view() which is inspired by numpy.ndarray.reshape() or numpy.reshape(), creates a new view of the tensor, as long as the new shape is compatible with the shape of the original tensor.

Let's understand this in detail using a concrete example.

In [43]: t = torch.arange(18) 

In [44]: t 
Out[44]: 
tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17])

With this tensor t of shape (18,), new views can only be created for the following shapes:

(1, 18) or equivalently (1, -1) or (-1, 18)
(2, 9) or equivalently (2, -1) or (-1, 9)
(3, 6) or equivalently (3, -1) or (-1, 6)
(6, 3) or equivalently (6, -1) or (-1, 3)
(9, 2) or equivalently (9, -1) or (-1, 2)
(18, 1) or equivalently (18, -1) or (-1, 1)

As we can already observe from the above shape tuples, the multiplication of the elements of the shape tuple (e.g. 2*9, 3*6 etc.) must always be equal to the total number of elements in the original tensor (18 in our example).

Another thing to observe is that we used a -1 in one of the places in each of the shape tuples. By using a -1, we are being lazy in doing the computation ourselves and rather delegate the task to PyTorch to do calculation of that value for the shape when it creates the new view. One important thing to note is that we can only use a single -1 in the shape tuple. The remaining values should be explicitly supplied by us. Else PyTorch will complain by throwing a RuntimeError:

RuntimeError: only one dimension can be inferred

So, with all of the above mentioned shapes, PyTorch will always return a new view of the original tensor t. This basically means that it just changes the stride information of the tensor for each of the new views that are requested.

Below are some examples illustrating how the strides of the tensors are changed with each new view.

# stride of our original tensor `t`
In [53]: t.stride() 
Out[53]: (1,)

Now, we will see the strides for the new views:

# shape (1, 18)
In [54]: t1 = t.view(1, -1)
# stride tensor `t1` with shape (1, 18)
In [55]: t1.stride() 
Out[55]: (18, 1)

# shape (2, 9)
In [56]: t2 = t.view(2, -1)
# stride of tensor `t2` with shape (2, 9)
In [57]: t2.stride()       
Out[57]: (9, 1)

# shape (3, 6)
In [59]: t3 = t.view(3, -1) 
# stride of tensor `t3` with shape (3, 6)
In [60]: t3.stride() 
Out[60]: (6, 1)

# shape (6, 3)
In [62]: t4 = t.view(6,-1)
# stride of tensor `t4` with shape (6, 3)
In [63]: t4.stride() 
Out[63]: (3, 1)

# shape (9, 2)
In [65]: t5 = t.view(9, -1) 
# stride of tensor `t5` with shape (9, 2)
In [66]: t5.stride()
Out[66]: (2, 1)

# shape (18, 1)
In [68]: t6 = t.view(18, -1)
# stride of tensor `t6` with shape (18, 1)
In [69]: t6.stride()
Out[69]: (1, 1)

So that's the magic of the view() function. It just changes the strides of the (original) tensor for each of the new views, as long as the shape of the new view is compatible with the original shape.

Another interesting thing one might observe from the strides tuples is that the value of the element in the 0th position is equal to the value of the element in the 1st position of the shape tuple.

In [74]: t3.shape 
Out[74]: torch.Size([3, 6])
                        |
In [75]: t3.stride()    |
Out[75]: (6, 1)         |
          |_____________|

This is because:

In [76]: t3 
Out[76]: 
tensor([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11],
        [12, 13, 14, 15, 16, 17]])

the stride (6, 1) says that to go from one element to the next element along the 0th dimension, we have to jump or take 6 steps. (i.e. to go from 0 to 6, one has to take 6 steps.) But to go from one element to the next element in the 1st dimension, we just need only one step (for e.g. to go from 2 to 3).

Thus, the strides information is at the heart of how the elements are accessed from memory for performing the computation.


torch.reshape()

This function would return a view and is exactly the same as using torch.Tensor.view() as long as the new shape is compatible with the shape of the original tensor. Otherwise, it will return a copy.

However, the notes of torch.reshape() warns that:

contiguous inputs and inputs with compatible strides can be reshaped without copying, but one should not depend on the copying vs. viewing behavior.


I figured it out that x.view(-1, 16 * 5 * 5) is equivalent to x.flatten(1), where the parameter 1 indicates the flatten process starts from the 1st dimension(not flattening the 'sample' dimension) As you can see, the latter usage is semantically more clear and easier to use, so I prefer flatten().


Let's try to understand view by the following examples:

    a=torch.range(1,16)

print(a)

    tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13., 14.,
            15., 16.])

print(a.view(-1,2))

    tensor([[ 1.,  2.],
            [ 3.,  4.],
            [ 5.,  6.],
            [ 7.,  8.],
            [ 9., 10.],
            [11., 12.],
            [13., 14.],
            [15., 16.]])

print(a.view(2,-1,4))   #3d tensor

    tensor([[[ 1.,  2.,  3.,  4.],
             [ 5.,  6.,  7.,  8.]],

            [[ 9., 10., 11., 12.],
             [13., 14., 15., 16.]]])
print(a.view(2,-1,2))

    tensor([[[ 1.,  2.],
             [ 3.,  4.],
             [ 5.,  6.],
             [ 7.,  8.]],

            [[ 9., 10.],
             [11., 12.],
             [13., 14.],
             [15., 16.]]])

print(a.view(4,-1,2))

    tensor([[[ 1.,  2.],
             [ 3.,  4.]],

            [[ 5.,  6.],
             [ 7.,  8.]],

            [[ 9., 10.],
             [11., 12.]],

            [[13., 14.],
             [15., 16.]]])

-1 as an argument value is an easy way to compute the value of say x provided we know values of y, z or the other way round in case of 3d and for 2d again an easy way to compute the value of say x provided we know values of y or vice versa..


weights.reshape(a, b) will return a new tensor with the same data as weights with size (a, b) as in it copies the data to another part of memory.

weights.resize_(a, b) returns the same tensor with a different shape. However, if the new shape results in fewer elements than the original tensor, some elements will be removed from the tensor (but not from memory). If the new shape results in more elements than the original tensor, new elements will be uninitialized in memory.

weights.view(a, b) will return a new tensor with the same data as weights with size (a, b)


What is the meaning of parameter -1?

You can read -1 as dynamic number of parameters or "anything". Because of that there can be only one parameter -1 in view().

If you ask x.view(-1,1) this will output tensor shape [anything, 1] depending on the number of elements in x. For example:

import torch
x = torch.tensor([1, 2, 3, 4])
print(x,x.shape)
print("...")
print(x.view(-1,1), x.view(-1,1).shape)
print(x.view(1,-1), x.view(1,-1).shape)

Will output:

tensor([1, 2, 3, 4]) torch.Size([4])
...
tensor([[1],
        [2],
        [3],
        [4]]) torch.Size([4, 1])
tensor([[1, 2, 3, 4]]) torch.Size([1, 4])

I really liked @Jadiel de Armas examples.

I would like to add a small insight to how elements are ordered for .view(...)

  • For a Tensor with shape (a,b,c), the order of it's elements are determined by a numbering system: where the first digit has a numbers, second digit has b numbers and third digit has c numbers.
  • The mapping of the elements in the new Tensor returned by .view(...) preserves this order of the original Tensor.

A tensor in pytorch is a view of an underlying contiguous block of numbers in memory (known as a storage). pytorch can achieve fast operations by modifying the shape parameters of a view of a storage without changing the underlying memory allocations themselves. Hence multiple different tensors may reference the same underlying storage object.

enter image description here

view is a way of specifying a change of shape on an existing tensor.


Questions with python tag:

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation Upgrade to python 3.8 using conda Unable to allocate array with shape and data type How to fix error "ERROR: Command errored out with exit status 1: python." when trying to install django-heroku using pip How to prevent Google Colab from disconnecting? "UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure." when plotting figure with pyplot on Pycharm How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? "E: Unable to locate package python-pip" on Ubuntu 18.04 Tensorflow 2.0 - AttributeError: module 'tensorflow' has no attribute 'Session' Jupyter Notebook not saving: '_xsrf' argument missing from post How to Install pip for python 3.7 on Ubuntu 18? Python: 'ModuleNotFoundError' when trying to import module from imported package OpenCV TypeError: Expected cv::UMat for argument 'src' - What is this? Requests (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.") Error in PyCharm requesting website How to setup virtual environment for Python in VS Code? Pylint "unresolved import" error in Visual Studio Code Pandas Merging 101 Numpy, multiply array with scalar What is the meaning of "Failed building wheel for X" in pip install? Selenium: WebDriverException:Chrome failed to start: crashed as google-chrome is no longer running so ChromeDriver is assuming that Chrome has crashed Could not install packages due to an EnvironmentError: [Errno 13] OpenCV !_src.empty() in function 'cvtColor' error ConvergenceWarning: Liblinear failed to converge, increase the number of iterations How to downgrade python from 3.7 to 3.6 I can't install pyaudio on Windows? How to solve "error: Microsoft Visual C++ 14.0 is required."? Iterating over arrays in Python 3 How do I install opencv using pip? How do I install Python packages in Google's Colab? How do I use TensorFlow GPU? How to upgrade Python version to 3.7? How to resolve TypeError: can only concatenate str (not "int") to str How can I install a previous version of Python 3 in macOS using homebrew? Flask at first run: Do not use the development server in a production environment TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array What is the difference between Jupyter Notebook and JupyterLab? Pytesseract : "TesseractNotFound Error: tesseract is not installed or it's not in your path", how do I fix this? Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" How do I resolve a TesseractNotFoundError? Trying to merge 2 dataframes but get ValueError Authentication plugin 'caching_sha2_password' is not supported Python Pandas User Warning: Sorting because non-concatenation axis is not aligned

Questions with memory tag:

How does the "view" method work in PyTorch? How do I release memory used by a pandas dataframe? How to solve the memory error in Python Docker error : no space left on device Default Xmxsize in Java 8 (max heap size) How to set Apache Spark Executor memory What is the best way to add a value to an array in state How do I read a large csv file with pandas? How to clear variables in ipython? Error occurred during initialization of VM Could not reserve enough space for object heap Could not create the Java virtual machine What is the default stack size, can it grow, how does it work with garbage collection? How to monitor the memory usage of Node.js? How do I get total physical memory size using PowerShell without WMI? Fastest way to convert Image to Byte array Python readlines() usage and efficient practice for reading What is a "cache-friendly" code? Where in memory are my variables stored in C? Delete all objects in a list How to clear memory to prevent "out of memory error" in excel vba? Variable's memory size in Python Command-line Tool to find Java Heap Size and Memory Used (Linux)? Increase Tomcat memory settings Best way to increase heap size in catalina.bat file If a DOM Element is removed, are its listeners also removed from memory? Fatal error: Out of memory, but I do have plenty of memory (PHP) How to read/write arbitrary bits in C/C++ memory error in python Calculate size of Object in Java Understanding the Linux oom-killer's logs How can I get the nth character of a string? Difference between static memory allocation and dynamic memory allocation How to create an instance of System.IO.Stream stream What's the difference between a word and byte? Heap space out of memory Could not reserve enough space for object heap to start JVM How can I create a memory leak in Java? What is the difference between buffer and cache memory in Linux? How to solve munmap_chunk(): invalid pointer error in C++ Difference between "on-heap" and "off-heap" What is the correct way to free memory in C# ios app maximum memory budget Stack Memory vs Heap Memory How to list processes attached to a shared memory segment in linux? Memory errors and list limits? ini_set("memory_limit") in PHP 5.3.3 is not working at all How to see top processes sorted by actual memory usage? How to assign pointer address manually in C programming language? Upper memory limit? Allowed memory size of X bytes exhausted What is causing "Unable to allocate memory for pool" in PHP?

Questions with pytorch tag:

Pytorch tensor to numpy array How to initialize weights in PyTorch? How to check if pytorch is using the GPU? PyTorch: How to get the shape of a Tensor as a list of int Pytorch reshape tensor dimension Best way to save a trained model in PyTorch? Model summary in pytorch How does the "view" method work in PyTorch?

Questions with torch tag:

How does the "view" method work in PyTorch?

Questions with tensor tag:

PyTorch: How to get the shape of a Tensor as a list of int Keras input explanation: input_shape, units, batch_size, dim, etc Pytorch reshape tensor dimension Best way to save a trained model in PyTorch? How does the "view" method work in PyTorch? How to get the dimensions of a tensor (in TensorFlow) at graph construction time? How to print the value of a Tensor object in TensorFlow?