How to write a multidimensional array to a text file

Question

In another question  other users offered some help if I could supply the array I was having trouble with  However  I even fail at a basic I O task  such as writing an array to a file   Can anyone explain what kind of loop I would need to write a 4x11x14 numpy array to file   This array consist of four 11 x 14 arrays  so I should format it with a nice newline  to make the reading of the file easier on others   Edit  So I ve tried the numpy savetxt function  Strangely  it gives the following error   TypeError  float argument required  not numpy ndarray   I assume that this is because the function doesn t work with multidimensional arrays  Any solutions as I would like them within one file

User · Accepted Answer

If you want to write it to disk so that it will be easy to read back in as a numpy array, look into numpy.save. Pickling it will work fine, as well, but it's less efficient for large arrays (which yours isn't, so either is perfectly fine).

If you want it to be human readable, look into numpy.savetxt.

Edit: So, it seems like savetxt isn't quite as great an option for arrays with >2 dimensions... But just to draw everything out to it's full conclusion:

I just realized that numpy.savetxt chokes on ndarrays with more than 2 dimensions... This is probably by design, as there's no inherently defined way to indicate additional dimensions in a text file.

E.g. This (a 2D array) works fine

import numpy as np
x = np.arange(20).reshape((4,5))
np.savetxt('test.txt', x)

While the same thing would fail (with a rather uninformative error: TypeError: float argument required, not numpy.ndarray) for a 3D array:

import numpy as np
x = np.arange(200).reshape((4,5,10))
np.savetxt('test.txt', x)

One workaround is just to break the 3D (or greater) array into 2D slices. E.g.

x = np.arange(200).reshape((4,5,10))
with open('test.txt', 'w') as outfile:
    for slice_2d in x:
        np.savetxt(outfile, slice_2d)

However, our goal is to be clearly human readable, while still being easily read back in with numpy.loadtxt. Therefore, we can be a bit more verbose, and differentiate the slices using commented out lines. By default, numpy.loadtxt will ignore any lines that start with # (or whichever character is specified by the comments kwarg). (This looks more verbose than it actually is...)

import numpy as np

# Generate some test data
data = np.arange(200).reshape((4,5,10))

# Write the array to disk
with open('test.txt', 'w') as outfile:
    # I'm writing a header here just for the sake of readability
    # Any line starting with "#" will be ignored by numpy.loadtxt
    outfile.write('# Array shape: {0}\n'.format(data.shape))
    
    # Iterating through a ndimensional array produces slices along
    # the last axis. This is equivalent to data[i,:,:] in this case
    for data_slice in data:

        # The formatting string indicates that I'm writing out
        # the values in left-justified columns 7 characters in width
        # with 2 decimal places.  
        np.savetxt(outfile, data_slice, fmt='%-7.2f')

        # Writing out a break to indicate different slices...
        outfile.write('# New slice\n')

This yields:

# Array shape: (4, 5, 10)
0.00    1.00    2.00    3.00    4.00    5.00    6.00    7.00    8.00    9.00   
10.00   11.00   12.00   13.00   14.00   15.00   16.00   17.00   18.00   19.00  
20.00   21.00   22.00   23.00   24.00   25.00   26.00   27.00   28.00   29.00  
30.00   31.00   32.00   33.00   34.00   35.00   36.00   37.00   38.00   39.00  
40.00   41.00   42.00   43.00   44.00   45.00   46.00   47.00   48.00   49.00  
# New slice
50.00   51.00   52.00   53.00   54.00   55.00   56.00   57.00   58.00   59.00  
60.00   61.00   62.00   63.00   64.00   65.00   66.00   67.00   68.00   69.00  
70.00   71.00   72.00   73.00   74.00   75.00   76.00   77.00   78.00   79.00  
80.00   81.00   82.00   83.00   84.00   85.00   86.00   87.00   88.00   89.00  
90.00   91.00   92.00   93.00   94.00   95.00   96.00   97.00   98.00   99.00  
# New slice
100.00  101.00  102.00  103.00  104.00  105.00  106.00  107.00  108.00  109.00 
110.00  111.00  112.00  113.00  114.00  115.00  116.00  117.00  118.00  119.00 
120.00  121.00  122.00  123.00  124.00  125.00  126.00  127.00  128.00  129.00 
130.00  131.00  132.00  133.00  134.00  135.00  136.00  137.00  138.00  139.00 
140.00  141.00  142.00  143.00  144.00  145.00  146.00  147.00  148.00  149.00 
# New slice
150.00  151.00  152.00  153.00  154.00  155.00  156.00  157.00  158.00  159.00 
160.00  161.00  162.00  163.00  164.00  165.00  166.00  167.00  168.00  169.00 
170.00  171.00  172.00  173.00  174.00  175.00  176.00  177.00  178.00  179.00 
180.00  181.00  182.00  183.00  184.00  185.00  186.00  187.00  188.00  189.00 
190.00  191.00  192.00  193.00  194.00  195.00  196.00  197.00  198.00  199.00 
# New slice

Reading it back in is very easy, as long as we know the shape of the original array. We can just do numpy.loadtxt('test.txt').reshape((4,5,10)). As an example (You can do this in one line, I'm just being verbose to clarify things):

# Read the array from disk
new_data = np.loadtxt('test.txt')

# Note that this returned a 2D array!
print new_data.shape

# However, going back to 3D is easy if we know the 
# original shape of the array
new_data = new_data.reshape((4,5,10))
    
# Just to check that they're the same...
assert np.all(new_data == data)

User · Answer

You can simply traverse the array in three nested loops and write their values to your file  For reading  you simply use the same exact loop construction  You will get the values in exactly the right order to fill your arrays correctly again

User · Answer

Use JSON module for multidimensional arrays  e g  import json with open filename   w   as f     json dump myndarray tolist    f

User · Answer

I have a way to do it using a simply filename write   operation  It works fine for me  but I m dealing with arrays having  1500 data elements    I basically just have for loops to iterate through the file and write it to the output destination line-by-line in a csv style output    import numpy as np  trial   np genfromtxt   extension file txt   dtype   str  delimiter         with open   extension file txt    w   as f      for x in xrange len trial   1             for y in range num of columns               if y  lt  num of columns-2                  f write trial x  y                     elif y    num of columns-1                  f write trial x  y           f write   n     The if and elif statement are used to add commas between the data elements  For whatever reason  these get stripped out when reading the file in as an nd array  My goal was to output the file as a csv  so this method helps to handle that   Hope this helps

User · Answer

If you don t need a human-readable output  another option you could try is to save the array as a MATLAB  mat file  which is a structured array  I despise MATLAB  but the fact that I can both read and write a  mat in very few lines is convenient    Unlike Joe Kington s answer  the benefit of this is that you don t need to know the original shape of the data in the  mat file  i e  no need to reshape upon reading in  And  unlike using pickle  a  mat file can be read by MATLAB  and probably some other programs languages as well    Here is an example   import numpy as np import scipy io    Some test data x   np arange 200  reshape  4 5 10      Specify the filename of the  mat file matfile    test mat mat     Write the array to the mat file  For this to work  the array must be the value   corresponding to a key name of your choice in a dictionary scipy io savemat matfile  mdict   out   x   oned as  row      For the above line  I specified the kwarg oned as since python  2 7 with    numpy 1 6 1  throws a FutureWarning   Here  this isn t really necessary    since oned as is a kwarg for dealing with 1-D arrays     Now load in the data from the  mat that was just saved matdata   scipy io loadmat matfile     And just to check if the data is the same  assert np all x    matdata  out      If you forget the key that the array is named in the  mat file  you can always do   print matdata keys     And of course you can store many arrays using many more keys   So yes     it won t be readable with your eyes  but only takes 2 lines to write and read the data  which I think is a fair trade-off   Take a look at the docs for scipy io savemat and scipy io loadmat and also this tutorial page  scipy io File IO Tutorial

User · Answer

There exist special libraries to do just that   Plus wrappers for python    netCDF4  http   www unidata ucar edu software netcdf  netCDF4 Python interface  http   www unidata ucar edu software netcdf software html Python HDF5  http   www hdfgroup org HDF5    hope this helps

User · Answer

ndarray tofile   should also work  e g  if your array is called a   a tofile  yourfile txt  sep     format   s     Not sure how to get newline formatting though   Edit  credit Kevin J  Black s comment here       Since version 1 5 0  np tofile   takes an optional parameter   newline   n  to allow multi-line output    https   docs scipy org doc numpy-1 13 0 reference generated numpy savetxt html

User · Answer

Pickle is best for these cases  Suppose you have a ndarray named x train  You can dump it into a file and revert it back using the following command   import pickle     Load into file with open  myfile pkl   wb   as f      pickle dump x train f      Extract from file with open  myfile pkl   rb   as f      x temp   pickle load f

User · Answer

I am not certain if this meets your requirements  given I think you are interested in making the file readable by people  but if that s not a primary concern  just pickle it   To save it   import pickle  my data     a    1  2 0  3  4 6j               b     string   u Unicode string                c   None  output   open  data pkl    wb   pickle dump my data  output  output close     To read it back   import pprint  pickle  pkl file   open  data pkl    rb    data1   pickle load pkl file  pprint pprint data1   pkl file close

User · Answer

Write to a file with Python s print    import numpy as np import sys  stdout sys   sys stdout np set printoptions precision 8    Sets number of digits of precision  np set printoptions suppress True    Suppress scientific notations  np set printoptions threshold sys maxsize    Prints the whole arrays  with open  myfile txt    w   as f      sys stdout   f     print nparr      sys stdout   stdout sys  Use set printoptions   to customize how the objects are displayed

[python] How to write a multidimensional array to a text file?

Examples related to python

Examples related to file-io

Examples related to numpy