[python] Dump a NumPy array into a csv file

Is there a way to dump a NumPy array into a CSV file? I have a 2D NumPy array and need to dump it in human-readable format.

This question is related to python arrays csv numpy

The answer is


numpy.savetxt saves an array to a text file.

import numpy
a = numpy.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
numpy.savetxt("foo.csv", a, delimiter=",")

tofile is a convenient function to do this:

import numpy as np
a = np.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
a.tofile('foo.csv',sep=',',format='%10.5f')

The man page has some useful notes:

This is a convenience function for quick storage of array data. Information on endianness and precision is lost, so this method is not a good choice for files intended to archive data or transport data between machines with different endianness. Some of these problems can be overcome by outputting the data as text files, at the expense of speed and file size.

Note. This function does not produce multi-line csv files, it saves everything to one line.


In Python we use csv.writer() module to write data into csv files. This module is similar to the csv.reader() module.

import csv

person = [['SN', 'Person', 'DOB'],
['1', 'John', '18/1/1997'],
['2', 'Marie','19/2/1998'],
['3', 'Simon','20/3/1999'],
['4', 'Erik', '21/4/2000'],
['5', 'Ana', '22/5/2001']]

csv.register_dialect('myDialect',
delimiter = '|',
quoting=csv.QUOTE_NONE,
skipinitialspace=True)

with open('dob.csv', 'w') as f:
    writer = csv.writer(f, dialect='myDialect')
    for row in person:
       writer.writerow(row)

f.close()

A delimiter is a string used to separate fields. The default value is comma(,).


if you want to write in column:

    for x in np.nditer(a.T, order='C'): 
            file.write(str(x))
            file.write("\n")

Here 'a' is the name of numpy array and 'file' is the variable to write in a file.

If you want to write in row:

    writer= csv.writer(file, delimiter=',')
    for x in np.nditer(a.T, order='C'): 
            row.append(str(x))
    writer.writerow(row)

I believe you can also accomplish this quite simply as follows:

  1. Convert Numpy array into a Pandas dataframe
  2. Save as CSV

e.g. #1:

    # Libraries to import
    import pandas as pd
    import nump as np

    #N x N numpy array (dimensions dont matter)
    corr_mat    #your numpy array
    my_df = pd.DataFrame(corr_mat)  #converting it to a pandas dataframe

e.g. #2:

    #save as csv 
    my_df.to_csv('foo.csv', index=False)   # "foo" is the name you want to give
                                           # to csv file. Make sure to add ".csv"
                                           # after whatever name like in the code

You can use pandas. It does take some extra memory so it's not always possible, but it's very fast and easy to use.

import pandas as pd 
pd.DataFrame(np_array).to_csv("path/to/file.csv")

if you don't want a header or index, use to_csv("/path/to/file.csv", header=None, index=None)


If you want to save your numpy array (e.g. your_array = np.array([[1,2],[3,4]])) to one cell, you could convert it first with your_array.tolist().

Then save it the normal way to one cell, with delimiter=';' and the cell in the csv-file will look like this [[1, 2], [2, 4]]

Then you could restore your array like this: your_array = np.array(ast.literal_eval(cell_string))


Writing record arrays as CSV files with headers requires a bit more work.

This example reads from a CSV file ('example.csv') and writes its contents to another CSV file (out.csv).

import numpy as np

# Write an example CSV file with headers on first line
with open('example.csv', 'w') as fp:
    fp.write('''\
col1,col2,col3
1,100.1,string1
2,222.2,second string
''')

# Read it as a Numpy record array
ar = np.recfromcsv('example.csv')
print(repr(ar))
# rec.array([(1, 100.1, 'string1'), (2, 222.2, 'second string')], 
#           dtype=[('col1', '<i4'), ('col2', '<f8'), ('col3', 'S13')])

# Write as a CSV file with headers on first line
with open('out.csv', 'w') as fp:
    fp.write(','.join(ar.dtype.names) + '\n')
    np.savetxt(fp, ar, '%s', ',')

Note that the above example cannot handle values which are strings with commas. To always enclose non-numeric values within quotes, use the csv package:

import csv

with open('out2.csv', 'wb') as fp:
    writer = csv.writer(fp, quoting=csv.QUOTE_NONNUMERIC)
    writer.writerow(ar.dtype.names)
    writer.writerows(ar.tolist())

You can also do it with pure python without using any modules.

# format as a block of csv text to do whatever you want
csv_rows = ["{},{}".format(i, j) for i, j in array]
csv_text = "\n".join(csv_rows)

# write it to a file
with open('file.csv', 'w') as f:
    f.write(csv_text)

As already discussed, the best way to dump the array into a CSV file is by using .savetxt(...)method. However, there are certain things we should know to do it properly.

For example, if you have a numpy array with dtype = np.int32 as

   narr = np.array([[1,2],
                 [3,4],
                 [5,6]], dtype=np.int32)

and want to save using savetxt as

np.savetxt('values.csv', narr, delimiter=",")

It will store the data in floating point exponential format as

1.000000000000000000e+00,2.000000000000000000e+00
3.000000000000000000e+00,4.000000000000000000e+00
5.000000000000000000e+00,6.000000000000000000e+00

You will have to change the formatting by using a parameter called fmt as

np.savetxt('values.csv', narr, fmt="%d", delimiter=",")

to store data in its original format

Saving Data in Compressed gz format

Also, savetxt can be used for storing data in .gz compressed format which might be useful while transferring data over network.

We just need to change the extension of the file as .gz and numpy will take care of everything automatically

np.savetxt('values.gz', narr, fmt="%d", delimiter=",")

Hope it helps


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to arrays

PHP array value passes to next row Use NSInteger as array index How do I show a message in the foreach loop? Objects are not valid as a React child. If you meant to render a collection of children, use an array instead Iterating over arrays in Python 3 Best way to "push" into C# array Sort Array of object by object field in Angular 6 Checking for duplicate strings in JavaScript array what does numpy ndarray shape do? How to round a numpy array?

Examples related to csv

Pandas: ValueError: cannot convert float NaN to integer Export result set on Dbeaver to CSV Convert txt to csv python script How to import an Excel file into SQL Server? "CSV file does not exist" for a filename with embedded quotes Save Dataframe to csv directly to s3 Python Data-frame Object has no Attribute (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape How to write to a CSV line by line? How to check encoding of a CSV file

Examples related to numpy

Unable to allocate array with shape and data type How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? Numpy, multiply array with scalar TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" Pytorch tensor to numpy array Numpy Resize/Rescale Image what does numpy ndarray shape do? How to round a numpy array? numpy array TypeError: only integer scalar arrays can be converted to a scalar index