How to save and load numpy array data properly

Question

I wonder  how to save and load numpy array data properly  Currently I m using the numpy savetxt   method  For example  if I got an array markers  which looks like this     I try to save it by the use of   numpy savetxt  markers txt   markers    In other script I try to open previously saved file   markers   np fromfile  markers txt     And that s what I get       Saved data first looks like this   0 000000000000000000e 00 0 000000000000000000e 00 0 000000000000000000e 00 0 000000000000000000e 00 0 000000000000000000e 00 0 000000000000000000e 00 0 000000000000000000e 00 0 000000000000000000e 00 0 000000000000000000e 00 0 000000000000000000e 00   But when I save just loaded data by the use of the same method  ie  numpy savetxt   it looks like this   1 398043286095131769e-76 1 398043286095288860e-76 1 396426376485745879e-76 1 398043286055061908e-76 1 398043286095288860e-76 1 182950697433698368e-76 1 398043275797188953e-76 1 398043286095288860e-76 1 210894289234927752e-99 1 398040649781712473e-76   What am I doing wrong  PS there are no other  backstage  operation which I perform  Just saving and loading  and that s what I get  Thank you in advance

User · Answer

np fromfile   has a sep  keyword argument      Separator between items if file is a text file  Empty          separator means the file should be treated as binary  Spaces           in the separator match zero or more whitespace characters  A separator consisting only of spaces must match at least one whitespace    The default value of sep    means that np fromfile   tries to read it as a binary file rather than a space-separated text file  so you get nonsense values back  If you use np fromfile  markers txt   sep      you will get the result you are looking for   However  as others have pointed out  np loadtxt   is the preferred way to convert text files to numpy arrays  and unless the file needs to be human-readable it is usually better to use binary formats instead  e g  np load   np save

User · Answer

For a short answer you should use np save and np load  The advantages of these is that they are made by developers of the numpy library and they already work  plus are likely already optimized nicely  e g  import numpy as np from pathlib import Path  path   Path    data tmp    expanduser   path mkdir parents True  exist ok True   lb ub   -1 1 num samples   5 x   np random uniform low lb high ub size  1 num samples   y   x  2   x   2  np save path  x   x  np save path  y   y   x loaded   np load path  x npy   y load   np load path  y npy    print x is x loaded    False print x    x loaded       True  True  True  True  True     Expanded answer  In the end it really depends in your needs because you can also save it human readable format  see this Dump a NumPy array into a csv file  or even with other libraries if your files are extremely large  see this best way to preserve numpy arrays on disk for an expanded discussion   However   making an expansion since you use the word  quot properly quot  in your question  I still think using the numpy function out of the box  and most code   most likely satisfy most user needs  The most important reason is that it already works  Trying to use something else for any other reason might take you on an unexpectedly LONG rabbit hole to figure out why it doesn t work and force it work  Take for example trying to save it with pickle  I tried that just for fun and it took me at least 30 minutes to realize that pickle wouldn t save my stuff unless I opened  amp  read the file in bytes mode with wb  Took time to google  try thing  understand the error message etc    Small detail but the fact that it already required me to open a file complicated things in unexpected ways  To add that it required me to re-read this  which btw is sort of confusing  Difference between modes a  a   w  w   and r  in built-in open function   So if there is an interface that meets your needs use it unless you have a  very  good reason  e g  compatibility with matlab or for some reason your really want to read the file and printing in python really doesn t meet your needs  which might be questionable   Furthermore  most likely if you need to optimize it you ll find out later down the line  rather than spend ages debugging useless stuff like opening a simple numpy file   So use the interface numpy provide  It might not be perfect it s most likely fine  especially for a library that s been around as long as numpy  I already spent the saving and loading data with numpy in a bunch of way so have fun with it  hope it helps  import numpy as np import pickle from pathlib import Path  path   Path    data tmp    expanduser   path mkdir parents True  exist ok True   lb ub   -1 1 num samples   5 x   np random uniform low lb high ub size  1 num samples   y   x  2   x   2    using save  to npy   savez  to npz  np save path  x   x  np save path  y   y  np savez path  db   x x  y y  with open path  db pkl    wb   as db file      pickle dump obj   x  x   y  y   file db file      using loading npy  npz files x loaded   np load path  x npy   y load   np load path  y npy   db   np load path  db npz   with open path  db pkl    rb   as db file      db pkl   pickle load db file   print x is x loaded  print x    x loaded  print x    db  x    print x    db pkl  x    print  done    Some comments on what I learned   np save as expected  this already compresses it well  see https   stackoverflow com a 55750128 1601580   works out of the box without any file opening  Clean  Easy  Efficient  Use it  np savez uses a uncompressed format  see docs  Save several arrays into a single file in uncompressed  npz format  If you decide to use this  you were warned to go away from the standard solution so expect bugs   you might discover that you need to use argument names to save it  unless you want to use the default names  So don t use this if the first already works  or any works use that   Pickle also allows for arbitrary code execution  Some people might not want to use this for security reasons  human readable files are expensive to make etc  Probably not worth it  there is something called hdf5 for large files  Cool  https   stackoverflow com a 9619713 1601580   Note this is not an exhaustive answer  But for other resources check this   For pickle  guess the top answer is don t use pickle us np save   Save Numpy Array using Pickle For large files  great answer  compares storage size  loading save and more    https   stackoverflow com a 41425878 1601580 For matlab  we have to accept matlab has some freakin  nice plots     quot Converting quot  Numpy arrays to Matlab and vice versa For saving in human readable format  Dump a NumPy array into a csv file

User · Answer

The most reliable way I have found to do this is to use np savetxt with np loadtxt and not np fromfile which is better suited to binary files written with tofile  The np fromfile and np tofile methods write and read binary files whereas np savetxt writes a text file  So  for example  a   np array  1  2  3  4   np savetxt  test1 txt   a  fmt   d   b   np loadtxt  test1 txt   dtype int  a    b   array   True   True   True   True   dtype bool   Or  a tofile  test2 dat   c   np fromfile  test2 dat   dtype int  c    a   array   True   True   True   True   dtype bool   I use the former method even if it is slower and creates bigger files  sometimes   the binary format can be platform dependent  for example  the file format depends on the endianness of your system   There is a platform independent format for NumPy arrays  which can be saved and read with np save and np load  np save  test3 npy   a        npy extension is added if not given d   np load  test3 npy   a    d   array   True   True   True   True   dtype bool

User · Answer

np save  data npy   num arr    save new num arr   np load  data npy     load

[python] How to save and load numpy.array() data properly?

Examples related to python

Examples related to arrays

Examples related to numpy