[python] Numpy converting array from float to strings

I have an array of floats that I have normalised to one (i.e. the largest number in the array is 1), and I wanted to use it as colour indices for a graph. In using matplotlib to use grayscale, this requires using strings between 0 and 1, so I wanted to convert the array of floats to an array of strings. I was attempting to do this by using "astype('str')", but this appears to create some values that are not the same (or even close) to the originals.

I notice this because matplotlib complains about finding the number 8 in the array, which is odd as it was normalised to one!

In short, I have an array phis, of float64, such that:

numpy.where(phis.astype('str').astype('float64') != phis)

is non empty. This is puzzling as (hopefully naively) it appears to be a bug in numpy, is there anything that I could have done wrong to cause this?

Edit: after investigation this appears to be due to the way the string function handles high precision floats. Using a vectorized toString function (as from robbles answer), this is also the case, however if the lambda function is:

lambda x: "%.2f" % x

Then the graphing works - curiouser and curiouser. (Obviously the arrays are no longer equal however!)

This question is related to python numpy matplotlib

The answer is


If the main problem is the loss of precision when converting from a float to a string, one possible way to go is to convert the floats to the decimalS: http://docs.python.org/library/decimal.html.

In python 2.7 and higher you can directly convert a float to a decimal object.


If you have an array of numbers and you want an array of strings, you can write:

strings = ["%.2f" % number for number in numbers]

If your numbers are floats, the array would be an array with the same numbers as strings with two decimals.

>>> a = [1,2,3,4,5]
>>> min_a, max_a = min(a), max(a)
>>> a_normalized = [float(x-min_a)/(max_a-min_a) for x in a]
>>> a_normalized
[0.0, 0.25, 0.5, 0.75, 1.0]
>>> a_strings = ["%.2f" % x for x in a_normalized]
>>> a_strings
['0.00', '0.25', '0.50', '0.75', '1.00']

Notice that it also works with numpy arrays:

>>> a = numpy.array([0.0, 0.25, 0.75, 1.0])
>>> print ["%.2f" % x for x in a]
['0.00', '0.25', '0.50', '0.75', '1.00']

A similar methodology can be used if you have a multi-dimensional array:

new_array = numpy.array(["%.2f" % x for x in old_array.reshape(old_array.size)])
new_array = new_array.reshape(old_array.shape)

Example:

>>> x = numpy.array([[0,0.1,0.2],[0.3,0.4,0.5],[0.6, 0.7, 0.8]])
>>> y = numpy.array(["%.2f" % w for w in x.reshape(x.size)])
>>> y = y.reshape(x.shape)
>>> print y
[['0.00' '0.10' '0.20']
 ['0.30' '0.40' '0.50']
 ['0.60' '0.70' '0.80']]

If you check the Matplotlib example for the function you are using, you will notice they use a similar methodology: build empty matrix and fill it with strings built with the interpolation method. The relevant part of the referenced code is:

colortuple = ('y', 'b')
colors = np.empty(X.shape, dtype=str)
for y in range(ylen):
    for x in range(xlen):
        colors[x, y] = colortuple[(x + y) % len(colortuple)]

surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, facecolors=colors,
        linewidth=0, antialiased=False)

This is probably slower than what you want, but you can do:

>>> tostring = vectorize(lambda x: str(x))
>>> numpy.where(tostring(phis).astype('float64') != phis)
(array([], dtype=int64),)

It looks like it rounds off the values when it converts to str from float64, but this way you can customize the conversion however you like.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to numpy

Unable to allocate array with shape and data type How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? Numpy, multiply array with scalar TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" Pytorch tensor to numpy array Numpy Resize/Rescale Image what does numpy ndarray shape do? How to round a numpy array? numpy array TypeError: only integer scalar arrays can be converted to a scalar index

Examples related to matplotlib

"UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure." when plotting figure with pyplot on Pycharm How to increase image size of pandas.DataFrame.plot in jupyter notebook? How to create a stacked bar chart for my DataFrame using seaborn? How to display multiple images in one figure correctly? Edit seaborn legend How to hide axes and gridlines in Matplotlib (python) How to set x axis values in matplotlib python? How to specify legend position in matplotlib in graph coordinates Python "TypeError: unhashable type: 'slice'" for encoding categorical data Seaborn Barplot - Displaying Values