I'm trying to implement the binary classification example using the IMDb dataset in Google Colab. I have implemented this model before. But when I tried to do it again after a few days, it returned a value error: 'Object arrays cannot be loaded when allow_pickle=False'
for the load_data() function.
I have already tried solving this, referring to an existing answer for a similar problem: How to fix 'Object arrays cannot be loaded when allow_pickle=False' in the sketch_rnn algorithm. But it turns out that just adding an allow_pickle argument isn't sufficient.
My code:
from keras.datasets import imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
The error:
ValueError Traceback (most recent call last)
<ipython-input-1-2ab3902db485> in <module>()
1 from keras.datasets import imdb
----> 2 (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
2 frames
/usr/local/lib/python3.6/dist-packages/keras/datasets/imdb.py in load_data(path, num_words, skip_top, maxlen, seed, start_char, oov_char, index_from, **kwargs)
57 file_hash='599dadb1135973df5b59232a0e9a887c')
58 with np.load(path) as f:
---> 59 x_train, labels_train = f['x_train'], f['y_train']
60 x_test, labels_test = f['x_test'], f['y_test']
61
/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py in __getitem__(self, key)
260 return format.read_array(bytes,
261 allow_pickle=self.allow_pickle,
--> 262 pickle_kwargs=self.pickle_kwargs)
263 else:
264 return self.zip.read(key)
/usr/local/lib/python3.6/dist-packages/numpy/lib/format.py in read_array(fp, allow_pickle, pickle_kwargs)
690 # The array contained Python objects. We need to unpickle the data.
691 if not allow_pickle:
--> 692 raise ValueError("Object arrays cannot be loaded when "
693 "allow_pickle=False")
694 if pickle_kwargs is None:
ValueError: Object arrays cannot be loaded when allow_pickle=False
Here's a trick to force imdb.load_data
to allow pickle by, in your notebook, replacing this line:
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
by this:
import numpy as np
# save np.load
np_load_old = np.load
# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)
# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
# restore np.load for future normal usage
np.load = np_load_old
This issue is still up on keras git. I hope it gets solved as soon as possible. Until then, try downgrading your numpy version to 1.16.2. It seems to solve the problem.
!pip install numpy==1.16.1
import numpy as np
This version of numpy has the default value of allow_pickle
as True
.
Following this issue on GitHub, the official solution is to edit the imdb.py file. This fix worked well for me without the need to downgrade numpy. Find the imdb.py file at tensorflow/python/keras/datasets/imdb.py
(full path for me was: C:\Anaconda\Lib\site-packages\tensorflow\python\keras\datasets\imdb.py
- other installs will be different) and change line 85 as per the diff:
- with np.load(path) as f:
+ with np.load(path, allow_pickle=True) as f:
The reason for the change is security to prevent the Python equivalent of an SQL injection in a pickled file. The change above will ONLY effect the imdb data and you therefore retain the security elsewhere (by not downgrading numpy).
I just used allow_pickle = True as an argument to np.load() and it worked for me.
np.load(path, allow_pickle=True)
In my case worked with:
np.load(path, allow_pickle=True)
I think the answer from cheez (https://stackoverflow.com/users/122933/cheez) is the easiest and most effective one. I'd elaborate a little bit over it so it would not modify a numpy function for the whole session period.
My suggestion is below. I´m using it to download the reuters dataset from keras which is showing the same kind of error:
old = np.load
np.load = lambda *a,**k: old(*a,**k,allow_pickle=True)
from keras.datasets import reuters
(train_data, train_labels), (test_data, test_labels) = reuters.load_data(num_words=10000)
np.load = old
del(old)
You can try changing the flag's value
np.load(training_image_names_array,allow_pickle=True)
none of the above listed solutions worked for me: i run anaconda with python 3.7.3. What worked for me was
run "conda install numpy==1.16.1" from Anaconda powershell
close and reopen the notebook
I landed up here, tried your ways and could not figure out.
I was actually working on a pregiven code where
pickle.load(path)
was used so i replaced it with
np.load(path, allow_pickle=True)
on jupyter notebook using
np_load_old = np.load
# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)
worked fine, but the problem appears when you use this method in spyder(you have to restart the kernel every time or you will get an error like:
TypeError : () got multiple values for keyword argument 'allow_pickle'
I solved this issue using the solution here:
find the path to imdb.py then just add the flag to np.load(path,...flag...)
def load_data(.......):
.......................................
.......................................
- with np.load(path) as f:
+ with np.load(path,allow_pickle=True) as f:
Its work for me
np_load_old = np.load
np.load = lambda *a: np_load_old(*a, allow_pickle=True)
(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=None, test_split=0.2)
np.load = np_load_old
What I have found is that TensorFlow 2.0 (I am using 2.0.0-alpha0) is not compatible with the latest version of Numpy i.e. v1.17.0 (and possibly v1.16.5+). As soon as TF2 is imported, it throws a huge list of FutureWarning, that looks something like this:
FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
This also resulted in the allow_pickle error when tried to load imdb dataset from keras
I tried to use the following solution which worked just fine, but I had to do it every single project where I was importing TF2 or tf.keras.
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)
The easiest solution I found was to either install numpy 1.16.1 globally, or use compatible versions of tensorflow and numpy in a virtual environment.
My goal with this answer is to point out that its not just a problem with imdb.load_data, but a larger problem vaused by incompatibility of TF2 and Numpy versions and may result in many other hidden bugs or issues.
Use this
from tensorflow.keras.datasets import imdb
instead of this
from keras.datasets import imdb
This error comes when you have the previous version of torch like 1.6.0 with torchvision==0.7.0, you may check yours torch version through this command:
import tensorflow
print(tensorflow.__version__)
this error is already resolved in the newer version of torch.
you can remove this error through making the following change in np.load()
np.load(somepath, allow_pickle=True)
The allow_pickle=True will solve it
The error also can occur if you try to save a python list of numpy arrays with np.save and load with np.load. I am only saying it for the sake of googler's to check out that this is not the issue. Also using allow_pickle=True
fixed the issue if a list is indeed what you meant to save and load.
Yes, installing previous a version of numpy solved the problem.
For those who uses PyCharm IDE:
in my IDE (Pycharm), File->Settings->Project Interpreter: I found my numpy to be 1.16.3, so I revert back to 1.16.1. Click + and type numpy in the search, tick "specify version" : 1.16.1 and choose--> install package.
Tensorflow has a fix in tf-nightly version.
!pip install tf-nightly
The current version is '2.0.0-dev20190511'.
Instead of
from keras.datasets import imdb
use
from tensorflow.keras.datasets import imdb
top_words = 10000
((x_train, y_train), (x_test, y_test)) = imdb.load_data(num_words=top_words, seed=21)
I don't usually post to these things but this was super annoying. The confusion comes from the fact that some of the Keras imdb.py
files have already updated:
with np.load(path) as f:
to the version with allow_pickle=True
. Make sure check the imdb.py file to see if this change was already implemented. If it has been adjusted, the following works fine:
from keras.datasets import imdb
(train_text, train_labels), (test_text, test_labels) = imdb.load_data(num_words=10000)
The easiest way is to change imdb.py
setting allow_pickle=True
to np.load
at the line where imdb.py
throws error.
I was facing the same issue, here is line from error
File "/usr/lib/python3/dist-packages/numpy/lib/npyio.py", line 260, in __getitem__
So i solve the issue by updating "npyio.py" file. In npyio.py line 196 assigning value to allow_pickle so i update this line as
self.allow_pickle = True
The answer of @cheez sometime doesn't work and recursively call the function again and again. To solve this problem you should copy the function deeply. You can do this by using the function partial
, so the final code is:
import numpy as np
from functools import partial
# save np.load
np_load_old = partial(np.load)
# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)
# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) =
imdb.load_data(num_words=10000)
# restore np.load for future normal usage
np.load = np_load_old
Source: Stackoverflow.com