[python] How to unpack pkl file?

I have a pkl file from MNIST dataset, which consists of handwritten digit images.

I'd like to take a look at each of those digit images, so I need to unpack the pkl file, except I can't find out how.

Is there a way to unpack/unzip pkl file?

This question is related to python pickle deep-learning mnist

The answer is


Handy one-liner

pkl() (
  python -c 'import pickle,sys;d=pickle.load(open(sys.argv[1],"rb"));print(d)' "$1"
)
pkl my.pkl

Will print __str__ for the pickled object.

The generic problem of visualizing an object is of course undefined, so if __str__ is not enough, you will need a custom script.


The pickle (and gzip if the file is compressed) module need to be used

NOTE: These are already in the standard Python library. No need to install anything new


In case you want to work with the original MNIST files, here is how you can deserialize them.

If you haven't downloaded the files yet, do that first by running the following in the terminal:

wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

Then save the following as deserialize.py and run it.

import numpy as np
import gzip

IMG_DIM = 28

def decode_image_file(fname):
    result = []
    n_bytes_per_img = IMG_DIM*IMG_DIM

    with gzip.open(fname, 'rb') as f:
        bytes_ = f.read()
        data = bytes_[16:]

        if len(data) % n_bytes_per_img != 0:
            raise Exception('Something wrong with the file')

        result = np.frombuffer(data, dtype=np.uint8).reshape(
            len(bytes_)//n_bytes_per_img, n_bytes_per_img)

    return result

def decode_label_file(fname):
    result = []

    with gzip.open(fname, 'rb') as f:
        bytes_ = f.read()
        data = bytes_[8:]

        result = np.frombuffer(data, dtype=np.uint8)

    return result

train_images = decode_image_file('train-images-idx3-ubyte.gz')
train_labels = decode_label_file('train-labels-idx1-ubyte.gz')

test_images = decode_image_file('t10k-images-idx3-ubyte.gz')
test_labels = decode_label_file('t10k-labels-idx1-ubyte.gz')

The script doesn't normalize the pixel values like in the pickled file. To do that, all you have to do is

train_images = train_images/255
test_images = test_images/255

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to pickle

Not able to pip install pickle in python 3.6 Python - AttributeError: 'numpy.ndarray' object has no attribute 'append' installing cPickle with python 3.5 How to read pickle file? What causes the error "_pickle.UnpicklingError: invalid load key, ' '."? How to save a list to a file and read it as a list type? ValueError: unsupported pickle protocol: 3, python2 pickle can not load the file dumped by python 3 pickle? Dump a list in a pickle file and retrieve it back later How to unpack pkl file? Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Examples related to deep-learning

How to initialize weights in PyTorch? What is the use of verbose in Keras while validating the model? How to import keras from tf.keras in Tensorflow? Keras input explanation: input_shape, units, batch_size, dim, etc Pytorch reshape tensor dimension What is the role of "Flatten" in Keras? Best way to save a trained model in PyTorch? Update TensorFlow Why binary_crossentropy and categorical_crossentropy give different performances for the same problem? Keras, How to get the output of each layer?

Examples related to mnist

How to unpack pkl file?