[python] How to unpack pkl file?

In case you want to work with the original MNIST files, here is how you can deserialize them.

If you haven't downloaded the files yet, do that first by running the following in the terminal:

wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

Then save the following as deserialize.py and run it.

import numpy as np
import gzip

IMG_DIM = 28

def decode_image_file(fname):
    result = []
    n_bytes_per_img = IMG_DIM*IMG_DIM

    with gzip.open(fname, 'rb') as f:
        bytes_ = f.read()
        data = bytes_[16:]

        if len(data) % n_bytes_per_img != 0:
            raise Exception('Something wrong with the file')

        result = np.frombuffer(data, dtype=np.uint8).reshape(
            len(bytes_)//n_bytes_per_img, n_bytes_per_img)

    return result

def decode_label_file(fname):
    result = []

    with gzip.open(fname, 'rb') as f:
        bytes_ = f.read()
        data = bytes_[8:]

        result = np.frombuffer(data, dtype=np.uint8)

    return result

train_images = decode_image_file('train-images-idx3-ubyte.gz')
train_labels = decode_label_file('train-labels-idx1-ubyte.gz')

test_images = decode_image_file('t10k-images-idx3-ubyte.gz')
test_labels = decode_label_file('t10k-labels-idx1-ubyte.gz')

The script doesn't normalize the pixel values like in the pickled file. To do that, all you have to do is

train_images = train_images/255
test_images = test_images/255

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to pickle

Not able to pip install pickle in python 3.6 Python - AttributeError: 'numpy.ndarray' object has no attribute 'append' installing cPickle with python 3.5 How to read pickle file? What causes the error "_pickle.UnpicklingError: invalid load key, ' '."? How to save a list to a file and read it as a list type? ValueError: unsupported pickle protocol: 3, python2 pickle can not load the file dumped by python 3 pickle? Dump a list in a pickle file and retrieve it back later How to unpack pkl file? Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Examples related to deep-learning

How to initialize weights in PyTorch? What is the use of verbose in Keras while validating the model? How to import keras from tf.keras in Tensorflow? Keras input explanation: input_shape, units, batch_size, dim, etc Pytorch reshape tensor dimension What is the role of "Flatten" in Keras? Best way to save a trained model in PyTorch? Update TensorFlow Why binary_crossentropy and categorical_crossentropy give different performances for the same problem? Keras, How to get the output of each layer?

Examples related to mnist

How to unpack pkl file?