mnist

Sebastian Raschka, 2015

Python Machine Learning - Supplementary Datasets

MNIST Dataset

Used in chapters 12 and 13

The MNIST dataset was constructed from two datasets of the US National Institute of Standards and Technology (NIST). The training set consists of handwritten digits from 250 different people, 50 percent high school students, and 50 percent employees from the Census Bureau. Note that the test set contains handwritten digits from different people following the same split.

Features

Each feature vector (row in the feature matrix) consists of 784 pixels (intensities) -- unrolled from the original 28x28 pixels images.

Number of samples: A subset of 5000 images (the first 500 digits of each class)
Target variable (discrete): {500x 0, ..., 500x 9}

References

Source: http://yann.lecun.com/exdb/mnist/
Y. LeCun and C. Cortes. Mnist handwritten digit database. AT&T Labs [Online]. Available: http://yann. lecun. com/exdb/mnist, 2010.

Loading MNIST

The description and code from chapter 12

In addition, I added to convenience function to one of my external machine learning packages

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
t10k-images-idx3-ubyte.gz		t10k-images-idx3-ubyte.gz
t10k-labels-idx1-ubyte.gz		t10k-labels-idx1-ubyte.gz
train-images-idx3-ubyte.gz		train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz		train-labels-idx1-ubyte.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Python Machine Learning - Supplementary Datasets

MNIST Dataset

References

Loading MNIST

FilesExpand file tree

mnist

Directory actions

More options

Directory actions

More options

Latest commit

History

mnist

Folders and files

parent directory

README.md

Python Machine Learning - Supplementary Datasets

MNIST Dataset

References

Loading MNIST