The Street View House Numbers (SVHN) Dataset

Visualization of SVHN Dataset in the Deep Lake UI
Instead of downloading the MNIST dataset in Python, you can effortlessly load it in Python via our Deep Lake open-source with just one line of code.
import deeplake
ds = deeplake.load('hub://activeloop/svhn-train')
import deeplake
ds = deeplake.load('hub://activeloop/svhn-extra')
import deeplake
ds = deeplake.load('hub://activeloop/svhn-test')
SVHN Data Fields
- image: a tensor containing 32×32 images
- boxes: a tensor to draw character-level bounding boxes around the digits.
- labels: an integer between 0 to 9 representing digits.
SVHN Data Splits
- SVHN training split comprises 73257 digits
- SVHN testing split comprises 26032 digits.
- SVHN extra split comprises 531131 digits, these are comparatively less difficult samples, to use as extra training data.
Train a model on SVHN dataset with PyTorch in Python
Let’s use Deep Lake built-in PyTorch one-line dataloader to connect the data to the compute:
dataloader = ds.pytorch(num_workers=0, batch_size=4, shuffle=False)
Train a model on SVHN dataset with TensorFlow in Python
dataloader = ds.tensorflow()
- Homepage: http://ufldl.stanford.edu/housenumbers/
- Paper: Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng Reading Digits in Natural Images with Unsupervised Feature Learning NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011. (PDF)
Licensing Information
Citation Information
title={SVHN: Reading Digits in Natural Images with Unsupervised Feature Learning},
author={Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng }, Workshop={NIPS Workshop on Deep Learning and Unsupervised Feature Learning }
year={2011}
What is the SVHN dataset for Python?
The SVHN dataset is used for developing machine learning and object recognition algorithms with minimal requirements for data preprocessing and formatting. It contains real-world images has an order of magnitude more labeled data than the MNIST dataset. SVHN was created from house numbers in Google Street View images.
What is the SVHN dataset used for?
How was the SVHN dataset generated?
The SVHN dataset was created by combining a large amount of Google Street View images by utilizing a combination of scripts and the Amazon Mechanical Turk, which helped localize, as well as transcribe the single digits. A large set of urban area house numbers from different countries was used in the sample.
How to download the SVHN dataset in Python?
With the open-source package Activeloop Deep Lake, you can load the SVHN dataset with one line of code using Python. See detailed instructions on loading the SVHN dataset training subset and the SVHN dataset testing subset in Python.
How can I use SVHN dataset in PyTorch or TensorFlow?
The open-source package Activeloop Deep Lake allows you to stream the SVHN dataset while training a model in TensorFlow or PyTorch with one line of code. See detailed instructions on how to train a model with PyTorch in Python or train a model on the SVHN dataset with TensorFlow in Python.