Spiking Heidelberg Digits and Spiking Speech Commands

In collaboration with the Electronic Vision(s) Group at the University of Heidelberg, we developed two new spiking datasets for the evaluation of spiking neural networks. The Spiking Heidelberg Digits (SHD) dataset and the Spiking Speech Command (SSC) dataset are both audio-based classification datasets for which input spikes and output labels are provided. The datasets are released under the Creative Commons Attribution 4.0 International License.

Download

Datasets: https://zenkelab.org/datasets/
Mirrors: https://ieee-dataport.org/open-access/heidelberg-spiking-datasets and https://compneuro.net/datasets/

Conversion code: https://github.com/electronicvisions/lauscher

Leader board SHD

Publication	Accuracy %	Network	URL
Sun et al. 2025	96.26 ± 0.08	Parameter-free attention for delay SNNs	https://doi.org/10.1016/j.neunet.2025.107154
Schöne et al. 2024	95.9 ± 0.9	Event-based linear state space model	https://arxiv.org/abs/2404.18508
Baronig et al. 2024	95.8 ± 0.6	RSNN with adaptive LIF neurons and symplectic-Euler discretization	https://arxiv.org/abs/2408.07517
Hammouamri et al. 2023	95.1 ± 0.3	Fully connected SNN with learned delays	https://arxiv.org/abs/2306.17670
Bittar and Garner 2022	94.6	RSNN with adaptation	https://www.frontiersin.org/articles/10.3389/fnins.2022.865897
Nowotny et al. 2025	93.5 ± 0.7	RSNN with delay line input and augmentations	https://iopscience.iop.org/article/10.1088/2634-4386/ada852
Mészáros et al. 2025	93.2	RSNN with delay learning	https://arxiv.org/abs/2501.07331
Sun et al. 2023	92.45	Feed-forward SNN with adaptive axonal delays	https://ieeexplore.ieee.org/abstract/document/10094768
Yu et al. 2022	92.4	Feed-forward SNN with spatio-temporal filters and attention	https://www.frontiersin.org/articles/10.3389/fnins.2022.1079357
Yao et al. 2021	91.1	RSNN with temporal attention	https://arxiv.org/abs/2107.11711
D’Agostino et al. 2023	90.1% / 87.6 %	Feed-forward SNN with random dendritic delays (simulation/hardware, RRAM)	https://arxiv.org/abs/2312.08960
Yin et al. 2020	84.4	RSNN with adaption	https://arxiv.org/abs/2005.11633
Rossbroich et al. 2022	83.5 ± 1.5	Recurrent convolutional SNN with fluctuation-driven init	https://iopscience.iop.org/article/10.1088/2634-4386/ac97bb
Cramer et al. 2020	83.2 ± 1.3	RSNN with data augmentation + noise injection	https://doi.org/10.1109/TNNLS.2020.3044364
Perez-Nieves et al. 2021	82.7 ± 0.8	RSNN with heterogeneous time constants	https://www.biorxiv.org/content/10.1101/2020.12.18.423468v2
Cramer et al. 2020	71.4 ± 1.9	RSNN	https://doi.org/10.1109/TNNLS.2020.3044364
Cramer et al. 2020	48.1 ± 1.6	feed-forward SNN (single hidden layer)	https://doi.org/10.1109/TNNLS.2020.3044364

Please let us know if your work should be on this list.

Also please take note of the leader boards at Papers with Code for SHD https://paperswithcode.com/sota/audio-classification-on-shd and SSC https://paperswithcode.com/sota/audio-classification-on-ssc.

Publication

When using these data or the code for your work, please cite:

Cramer, B., Stradmann, Y., Schemmel, J., and Zenke, F. (2022).
The Heidelberg Spiking Data Sets for the Systematic Evaluation of Spiking Neural Networks.
IEEE Transactions on Neural Networks and Learning Systems 33, 2744–2757.
https://doi.org/10.1109/TNNLS.2020.3044364.

Specifications

We provide two distinct classification datasets for spiking neural networks.

Name	Classes	Samples (train/valid/test)	Parent dataset	URL
SHD	20	8156/-/2264	Heidelberg Digits (HD)	https://zenkelab.org/datasets/hd_audio.tar.gz
SSC	35	75466/9981/20382	Speech Commands v0.2	https://arxiv.org/abs/1804.03209

Both datasets are based on respective audio datasets. Spikes in 700 input channels were generated using Lauscher, an artificial cochlea model. The SHD consists of approximately 10000 high-quality aligned studio recordings of spoken digits from 0 to 9 in both German and English language. Recordings exist of 12 distinct speakers two of which are only present in the test set. The SSC is based on the Speech Commands release by Google which consists of utterances recorded from a larger number of speakers under less controlled conditions. It contains 35 word categories from a larger number of speakers.

Data format

For maximum compatibility, the SHD datasets are provided in HDF5 format which can be read by most major programming languages.

root
|-spikes
   |-times[]
   |-units[]
|-labels[]
|-extra
   |-speaker[]
   |-keys[]
   |-meta_info
      |-gender[]
      |-age[]
      |-body_height[]

Each datum consists of two lists that contain the firing times and the unit id of which neuron has fired at the corresponding firing time.

Example code

For a tutorial how to train a spiking neural network on this dataset checkout:
https://github.com/fzenke/spytorch/blob/master/notebooks/SpyTorchTutorial4.ipynb

The following code illustrates howto download and access the dataset in Python. The example code uses the PyTables package (https://www.pytables.org) to load HDF5 files.

import os
import urllib.request
import gzip, shutil
from tensorflow.keras.utils import get_file
cache_dir=os.path.expanduser("~/data")
cache_subdir="hdspikes"
print("Using cache dir: %s"%cache_dir)
# The remote directory with the data files
base_url = "https://zenkelab.org/datasets"
# Retrieve MD5 hashes from remote
response = urllib.request.urlopen("%s/md5sums.txt"%base_url)
data = response.read() 
lines = data.decode('utf-8').split("\n")
file_hashes = { line.split()[1]:line.split()[0] for line in lines if len(line.split())==2 }
def get_and_gunzip(origin, filename, md5hash=None):
    gz_file_path = get_file(filename, origin, md5_hash=md5hash, cache_dir=cache_dir, cache_subdir=cache_subdir)
    hdf5_file_path=gz_file_path[:-3]
    if not os.path.isfile(hdf5_file_path) or os.path.getctime(gz_file_path) > os.path.getctime(hdf5_file_path):
        print("Decompressing %s"%gz_file_path)
        with gzip.open(gz_file_path, 'r') as f_in, open(hdf5_file_path, 'wb') as f_out:
            shutil.copyfileobj(f_in, f_out)
    return hdf5_file_path
# Download the Spiking Heidelberg Digits (SHD) dataset
files = [ "shd_train.h5.gz", 
          "shd_test.h5.gz",
        ]
for fn in files:
    origin = "%s/%s"%(base_url,fn)
    hdf5_file_path = get_and_gunzip(origin, fn, md5hash=file_hashes[fn])
    print(hdf5_file_path)
# Similarly, to download the SSC dataset
files = [ "ssc_train.h5.gz", 
          "ssc_valid.h5.gz",
          "ssc_test.h5.gz",
        ]
for fn in files:
    origin = "%s/%s"%(base_url,fn)
    hdf5_file_path = get_and_gunzip(origin,fn,md5hash=file_hashes[fn])
    print(hdf5_file_path)
# At this point we can visualize some of the data
import tables
import numpy as np
fileh = tables.open_file(hdf5_file_path, mode='r')
units = fileh.root.spikes.units
times = fileh.root.spikes.times
labels = fileh.root.labels
# This is how we access spikes and labels
index = 0
print("Times (ms):", times[index])
print("Unit IDs:", units[index])
print("Label:", labels[index])
# A quick raster plot for one of the samples
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(16,4))
idx = np.random.randint(len(times),size=3)
for i,k in enumerate(idx):
    ax = plt.subplot(1,3,i+1)
    ax.scatter(times[k],700-units[k], color="k", alpha=0.33, s=2)
    ax.set_title("Label %i"%labels[k])
    ax.axis("off")
plt.show()

License

The above code is under the MIT License.
The datasets are released under the Creative Commons Attribution 4.0 International License.

Zenke Lab

Computational Neuroscience at the FMI