QSBench: Synthetic quantum circuit datasets for QML benchmarking

QSBench · April 6, 2026, 3:48pm

QSBench: Synthetic Quantum Circuit Datasets for QML Benchmarking

Hi everyone,

I’m sharing QSBench — a collection of synthetic quantum circuit datasets designed for machine learning benchmarking, especially for graph-based models and noise-aware learning.

Resources

What is QSBench?

QSBench is an ecosystem of datasets and tools for generating quantum circuits enriched with structural and physical metadata.

The goal is to move beyond:

purely random circuits
classical datasets embedded into quantum states

and instead provide structured, ML-ready quantum data.

Key Features

Structural Metadata (Graph-Ready)

Each circuit includes:

Adjacency matrices
Gate-level statistics
Entanglement metrics

This makes the datasets directly usable with Graph Neural Networks (GNNs).

Noise-Aware Design

QSBench explicitly models different physical noise channels:

Depolarizing noise
Amplitude damping
Thermal relaxation (T1/T2)
Readout errors

High-Performance Format

All datasets are stored in Apache Parquet, enabling:

Faster queries
Efficient large-scale processing
Better integration with ML pipelines

Available Datasets

QSBench-Core

Clean structural dataset (no noise)
Includes QASM, adjacency matrices, and entanglement metrics

QSBench-Depolarizing

Circuits with depolarizing noise
Designed for robustness and error mitigation research

QSBench-Amplitude

Focused on amplitude damping noise
Suitable for asymmetric noise modeling

QSBench-Transpilation

Raw vs transpiled circuits
Useful for studying compilation overhead and optimization

QSBench-Thermal

Thermal relaxation noise (T1/T2)
Designed for decoherence-aware modeling

QSBench-Device

Hardware-inspired noise models
Includes realistic combinations of error sources

Example Usage

from datasets import load_dataset

dataset = load_dataset("QSBench/QSBench-Core-v1.0.0-demo")

sample = dataset["train"][0]

print(sample["gate_count"])
print(len(sample["adjacency_matrix"]))

Use Cases

Predicting circuit properties from structure
Training GNNs on quantum circuits
Noise classification and error mitigation
Transpilation cost estimation
Hardware-aware ML modeling

Roadmap

Targeted entanglement generation
Dynamic circuits (mid-circuit measurements)
Integration with physical Hamiltonians

Feedback

Would love feedback, especially on:

Missing features or metadata
Additional noise models
Real-world use cases

Thanks!

Topic		Replies	Views
Quantum library with useful tools to help accelerate and optimise deep learning models Research	4	109	June 18, 2025
Oracle-verified reasoning dataset: verify-or-fix + witnesses + traces (preview + gated pilot) 🤗Datasets	1	14	January 7, 2026
Unlock AI training data with the open-sourced Synthetic Data SDK Show and Tell	0	84	February 4, 2025
About the datasets category 🤗Datasets	1	401	July 7, 2020
Tools, datasets ,benchmarks in AI Safety 🤗Datasets	0	130	June 20, 2024