NeuroSTORM logo NeuroSTORM
Contact Us

Supported Datasets

NeuroSTORM supports a wide range of publicly available fMRI datasets for both pre-training and downstream analysis. Below is a comprehensive overview of 18+ datasets with details on subjects, resolution, TR, and scientific focus.

Dataset Overview Table

Dataset Name (Abbreviation) Subjects (Male/Female) Spatial Resolution TR (ms) Homepage
UK Biobank (UKB) 35,458 (16,733 / 18,725) 2.4 × 2.4 × 2.4 mm³ 735 Link
Adolescent Brain Cognitive Development (ABCD) 9,448 (4,931 / 4,516) 2.4 × 2.4 × 2.4 mm³ 800 Link
Human Connectome Project - Young Adult (HCP-YA) 1,206 (550 / 656) 2 × 2 × 2 mm³ 720 Link
Human Connectome Project - Aging (HCP-A) 722 (319 / 403) 2 × 2 × 2 mm³ 800 Link
Human Connectome Project - Development (HCP-D) 632 (293 / 339) 2 × 2 × 2 mm³ 800 Link
Human Connectome Project - Early Psychosis (HCP-EP) 176 (109 / 67) 2 × 2 × 2 mm³ 800 Link
ADHD-200 Sample (ADHD200) 497 (321 / 176) 3 × 3 × 4 mm³ 2000 Link
Autism Brain Imaging Data Exchange (ABIDE) 1,112 (948 / 164) 3 × 3 × 3 mm³ 2000 Link
UCLA Consortium for Neuropsychiatric Phenomics (UCLA) 261 (152 / 109) 3 × 3 × 4 mm³ 2000 Link
Center for Biomedical Research Excellence (COBRE) 173 (130 / 43) 3.75 × 3.75 × 4.55 mm³ 2000 Link
Motor Neuron Disease fMRI Dataset (MND) 59 (44 / 15) 2.395 × 2.395 × 2.4 mm³ 2000 Link
Transdiagnostic Connectome Project (TCP) 245 (102 / 143) 2 × 2 × 2 mm³ 800 Link
Healthy Brain Network (HBN) ~3,900+ (ongoing) Varies (2.4–2.5 mm) ~800–1450 Link
Philadelphia Neurodevelopmental Cohort (PNC) 1,445 (701 / 744) 3 × 3 × 3 mm³ 3000 Link
REST-meta-MDD 2,428 Varies (25 cohorts) Varies Link
DMT & Harmine Meditation (DMT-HAR-MED) 40 (22 / 18) ≈ 2 × 2 × 2 mm³ Varies Link

Detailed Dataset Descriptions

UK Biobank (UKB): Large-scale population study with 35,458 subjects

The UK Biobank (UKB) is one of the world's largest population-based health resource projects, comprising extensive genetic, clinical, lifestyle, and imaging data from over 500,000 participants, of which more than 40,000 have multimodal brain MRI—including both resting-state and task-based fMRI scans. Initiated between 2006 and 2010, UKB focuses on adults aged 40–69, with repeated imaging on a subset, enabling longitudinal analyses. fMRI data are acquired on Siemens Skyra 3T scanners, with a resolution of 2.4×2.4×2.4 mm³, and a fast TR of 735ms. The dataset includes both rsfMRI (6 min scan) and tfMRI (motor, emotion, social, gambling, and relational tasks), plus comprehensive demographic, cognitive, and health phenotypes. Standardized preprocessing pipelines (including motion correction, ICA-FIX denoising, and registration to MNI152) are publicly available. UKB is widely used for large-scale brain-behavior association studies, disease risk modeling, and neuroimaging foundation model pre-training.

Adolescent Brain Cognitive Development (ABCD): Longitudinal study of 9,448 children

The ABCD Study is the largest long-term study of brain development and child/adolescent health in the US. It follows over 11,800 children (9–10 at baseline) through adolescence, with repeated multimodal MRI, cognitive, behavioral, genetic, and environmental data collection. Neuroimaging includes high-resolution rsfMRI (2.4mm³, TR=800ms) and tfMRI (emotional, reward, cognitive tasks), harmonized across 21 sites and 3 major scanner vendors. Imaging pipelines are derived from HCP preprocessing (motion correction, normalization, artifact removal), with ROI time-series available (e.g., Schaefer atlas). ABCD supports studies of typical development, neuropsychiatric risk, and gene–brain–behavior relationships.

Human Connectome Project (HCP): High-resolution multimodal imaging across lifespan

The HCP is an NIH initiative to map human brain connectivity with unprecedented detail. Three major lifespan datasets are available:

All datasets include rich behavioral, cognitive, and demographic data. Used widely for benchmarking machine learning, connectomics, and lifespan brain research.

ADHD-200 Sample: Multi-site data for ADHD biomarker discovery (973 participants)

The ADHD-200 Sample is a multi-center open dataset for ADHD biomarker discovery. It consists of 973 children and adolescents (ages 7–21) from 8 US and 4 Chinese sites, including both ADHD (combined, inattentive, hyperactive-impulsive) and typically developing controls. Imaging includes resting-state fMRI (3×3×4 mm³, TR=2s) and T1-weighted MRI. Phenotypic data covers diagnosis, ADHD subtype, IQ, age, sex, and clinical symptoms. Preprocessing pipelines (Athena, NIAK, others) are public, supporting motion correction, normalization, and ROI extraction. ADHD200 is widely used for benchmarking machine learning models for disease classification.

Autism Brain Imaging Data Exchange (ABIDE): 1,112 ASD subjects across 17 sites

ABIDE collates resting-state fMRI and anatomical MRI from 1,112 subjects (539 with Autism Spectrum Disorder, 573 controls), ages 7–64, across 17 international sites. Imaging protocols are heterogeneous (typical: 3mm³, TR=2s). Extensive phenotypic/clinical data are included, covering ASD diagnosis, IQ, and behavioral scales. Preprocessing (multiple pipelines) includes normalization, head motion correction, nuisance regression, registration, and ROI-based time series extraction. ABIDE is a benchmark for autism connectomics and machine learning-based disorder classification.

UCLA Consortium for Neuropsychiatric Phenomics: Multi-diagnostic sample (272 subjects)

The UCLA dataset comprises multimodal MRI and neuropsychological data for 272 adults (aged 21–50), including healthy controls and patients with schizophrenia, bipolar disorder, and ADHD. Resting-state and task-based fMRI (3×3×4 mm³, TR=2s) are included, with rich cognitive, behavioral, and clinical phenotype data. Imaging was acquired on Siemens Trio 3T scanners. Preprocessing includes motion correction, normalization, and ROI time series extraction. This dataset enables studies of transdiagnostic neural signatures and supports disease classification benchmarks.

Center for Biomedical Research Excellence (COBRE): Schizophrenia and controls (173 subjects)

COBRE provides MRI data for 89 schizophrenia patients and 84 healthy controls (aged 18–65), recruited at a single US site. Imaging includes rsfMRI (3.75×3.75×4.55 mm³, TR=2s), T1 MRI, and clinical/behavioral measures. Preprocessing follows standard steps: motion correction, normalization, ROI time series extraction. This dataset is widely used for machine learning classification of schizophrenia and connectome analysis.

Philadelphia Neurodevelopmental Cohort (PNC): Youth cohort with deep phenotyping (1,445 scanned)

The PNC is a large-scale, community-based study of neurodevelopment and psychiatric risk in over 9,500 youths (ages 8–21) from the greater Philadelphia area, with a deeply phenotyped imaging subsample (n=1,445) providing multimodal MRI, including resting-state and task-based fMRI, structural MRI, DTI, and perfusion scans (ASL). All imaging was performed on a single Siemens TIM Trio 3T scanner using harmonized protocols. All participants received comprehensive computerized neurocognitive testing (CNB) and structured psychiatric assessment (GOASSESS), with rich demographic, medical, and clinical data. The imaging sample is balanced by age, sex, and race, providing a unique resource for developmental and transdiagnostic psychopathology research.

Healthy Brain Network (HBN): Transdiagnostic developmental dataset (~3,900+ ongoing)

The Healthy Brain Network (HBN) is an ongoing open-data initiative from the Child Mind Institute, designed to capture a broad spectrum of childhood psychopathologies and typical development. Enrollment focuses on families where there are concerns about a child's mental health or learning challenges. HBN spans multiple scanning sites in the New York City area (Staten Island mobile 1.5T and several 3T facilities), offering resting-state fMRI, anatomical MRI, diffusion MRI, EEG, voice/video recordings, and extensive phenotypic data. The resting-state protocol and "naturalistic" movie-watching runs facilitate pediatric compliance and reduce head motion. Data are released with thorough quality control metrics and partially harmonized preprocessing.

REST-meta-MDD: Major depressive disorder multisite meta-analysis (2,428 subjects)

The REST-meta-MDD Project is the first initiative of the Depression Imaging Research Consortium (DIRECT), involving 25 R-fMRI cohorts from 17 hospitals in China. It comprises 2,428 participants—1,300 patients with major depressive disorder (MDD) and 1,128 normal controls—making it among the largest MDD resting-state fMRI datasets collected to date. Clinical and demographic data (e.g., first-episode drug-naïve MDD, recurrent MDD, medication status, illness duration, Hamilton Depression Rating Scale scores) are included, allowing investigation of pathophysiological mechanisms and potential biomarkers for diagnosis or treatment response.

Transdiagnostic Connectome Project (TCP): Multi-psychiatric diagnosis study (245 subjects)

The TCP dataset consists of 245 adults (aged 18–65) with a diverse range of psychiatric conditions (including mood, anxiety, and psychotic disorders), along with healthy controls, recruited at Yale and McLean (US). Resting-state fMRI (2mm³, TR=800ms) is harmonized across sites using Siemens Prisma scanners. All participants undergo the same comprehensive psychiatric diagnostic interviews (DSM-5), cognitive battery, and clinical assessments. Preprocessing mirrors HCP pipelines, providing analysis-ready ROI-based functional connectivity and supporting transdiagnostic biomarker research.