NeuroSTORM Icon NeuroSTORM: A Deep Learning Platform for fMRI Analysis

1. Develop Deep-learning Methods with NeuroSTORM

  1. How to preprocess your own data with NeuroSTORM
    • Pre-processing: Please make sure you have completed the primary preprocessing pipeline, such as FSL, fMRIPrep, or the HCP pipeline, and your data is aligned to MNI152 space. You may also use our provided shell script for brain extraction: brain_extraction.sh (based on FSL BET, please install FSL tool first). After running, brain mask files in .nii.gz format will be generated in the output directory.
    • Prepare 4D input: Use preprocessing_volume.py to preprocess your data for model input. This tool supports bulk dataset processing, including background removal, resampling, Z-normalization, and saving frames as .pt files. If CPU is limited, preprocess in advance; if disk speed is the bottleneck, online preprocessing during training is an option.
    • Prepare 2D input: Use generate_roi_data_from_nii.py to convert 3D/4D data to 2D ROI-based data using available brain atlases. Multiple datasets and atlases are supported.
  2. How to run existing methods on supported datasets
    You can use our prepared scripts to quickly reproduce the experiments from the paper: scripts
  3. How to add new methods
  4. How to adapt NeuroSTORM to new datasets
  5. How to add new tasks
    • Define the dataset label format in the make_subject_dict function from data_module.py
    • Set the task type by specifying --downstream_task in the script.
    • Choose a classification or regression head. For custom tasks, add a new head definition in models/heads
  6. Related Links

2. Supported Datasets

NeuroSTORM supports a wide range of publicly available fMRI datasets for both pre-training and downstream analysis. The table below summarizes key characteristics of each dataset, including the subject number, male/female ratio, spatial resolution, TR, and official homepage.

Dataset Name (Abbreviation) Subjects
(Male/Female)
Spatial Resolution TR (ms) Homepage
UK Biobank (UKB) 40,842
(17,720 / 23,122)
2.4 × 2.4 × 2.4 mm³ 735 Link
Adolescent Brain Cognitive Development (ABCD) 9,448
(4,931 / 4,517)
2.4 × 2.4 × 2.4 mm³ 800 Link
Human Connectome Project - Young Adult (HCP-YA) 1,206
(550 / 656)
2 × 2 × 2 mm³ 720 Link
Human Connectome Project - Aging (HCP-A) 725
(319 / 406)
2 × 2 × 2 mm³ 800 Link
Human Connectome Project - Development (HCP-D) 652
(301 / 351)
2 × 2 × 2 mm³ 800 Link
Human Connectome Project - Early Psychosis (HCP-EP) 252
(94 / 158)
2 × 2 × 2 mm³ 800 Link
ADHD-200 Sample (ADHD200) 973
(600 / 373)
3 × 3 × 4 mm³ 2000 Link
Autism Brain Imaging Data Exchange (ABIDE) 1,112
(948 / 164)
3 × 3 × 3 mm³ 2000 Link
UCLA Consortium for Neuropsychiatric Phenomics (UCLA) 272
(222 / 50)
3 × 3 × 4 mm³ 2000 Link
Center for Biomedical Research Excellence (COBRE) 173
(130 / 43)
3.75 × 3.75 × 4.55 mm³ 2000 Link
Motor Neuron Disease fMRI Dataset (MND) 59
(44 / 15)
2.395 × 2.395 × 2.4 mm³ 2000 --
Transdiagnostic Connectome Project (TCP) 245
(143 / 102)
2 × 2 × 2 mm³ 800 Link
Healthy Brain Network (HBN) ~3,900+ (ongoing enrollment) Varies by site (commonly ~2.4–2.5 mm isotropic) ~800–1450 Link
Philadelphia Neurodevelopmental Cohort (PNC) 1,445
(~701 / ~744)
3 × 3 × 3 mm³ 3000 Link
REST-meta-MDD 2,428
(1300 MDD / 1128 NC)
Varies across 25 cohorts Varies Link

Dataset Descriptions

    UK Biobank (UKB): A large-scale prospective study from the UK containing health, genetic, and neuroimaging data of over 40,000 middle-aged participants. fMRI is acquired at 2.4mm isotropic resolution (TR=735ms).

    The UK Biobank (UKB) is one of the world's largest population-based health resource projects, comprising extensive genetic, clinical, lifestyle, and imaging data from over 500,000 participants, of which more than 40,000 have multimodal brain MRI—including both resting-state and task-based fMRI scans. Initiated between 2006 and 2010, UKB focuses on adults aged 40–69, with repeated imaging on a subset, enabling longitudinal analyses. fMRI data are acquired on Siemens Skyra 3T scanners, with a resolution of 2.4×2.4×2.4 mm³, and a fast TR of 735ms. The dataset includes both rsfMRI (6 min scan) and tfMRI (motor, emotion, social, gambling, and relational tasks), plus comprehensive demographic, cognitive, and health phenotypes. Standardized preprocessing pipelines (including motion correction, ICA-FIX denoising, and registration to MNI152) are publicly available. UKB is widely used for large-scale brain-behavior association studies, disease risk modeling, and neuroimaging foundation model pre-training.

    Dataset NameUK Biobank (UKB)
    fMRI TypesrsfMRI and tfMRI
    tfMRI TasksEmotion, Gambling, Motor, Relational, Social
    Age Range40-69 years
    Gender Ratio~46% male, ~54% female
    Patient/ControlPopulation-based (includes healthy and various disease cases)
    Disease TypesVarious (not a patient cohort, but disease info available)
    Used in TasksPre-training, Task 1 (age/gender prediction)
    Sample Size~40,842 with fMRI
    Adolescent Brain Cognitive Development (ABCD): Longitudinal neuroimaging of ~9,500 children in the US, with multimodal data and 2.4mm/800ms fMRI scans.

    The ABCD Study is the largest long-term study of brain development and child/adolescent health in the US. It follows over 11,800 children (9–10 at baseline) through adolescence, with repeated multimodal MRI, cognitive, behavioral, genetic, and environmental data collection. Neuroimaging includes high-resolution rsfMRI (2.4mm³, TR=800ms) and tfMRI (emotional, reward, cognitive tasks), harmonized across 21 sites and 3 major scanner vendors. Imaging pipelines are derived from HCP preprocessing (motion correction, normalization, artifact removal), with ROI time-series available (e.g., Schaefer atlas). ABCD supports studies of typical development, neuropsychiatric risk, and gene–brain–behavior relationships.

    Dataset NameABCD
    fMRI TypesrsfMRI and tfMRI
    tfMRI TasksEmotional n-back, Reward, Stop-signal, Monetary Incentive Delay
    Age Range9–13 years (at latest release)
    Gender Ratio~52% male, ~48% female
    Patient/ControlCommunity sample (includes healthy and at-risk youth)
    Disease TypesNot specific, but behavioral/clinical phenotypes available
    Used in TasksPre-training, Task 1 (age/gender)
    Sample Size~9,448 with fMRI
    Human Connectome Project – Young Adult (HCP-YA), Aging (HCP-A), Development (HCP-D): Three high-resolution (2mm) public datasets for mapping brain structure and function across the lifespan (children, young adults, elderly).

    The HCP is an NIH initiative to map human brain connectivity with unprecedented detail. Three major lifespan datasets are:

    • HCP-YA: 1,206 healthy young adults (22–37y), scanned at 3T and 7T. Imaging includes rsfMRI (2mm³, TR=720ms; 1 hour per subject), tfMRI (7 tasks: working memory, emotion, language, motor, gambling, relational, social), and dMRI. Extensively preprocessed: motion correction, ICA-FIX, MNI registration, surface/volumetric data.
    • HCP-A: 725 adults aged 36–100, using similar MRI protocols. Focused on typical aging and age-related brain changes.
    • HCP-D: 652 children and adolescents (ages 5–21), using harmonized imaging, enables developmental connectomics.
    All datasets include rich behavioral, cognitive, and demographic data. Used widely for benchmarking machine learning, connectomics, and lifespan brain research.

    Dataset NameHCP-YA, HCP-A, HCP-D
    fMRI TypesrsfMRI and tfMRI
    tfMRI TasksEmotion, Gambling, Language, Motor, Relational, Social, Working Memory
    Age RangeHCP-YA: 22–37; HCP-A: 36–100; HCP-D: 5–21
    Gender Ratio~47% male, ~53% female (YA); similar balance in others
    Patient/ControlHealthy volunteers
    Disease TypesNone (controls only)
    Used in TasksPre-training, Task 1 (age/gender), Task 2 (phenotype), Task 5 (tfMRI state classification)
    Sample SizeHCP-YA: 1,206; HCP-A: 725; HCP-D: 652
    Human Connectome Project – Early Psychosis (HCP-EP): fMRI/clinical data for early psychosis research (252 subjects), 2mm, TR=800ms.

    The HCP-EP dataset focuses on individuals in the early phases (within 5 years) of psychotic disorders, including both affective and non-affective psychoses, and matched healthy controls. Participants (ages 16–35) are clinically characterized, with rsfMRI (2mm³, TR=800ms) and full neurocognitive/clinical assessments. Imaging is harmonized with HCP-Lifespan protocols (motion correction, ICA-FIX, MNI). The dataset supports studies of biomarkers and network changes in schizophrenia spectrum disorders and is a benchmark for disease diagnosis tasks.

    Dataset NameHCP-EP
    fMRI TypesrsfMRI
    tfMRI TasksNone
    Age Range16–35 years
    Gender Ratio~42% male, ~58% female
    Patient/ControlPatients and controls
    Disease TypesEarly psychosis (schizophrenia, schizoaffective, bipolar with psychosis)
    Used in TasksTask 3 (disease diagnosis)
    Sample Size252 (57 affective psychosis, 127 non-affective psychosis, 68 controls)
    ADHD-200 Sample (ADHD200): Multi-site data of 973 children/adolescents, 3×3×4mm, TR=2000ms, focused on ADHD diagnosis.

    The ADHD-200 Sample is a multi-center open dataset for ADHD biomarker discovery. It consists of 973 children and adolescents (ages 7–21) from 8 US and 4 Chinese sites, including both ADHD (combined, inattentive, hyperactive-impulsive) and typically developing controls. Imaging includes resting-state fMRI (3×3×4 mm³, TR=2s) and T1-weighted MRI. Phenotypic data covers diagnosis, ADHD subtype, IQ, age, sex, and clinical symptoms. Preprocessing pipelines (Athena, NIAK, others) are public, supporting motion correction, normalization, and ROI extraction. ADHD200 is widely used for benchmarking machine learning models for disease classification.

    Dataset NameADHD-200
    fMRI TypesrsfMRI
    tfMRI TasksNone
    Age Range7–21 years
    Gender Ratio~73% male, ~27% female
    Patient/ControlADHD patients and controls
    Disease TypesAttention-Deficit/Hyperactivity Disorder (ADHD)
    Used in TasksTask 3 (disease diagnosis)
    Sample Size973 (362 ADHD, 611 controls)
    Autism Brain Imaging Data Exchange (ABIDE): Aggregated from 17 sites, 1,112 subjects (948 males, 164 females) for ASD studies, 3mm, TR=2000ms.

    ABIDE collates resting-state fMRI and anatomical MRI from 1,112 subjects (539 with Autism Spectrum Disorder, 573 controls), ages 7–64, across 17 international sites. Imaging protocols are heterogeneous (typical: 3mm³, TR=2s). Extensive phenotypic/clinical data are included, covering ASD diagnosis, IQ, and behavioral scales. Preprocessing (multiple pipelines) includes normalization, head motion correction, nuisance regression, registration, and ROI-based time series extraction. ABIDE is a benchmark for autism connectomics and machine learning-based disorder classification.

    Dataset NameABIDE
    fMRI TypesrsfMRI
    tfMRI TasksNone
    Age Range7–64 years
    Gender Ratio~85% male, ~15% female
    Patient/ControlASD patients and controls
    Disease TypesAutism Spectrum Disorder (ASD)
    Used in TasksTask 3 (disease diagnosis)
    Sample Size1,112 (539 ASD, 573 controls)
    UCLA Consortium for Neuropsychiatric Phenomics (UCLA): 272 subjects (multi-diagnostic), 3×3×4mm, TR=2000ms.

    The UCLA dataset comprises multimodal MRI and neuropsychological data for 272 adults (aged 21–50), including healthy controls and patients with schizophrenia, bipolar disorder, and ADHD. Resting-state and task-based fMRI (3×3×4 mm³, TR=2s) are included, with rich cognitive, behavioral, and clinical phenotype data. Imaging was acquired on Siemens Trio 3T scanners. Preprocessing includes motion correction, normalization, and ROI time series extraction. This dataset enables studies of transdiagnostic neural signatures and supports disease classification benchmarks.

    Dataset NameUCLA Phenomics
    fMRI TypesrsfMRI and tfMRI
    tfMRI TasksSternberg, Stroop, Stop-signal, Task-switching
    Age Range21–50 years
    Gender Ratio~46% male, ~54% female
    Patient/ControlPatients and controls
    Disease TypesSchizophrenia, Bipolar Disorder, ADHD
    Used in TasksTask 3 (disease diagnosis)
    Sample Size272 (130 healthy, 72 schizophrenia, 35 bipolar, 35 ADHD)
    Center for Biomedical Research Excellence (COBRE): 173 subjects (schizophrenia and controls), 3.75×3.75×4.55mm, TR=2000ms.

    COBRE provides MRI data for 89 schizophrenia patients and 84 healthy controls (aged 18–65), recruited at a single US site. Imaging includes rsfMRI (3.75×3.75×4.55 mm³, TR=2s), T1 MRI, and clinical/behavioral measures. Preprocessing follows standard steps: motion correction, normalization, ROI time series extraction. This dataset is widely used for machine learning classification of schizophrenia and connectome analysis.

    Dataset NameCOBRE
    fMRI TypesrsfMRI
    tfMRI TasksNone
    Age Range18–65 years
    Gender Ratio~70% male, ~30% female
    Patient/ControlSchizophrenia patients and controls
    Disease TypesSchizophrenia
    Used in TasksTask 3 (disease diagnosis)
    Sample Size173 (89 patients, 84 controls)
    Motor Neuron Disease fMRI Dataset (MND): 59 participants (ALS and controls), 2.4mm, TR=2000ms, collected in Australia.

    The MND dataset features anatomical and resting-state fMRI (2.395×2.395×2.4mm³, TR=2s) from 59 subjects (36 with Amyotrophic Lateral Sclerosis—ALS, 23 controls), acquired at Herston Imaging Research Facility in Australia using Siemens Prisma 3T scanners. Detailed motor, cognitive, and clinical characterization is included. Imaging data are preprocessed (motion correction, normalization). This dataset is suitable for studying motor system degeneration and machine learning-based diagnosis.

    Dataset NameMND (Motor Neuron Disease)
    fMRI TypesrsfMRI
    tfMRI TasksNone
    Age RangeMean ~57 years (range 30–80)
    Gender Ratio44 male, 15 female
    Patient/ControlALS patients and controls
    Disease TypesAmyotrophic Lateral Sclerosis (ALS)
    Used in TasksTask 3 (disease diagnosis)
    Sample Size59 (36 ALS, 23 controls)
    Transdiagnostic Connectome Project (TCP): 245 subjects with multiple psychiatric diagnoses (2mm, TR=800ms), harmonized imaging.

    The TCP dataset consists of 245 adults (aged 18–65) with a diverse range of psychiatric conditions (including mood, anxiety, and psychotic disorders), along with healthy controls, recruited at Yale and McLean (US). Resting-state fMRI (2mm³, TR=800ms) is harmonized across sites using Siemens Prisma scanners. All participants undergo the same comprehensive psychiatric diagnostic interviews (DSM-5), cognitive battery, and clinical assessments. Preprocessing mirrors HCP pipelines (motion correction, ICA-FIX, MNI registration, global signal regression), providing analysis-ready ROI-based functional connectivity and supporting transdiagnostic biomarker research.

    Dataset NameTCP (Transdiagnostic Connectome Project)
    fMRI TypesrsfMRI
    tfMRI TasksNone
    Age Range18–65 years
    Gender Ratio~54% female, ~46% male
    Patient/ControlMixed: patients (multiple psychiatric diagnoses) and controls
    Disease TypesMajor depressive disorder, generalized anxiety, bipolar, psychotic disorders, etc.
    Used in TasksTask 2 (phenotype prediction), Task 3 (disease diagnosis)
    Sample Size245
    Healthy Brain Network (HBN): A large-scale, transdiagnostic developmental dataset aiming for 10,000 children/adolescents (ages 5–21), with multimodal MRI and extensive clinical/behavioral measures.

    The Healthy Brain Network (HBN) is an ongoing open-data initiative from the Child Mind Institute, designed to capture a broad spectrum of childhood psychopathologies and typical development. Enrollment focuses on families where there are concerns about a child’s mental health or learning challenges, leading to high representation of clinical populations (e.g., ADHD, mood/anxiety disorders), though any child meeting basic inclusion criteria may participate. HBN spans multiple scanning sites in the New York City area (Staten Island mobile 1.5T and several 3T facilities), offering resting-state fMRI, anatomical MRI, diffusion MRI, EEG, voice/video recordings, and extensive phenotypic data. The resting-state protocol and “naturalistic” movie-watching runs (e.g., “Despicable Me,” “The Present”) facilitate pediatric compliance and reduce head motion. Data are released with thorough quality control metrics (framewise displacement, QAP) and partially harmonized preprocessing.

    Dataset NameHealthy Brain Network (HBN)
    fMRI TypesrsfMRI (plus naturalistic viewing fMRI)
    tfMRI TasksMovie-watching scans (e.g., “Despicable Me,” “The Present”)
    Age Range5–21 years
    Gender RatioMixed (ongoing enrollment)
    Patient/ControlTransdiagnostic sample (clinical concerns + typically developing)
    Disease TypesVarious childhood conditions (ADHD, mood, anxiety, etc.)
    Used in TasksTask 2 (phenotype prediction), Task 3 (disease diagnosis)
    Sample Size~3,900 to date (aiming for 10,000)
    Philadelphia Neurodevelopmental Cohort (PNC): A population-based youth cohort (age 8–21) with deep phenotyping, neurocognitive assessment, and multimodal MRI (1,445 scanned).

    The PNC is a large-scale, community-based study of neurodevelopment and psychiatric risk in over 9,500 youths (ages 8–21) from the greater Philadelphia area, with a deeply phenotyped imaging subsample (n=1,445) providing multimodal MRI, including resting-state and task-based fMRI, structural MRI, DTI, and perfusion scans (ASL). All imaging was performed on a single Siemens TIM Trio 3T scanner using harmonized protocols (rest fMRI: 3mm isotropic, TR=3s; 124 volumes; n-back/Emotion tasks: identical parameters).

    All participants received comprehensive computerized neurocognitive testing (CNB) and structured psychiatric assessment (GOASSESS, adapted K-SADS), with parent and self-report for children/adolescents, and rich demographic, medical, and clinical data. The imaging sample is balanced by age, sex, and race, providing a unique resource for developmental, cognitive, and transdiagnostic psychopathology research. Data are publicly available via dbGaP and the PNC data portal.

    Dataset NamePhiladelphia Neurodevelopmental Cohort (PNC)
    fMRI TypesrsfMRI, tfMRI (n-back, emotion ID)
    tfMRI TasksFractal n-back (working memory), Emotion Identification
    Age Range8–21 years
    Gender Ratio~48.5% male, ~51.5% female
    Patient/ControlPopulation-based (includes healthy and various clinical groups)
    Disease TypesTransdiagnostic: ADHD, mood, anxiety, psychosis-risk, conduct, etc.
    Used in TasksDevelopmental/phenotype prediction, cognitive modeling, disease diagnosis
    Sample Size1,445 with MRI (of 9,428 total assessed)
    HomepageLink
    REST-meta-MDD: Multisite resting-state fMRI project from the DIRECT consortium investigating major depressive disorder across China.

    The REST-meta-MDD Project is the first initiative of the Depression Imaging Research Consortium (DIRECT), involving 25 R-fMRI cohorts from 17 hospitals in China. It comprises 2,428 participants—1,300 patients with major depressive disorder (MDD) and 1,128 normal controls (NCs)—making it among the largest MDD resting-state fMRI datasets collected to date. Each site applied a standardized DPARSF-based preprocessing pipeline locally before sharing final resting-state metrics (e.g., region-wise functional connectivity) and necessary phenotypic data. The project aims to address concerns about low statistical power in smaller MDD studies and to reduce analytic flexibility across different centers by promoting uniform preprocessing. Clinical and demographic data (e.g., first-episode drug-naïve MDD, recurrent MDD, medication status, illness duration, Hamilton Depression Rating Scale scores) are included, allowing the investigation of pathophysiological mechanisms and potential biomarkers for diagnosis or treatment response. Data and code are openly shared to encourage replication, secondary analyses, and new discoveries in MDD research.

    Dataset NameREST-meta-MDD
    fMRI TypesrsfMRI
    tfMRI TasksNone
    Age RangePrimarily 18–65 years
    Gender RatioVaries by cohort; overall ~826 female, 474 male in MDD group
    Patient/ControlMDD patients and normal controls
    Disease TypesMajor Depressive Disorder
    Used in TasksPotentially Task 3 (disease diagnosis), research on biomarkers
    Sample Size2,428 (1,300 MDD, 1,128 NC)
Dataset distribution sunburst plot
Dataset and sex distribution overview.
Sankey diagram for NeuroSTORM dataset composition
Sample composition in NeuroSTORM datasets.

3. Supported Tasks

1. Age and Gender Prediction

Task Description:
This task assesses the ability of models to predict basic demographic variables—chronological age and biological sex—using only resting-state fMRI (rsfMRI) sequences as input. The input is a preprocessed 4D rsfMRI sequence for each subject. The output is either a continuous age value (for regression) or a categorical label (male/female, for classification). Age and sex are fundamental variables correlated with brain structure and function, and accurate prediction indicates that brain representations capture meaningful demographic information. This task is widely used as a baseline for evaluating the generalizability and biological relevance of neural representations in fMRI analysis.

How to use in NeuroSTORM
  • Sex classification (reported):
    --downstream_task_id 1 --downstream_task_type classification --task_name sex
  • Age prediction (regression):
    --downstream_task_id 1 --downstream_task_type regression --task_name age

2. Phenotype Prediction

Task Description:
This task involves predicting quantitative or categorical phenotypic scores (such as cognitive, behavioral, or clinical measurements) from fMRI data. The input is a preprocessed fMRI sequence (rsfMRI or tfMRI) and the output is a continuous score (for regression) corresponding to the target phenotype. Example outputs include MMSE scores, PANSS scores, DASS measures, and other clinical or cognitive test results. This task provides a direct evaluation of how well neural representations capture individual differences in brain function related to cognition, emotion, or disease traits, and is critical for developing clinically useful neuroimaging biomarkers.

How to use in NeuroSTORM
  • Phenotype regression:
    --downstream_task_id 2 --downstream_task_type regression --task_name your_score_name
    Replace your_score_name with the phenotype to predict (e.g., MMSE, PANSS_Positive).
Phenotype label distributions in HCP-YA
Distribution of selected phenotype scores in the HCP-YA dataset.
Phenotype label distributions in TCP
Distribution of representative phenotype scores in the TCP dataset.

3. Disease Diagnosis

Task Description:
This task requires the model to assign each subject to a diagnostic category (e.g., healthy control, ADHD, schizophrenia, autism) based on their resting-state or task-based fMRI data. The input is a preprocessed fMRI sequence for each subject. The output is a categorical disease label. Disease diagnosis tasks are central for translational neuroimaging, as they test the capacity of models to extract pathological signatures from brain activity and are directly relevant to clinical decision support and biomarker discovery.

How to use in NeuroSTORM
  • Disease classification:
    --downstream_task_id 3 --downstream_task_type classification --task_name diagnosis

4. fMRI Retrieval

Task Description:
This task evaluates the ability of models to align fMRI activation patterns with external semantic information, such as images or textual descriptions. The input can be an fMRI sequence (e.g., recorded while viewing images) or a semantic embedding (e.g., image features). The output is the retrieval of matching semantic content given fMRI input, or vice versa. This task is foundational for brain decoding, neural representation alignment, and cross-modal retrieval, and is crucial for understanding how brain activity encodes semantic information.

Experimental Procedure:
The fMRI retrieval experiment proceeds as follows:

  1. Test Sample Selection: Randomly select 300 test samples from the NSD dataset. Each sample includes an fMRI sequence and its associated natural image.
  2. Compute Image Embeddings: For each image, compute its CLIP embedding to obtain a semantic feature representation.
  3. Candidate Pool Creation: For each test image, use its CLIP embedding to query the LAION-5B dataset, retrieving the top 16 most similar images as candidates.
  4. fMRI Embedding Extraction: Process each test fMRI sequence with the analysis model to obtain its embedding in the same feature space as CLIP.
  5. Similarity Computation and Retrieval:
    • Compute cosine similarity between each fMRI embedding and the 16 candidate CLIP image embeddings. Select the image with the highest similarity as the retrieval result (brain-to-image retrieval).
    • For image-to-brain retrieval, use the image's CLIP embedding to retrieve the matching fMRI embedding from the pool of fMRI embeddings.
  6. Evaluation: Calculate top-1, top-3, and top-5 retrieval accuracy. Repeat the retrieval process multiple times to ensure statistical robustness.

How to use in NeuroSTORM

Support for this task in NeuroSTORM will be released soon.

5. Task-based fMRI State Classification

Task Description:
This task involves classifying which cognitive state or task condition a subject is in, based on their task-based fMRI (tfMRI) sequence. The input is a preprocessed tfMRI sequence corresponding to a specific cognitive experiment (e.g., language, emotion, gambling task). The output is a label indicating the task condition or cognitive state. Accurate state classification demonstrates that the model captures functionally relevant brain activation patterns, and this task is a key benchmark for evaluating generalization and sensitivity to cognitive manipulations in fMRI analysis.

How to use in NeuroSTORM
  • State classification:
    --downstream_task_id 5 --downstream_task_type classification --task_name state_classification