Reorganizing imaging sourcedata

Nipoppy distinguishes between “pre-” and “post-reorg” status prior to BIDS conversion. Every neuroimaging unit organizes the original storage of their sourcedata differently and even within units, sourcedata is often messy in different ways, making it hard to use BIDSification tools directly. Often we find that there is a gap between state of the out-of-scanner vs. ready-for-BIDSification data. Nipoppy breaks this process down and provides a unified way to deal with sourcedata which simplifies BIDSification.

Sourcedata in Nipoppy

The Nipoppy directory tree for the imaging sourcedata looks like the following:

├── imaging
│   ├── curation_status.tsv
│   ├── downloads
│   ├── pre_reorg
│   ├── post_reorg
│   └── README.md

Example content for the imaging sourcedata directories:

downloads

├── downloads
│   ├── example.zip
│   └── README.md

or

├── downloads
│   ├── example.tar
│   └── README.md

or

├── downloads
│   ├── example.tar.gz
│   └── README.md

pre-reorg

├── pre_reorg
│   ├── 01
│   │   ├── 1
│   │   │   ├── protocol_1/
│   │   │   │   ├── 001.dcm
│   │   │   │   ├── …
│   │   │   │   └── 084.dcm
│   │   │   ├── protocol_2/
│   │   │   │   ├── 100.dcm
│   │   │   │   ├── …
│   │   │   │   └── 184.dcm
│   ├── 02
│   │   ├── 1
│   │   │   ├── protocol_1/
│   │   │   │   ├── 001.dcm
│   │   │   │   ├── …
│   │   │   │   └── 084.dcm
│   │   │   ├── protocol_2/
│   │   │   │   ├── 100.dcm
│   │   │   │   ├── …
│   │   │   │   └── 184.dcm
│   └── README.md

post-reorg

├── post_reorg
│   ├── sub-01
│   │   ├── ses-1
│   │   │   ├── 001.dcm
│   │   │   ├── …
│   │   │   ├── 084.dcm
│   │   │   ├── 100.dcm
│   │   │   ├── …
│   │   │   ├── 184.dcm
│   ├── sub-02
│   │   ├── ses-1
│   │   │   ├── 001.dcm
│   │   │   ├── …
│   │   │   ├── 084.dcm
│   │   │   ├── 100.dcm
│   │   │   ├── …
│   │   │   ├── 184.dcm
│   └── README.md

# flat list of dicom files and
sub- and ses- prefix added

nipoppy reorg

Requirements for running nipoppy reorg

  • uncompressed DICOM sourcedata in <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/pre_reorg. You can check this by running

$ nipoppy status --dataset <NIPOPPY_PROJECT_ROOT>

The output should list the number of participants that are present in both the manifest.tsv file and in <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/pre_reorg but are not in <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/post_reorg yet, according to the curation status file. These are the participants and session Nipoppy will loop over when running nipoppy reorg. If you feel like the information of nipoppy status is outdated, you can run nipoppy track-curation --regenerate to update.

  • file organization in <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/pre_reorg must follow a subject-first, session-second manner (see next section for help if this is not the case for your data)

  • the manifest.tsv file must list all participants and sessions

  • subject folder and session folder are not allowed to have the BIDS-specific prefixes: subject and session folders need to be named exactly as indicated in the manifest.tsv file

If all of these requirements are satisfied, you can run

$ nipoppy reorg --dataset <NIPOPPY_PROJECT_ROOT>

For each participant-session pair, Nipoppy

  • “copies” (the default is to create symlinks) files from the <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/pre_reorg directory to the <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/post_reorg directory into a flat list

  • adds a sub- prefix to all participant folders and a ses- prefix to all session folders

You can check the successful reorganization in the curation status file or simply by running

$ nipoppy status --dataset <NIPOPPY_PROJECT_ROOT>

Customizing the nipoppy reorg behavior

If the file organization in <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/pre_reorg does not follow a subject-first, session-second manner but vice versa, you can simply set "DICOM_DIR_PARTICIPANT_FIRST" to "false" in the global configuration file.

If the raw imaging data are not organized in any of these two structures, a custom tab-separated file can be created to map each unique participant-session pair to a directory path (relative to <NIPOPPY_PROJECT_ROOT>/sourcedata/imaging/pre_reorg). This path to this mapping file must be specified in the "DICOM_DIR_MAP_FILE" in the global configuration file. See the schema reference for more information.

Here is an example file for a dataset that already uses the ses- prefix for sessions:

participant_id

session_id

participant_dicom_dir

01

1

01/ses-1

01

2

01/ses-2

02

1

02/ses-1