Download Datasets: Difference between revisions

From Network for Advanced NMR
Jump to navigationJump to search
No edit summary
Line 1: Line 1:
=== Download ===
== Download ==
Datasets may be downloaded by selecting the '''Download''' action from the context menu, which is accessed from the context menu by right-clicking on a selected dataset (or set of datasets).
Datasets may be downloaded by selecting the '''Download''' action from the context menu, which is accessed from the context menu by right-clicking on a selected dataset (or set of datasets).


Line 120: Line 120:
|}
|}


==== Dataset Metadata Files ====
== Dataset Metadata Files ==
Each individual dataset directory includes several additional files:
Each individual dataset directory includes several additional files or directories
* <code>provenance.prov</code> – a W3C PROV file describing the complete provenance of the dataset
* <code>provenance.prov</code> – a W3C PROV file describing the complete provenance of the dataset
* <code>sample_metadata.xml</code> – an XML file containing sample information, if applicable
* <code>sample_metadata.xml</code> – an XML file containing sample information, if applicable
* <code>experiment.csv</code> – a single-entry CSV file describing the dataset (same format as the main <code>experiments.csv</code>)
* <code>experiment.csv</code> – a single-entry CSV file describing the dataset (same format as the main <code>experiments.csv</code>)
* <code>identity.xml</code> – an internal-use XML file that can generally be ignored by users
* <code>identity.xml</code> – an internal-use XML file that can generally be ignored by users
* <code>supplemental_data</code> – A directory containing additional supplemental data files that the user uploaded for the dataset
* <code>supplemental_data.csv</code> – a CSV file with the category, type, value, and description for all supplemental data entries.
* <code>post_acquisition</code> – a directory containing file contents retrieved from the post_acquisition directory on NMRbox


The location of these files differs slightly depending on the organization format.
The location of these files / directories differs slightly depending on the organization format.


=== Organized for TopSpin ===
== Organized for TopSpin ==
When '''Organized for TopSpin''' is selected, the download is structured to match the standard Bruker TopSpin format. Datasets are grouped under directories named for the spectrometer used for acquisition (e.g., <code>NMR800-NEO</code>, <code>NMR600-NEO</code>).
When '''Organized for TopSpin''' is selected, the download is structured to match the standard Bruker TopSpin format. Datasets are grouped under directories named for the spectrometer used for acquisition (e.g., <code>NMR800-NEO</code>, <code>NMR600-NEO</code>).


Line 138: Line 141:
This format is ideal when multiple experiments would need to be opened in TopSpin as the experiment_diretory/EXPNO format is strictly preserved with multiple EXPNO directories residing under the experiment_directory.
This format is ideal when multiple experiments would need to be opened in TopSpin as the experiment_diretory/EXPNO format is strictly preserved with multiple EXPNO directories residing under the experiment_directory.


==== Example Layout ====
=== Example Layout ===
<pre>
<pre>NMR800-NEO/
NMR800-NEO/
└── polx/
└── polx/
     └── 5/
     └── 5/
         ├── fid
         ├── ser
         ├── acqus
         ├── acqus
         ├── acqu2s
         ├── acqu2s
         ├── pulseprogram
         ├── pulseprogram
         ├── procs
         ├── lists/
        ├── pdata/
        ├── ...
         ├── provenance.prov
         ├── provenance.prov
         ├── sample_metadata.xml
         ├── sample_metadata.xml
         ├── experiment.csv
         ├── experiment.csv
         ├── identity.xml
         ├── identity.xml
         └── pdata/
        ├── supplemental_data/
         └── post_acquisition/


NMR600-NEO/
NMR600-NEO/
Line 159: Line 164:
     │  ├── fid
     │  ├── fid
     │  ├── acqus
     │  ├── acqus
    |  ├── pulseprogram
    |  ├── lists/
    |  ├── pdata/
    |  ├── ...
     │  ├── provenance.prov
     │  ├── provenance.prov
     │  ├── sample_metadata.xml
     │  ├── sample_metadata.xml
     │  ├── experiment.csv
     │  ├── experiment.csv
     │  ├── identity.xml
     │  ├── identity.xml
       └── pdata/
     |  ├── supplemental_data/
    |   └── post_acquisition/
 


     ├── 9_20250224000647/
     ├── 9_20250224000647/
     │  ├── fid
     │  ├── fid
     │  ├── acqus
     │  ├── acqus
    |  ├── pulseprogram
    |  ├── lists/
    |  ├── pdata/
    |  ├── ...
     │  ├── provenance.prov
     │  ├── provenance.prov
     │  ├── sample_metadata.xml
     │  ├── sample_metadata.xml
     │  ├── experiment.csv
     │  ├── experiment.csv
     │  ├── identity.xml
     │  ├── identity.xml
       └── pdata/
     |  ├── supplemental_data/
    |   └── post_acquisition/


     └── 10/
     └── 10/
         ├── fid
         ├── fid
         ├── acqus
         ├── acqus
        ├── acqu2s
        ├── pulseprogram
        ├── lists/
        ├── pdata/
        ├── ...
         ├── provenance.prov
         ├── provenance.prov
         ├── sample_metadata.xml
         ├── sample_metadata.xml
         ├── experiment.csv
         ├── experiment.csv
         ├── identity.xml
         ├── identity.xml
         └── pdata/
        ├── supplemental_data/
         └── post_acquisition/


experiments.csv
experiments.csv</pre>
</pre>


=== Organized by Experiment ===
== Organized by Experiment ==
When '''Organized by Experiment''' is selected, the download is structured so that each dataset resides in its own top-level directory named using the format:
When '''Organized by Experiment''' is selected, the download is structured so that each dataset resides in its own top-level directory named using the format:


Line 197: Line 218:
* The dataset-specific <code>provenance.prov</code>, <code>sample_metadata.xml</code>, <code>experiment.csv</code>, and <code>identity.xml</code> files are placed in the top-level timestamped directory
* The dataset-specific <code>provenance.prov</code>, <code>sample_metadata.xml</code>, <code>experiment.csv</code>, and <code>identity.xml</code> files are placed in the top-level timestamped directory


==== Example Layout ====
=== Example Layout ===
<pre>
<pre>
20250124T000636_NMR800-NEO_hsqcetf3gpsi/
20250124T000636_NMR800-NEO_hsqcetf3gpsi/

Revision as of 17:21, 27 May 2025

Download

Datasets may be downloaded by selecting the Download action from the context menu, which is accessed from the context menu by right-clicking on a selected dataset (or set of datasets).

Users may choose between two download formats:

  • Organized for TopSpin
  • Organized by Experiment

The selected datasets are packaged into a zip file and downloaded through your browser to the local Downloads folder.

All files are included, including supplemental data and any contents from the post-acquisition directory.

An `experiments.csv` file is placed in the root of the zip archive. It lists the downloaded experiments in the following format:

Path Display Name Dataset Name Facility Spectrometer Field State Pulse Program # dims # dims collected direct nuclei nuclei Temp Classification Sample Date NAN User PI Workstation User UUID
NMR800-NEO/polx/5 HSQC polx/5 Mullen NMR800-NEO 800 solution hsqcetf3gpsi 2 2 1H 1H,15N 298 test polx 2025-01-24T00:06:36-05:00 Bloch Purcell Rabi fb5323d6-fcae-4328-a908-9f6ff1d88512
NMR600-NEO/ubiquitin/9 1D 1H ubiquitin/9 Mullen NMR600-NEO 600 solution zgpr 1 1 1H 1H 298 calibration ubiquitin 2025-02-24T00:06:36-05:00 Bloch Purcell Rabi fb5323d6-fcae-4328-a908-9f6ff1d88518
NMR600-NEO/ubiquitin/9 1D 1H ubiquitin/9 Mullen NMR600-NEO 600 solution zgpr 1 1 1H 1H 298 calibration ubiquitin 2025-02-24T00:06:47-05:00 Bloch Purcell Rabi fb5323d6-fcae-4328-a908-9f6ff1d88519
NMR600-NEO/ubiquitin/10 1D NOESY ubiquitin/10 Mullen NMR600-NEO 600 solution noesygppr1d 1 1 1H 1H 298 successful ubiquitin 2025-02-24T00:07:26-05:00 Bloch Purcell Rabi fb5323d6-fcae-4328-a908-9f6ff1d8854

Dataset Metadata Files

Each individual dataset directory includes several additional files or directories

  • provenance.prov – a W3C PROV file describing the complete provenance of the dataset
  • sample_metadata.xml – an XML file containing sample information, if applicable
  • experiment.csv – a single-entry CSV file describing the dataset (same format as the main experiments.csv)
  • identity.xml – an internal-use XML file that can generally be ignored by users
  • supplemental_data – A directory containing additional supplemental data files that the user uploaded for the dataset
  • supplemental_data.csv – a CSV file with the category, type, value, and description for all supplemental data entries.
  • post_acquisition – a directory containing file contents retrieved from the post_acquisition directory on NMRbox

The location of these files / directories differs slightly depending on the organization format.

Organized for TopSpin

When Organized for TopSpin is selected, the download is structured to match the standard Bruker TopSpin format. Datasets are grouped under directories named for the spectrometer used for acquisition (e.g., NMR800-NEO, NMR600-NEO).

Each dataset resides in a subdirectory under the dataset name, with an additional level for the experiment number (EXPNO). If two datasets have the same dataset name and experiment number, the older one will have a timestamp suffix to avoid overwriting (e.g., 9_20250224000647).

The provenance.prov, sample_metadata.xml, experiment.csv, and identity.xml files are placed inside the experiment number (EXPNO) directory alongside the standard TopSpin files.

This format is ideal when multiple experiments would need to be opened in TopSpin as the experiment_diretory/EXPNO format is strictly preserved with multiple EXPNO directories residing under the experiment_directory.

Example Layout

NMR800-NEO/
└── polx/
    └── 5/
        ├── ser
        ├── acqus
        ├── acqu2s
        ├── pulseprogram
        ├── lists/
        ├── pdata/
        ├── ...
        ├── provenance.prov
        ├── sample_metadata.xml
        ├── experiment.csv
        ├── identity.xml
        ├── supplemental_data/
        └── post_acquisition/

NMR600-NEO/
└── ubiquitin/
    ├── 9/
    │   ├── fid
    │   ├── acqus
    |   ├── pulseprogram
    |   ├── lists/
    |   ├── pdata/
    |   ├── ...
    │   ├── provenance.prov
    │   ├── sample_metadata.xml
    │   ├── experiment.csv
    │   ├── identity.xml
    |   ├── supplemental_data/
    |   └── post_acquisition/


    ├── 9_20250224000647/
    │   ├── fid
    │   ├── acqus
    |   ├── pulseprogram
    |   ├── lists/
    |   ├── pdata/
    |   ├── ...
    │   ├── provenance.prov
    │   ├── sample_metadata.xml
    │   ├── experiment.csv
    │   ├── identity.xml
    |   ├── supplemental_data/
    |   └── post_acquisition/

    └── 10/
        ├── fid
        ├── acqus
        ├── acqu2s
        ├── pulseprogram
        ├── lists/
        ├── pdata/
        ├── ...
        ├── provenance.prov
        ├── sample_metadata.xml
        ├── experiment.csv
        ├── identity.xml
        ├── supplemental_data/
        └── post_acquisition/

experiments.csv

Organized by Experiment

When Organized by Experiment is selected, the download is structured so that each dataset resides in its own top-level directory named using the format:

YYYYMMDDTHHMMSS_Spectrometer_PulseProgram

Inside each of these timestamped directories:

  • The Bruker dataset is nested in a folder named with the Bruker dataset name (e.g., polx, ubiquitin)
  • That folder contains the experiment number as a subfolder (e.g., 5, 9, 10)
  • The experiment directory contains the standard TopSpin file layout
  • The dataset-specific provenance.prov, sample_metadata.xml, experiment.csv, and identity.xml files are placed in the top-level timestamped directory

Example Layout

20250124T000636_NMR800-NEO_hsqcetf3gpsi/
├── provenance.prov
├── sample_metadata.xml
├── experiment.csv
├── identity.xml
└── polx/
    └── 5/
        ├── fid
        ├── acqus
        ├── acqu2s
        ├── pulseprogram
        └── pdata/

20250224T000636_NMR600-NEO_zgpr/
├── provenance.prov
├── sample_metadata.xml
├── experiment.csv
├── identity.xml
└── ubiquitin/
    └── 9/
        ├── fid
        ├── acqus
        └── pdata/

20250224T000647_NMR600-NEO_zgpr/
├── provenance.prov
├── sample_metadata.xml
├── experiment.csv
├── identity.xml
└── ubiquitin/
    └── 9/
        ├── fid
        ├── acqus
        └── pdata/

20250224T000726_NMR600-NEO_noesygppr1d/
├── provenance.prov
├── sample_metadata.xml
├── experiment.csv
├── identity.xml
└── ubiquitin/
    └── 10/
        ├── fid
        ├── acqus
        └── pdata/

experiments.csv