Download Datasets: Difference between revisions

Latest revision as of 17:06, 10 June 2025

Dataset Browser Navigation

Dataset Browser

Overview

Datasets may be downloaded by selecting the Download action from the context menu, which is accessed from the context menu by right-clicking on a selected dataset (or set of datasets).

Users may choose between two download formats:

Organized for TopSpin
Organized by Experiment

The selected datasets are packaged into a zip file and downloaded through your browser to the local Downloads folder.

All files are included, including supplemental data, any contents from the post-acquisition directory, and a series of metadata files added by NAN described below.

An `experiments.csv` file is placed in the root of the zip archive. It lists the downloaded experiments in the following format:


Path	Display Name	Dataset Name	Facility	Spectrometer	Field	State	Pulse Program	# dims	# dims collected	direct nuclei	nuclei	Temp	Classification	Sample	Date	NAN User	PI	Workstation User	UUID
NMR800-NEO/polx/5	HSQC	polx/5	Mullen	NMR800-NEO	800	solution	hsqcetf3gpsi	2	2	1H	1H,15N	298	test	polx	2025-01-24T00:06:36-05:00	Bloch	Purcell	Rabi	fb5323d6-fcae-4328-a908-9f6ff1d88512
NMR600-NEO/ubiquitin/9	1D 1H	ubiquitin/9	Mullen	NMR600-NEO	600	solution	zgpr	1	1	1H	1H	298	calibration	ubiquitin	2025-02-24T00:06:36-05:00	Bloch	Purcell	Rabi	fb5323d6-fcae-4328-a908-9f6ff1d88518
NMR600-NEO/ubiquitin/9	1D 1H	ubiquitin/9	Mullen	NMR600-NEO	600	solution	zgpr	1	1	1H	1H	298	calibration	ubiquitin	2025-02-24T00:06:47-05:00	Bloch	Purcell	Rabi	fb5323d6-fcae-4328-a908-9f6ff1d88519
NMR600-NEO/ubiquitin/10	1D NOESY	ubiquitin/10	Mullen	NMR600-NEO	600	solution	noesygppr1d	1	1	1H	1H	298	successful	ubiquitin	2025-02-24T00:07:26-05:00	Bloch	Purcell	Rabi	fb5323d6-fcae-4328-a908-9f6ff1d8854

Note that the column names used in this table have been shortened versus the actual experiment.csv file to make viewing the table on the Wiki page easier and the data is not real.

Dataset Metadata Files added by NAN

Each individual dataset directory includes several additional files or directories

provenance.prov – a W3C PROV file describing the complete provenance of the dataset
sample_metadata.xml – an XML file containing sample information, if applicable
experiment.csv – a single-entry CSV file describing the dataset (same format as the main experiments.csv)
identity.xml – an internal-use XML file that can generally be ignored by users
supplemental_data – A directory containing additional supplemental data files that the user uploaded for the dataset
supplemental_data.csv – a CSV file with the category, type, value, and description for all supplemental data entries.
post_acquisition – a directory containing file contents retrieved from the post_acquisition directory on NMRbox

The location of these files / directories differs slightly depending on the organization format.

Organized for TopSpin File Layout

When Organized for TopSpin is selected, the download is structured to match the standard Bruker TopSpin format. Datasets are grouped under directories named for the spectrometer used for acquisition (e.g., NMR800-NEO, NMR600-NEO).

Each dataset resides in the experiment number (EXPNO) subdirectory under the experiment directory. If two datasets have the same experiment directory and experiment number, the older one will have a timestamp suffix to avoid overwriting (e.g., 9_20250224000647).

The provenance.prov, sample_metadata.xml, experiment.csv, supplemental_data.csv, and identity.xml files as well as directories for supplemental_data and post_acquisitionare placed inside the experiment number (EXPNO) directory alongside the standard TopSpin files.

This format is ideal when multiple experiments would need to be opened in TopSpin as the experiment_directory/EXPNO format is strictly preserved with multiple EXPNO directories residing under the experiment_directory.

Example Layout

NMR800-NEO/
└── polx/
    └── 5/
        ├── ser
        ├── acqus
        ├── acqu2s
        ├── pulseprogram
        ├── lists/
        ├── pdata/
        ├── ...
        ├── provenance.prov
        ├── sample_metadata.xml
        ├── experiment.csv
        ├── identity.xml
        ├── supplemental_data.csv
        ├── supplemental_data/
        └── post_acquisition/

NMR600-NEO/
└── ubiquitin/
    ├── 9/
    │   ├── fid
    │   ├── acqus
    |   ├── pulseprogram
    |   ├── lists/
    |   ├── pdata/
    |   ├── ...
    │   ├── provenance.prov
    │   ├── sample_metadata.xml
    │   ├── experiment.csv
    │   ├── identity.xml
    |   ├── supplemental_data.csv
    |   ├── supplemental_data/
    |   └── post_acquisition/


    ├── 9_20250224000647/
    │   ├── fid
    │   ├── acqus
    |   ├── pulseprogram
    |   ├── lists/
    |   ├── pdata/
    |   ├── ...
    │   ├── provenance.prov
    │   ├── sample_metadata.xml
    │   ├── experiment.csv
    │   ├── identity.xml
    |   ├── supplemental_data.csv
    |   ├── supplemental_data/
    |   └── post_acquisition/

    └── 10/
        ├── fid
        ├── acqus
        ├── pulseprogram
        ├── lists/
        ├── pdata/
        ├── ...
        ├── provenance.prov
        ├── sample_metadata.xml
        ├── experiment.csv
        ├── identity.xml
        ├── supplemental_data.csv
        ├── supplemental_data/
        └── post_acquisition/

experiments.csv

Organized by Experiment File Layout

When Organized by Experiment is selected, the download is structured so that each dataset resides in its own top-level directory named using the format:

YYYYMMDDTHHMMSS_Spectrometer_PulseProgram

Inside each of these timestamped directories:

The Bruker dataset is nested in a folder named with the Bruker dataset name (e.g., polx, ubiquitin)
That folder contains the experiment number as a subfolder (e.g., 5, 9, 10)
The experiment directory contains the standard TopSpin file layout
The dataset-specific provenance.prov, sample_metadata.xml, experiment.csv, supplemental_data.csv, and identity.xml files are placed in the top-level timestamped directory. The supplemental_data and post_acquisition directories are located within the experiment number directory

Example Layout

20250124T000636_NMR800-NEO_hsqcetf3gpsi/
├── provenance.prov
├── sample_metadata.xml
├── experiment.csv
├── identity.xml
├── supplemental_data.csv
└── polx/
    └── 5/
        ├── fid
        ├── acqus
        ├── acqu2s
        ├── pulseprogram
        ├── pdata
        ├── ...
        ├── supplemental_data/
        └── post_acquisition/

20250224T000636_NMR600-NEO_zgpr/
├── provenance.prov
├── sample_metadata.xml
├── experiment.csv
├── identity.xml
├── supplemental_data.csv
└── ubiquitin/
    └── 9/
        ├── fid
        ├── acqus
        ├── pulseprogram
        ├── pdata
        ├── ...
        ├── supplemental_data/
        └── post_acquisition/

20250224T000647_NMR600-NEO_zgpr/
├── provenance.prov
├── sample_metadata.xml
├── experiment.csv
├── identity.xml
├── supplemental_data.csv
└── ubiquitin/
    └── 9/
        ├── fid
        ├── acqus
        ├── pulseprogram
        ├── pdata
        ├── ...
        ├── supplemental_data/
        └── post_acquisition/

20250224T000726_NMR600-NEO_noesygppr1d/
├── provenance.prov
├── sample_metadata.xml
├── experiment.csv
├── identity.xml
├── supplemental_data.csv
└── ubiquitin/
    └── 10/
        ├── fid
        ├── acqus
        ├── pulseprogram
        ├── pdata
        ├── ...
        ├── supplemental_data/
        └── post_acquisition/

experiments.csv

Download Datasets: Difference between revisions

Latest revision as of 17:06, 10 June 2025

Contents

Overview

Dataset Metadata Files added by NAN

Organized for TopSpin File Layout

Example Layout

Organized by Experiment File Layout

Example Layout

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 1: / Line 1: @@
-=== Download ===
+{{Datasets}}
-Datasets may be downloaded by selecting the '''Download''' action from the context menu, which is accessed by right-clicking on a selected dataset (or set of datasets).
+== Overview ==
+Datasets may be downloaded by selecting the '''Download''' action from the context menu, which is accessed from the context menu by right-clicking on a selected dataset (or set of datasets).
 Users may choose between two download formats:
@@ Line 8: / Line 10: @@
 The selected datasets are packaged into a zip file and downloaded through your browser to the local Downloads folder.
-All files are included, including supplemental data and any contents from the post-acquisition directory.
+All files are included, including supplemental data, any contents from the post-acquisition directory, and a series of metadata files added by NAN described below.
 An `experiments.csv` file is placed in the root of the zip archive. It lists the downloaded experiments in the following format:
@@ Line 119: / Line 121: @@
 |fb5323d6-fcae-4328-a908-9f6ff1d8854
 |}
+''Note that the column names used in this table have been shortened versus the actual experiment.csv file to make viewing the table on the Wiki page easier and the data is not real.''
+== Dataset Metadata Files added by NAN ==
+Each individual dataset directory includes several additional files or directories
+* <code>provenance.prov</code> – a W3C PROV file describing the complete provenance of the dataset
+* <code>sample_metadata.xml</code> – an XML file containing sample information, if applicable
+* <code>experiment.csv</code> – a single-entry CSV file describing the dataset (same format as the main <code>experiments.csv</code>)
+* <code>identity.xml</code> – an internal-use XML file that can generally be ignored by users
+* <code>supplemental_data</code> – A directory containing additional supplemental data files that the user uploaded for the dataset
+* <code>supplemental_data.csv</code> – a CSV file with the category, type, value, and description for all supplemental data entries.
+* <code>post_acquisition</code> – a directory containing file contents retrieved from the post_acquisition directory on NMRbox
+The location of these files / directories differs slightly depending on the organization format.
-=== Organized for TopSpin ===
+== Organized for TopSpin File Layout ==
 When '''Organized for TopSpin''' is selected, the download is structured to match the standard Bruker TopSpin format. Datasets are grouped under directories named for the spectrometer used for acquisition (e.g., <code>NMR800-NEO</code>, <code>NMR600-NEO</code>).
-Each dataset resides in a subdirectory under the dataset name, with an additional level for the experiment number (EXPNO). If two datasets have the same dataset name and experiment number, the older one will have a timestamp suffix to avoid overwriting (e.g., <code>9_20250224000647</code>).
+Each dataset resides in the experiment number (EXPNO) subdirectory under the experiment directory. If two datasets have the same experiment directory and experiment number, the older one will have a timestamp suffix to avoid overwriting (e.g., <code>9_20250224000647</code>).
+The <code>provenance.prov</code>, <code>sample_metadata.xml</code>, <code>experiment.csv</code>, <code>supplemental_data.csv</code>, and <code>identity.xml</code> files as well as directories for <code>supplemental_data</code> and <code>post_acquisition</code>are placed inside the experiment number (EXPNO) directory alongside the standard TopSpin files.
-==== Example Layout ====
+This format is ideal when multiple experiments would need to be opened in TopSpin as the experiment_directory/EXPNO format is strictly preserved with multiple EXPNO directories residing under the experiment_directory.
-The directory structure for the four datasets in the example would look like this:
-<pre>
+=== Example Layout ===
-NMR800-NEO/
+<pre>NMR800-NEO/
 └── polx/
      └── 5/
-         ├── fid
+         ├── ser
          ├── acqus
          ├── acqu2s
          ├── pulseprogram
-         ├── procs
+         ├── lists/
-         └── pdata/
+        ├── pdata/
+        ├── ...
+        ├── provenance.prov
+        ├── sample_metadata.xml
+        ├── experiment.csv
+        ├── identity.xml
+        ├── supplemental_data.csv
+        ├── supplemental_data/
+         └── post_acquisition/
 NMR600-NEO/
@@ Line 144: / Line 168: @@
      │   ├── fid
      │   ├── acqus
-     │   ├── pulseprogram
+    |   ├── pulseprogram
-     │   └── pdata/
+    |   ├── lists/
+    |   ├── pdata/
+    |   ├── ...
+    │   ├── provenance.prov
+    │   ├── sample_metadata.xml
+     │   ├── experiment.csv
+     │   ├── identity.xml
+    |   ├── supplemental_data.csv
+    |   ├── supplemental_data/
+    |   └── post_acquisition/
      ├── 9_20250224000647/
      │   ├── fid
      │   ├── acqus
-     │   └── pdata/
+    |   ├── pulseprogram
+    |   ├── lists/
+    |   ├── pdata/
+    |   ├── ...
+    │   ├── provenance.prov
+     │   ├── sample_metadata.xml
+    │   ├── experiment.csv
+    │   ├── identity.xml
+    |   ├── supplemental_data.csv
+    |   ├── supplemental_data/
+    |   └── post_acquisition/
      └── 10/
          ├── fid
          ├── acqus
-         └── pdata/
+        ├── pulseprogram
+        ├── lists/
+        ├── pdata/
+        ├── ...
+        ├── provenance.prov
+        ├── sample_metadata.xml
+        ├── experiment.csv
+        ├── identity.xml
+        ├── supplemental_data.csv
+        ├── supplemental_data/
+         └── post_acquisition/
-experiments.csv
+experiments.csv</pre>
-</pre>
-=== Organized by Experiment ===
+== Organized by Experiment File Layout ==
 When '''Organized by Experiment''' is selected, the download is structured so that each dataset resides in its own top-level directory named using the format:
 <code>YYYYMMDDTHHMMSS_Spectrometer_PulseProgram</code>
-Within each of these timestamped directories is the full Bruker dataset layout:
+Inside each of these timestamped directories:
-* A folder named according to the Bruker dataset name (e.g., <code>polx</code> or <code>ubiquitin</code>)
+* The Bruker dataset is nested in a folder named with the Bruker dataset name (e.g., <code>polx</code>, <code>ubiquitin</code>)
-* Inside that folder is the experiment number (e.g., <code>5</code>, <code>9</code>, <code>10</code>)
+* That folder contains the experiment number as a subfolder (e.g., <code>5</code>, <code>9</code>, <code>10</code>)
-* The data follows the standard TopSpin structure (e.g., <code>fid</code>, <code>acqus</code>, <code>pdata/1/1r</code>)
+* The experiment directory contains the standard TopSpin file layout
+* The dataset-specific <code>provenance.prov</code>, <code>sample_metadata.xml</code>, <code>experiment.csv</code>, <code>supplemental_data.csv</code>, and <code>identity.xml</code> files are placed in the top-level timestamped directory. The supplemental_data and post_acquisition directories are located within the experiment number directory
-In addition to the dataset itself, each timestamped directory includes:
-* <code>provenance.prov</code> – a W3C PROV file documenting the complete provenance of the dataset
-* <code>sample_metadata.xml</code> – an XML file describing the sample, if applicable
-* <code>experiment.csv</code> – a CSV file identical in format to the global <code>experiments.csv</code> but containing only a single row for the corresponding dataset
-* <code>identity.xml</code> – an internal-use XML file (can typically be ignored by end users)
-==== Example Layout ====
-The directory structure for the four datasets in the example above would look like this:
-<pre>
+=== Example Layout ===
-20250124T000636_NMR800-NEO_hsqcetf3gpsi/
+<pre>20250124T000636_NMR800-NEO_hsqcetf3gpsi/
 ├── provenance.prov
 ├── sample_metadata.xml
 ├── experiment.csv
 ├── identity.xml
+├── supplemental_data.csv
 └── polx/
      └── 5/
@@ Line 191: / Line 237: @@
          ├── acqu2s
          ├── pulseprogram
-         └── pdata/
+        ├── pdata
+        ├── ...
+        ├── supplemental_data/
+         └── post_acquisition/
 20250224T000636_NMR600-NEO_zgpr/
@@ Line 198: / Line 247: @@
 ├── experiment.csv
 ├── identity.xml
+├── supplemental_data.csv
 └── ubiquitin/
      └── 9/
          ├── fid
          ├── acqus
-         └── pdata/
+        ├── pulseprogram
+        ├── pdata
+        ├── ...
+        ├── supplemental_data/
+         └── post_acquisition/
 20250224T000647_NMR600-NEO_zgpr/
@@ Line 209: / Line 263: @@
 ├── experiment.csv
 ├── identity.xml
+├── supplemental_data.csv
 └── ubiquitin/
      └── 9/
          ├── fid
          ├── acqus
-         └── pdata/
+        ├── pulseprogram
+        ├── pdata
+        ├── ...
+        ├── supplemental_data/
+         └── post_acquisition/
 20250224T000726_NMR600-NEO_noesygppr1d/
@@ Line 220: / Line 279: @@
 ├── experiment.csv
 ├── identity.xml
+├── supplemental_data.csv
 └── ubiquitin/
      └── 10/
          ├── fid
          ├── acqus
-         └── pdata/
+        ├── pulseprogram
+        ├── pdata
+        ├── ...
+        ├── supplemental_data/
+         └── post_acquisition/
-experiments.csv
+experiments.csv</pre>
-</pre>