Download Datasets: Difference between revisions
Mmaciejewski (talk | contribs) Created page with "=== Download === Datasets may be downloaded by selecting the "Download" action from the context menu accessible by right mouse clicking on a selected dataset(s). When downloading datasets users choose between Organized for Topspin or Organized by Experiment The dataset(s) requested for downloaded are packaged into a zip file and downloaded through your browser to the Downloads folder All files, including supplemental data and any data in the post-acquisition director..." |
Mmaciejewski (talk | contribs) No edit summary |
||
(18 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
{{Datasets}} | |||
Datasets | |||
== Overview == | |||
Datasets may be downloaded by selecting the '''Download''' action from the context menu, which is accessed from the context menu by right-clicking on a selected dataset (or set of datasets). | |||
Users may choose between two download formats: | |||
* ''Organized for TopSpin'' | |||
* ''Organized by Experiment'' | |||
All files, including supplemental data | The selected datasets are packaged into a zip file and downloaded through your browser to the local Downloads folder. | ||
All files are included, including supplemental data, any contents from the post-acquisition directory, and a series of metadata files added by NAN described below. | |||
An `experiments.csv` file is placed in the root of the zip archive. It lists the downloaded experiments in the following format: | |||
{| class="wikitable" | {| class="wikitable" | ||
|+ | |+ | ||
Line 30: | Line 35: | ||
!PI | !PI | ||
!Workstation User | !Workstation User | ||
! | !UUID | ||
|- | |- | ||
|NMR800-NEO/ | |NMR800-NEO/polx/5 | ||
|HSQC | |HSQC | ||
| | |polx/5 | ||
|Mullen | |Mullen | ||
|NMR800-NEO | |NMR800-NEO | ||
|800 | |800 | ||
|solution | |solution | ||
| | |hsqcetf3gpsi | ||
|2 | |2 | ||
|2 | |2 | ||
Line 45: | Line 50: | ||
|1H,15N | |1H,15N | ||
|298 | |298 | ||
| | |test | ||
|polx | |||
|2025-01-24T00:06:36-05:00 | |||
|Bloch | |||
|Purcell | |||
|Rabi | |||
|fb5323d6-fcae-4328-a908-9f6ff1d88512 | |||
|- | |||
|NMR600-NEO/ubiquitin/9 | |||
|1D 1H | |||
|ubiquitin/9 | |||
|Mullen | |||
|NMR600-NEO | |||
|600 | |||
|solution | |||
|zgpr | |||
|1 | |||
|1 | |||
|1H | |||
|1H | |||
|298 | |||
|calibration | |||
|ubiquitin | |||
|2025-02-24T00:06:36-05:00 | |||
|Bloch | |||
|Purcell | |||
|Rabi | |||
|fb5323d6-fcae-4328-a908-9f6ff1d88518 | |||
|- | |||
|NMR600-NEO/ubiquitin/9 | |||
|1D 1H | |||
|ubiquitin/9 | |||
|Mullen | |||
|NMR600-NEO | |||
|600 | |||
|solution | |||
|zgpr | |||
|1 | |||
|1 | |||
|1H | |||
|1H | |||
|298 | |||
|calibration | |||
|ubiquitin | |||
|2025-02-24T00:06:47-05:00 | |||
|Bloch | |||
|Purcell | |||
|Rabi | |||
|fb5323d6-fcae-4328-a908-9f6ff1d88519 | |||
|- | |||
|NMR600-NEO/ubiquitin/10 | |||
|1D NOESY | |||
|ubiquitin/10 | |||
|Mullen | |||
|NMR600-NEO | |||
|600 | |||
|solution | |||
|noesygppr1d | |||
|1 | |||
|1 | |||
|1H | |||
|1H | |||
|298 | |||
|successful | |||
|ubiquitin | |ubiquitin | ||
|2025- | |2025-02-24T00:07:26-05:00 | ||
|Bloch | |Bloch | ||
|Purcell | |Purcell | ||
|Rabi | |Rabi | ||
|fb5323d6-fcae-4328-a908- | |fb5323d6-fcae-4328-a908-9f6ff1d8854 | ||
|} | |} | ||
''Note that the column names used in this table have been shortened versus the actual experiment.csv file to make viewing the table on the Wiki page easier and the data is not real.'' | |||
== Dataset Metadata Files added by NAN == | |||
Each individual dataset directory includes several additional files or directories | |||
* <code>provenance.prov</code> – a W3C PROV file describing the complete provenance of the dataset | |||
* <code>sample_metadata.xml</code> – an XML file containing sample information, if applicable | |||
* <code>experiment.csv</code> – a single-entry CSV file describing the dataset (same format as the main <code>experiments.csv</code>) | |||
* <code>identity.xml</code> – an internal-use XML file that can generally be ignored by users | |||
* <code>supplemental_data</code> – A directory containing additional supplemental data files that the user uploaded for the dataset | |||
* <code>supplemental_data.csv</code> – a CSV file with the category, type, value, and description for all supplemental data entries. | |||
* <code>post_acquisition</code> – a directory containing file contents retrieved from the post_acquisition directory on NMRbox | |||
The location of these files / directories differs slightly depending on the organization format. | |||
== Organized for TopSpin File Layout == | |||
When '''Organized for TopSpin''' is selected, the download is structured to match the standard Bruker TopSpin format. Datasets are grouped under directories named for the spectrometer used for acquisition (e.g., <code>NMR800-NEO</code>, <code>NMR600-NEO</code>). | |||
Each dataset resides in the experiment number (EXPNO) subdirectory under the experiment directory. If two datasets have the same experiment directory and experiment number, the older one will have a timestamp suffix to avoid overwriting (e.g., <code>9_20250224000647</code>). | |||
The <code>provenance.prov</code>, <code>sample_metadata.xml</code>, <code>experiment.csv</code>, <code>supplemental_data.csv</code>, and <code>identity.xml</code> files as well as directories for <code>supplemental_data</code> and <code>post_acquisition</code>are placed inside the experiment number (EXPNO) directory alongside the standard TopSpin files. | |||
This format is ideal when multiple experiments would need to be opened in TopSpin as the experiment_directory/EXPNO format is strictly preserved with multiple EXPNO directories residing under the experiment_directory. | |||
=== Example Layout === | |||
<pre>NMR800-NEO/ | |||
└── polx/ | |||
└── 5/ | |||
├── ser | |||
├── acqus | |||
├── acqu2s | |||
├── pulseprogram | |||
├── lists/ | |||
├── pdata/ | |||
├── ... | |||
├── provenance.prov | |||
├── sample_metadata.xml | |||
├── experiment.csv | |||
├── identity.xml | |||
├── supplemental_data.csv | |||
├── supplemental_data/ | |||
└── post_acquisition/ | |||
NMR600-NEO/ | |||
└── ubiquitin/ | |||
├── 9/ | |||
│ ├── fid | |||
│ ├── acqus | |||
| ├── pulseprogram | |||
| ├── lists/ | |||
| ├── pdata/ | |||
| ├── ... | |||
│ ├── provenance.prov | |||
│ ├── sample_metadata.xml | |||
│ ├── experiment.csv | |||
│ ├── identity.xml | |||
| ├── supplemental_data.csv | |||
| ├── supplemental_data/ | |||
| └── post_acquisition/ | |||
├── 9_20250224000647/ | |||
│ ├── fid | |||
│ ├── acqus | |||
| ├── pulseprogram | |||
| ├── lists/ | |||
| ├── pdata/ | |||
| ├── ... | |||
│ ├── provenance.prov | |||
│ ├── sample_metadata.xml | |||
│ ├── experiment.csv | |||
│ ├── identity.xml | |||
| ├── supplemental_data.csv | |||
| ├── supplemental_data/ | |||
| └── post_acquisition/ | |||
└── 10/ | |||
├── fid | |||
├── acqus | |||
├── pulseprogram | |||
├── lists/ | |||
├── pdata/ | |||
├── ... | |||
├── provenance.prov | |||
├── sample_metadata.xml | |||
├── experiment.csv | |||
├── identity.xml | |||
├── supplemental_data.csv | |||
├── supplemental_data/ | |||
└── post_acquisition/ | |||
experiments.csv</pre> | |||
== Organized by Experiment File Layout == | |||
When '''Organized by Experiment''' is selected, the download is structured so that each dataset resides in its own top-level directory named using the format: | |||
<code>YYYYMMDDTHHMMSS_Spectrometer_PulseProgram</code> | |||
Inside each of these timestamped directories: | |||
* The Bruker dataset is nested in a folder named with the Bruker dataset name (e.g., <code>polx</code>, <code>ubiquitin</code>) | |||
* That folder contains the experiment number as a subfolder (e.g., <code>5</code>, <code>9</code>, <code>10</code>) | |||
* The experiment directory contains the standard TopSpin file layout | |||
* The dataset-specific <code>provenance.prov</code>, <code>sample_metadata.xml</code>, <code>experiment.csv</code>, <code>supplemental_data.csv</code>, and <code>identity.xml</code> files are placed in the top-level timestamped directory. The supplemental_data and post_acquisition directories are located within the experiment number directory | |||
=== Example Layout === | |||
<pre>20250124T000636_NMR800-NEO_hsqcetf3gpsi/ | |||
├── provenance.prov | |||
├── sample_metadata.xml | |||
├── experiment.csv | |||
├── identity.xml | |||
├── supplemental_data.csv | |||
└── polx/ | |||
└── 5/ | |||
├── fid | |||
├── acqus | |||
├── acqu2s | |||
├── pulseprogram | |||
├── pdata | |||
├── ... | |||
├── supplemental_data/ | |||
└── post_acquisition/ | |||
20250224T000636_NMR600-NEO_zgpr/ | |||
├── provenance.prov | |||
├── sample_metadata.xml | |||
├── experiment.csv | |||
├── identity.xml | |||
├── supplemental_data.csv | |||
└── ubiquitin/ | |||
└── 9/ | |||
├── fid | |||
├── acqus | |||
├── pulseprogram | |||
├── pdata | |||
├── ... | |||
├── supplemental_data/ | |||
└── post_acquisition/ | |||
20250224T000647_NMR600-NEO_zgpr/ | |||
├── provenance.prov | |||
├── sample_metadata.xml | |||
├── experiment.csv | |||
├── identity.xml | |||
├── supplemental_data.csv | |||
└── ubiquitin/ | |||
└── 9/ | |||
├── fid | |||
├── acqus | |||
├── pulseprogram | |||
├── pdata | |||
├── ... | |||
├── supplemental_data/ | |||
└── post_acquisition/ | |||
20250224T000726_NMR600-NEO_noesygppr1d/ | |||
├── provenance.prov | |||
├── sample_metadata.xml | |||
├── experiment.csv | |||
├── identity.xml | |||
├── supplemental_data.csv | |||
└── ubiquitin/ | |||
└── 10/ | |||
├── fid | |||
├── acqus | |||
├── pulseprogram | |||
├── pdata | |||
├── ... | |||
├── supplemental_data/ | |||
└── post_acquisition/ | |||
experiments.csv</pre> |
Latest revision as of 17:06, 10 June 2025
Overview
Datasets may be downloaded by selecting the Download action from the context menu, which is accessed from the context menu by right-clicking on a selected dataset (or set of datasets).
Users may choose between two download formats:
- Organized for TopSpin
- Organized by Experiment
The selected datasets are packaged into a zip file and downloaded through your browser to the local Downloads folder.
All files are included, including supplemental data, any contents from the post-acquisition directory, and a series of metadata files added by NAN described below.
An `experiments.csv` file is placed in the root of the zip archive. It lists the downloaded experiments in the following format:
Path | Display Name | Dataset Name | Facility | Spectrometer | Field | State | Pulse Program | # dims | # dims collected | direct nuclei | nuclei | Temp | Classification | Sample | Date | NAN User | PI | Workstation User | UUID |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NMR800-NEO/polx/5 | HSQC | polx/5 | Mullen | NMR800-NEO | 800 | solution | hsqcetf3gpsi | 2 | 2 | 1H | 1H,15N | 298 | test | polx | 2025-01-24T00:06:36-05:00 | Bloch | Purcell | Rabi | fb5323d6-fcae-4328-a908-9f6ff1d88512 |
NMR600-NEO/ubiquitin/9 | 1D 1H | ubiquitin/9 | Mullen | NMR600-NEO | 600 | solution | zgpr | 1 | 1 | 1H | 1H | 298 | calibration | ubiquitin | 2025-02-24T00:06:36-05:00 | Bloch | Purcell | Rabi | fb5323d6-fcae-4328-a908-9f6ff1d88518 |
NMR600-NEO/ubiquitin/9 | 1D 1H | ubiquitin/9 | Mullen | NMR600-NEO | 600 | solution | zgpr | 1 | 1 | 1H | 1H | 298 | calibration | ubiquitin | 2025-02-24T00:06:47-05:00 | Bloch | Purcell | Rabi | fb5323d6-fcae-4328-a908-9f6ff1d88519 |
NMR600-NEO/ubiquitin/10 | 1D NOESY | ubiquitin/10 | Mullen | NMR600-NEO | 600 | solution | noesygppr1d | 1 | 1 | 1H | 1H | 298 | successful | ubiquitin | 2025-02-24T00:07:26-05:00 | Bloch | Purcell | Rabi | fb5323d6-fcae-4328-a908-9f6ff1d8854 |
Note that the column names used in this table have been shortened versus the actual experiment.csv file to make viewing the table on the Wiki page easier and the data is not real.
Dataset Metadata Files added by NAN
Each individual dataset directory includes several additional files or directories
provenance.prov
– a W3C PROV file describing the complete provenance of the datasetsample_metadata.xml
– an XML file containing sample information, if applicableexperiment.csv
– a single-entry CSV file describing the dataset (same format as the mainexperiments.csv
)identity.xml
– an internal-use XML file that can generally be ignored by userssupplemental_data
– A directory containing additional supplemental data files that the user uploaded for the datasetsupplemental_data.csv
– a CSV file with the category, type, value, and description for all supplemental data entries.post_acquisition
– a directory containing file contents retrieved from the post_acquisition directory on NMRbox
The location of these files / directories differs slightly depending on the organization format.
Organized for TopSpin File Layout
When Organized for TopSpin is selected, the download is structured to match the standard Bruker TopSpin format. Datasets are grouped under directories named for the spectrometer used for acquisition (e.g., NMR800-NEO
, NMR600-NEO
).
Each dataset resides in the experiment number (EXPNO) subdirectory under the experiment directory. If two datasets have the same experiment directory and experiment number, the older one will have a timestamp suffix to avoid overwriting (e.g., 9_20250224000647
).
The provenance.prov
, sample_metadata.xml
, experiment.csv
, supplemental_data.csv
, and identity.xml
files as well as directories for supplemental_data
and post_acquisition
are placed inside the experiment number (EXPNO) directory alongside the standard TopSpin files.
This format is ideal when multiple experiments would need to be opened in TopSpin as the experiment_directory/EXPNO format is strictly preserved with multiple EXPNO directories residing under the experiment_directory.
Example Layout
NMR800-NEO/ └── polx/ └── 5/ ├── ser ├── acqus ├── acqu2s ├── pulseprogram ├── lists/ ├── pdata/ ├── ... ├── provenance.prov ├── sample_metadata.xml ├── experiment.csv ├── identity.xml ├── supplemental_data.csv ├── supplemental_data/ └── post_acquisition/ NMR600-NEO/ └── ubiquitin/ ├── 9/ │ ├── fid │ ├── acqus | ├── pulseprogram | ├── lists/ | ├── pdata/ | ├── ... │ ├── provenance.prov │ ├── sample_metadata.xml │ ├── experiment.csv │ ├── identity.xml | ├── supplemental_data.csv | ├── supplemental_data/ | └── post_acquisition/ ├── 9_20250224000647/ │ ├── fid │ ├── acqus | ├── pulseprogram | ├── lists/ | ├── pdata/ | ├── ... │ ├── provenance.prov │ ├── sample_metadata.xml │ ├── experiment.csv │ ├── identity.xml | ├── supplemental_data.csv | ├── supplemental_data/ | └── post_acquisition/ └── 10/ ├── fid ├── acqus ├── pulseprogram ├── lists/ ├── pdata/ ├── ... ├── provenance.prov ├── sample_metadata.xml ├── experiment.csv ├── identity.xml ├── supplemental_data.csv ├── supplemental_data/ └── post_acquisition/ experiments.csv
Organized by Experiment File Layout
When Organized by Experiment is selected, the download is structured so that each dataset resides in its own top-level directory named using the format:
YYYYMMDDTHHMMSS_Spectrometer_PulseProgram
Inside each of these timestamped directories:
- The Bruker dataset is nested in a folder named with the Bruker dataset name (e.g.,
polx
,ubiquitin
) - That folder contains the experiment number as a subfolder (e.g.,
5
,9
,10
) - The experiment directory contains the standard TopSpin file layout
- The dataset-specific
provenance.prov
,sample_metadata.xml
,experiment.csv
,supplemental_data.csv
, andidentity.xml
files are placed in the top-level timestamped directory. The supplemental_data and post_acquisition directories are located within the experiment number directory
Example Layout
20250124T000636_NMR800-NEO_hsqcetf3gpsi/ ├── provenance.prov ├── sample_metadata.xml ├── experiment.csv ├── identity.xml ├── supplemental_data.csv └── polx/ └── 5/ ├── fid ├── acqus ├── acqu2s ├── pulseprogram ├── pdata ├── ... ├── supplemental_data/ └── post_acquisition/ 20250224T000636_NMR600-NEO_zgpr/ ├── provenance.prov ├── sample_metadata.xml ├── experiment.csv ├── identity.xml ├── supplemental_data.csv └── ubiquitin/ └── 9/ ├── fid ├── acqus ├── pulseprogram ├── pdata ├── ... ├── supplemental_data/ └── post_acquisition/ 20250224T000647_NMR600-NEO_zgpr/ ├── provenance.prov ├── sample_metadata.xml ├── experiment.csv ├── identity.xml ├── supplemental_data.csv └── ubiquitin/ └── 9/ ├── fid ├── acqus ├── pulseprogram ├── pdata ├── ... ├── supplemental_data/ └── post_acquisition/ 20250224T000726_NMR600-NEO_noesygppr1d/ ├── provenance.prov ├── sample_metadata.xml ├── experiment.csv ├── identity.xml ├── supplemental_data.csv └── ubiquitin/ └── 10/ ├── fid ├── acqus ├── pulseprogram ├── pdata ├── ... ├── supplemental_data/ └── post_acquisition/ experiments.csv