Datasets: Difference between revisions

From Network for Advanced NMR
Jump to navigationJump to search
No edit summary
Line 134: Line 134:
# '''Make Public''' - make the dataset public. This is a '''PERMANENT''' option and can't be undone.  
# '''Make Public''' - make the dataset public. This is a '''PERMANENT''' option and can't be undone.  
# '''Publish''' - publish dataset. The published datasets '''can't be edited'''. This is a '''PERMANENT''' option and can't be undone.   
# '''Publish''' - publish dataset. The published datasets '''can't be edited'''. This is a '''PERMANENT''' option and can't be undone.   
= Data Browser: Datasets =
The Dataset Browser allows users to explore and manage datasets they are authorized to access. Access permissions are granted either because the dataset is ''public'' or through ''lab-based'', ''user-based'', or ''collaborative'' permissions authorized by a Principal Investigator (PI).
The Dataset Browser includes:
* A '''Navigation Bar''' on the left side for switching between dataset views and accessing hierarchical groupings.
* A '''Table View''' for displaying dataset metadata in a sortable and filterable format.
* '''Customization Tools''' in the upper-right corner to configure columns, saved views, and filters.
* An '''Upload Datasets''' button to submit datasets that were not harvested by NDTS.
== Navigation Bar and Hierarchical Organization ==
The Navigation Bar allows users to quickly access datasets across different categories. Unauthenticated users will only see public datasets and knowledgebase datasets.
; '''All Datasets'''
: Displays all datasets the user can access, including public and permission-granted datasets.
; '''All Public Datasets'''
: Displays only datasets marked as public.
; '''Knowledgebase Datasets'''
: Displays a curated subset of public datasets that are highly annotated and intended to aid users in experimental planning and analysis.
; '''My & Lab Data'''
: Displays datasets accessible via user- or lab-based permissions (excluding datasets that are visible only due to being public). 
: This section includes a hierarchical organization mirroring a file system:
:* '''My Collections''' – personal collections created by the user.
:* '''Projects''' – high-level lab groupings.
::* '''Studies''' – project subsets for specific investigations.
:::* '''Collections''' – fine-grained dataset groupings within a study.
== Table of Datasets ==
The Dataset Table displays all datasets for which the user has at least ''read access''. Each row represents one dataset, with metadata columns that can be customized per user.
At the bottom of the table is a '''pagination control'''. Users can move between pages and adjust the number of rows displayed per page: '''25''', '''50''', '''100''', or '''500''' datasets per page.
=== Display Name / Dataset Name ===
The '''Display Name''' is the first column and is always visible. It defaults to the non-editable '''Dataset Name''', which corresponds to the experiment directory name:
* VNMRJ: `expN`
* Bruker: `experiment/N`
Users can edit the '''Display Name''' to create a more meaningful label. When downloading, the dataset is saved using the original '''Dataset Name'''; the '''Display Name''' is saved in a `CSV` file within the downloaded folder.
----
=== How to Customize the Dataset Table View ===
# Click the '''Wrench''' icon in the top-right corner.
# Choose '''Displayed Columns''' to select which columns are visible.
# Use '''Save View''' to save your configuration.
#* Views can be edited or deleted.
# Use '''Saved Views''' to switch between existing configurations.
----
=== How to Download Experiments ===
# Select one or more experiments using the checkbox icon or by right-clicking.
# Right-click and choose '''Download'''.
# Select the download format:
#* '''Organized for Topspin''' – maintains Bruker format hierarchy.
#* '''Organized by Experiment''' – each experiment in its own folder.
----
=== How to Link Datasets to a Sample ===
There are two ways:
# '''From the Dataset Editor'''
#* Double-click a dataset you have '''write access''' to.
#* Click '''Find & Link Sample''', select a sample, and click '''Save'''.
# '''From the Table View'''
#* Select datasets, right-click, and choose '''Link Sample'''.
#* Select a sample and click '''Save'''.
----
=== Quick Filters ===
Quick filters apply predefined views to narrow down datasets:
* '''Successful Datasets Only''' – shows datasets marked as successful.
* '''Hide Failed Datasets''' – hides datasets marked as failed.
* '''Non-Redundant Datasets''' – shows datasets marked as preferred.
* '''My Data''' – datasets owned by the logged-in user.
* '''Non-public Data''' – datasets not made public.
* '''KB Datasets''' – datasets published in the Knowledgebase.
To classify or mark datasets:
* Right-click the dataset and select '''Classification''' or '''Redundancy''' from the context menu.
----
=== Context Menu ===
Right-clicking a dataset opens the context menu. Available actions depend on the user’s permissions; unavailable actions appear grayed out.
Available options may include:
# '''Edit Dataset'''
#* Update metadata, classification, or redundancy status.
# '''Reassign'''
#* Assign to another lab user or reject misaligned data.
#* Rejected data (within 3 months) goes to the facility manager.
# '''Download'''
#* Download dataset(s).
# '''NMRbox Integration'''
#* Copy dataset to NMRbox home directory.
# '''Supplemental Data'''
#* Upload related data files.
# '''Redundancy'''
#* Mark as preferred/redundant.
# '''Link Sample'''
#* Associate with a sample (shows up in the Sample column).
# '''Classification'''
#* Label as:
#** Calibration experiment 
#** Failed – sample, instrument, or setup related 
#** Successful experiment 
#** Test experiment 
# '''Tags'''
#* Add searchable tags.
# '''Notes'''
#* Add or edit descriptive notes.
# '''Unlink from Collection'''
#* Remove from a dataset collection.
# '''Make Public'''
#* Permanently make dataset public (cannot be undone).
# '''Publish'''
#* Permanently publish dataset (cannot be undone or edited).
[[Category:Data Browser]]
[[Category:Data Browser]]

Revision as of 16:13, 22 May 2025

Data Browser: Datasets

The Dataset Browser allows users to explore and manage datasets they are authorized to access. Access permissions are granted either due to the data being public or through lab-based, user-based, or collaborative access authorized by a Principal Investigator (PI).

The Data Browser is composed of different areas:

  • A Navigation Bar on the left hand side allows users to quickly navigate between All Datasets, All Public Datasets, Knowledgebase Datasets, and My and Lab Data. Note that users who are not authenticated with an NMRhub account only see All Public Data and Knowledgebase Datasets. The My and Lab Data has sub-headings for additional hierarchal organization as described below.
  • A Table view for visualizing the datasets as rows with a series of metadata columns that are all sortable and filterable
  • Customization tools in the upper right hand corner for selecting saved views, quick filters, changing which metadata columns are displayed, creating and deleting views, and the ability to clear filters to columns that are displayed and hidden.
  • A Upload Datasets button to allow users to upload datasets collected on NAN connected instruments that were not harvested by NDTS

Navigation Bar

All Datasets
Displays every dataset the user can access, including both public datasets and restricted datasets the user has permission to view through lab permissions or collaboration
All Public Datasets
Displays only datasets that have been marked as public, regardless of any user-specific permissions.
Knowledgebase Datasets
Displays a curated subset of public datasets designated as *Knowledgebase (KB) datasets*.
KB datasets are well-vetted, highly annotated datasets intended to aid users in running their own experiments and analyses.
My & Lab Data (Default View)
Displays datasets the user can view based on their PI-granted permissions.
*Excludes* datasets that are visible *only* because they are public.
If a dataset is public but the user would have had permission to see it regardless, it remains visible here.

Hierarchical Organization under My & Lab Data

The My & Lab Data section organizes user-accessible datasets hierarchically:

  • My Collections
    • Personal collections created and managed by the user.
  • Projects
    • High-level organizational units defined by the lab.
    • Studies
      • Subsets of Projects, grouping datasets for specific investigations.
      • Collections
        • Fine-grained dataset groupings within a Study.

This structure mirrors a traditional file system hierarchy: Project → Studies → Collections

Table of Datasets

The Dataset Table displays all the datasets that the user has at least read access. The datasets are shown as a table with user adjustable columns representing metadata associated with the datasets.

Display Name/ Dataset Name

While the user may control which metadata columns are visible, and in which order, the Display Name column is fixed as the first column. The Display Name is automatically set to the Dataset Name which is not-editable and is defined as the experimental directory name when the data was collected. For VNMRJ it is the expN directory name and for Bruker it is the experiment/N directory. The Display Name is user editable allowing the user to create a more meaningful name than the VNMRJ or Topspin software allows, while keeping the original Dataset Name fixed. When a user downloads an experiment, it will be saved using the Dataset Name (original experiment name), but the corresponding Display Name is saved in a CSV file located in the downloaded experiment’s folder.


How to customize the Datasets Table view

To customize the view of the Datasets Table

  1. Click on the "Wrench" Icon in the top-right corner of the table
  2. Click on the "Displayed Columns" button and select which columns you want to be visible in the Datasets Table
  3. Click the "Save View" button
    • The saved views can be edited and deleted
  4. The user can create multiple views
    • To switch between different views click on the "Saved Views" button

How to download experiments

To download experiments

  1. Navigate to the Datasets section of the Data Browser
  2. Locate the experiments you want to download
  3. Click the "Checkbox" icon to select multiple experiments.
    • Alternatively, right-click on a single experiment to automatically select it
  4. Right-click and select "Download" option form the context menu
  5. Choose data organization format:
    • Organized for Topspin
      • If the datasets are in Bruker format you can download it structured for Topspin
      • The Topspin hierarchy of the data, including all folders/subfolders, will be preserved
    • Organized by experiment
      • Each experiment will be in a separate folder

How to link datasets to a sample

There are two ways of linking datasets to a sample.

  1. Through editing the dataset
    • Double click on the dataset that you have write access to
    • In the appeared pop-up window, click on "Find & Link Sample"
    • Choose the sample from the table of available samples
    • Click "Save" to confirm the link
  2. Through Datasets Table
    • Select one or more experiments you wish to link to a sample
    • Right-click and choose "Link sample" option form the context menu
    • Choose the sample from the table of available samples
    • Click "Save" to confirm the link

Quick Filters

Quick filters are predefined views that allows users to quickly access specific datasets. The user can apply one or more filters:

  • Successful Datasets Only – displays datasets classified as successful
    • To classify a dataset, right-click on it and select "Classification" from the context menu
  • Hide Failed Datasets – hides datasets that were classified as failed
  • Non-Redundant Datasets – displays datasets marked as preferred
    • To mark a dataset as preferred/redundant, right-click on it and select "Redundancy" from the context menu
  • My Data – displays datasets owned by the logged-in user
  • Non-public Data – displays datasets that have not been made public
  • KB Datasets - display datasets that have been published in a knowledgebase

Context menu

Right-clicking on a dataset opens the Context menu which provides various actions that users can perform on the selected dataset. The available options depend on the user's permissions for that dataset, options that are not allowed will appear grayed out. The options might include:

  1. Edit Dataset
    • Change basic information about the dataset
    • Mark dataset as preferred/redundant
  2. Reassign
    • Reassign dataset to another user within the lab
    • Reject misaligned data
      • User has three months to reject the data
      • The rejected data are automatically assigned to the facility manager of the facility in which the dataset was collected
  3. Download - download selected datasets
  4. NMRbox Integration - copy the dataset to NMRbox
  5. Supplemental Data - upload supplemental data
  6. Redundancy - mark a dataset as preferred/redundant
  7. Link Sample - link the dataset to the sample that has been created in the Samples section of the Data Browser
    • Name of the sample and a link to its information will appear in the Sample Column of Datasets Table
  8. Classification - classify dateset as
    • Calibration experiment
    • Failed - sample related
    • Failed - instrument related
    • Failed - setup related
    • Succesful Experiment
    • Test experiment
  1. Tags - assign tags to categorize the dataset
  2. Notes - add or update dataset notes
  3. Unlink from Collection - if the dataset is linked to a Dataset Collection, the user can unlink it
  4. Make Public - make the dataset public. This is a PERMANENT option and can't be undone.
  5. Publish - publish dataset. The published datasets can't be edited. This is a PERMANENT option and can't be undone.


Data Browser: Datasets

The Dataset Browser allows users to explore and manage datasets they are authorized to access. Access permissions are granted either because the dataset is public or through lab-based, user-based, or collaborative permissions authorized by a Principal Investigator (PI).

The Dataset Browser includes:

  • A Navigation Bar on the left side for switching between dataset views and accessing hierarchical groupings.
  • A Table View for displaying dataset metadata in a sortable and filterable format.
  • Customization Tools in the upper-right corner to configure columns, saved views, and filters.
  • An Upload Datasets button to submit datasets that were not harvested by NDTS.

Navigation Bar and Hierarchical Organization

The Navigation Bar allows users to quickly access datasets across different categories. Unauthenticated users will only see public datasets and knowledgebase datasets.

All Datasets
Displays all datasets the user can access, including public and permission-granted datasets.
All Public Datasets
Displays only datasets marked as public.
Knowledgebase Datasets
Displays a curated subset of public datasets that are highly annotated and intended to aid users in experimental planning and analysis.
My & Lab Data
Displays datasets accessible via user- or lab-based permissions (excluding datasets that are visible only due to being public).
This section includes a hierarchical organization mirroring a file system:
  • My Collections – personal collections created by the user.
  • Projects – high-level lab groupings.
  • Studies – project subsets for specific investigations.
  • Collections – fine-grained dataset groupings within a study.

Table of Datasets

The Dataset Table displays all datasets for which the user has at least read access. Each row represents one dataset, with metadata columns that can be customized per user.

At the bottom of the table is a pagination control. Users can move between pages and adjust the number of rows displayed per page: 25, 50, 100, or 500 datasets per page.

Display Name / Dataset Name

The Display Name is the first column and is always visible. It defaults to the non-editable Dataset Name, which corresponds to the experiment directory name:

  • VNMRJ: `expN`
  • Bruker: `experiment/N`

Users can edit the Display Name to create a more meaningful label. When downloading, the dataset is saved using the original Dataset Name; the Display Name is saved in a `CSV` file within the downloaded folder.


How to Customize the Dataset Table View

  1. Click the Wrench icon in the top-right corner.
  2. Choose Displayed Columns to select which columns are visible.
  3. Use Save View to save your configuration.
    • Views can be edited or deleted.
  4. Use Saved Views to switch between existing configurations.

How to Download Experiments

  1. Select one or more experiments using the checkbox icon or by right-clicking.
  2. Right-click and choose Download.
  3. Select the download format:
    • Organized for Topspin – maintains Bruker format hierarchy.
    • Organized by Experiment – each experiment in its own folder.

How to Link Datasets to a Sample

There are two ways:

  1. From the Dataset Editor
    • Double-click a dataset you have write access to.
    • Click Find & Link Sample, select a sample, and click Save.
  1. From the Table View
    • Select datasets, right-click, and choose Link Sample.
    • Select a sample and click Save.

Quick Filters

Quick filters apply predefined views to narrow down datasets:

  • Successful Datasets Only – shows datasets marked as successful.
  • Hide Failed Datasets – hides datasets marked as failed.
  • Non-Redundant Datasets – shows datasets marked as preferred.
  • My Data – datasets owned by the logged-in user.
  • Non-public Data – datasets not made public.
  • KB Datasets – datasets published in the Knowledgebase.

To classify or mark datasets:

  • Right-click the dataset and select Classification or Redundancy from the context menu.

Context Menu

Right-clicking a dataset opens the context menu. Available actions depend on the user’s permissions; unavailable actions appear grayed out.

Available options may include:

  1. Edit Dataset
    • Update metadata, classification, or redundancy status.
  2. Reassign
    • Assign to another lab user or reject misaligned data.
    • Rejected data (within 3 months) goes to the facility manager.
  3. Download
    • Download dataset(s).
  4. NMRbox Integration
    • Copy dataset to NMRbox home directory.
  5. Supplemental Data
    • Upload related data files.
  6. Redundancy
    • Mark as preferred/redundant.
  7. Link Sample
    • Associate with a sample (shows up in the Sample column).
  8. Classification
    • Label as:
      • Calibration experiment
      • Failed – sample, instrument, or setup related
      • Successful experiment
      • Test experiment
  9. Tags
    • Add searchable tags.
  10. Notes
    • Add or edit descriptive notes.
  11. Unlink from Collection
    • Remove from a dataset collection.
  12. Make Public
    • Permanently make dataset public (cannot be undone).
  13. Publish
    • Permanently publish dataset (cannot be undone or edited).