Datasets: Difference between revisions
Mmaciejewski (talk | contribs) |
Apozhidaeva (talk | contribs) No edit summary |
||
(21 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
{{DataBrowser}} | |||
{{Datasets}} | |||
= Data Browser: Datasets = | = Data Browser: Datasets = | ||
The Dataset Browser allows users to explore and manage datasets they are authorized to access. Access permissions are granted either because the dataset is ''public'' or through ''lab-based'', ''user-based'', or ''collaborative'' permissions authorized by a Principal Investigator (PI). | The Dataset Browser allows users to explore and manage datasets they are authorized to access. Access [[Lab Permissions|permissions]] are granted either because the dataset is ''public'' or through ''lab-based'', ''user-based'', or ''collaborative'' permissions authorized by a Principal Investigator (PI). | ||
The Dataset Browser includes: | The Dataset Browser includes: | ||
Line 10: | Line 11: | ||
* '''[[Datasets#Customization Tools|Customization Tools]]''' in the upper-right corner to configure columns, saved views, and filters. | * '''[[Datasets#Customization Tools|Customization Tools]]''' in the upper-right corner to configure columns, saved views, and filters. | ||
* Advanced '''[[Datasets#Selection, Sorting, and Filtering|Selection, Sorting, and Filtering]]''' of datasets | * Advanced '''[[Datasets#Selection, Sorting, and Filtering|Selection, Sorting, and Filtering]]''' of datasets | ||
* Ability to perform [[Datasets#Actions|Actions]] quickly and easily from a context menu to one or more datasets | * Ability to perform [[Datasets#Actions|'''Actions''']] quickly and easily from a context menu to one or more datasets | ||
* An '''[[Datasets#Upload Datasets|Upload Datasets]]''' button to submit datasets that were not harvested by NDTS. | * An '''[[Datasets#Upload Datasets|Upload Datasets]]''' button to submit datasets that were not harvested by NDTS. | ||
== Navigation Pane == | == Navigation Pane == | ||
[[File:Navigation pane.png|thumb|Navigation pane]] | [[File:Navigation pane.png|thumb|Navigation pane]] | ||
The Navigation Pane allows users to quickly access datasets across different categories. Unauthenticated users will only see ''All Public Datasets'' and ''Knowledgebase Datasets''. | The Navigation Pane allows users to quickly access datasets across different categories. Unauthenticated users will only see '''''All Public Datasets''''' and '''''Knowledgebase Datasets'''''. | ||
=== All Datasets === | === All Datasets === | ||
Line 28: | Line 29: | ||
=== My & Lab Data === | === My & Lab Data === | ||
: Displays datasets accessible via user- or lab-based permissions (excluding datasets that are visible only due to being public). | : Displays datasets accessible via user- or lab-based permissions (excluding datasets that are visible only due to being public). | ||
: This section includes a hierarchical organization mirroring a file system | : This section includes a hierarchical organization mirroring a file system including user defined Collections and Lab controlled [[Project, Studies, and Collections|Projects --> Studies --> Collections]] | ||
:* '''My Collections''' – personal collections created by the user. | :* '''My Collections''' – personal collections created by the user. | ||
:* '''Projects''' – high-level groupings for data organization | :* '''Projects''' – high-level groupings for data organization | ||
:** '''Studies''' – reside inside Projects to allow datasets from a given study to be grouped | :** '''Studies''' – reside inside Projects to allow datasets from a given study to be grouped | ||
:*** '''Collections''' – reside inside Studies to allow fine-grained dataset groupings | :*** '''Collections''' – reside inside Studies to allow fine-grained dataset groupings | ||
== Dataset Table == | == Dataset Table == | ||
Line 43: | Line 44: | ||
Columns represent different metadata fields for the NAN dataset. There is a default list of columns that are displayed, but users can toggle different columns on and off as desired by selecting the wrench icon in the upper right hand corner of the dataset browser. Columns may be re-order by dragging them. The columns to be displayed, along with their order, is saved in the NAN database as a user preference and will persist across sessions, browsers, and computers. See [[Dataset Columns]] for a complete list of columns with a short description and the types of filters that may be applied. | Columns represent different metadata fields for the NAN dataset. There is a default list of columns that are displayed, but users can toggle different columns on and off as desired by selecting the wrench icon in the upper right hand corner of the dataset browser. Columns may be re-order by dragging them. The columns to be displayed, along with their order, is saved in the NAN database as a user preference and will persist across sessions, browsers, and computers. See [[Dataset Columns]] for a complete list of columns with a short description and the types of filters that may be applied. | ||
==== Redundant Status Column ==== | |||
* By default, the Data Browser displays only '''Preferred''' datasets to provide a clean and responsive interface. | |||
* Users can identify datasets with redundancies via an icon on the preferred dataset. Clicking the icon opens the full set for review (see Icon badges below) | |||
* A '''Redundant''' column may be enabled in the browser view. When this column is active, the Data Browser shows both preferred and redundant datasets for complete visibility. | |||
==== Display Name / Dataset Name ==== | ==== Display Name / Dataset Name ==== | ||
When a dataset is harvested by the NAN Data Transport System it is stored in the NAN database with a unique UUID (hidden from the user) and is given a Dataset Name (non-editable) that matches the experimental directory from the NMR spectrometer. | When a dataset is harvested by the NAN Data Transport System it is stored in the NAN database with a unique UUID (hidden from the user) and is given a Dataset Name (non-editable) that matches the experimental directory from the NMR spectrometer. | ||
Line 67: | Line 72: | ||
* Brings up a pull-down menu to toggle which columns are shown in the dataset table. | * Brings up a pull-down menu to toggle which columns are shown in the dataset table. | ||
* Allows a View to be | * Allows a View to be created, overwritten, or deleted. Note that the columns that are displayed are saved as a user preference and are not tied to a View. A View defines the applied filters and sorts to the columns and is independent on which columns are visible. For example, maybe you had a project where all the datasets were collected between two dates so you define a View to filter only datasets from specific users involved in the project that lie between two dates so that you can quickly see those datasets without the need to reapply the filters. | ||
=== Saved Views === | === Saved Views === | ||
Line 145: | Line 150: | ||
The <nowiki>'''</nowiki>Actions<nowiki>'''</nowiki> menu is accessed by right-clicking on a dataset row in the Dataset Browser. For multiple selections, right-click on any of the selected rows to perform bulk actions. Available actions depend on user permissions—actions unavailable to the user will appear grayed out. | The <nowiki>'''</nowiki>Actions<nowiki>'''</nowiki> menu is accessed by right-clicking on a dataset row in the Dataset Browser. For multiple selections, right-click on any of the selected rows to perform bulk actions. Available actions depend on user permissions—actions unavailable to the user will appear grayed out. | ||
Below is a table of available actions | Below is a table of available actions with some providing links to a page with additional details. | ||
{| class="wikitable" | {| class="wikitable" | ||
Line 153: | Line 158: | ||
! Description | ! Description | ||
|- | |- | ||
| View / Edit Dataset | | [[Dataset Editing|View / Edit Dataset]] | ||
| Possibly | | Possibly | ||
| Opens a modal window to view or edit the selected dataset. | | Opens a modal window to view or edit the selected dataset. | ||
|- | |- | ||
| Reassign | | [[Dataset reassignment|Reassign]] | ||
| Yes | | Yes | ||
| Assigns or reassigns a dataset to a ''NAN user''. Facility managers can reassign datasets to any user without time restrictions. Standard users can reassign within their lab group for up to three months after harvesting. | | Assigns or reassigns a dataset to a ''NAN user''. Facility managers can reassign datasets to any user without time restrictions. Standard users can reassign within their lab group for up to three months after harvesting. | ||
Line 165: | Line 170: | ||
| Downloads datasets in a variety of organizational layouts. | | Downloads datasets in a variety of organizational layouts. | ||
|- | |- | ||
| NMRbox Integration | | [[NMRbox Integration]] | ||
| Yes | | Yes | ||
| Copies a dataset from the NAN archive to the user’s NMRbox home folder in a predefined location. Also enables retrieval of post-acquisition files from NMRbox back into the NAN archive. | | Copies a dataset from the NAN archive to the user’s NMRbox home folder in a predefined location. Also enables retrieval of post-acquisition files from NMRbox back into the NAN archive. | ||
Line 173: | Line 178: | ||
| Adds or views supplemental data associated with a dataset. | | Adds or views supplemental data associated with a dataset. | ||
|- | |- | ||
| Redundancy | | [[Redundant Datasets|Redundancy]] | ||
| Yes | | Yes | ||
| Sets the dataset’s ''redundancy status'' as “preferred” or “redundant.” By default, the most recent experiment in a redundant set is marked as preferred. | | Sets the dataset’s ''redundancy status'' as “preferred” or “redundant.” By default, the most recent experiment in a redundant set is marked as preferred. | ||
Line 181: | Line 186: | ||
| Marks the dataset as publicly available. | | Marks the dataset as publicly available. | ||
|- | |- | ||
| Link Sample | | [[Link Sample]] | ||
| Yes | | Yes | ||
| Links a dataset to a sample. | | Links a dataset to a sample. | ||
|- | |- | ||
| Classification | | [[Dataset Classification|Classification]] | ||
| Yes | | Yes | ||
| | | Allows uses to classify datasets from a controlled list. Allows NMR facility managers to target a dataset to be removed from the NAN archive. | ||
|- | |- | ||
| Tags | | Tags | ||
Line 207: | Line 212: | ||
|Copy Dataset Link | |Copy Dataset Link | ||
|No | |No | ||
|Copies the URL | |Copies the URL of a dataset to the Clipboard | ||
|} | |} | ||
== Upload Datasets == | == Upload Datasets == | ||
See the [[Arbitrary Dataset Upload]] page for details | |||
[[Category:Data Browser]] | [[Category:Data Browser]] |
Latest revision as of 18:48, 23 June 2025
Data Browser: Datasets
The Dataset Browser allows users to explore and manage datasets they are authorized to access. Access permissions are granted either because the dataset is public or through lab-based, user-based, or collaborative permissions authorized by a Principal Investigator (PI).
The Dataset Browser includes:
- A Navigation Pane on the left side for switching between dataset views and hierarchical organization of My & Lab data described below.
- A Dataset Table for displaying rows of datasets with columns representing metadata that may be sorted and filtered.
- Customization Tools in the upper-right corner to configure columns, saved views, and filters.
- Advanced Selection, Sorting, and Filtering of datasets
- Ability to perform Actions quickly and easily from a context menu to one or more datasets
- An Upload Datasets button to submit datasets that were not harvested by NDTS.

The Navigation Pane allows users to quickly access datasets across different categories. Unauthenticated users will only see All Public Datasets and Knowledgebase Datasets.
All Datasets
- Displays all datasets the user can access, including public and permission-granted datasets.
All Public Datasets
- Displays only datasets marked as public.
Knowledgebase Datasets
- Displays a curated subset of public datasets that are highly annotated and intended to aid users in experimental planning and analysis.
My & Lab Data
- Displays datasets accessible via user- or lab-based permissions (excluding datasets that are visible only due to being public).
- This section includes a hierarchical organization mirroring a file system including user defined Collections and Lab controlled Projects --> Studies --> Collections
- My Collections – personal collections created by the user.
- Projects – high-level groupings for data organization
- Studies – reside inside Projects to allow datasets from a given study to be grouped
- Collections – reside inside Studies to allow fine-grained dataset groupings
- Studies – reside inside Projects to allow datasets from a given study to be grouped
Dataset Table
The Dataset Table displays all datasets for which the user has at least read access.
Table Rows
Each row highlights a dataset in the NAN archive
Table Columns
Columns represent different metadata fields for the NAN dataset. There is a default list of columns that are displayed, but users can toggle different columns on and off as desired by selecting the wrench icon in the upper right hand corner of the dataset browser. Columns may be re-order by dragging them. The columns to be displayed, along with their order, is saved in the NAN database as a user preference and will persist across sessions, browsers, and computers. See Dataset Columns for a complete list of columns with a short description and the types of filters that may be applied.
Redundant Status Column
- By default, the Data Browser displays only Preferred datasets to provide a clean and responsive interface.
- Users can identify datasets with redundancies via an icon on the preferred dataset. Clicking the icon opens the full set for review (see Icon badges below)
- A Redundant column may be enabled in the browser view. When this column is active, the Data Browser shows both preferred and redundant datasets for complete visibility.
Display Name / Dataset Name
When a dataset is harvested by the NAN Data Transport System it is stored in the NAN database with a unique UUID (hidden from the user) and is given a Dataset Name (non-editable) that matches the experimental directory from the NMR spectrometer.
- VNMRJ: "expN"
- Bruker: "experiment/N"
As the Dataset Name is generally not a useful description of the experiment we also create a Display Name that is user editable to allow users to create a more descriptive and meaningful label. When downloading, the dataset is saved using the original Dataset Name and the the Display Name is saved in a CSV file within the dataset folder. Note that the Display Name is fixed as the first column of the dataset table and cannot be altered. The Data Name column is not displayed by default, but can be toggled on if desired.
Icon badges
Icon badges in select columns represent additional information about the dataset and provide navigation links as described here:
- Display Name icon badges
- A circle with a star represents that the dataset is marked as a "preferred" dataset of a redundant set. In parenthesis will be the total number of datasets in the redundant set. Clicking the star will show all the datasets in the redundant set and provide a breadcrumb to navigate back to the default view.
- A small clock icon indicates that the dataset has been published. Clicking the icon will allow you to navigate to previous published versions or the original dataset.
- Sample icons
- A link icon indicates that a sample has been linked to the dataset and clicking the link icon will bring you to the Sample Browser filtered on the linked sample.
Pagination
At the bottom of the table is a pagination control. Users can move between pages and adjust the number of rows displayed per page: 25, 50, 100, or 500 datasets per page
Customization Tools

Wrench Icon
- Brings up a pull-down menu to toggle which columns are shown in the dataset table.
- Allows a View to be created, overwritten, or deleted. Note that the columns that are displayed are saved as a user preference and are not tied to a View. A View defines the applied filters and sorts to the columns and is independent on which columns are visible. For example, maybe you had a project where all the datasets were collected between two dates so you define a View to filter only datasets from specific users involved in the project that lie between two dates so that you can quickly see those datasets without the need to reapply the filters.
Saved Views
- Pull-down list of saved views (defined filters and sort)
Quick Filters
Quick filters apply predefined views to narrow down datasets. Current Quick Filters include:
- Successful Datasets Only – shows datasets marked as successful.
- Hide Failed Datasets – hides datasets marked as failed.
- My Data – datasets owned by the logged-in user.
- KB Datasets – datasets published in the Knowledgebase.
Note that successful and failed dataset filters rely on proper classification of datasets
Remove Filters Icon
- When no filters are applied to any columns the icon appears faded and is not selectable
- When not faded and selectable will clear all applied filters and sorts
- When the icon contains an exclamation point it means filters or sorts for a non visible column are active. Pressing the icon will prompt if all filters and sorts should be removed or only those for the non-visible columns.
Selection Icon
- Shown as a circle with a line through it. The Icon become visible when one or more datasets are selected and pressing it will clear all selections. Can be very handy when datasets are selected, but not visible on the screen.
Selection, Sorting, and Filtering

Selection
Datasets can be selected by clicking on the Display Name. The Dataset Browser supports multi-selection, with a checkbox next to each Display Name indicating selection status.
To select multiple datasets:
- Hold the Ctrl key (or Cmd on Mac) and click on Display Names to toggle individual selections.
- Hold the Shift key to select a range from the last selected to the current dataset.
IMPORTANT NOTE: There is inconsistent behavior when using Shift and Ctrl keys with the checkboxes. It is strongly recommended to use the Display Name for selection. Treat checkboxes only as visual indicators.
By default, datasets are sorted by date, with the most recent shown first. Sorting and filtering are available for all columns.
Sorting
Each column header includes a sort button (up/down arrows). Click once to sort in ascending order; click again to sort in descending order.
Filtering
Each column header includes a filter icon that opens a filtering dialog. The available filter types depend on the column's data type. The table below summarizes available filters. See Dataset Columns for which filter type apply to each column.
Date | Boolean | Text | Number | Controlled List Classification |
Controlled List Transfer Mode |
Tags |
---|---|---|---|---|---|---|
equals | yes | equals | equals | includes | ||
before | no | does not equal | does not equal | does not include | ||
after | is unset | contains | greater than | |||
is set | does not contain | less than | ||||
similar to | ||||||
starts with | ||||||
ends with | ||||||
is unset |
For all filter types except Boolean, users can add multiple filter rules per column. If multiple rules are added, the user must specify whether to "Match All" (AND) or "Match Any" (OR). This setting is ignored if only one rule is applied.
Advanced filters can also span multiple columns. While building complex filters may take effort, users can save views for reuse. See Customization Tools for details.
Actions

The '''Actions''' menu is accessed by right-clicking on a dataset row in the Dataset Browser. For multiple selections, right-click on any of the selected rows to perform bulk actions. Available actions depend on user permissions—actions unavailable to the user will appear grayed out.
Below is a table of available actions with some providing links to a page with additional details.
Action | Bulk Action Capable | Description |
---|---|---|
View / Edit Dataset | Possibly | Opens a modal window to view or edit the selected dataset. |
Reassign | Yes | Assigns or reassigns a dataset to a NAN user. Facility managers can reassign datasets to any user without time restrictions. Standard users can reassign within their lab group for up to three months after harvesting. |
Download | Yes | Downloads datasets in a variety of organizational layouts. |
NMRbox Integration | Yes | Copies a dataset from the NAN archive to the user’s NMRbox home folder in a predefined location. Also enables retrieval of post-acquisition files from NMRbox back into the NAN archive. |
Supplemental Data | No | Adds or views supplemental data associated with a dataset. |
Redundancy | Yes | Sets the dataset’s redundancy status as “preferred” or “redundant.” By default, the most recent experiment in a redundant set is marked as preferred. |
Make Public | Yes | Marks the dataset as publicly available. |
Link Sample | Yes | Links a dataset to a sample. |
Classification | Yes | Allows uses to classify datasets from a controlled list. Allows NMR facility managers to target a dataset to be removed from the NAN archive. |
Tags | Yes | Allows users to assign arbitrary tags to datasets. |
Notes | Yes | Allows users to add notes to datasets. |
Unlink from Collection | Yes | Removes a dataset from a collection. |
Publish | Yes | Publishes a dataset. |
Copy Dataset Link | No | Copies the URL of a dataset to the Clipboard |
Upload Datasets
See the Arbitrary Dataset Upload page for details