Redundant Datasets: Difference between revisions
Mmaciejewski (talk | contribs) Created page with "<span style="display:inline-block; margin-bottom:1em;">← Back to Dataset Browser --> Actions</span> = Redundancy Management = The '''Redundancy''' action allows users to manage and curate datasets that were collected under the same conditions within a session. This helps maintain a streamlined view in the Data Browser while preserving access to repeated experiments collected during method development, calibration, or repeated runs. == Background..." |
Mmaciejewski (talk | contribs) |
||
Line 1: | Line 1: | ||
<span style="display:inline-block; margin-bottom:1em;">[[Datasets#Actions|← Back to Dataset Browser --> Actions]]</span> | <span style="display:inline-block; margin-bottom:1em;">[[Datasets#Actions|← Back to Dataset Browser --> Actions]]</span> | ||
= | = Overview = | ||
The '''Redundancy''' action allows users to manage and curate datasets that were collected under the same conditions within a session. This helps maintain a streamlined view in the Data Browser while preserving access to repeated experiments collected during method development, calibration, or repeated runs. | The '''Redundancy''' action allows users to manage and curate datasets that were collected under the same conditions within a session. This helps maintain a streamlined view in the Data Browser while preserving access to repeated experiments collected during method development, calibration, or repeated runs. |
Revision as of 17:12, 28 May 2025
← Back to Dataset Browser --> Actions
Overview
The Redundancy action allows users to manage and curate datasets that were collected under the same conditions within a session. This helps maintain a streamlined view in the Data Browser while preserving access to repeated experiments collected during method development, calibration, or repeated runs.
Background
All datasets harvested by NDTS are associated with a session ID, which is assigned when:
- A user logs in to an instrument,
- Starts the NMR software (VnmrJ or TopSpin), or
- Changes the NAN user in the NDTS GUI.
Each dataset is named based on its acquisition directory (e.g., `expN` for VnmrJ or `data_directory/N` for TopSpin). It is common for users to acquire multiple datasets with the same name in a session—often for calibration, 2D projections, or trial runs prior to the final experiment.
NDTS automatically harvests all datasets, including these repeated runs. To avoid clutter, datasets within a session that share the same name are grouped, with:
- The last acquired dataset marked as Preferred,
- All earlier versions in the session marked as Redundant.
Using the Redundancy Action
From the dataset context menu, users can choose the Redundancy action, which offers the following options:
- View Redundant Set – Displays all datasets that share the same name and session ID, including both preferred and redundant entries.
- Mark as Redundant – Assigns the selected dataset as redundant, removing it from the default browser view.
- Mark as Preferred – Assigns the selected dataset as preferred, overriding the system default and demoting others in the set to redundant.
- Reset to Default – Reverts the redundancy settings to the default behavior, where the most recently collected dataset is preferred.
Browser Behavior
- By default, the Data Browser displays only Preferred datasets to provide a clean and responsive interface.
- Users can identify datasets with redundancies via an icon on the preferred dataset. Clicking the icon opens the full set for review.
- A Redundant column may be enabled in the browser view. When this column is active, the Data Browser shows both preferred and redundant datasets for complete visibility.
Real-Time Harvesting Implications
Because NDTS harvests datasets in real time:
- The first dataset acquired is initially tagged as preferred.
- As new datasets with the same name arrive within the same session, they are promoted to preferred, and previous ones are re-tagged as redundant.
- This dynamic tagging continues until the acquisition session ends.
Users may override these settings using the Redundancy tools described above. The default state is always preserved and can be restored using the Reset to Default option.
The redundancy system provides a balance between efficient browsing and full data traceability, particularly useful when preparing curated datasets or selecting experiments for publication or machine learning training.