Dataset Classification

From Network for Advanced NMR
Jump to navigationJump to search

← Back to Dataset Browser --> Actions

Dataset Classification

Classification Modal

From the Data Browser, users may classify one or more datasets by selecting the Classification action from the context menu. This launches a dialog that allows the user to choose a classification from a controlled list:

  • Calibration experiment
  • Failed experiment due to sample-related issues
  • Failed experiment due to instrument-related issues
  • Failed experiment due to setup issues
  • Successful experiment
  • Test experiment

Classifying datasets provides a valuable layer of annotation that supports long-term data reuse and analysis. While this feature introduces a small additional burden to users, it is optimized for ease of use. Classifications can be applied in bulk, allowing entire sets of experiments to be labeled in just a few seconds.

Why Classification Matters

Dataset classification plays a central role in organizing and curating scientific data. In NAN, datasets are categorized using a combination of automated and user-defined metadata, including:

  • Tags
  • Pulse program
  • Data harvesting method
  • Dimensionality
  • Experimental parameters
  • Linked samples
  • Supplemental data
  • Redundancy status
  • Classification (user-defined)

This structured metadata enables powerful filtering capabilities in the Data Browser, allowing users to rapidly identify relevant experiments based on specific criteria.

As the NAN archive continues to grow, these filters will become essential tools for researchers. In particular, classification is a key element in generating curated data collections that support:

  • Machine learning model training
  • Quality control benchmarking
  • Comparative analysis
  • Downstream automated processing workflows

By consistently applying dataset classifications, users help build a more robust and reusable data ecosystem—enabling both human and machine-driven discovery.

Best Practices

  • Use the classification feature regularly, especially for test and failed experiments, to keep your dataset library clean and navigable.
  • Apply bulk classifications for ease of use.
  • Combine classification with tags and supplemental data for rich, queryable metadata.