NAN Data Transport System: Difference between revisions
Mmaciejewski (talk | contribs) |
Mmaciejewski (talk | contribs) No edit summary |
||
(5 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
== Overview == | == Overview == | ||
The Network for Advanced NMR Data Transport System (NDTS) enables automated harvesting of NMR acquisition data from spectrometer workstations and delivers it securely to the NAN Repository. Facility Managers are responsible for installing and managing the local components of the system, ensuring connectivity, and supporting user access to collected data. | The Network for Advanced NMR Data Transport System (NDTS) enables automated harvesting of NMR acquisition data from spectrometer workstations and delivers it securely to the NAN Repository. Facility Managers are responsible for installing and managing the local components of the system, ensuring connectivity, and supporting user access to collected data. | ||
See the [[NDTS Technical Implementation]] page for detailed information on how NDTS functions and some of the design decisions. | |||
== NDTS Components == | == NDTS Components == | ||
Line 48: | Line 50: | ||
== Data Flow Summary == | == Data Flow Summary == | ||
=== Experimental Data (Spectrometer → NAN Archive) === | |||
Datasets that are automatically or manually harvested are processed and sent to the NAN Archive as follows: | |||
# A user completes an acquisition on a spectrometer. | # A user completes an acquisition on a spectrometer. | ||
# The Daemon detects the completed experiment and sends it to the Gateway. | # The '''Daemon''' detects the completed experiment and sends it to the '''Gateway'''. | ||
# The Gateway transmits the data to the Receiver at UCHC. | # The '''Gateway''' transmits the data to the '''Receiver''' at UCHC. | ||
# The Receiver accepts the data and hands it off to the Parser. | # The '''Receiver''' accepts the data and hands it off to the '''Parser'''. | ||
# The Parser extracts metadata and stores it in the PostgreSQL and Elasticsearch databases. | # The '''Parser''' extracts metadata and stores it in the PostgreSQL and Elasticsearch databases. | ||
# The experiment data is stored in primary storage and backed up to disaster recovery storage. | # The experiment data is stored in primary storage and backed up to disaster recovery storage. | ||
# The data becomes visible in the NAN Portal (e.g., Data Browser, vNOC) within seconds. | # The data becomes visible in the NAN Portal (e.g., Data Browser, vNOC) within seconds. | ||
Line 58: | Line 63: | ||
Failures at any stage result in data being spooled locally and retried automatically. | Failures at any stage result in data being spooled locally and retried automatically. | ||
== Facility Manager Responsibilities == | === User and Probe Metadata (NAN Archive → Spectrometer) === | ||
Metadata includes facility users, their projects, studies, samples, NDTS default settings, mappings of local workstation users to NAN users, and the currently installed probe. These data are packaged and staged for retrieval by the '''Gateway'''. | |||
# A service on the '''Receiver''' packages and stages metadata for the spectrometers every 10 minutes. | |||
# The '''Gateway''' compares the checksum of the staged metadata against its current version. If different, it retrieves the updated package; otherwise, retrieval is skipped. This check occurs every 10 minutes. | |||
# The spectrometer performs the same check every 10 minutes and retrieves a new package only if changes are detected. | |||
The '''Daemon''' uses the updated metadata to set NDTS defaults for NAN users and present information in the NDTS GUI. | |||
=== Heartbeats (Spectrometer → NAN Archive) === | |||
At 10-minute intervals, the spectrometer sends heartbeat reports containing NDTS and instrument status. | |||
# The '''Daemon''' packages the heartbeat information and sends it to the '''Gateway'''. | |||
# The '''Gateway''' transmits the data to the '''Receiver''' at UCHC. | |||
# The '''Receiver''' processes the heartbeat data and stores it in the Elasticsearch database. | |||
# Heartbeat information is summarized and visualized in the virtual Network Operations Center (vNOC). | |||
== Facility Manager NDTS Responsibilities == | |||
Facility Managers are expected to: | Facility Managers are expected to: | ||
* Purchase the Gateway computer and install a modern Linux (preferably Ubuntu / Xubuntu / Mint or other Debian based OS) | * Purchase the Gateway computer and install a modern Linux (preferably Ubuntu / Xubuntu / Mint or other Debian based OS) | ||
* Keep the Gateway OS up-to-date with security patches | |||
* Install and configure [[NDTS Gateway Installation|Gateway]] and [[NDTS Daemon Installation|Daemon]] software | * Install and configure [[NDTS Gateway Installation|Gateway]] and [[NDTS Daemon Installation|Daemon]] software | ||
* Update Gateway and Daemon software within a reasonable time-frame of new releases being made available | |||
* Manage facility users through the [[Facility Dashboards|Facility Dashboard]] | * Manage facility users through the [[Facility Dashboards|Facility Dashboard]] | ||
* Reassign “unselected” or misattributed data through the [[Datasets|Dataset Browser]] | * [[Dataset reassignment|Reassign]] “unselected” or misattributed data through the [[Datasets|Dataset Browser]] | ||
* Monitor the health of NDTS for their facility, including heartbeats, through the virtual NAN Operating Center (vNOC) | * Monitor the health of NDTS for their facility, including heartbeats, through the virtual NAN Operating Center (vNOC) | ||
Latest revision as of 16:24, 17 July 2025
- NAN Data Transport System
- NDTS Overview
- NDTS Installation
- TopSpin 3.x Requirement
- Managing, Monitoring, and Logging
- NDTS Usage Guides
- NDTS Technical Details
Overview
The Network for Advanced NMR Data Transport System (NDTS) enables automated harvesting of NMR acquisition data from spectrometer workstations and delivers it securely to the NAN Repository. Facility Managers are responsible for installing and managing the local components of the system, ensuring connectivity, and supporting user access to collected data.
See the NDTS Technical Implementation page for detailed information on how NDTS functions and some of the design decisions.
NDTS Components
The NDTS system consists of local and central components working together to collect, transfer, store, and index NMR datasets
Component | Location | Role |
---|---|---|
Daemon | Spectrometer Workstation |
|
NDTS GUI | Spectrometer Workstation |
|
Gateway | Within NMR facility network |
|
Receiver | UCHC Data Center |
|
Parser | UCHC Data Center |
|
PostgreSQL Database | UCHC Data Center |
|
Primary Storage | UCHC Data Center |
|
Disaster Recovery Storage | Geo-dispersed |
|
Elasticsearch Database | UCHC Data Center |
|
Data Flow Summary
Experimental Data (Spectrometer → NAN Archive)
Datasets that are automatically or manually harvested are processed and sent to the NAN Archive as follows:
- A user completes an acquisition on a spectrometer.
- The Daemon detects the completed experiment and sends it to the Gateway.
- The Gateway transmits the data to the Receiver at UCHC.
- The Receiver accepts the data and hands it off to the Parser.
- The Parser extracts metadata and stores it in the PostgreSQL and Elasticsearch databases.
- The experiment data is stored in primary storage and backed up to disaster recovery storage.
- The data becomes visible in the NAN Portal (e.g., Data Browser, vNOC) within seconds.
Failures at any stage result in data being spooled locally and retried automatically.
User and Probe Metadata (NAN Archive → Spectrometer)
Metadata includes facility users, their projects, studies, samples, NDTS default settings, mappings of local workstation users to NAN users, and the currently installed probe. These data are packaged and staged for retrieval by the Gateway.
- A service on the Receiver packages and stages metadata for the spectrometers every 10 minutes.
- The Gateway compares the checksum of the staged metadata against its current version. If different, it retrieves the updated package; otherwise, retrieval is skipped. This check occurs every 10 minutes.
- The spectrometer performs the same check every 10 minutes and retrieves a new package only if changes are detected.
The Daemon uses the updated metadata to set NDTS defaults for NAN users and present information in the NDTS GUI.
Heartbeats (Spectrometer → NAN Archive)
At 10-minute intervals, the spectrometer sends heartbeat reports containing NDTS and instrument status.
- The Daemon packages the heartbeat information and sends it to the Gateway.
- The Gateway transmits the data to the Receiver at UCHC.
- The Receiver processes the heartbeat data and stores it in the Elasticsearch database.
- Heartbeat information is summarized and visualized in the virtual Network Operations Center (vNOC).
Facility Manager NDTS Responsibilities
Facility Managers are expected to:
- Purchase the Gateway computer and install a modern Linux (preferably Ubuntu / Xubuntu / Mint or other Debian based OS)
- Keep the Gateway OS up-to-date with security patches
- Install and configure Gateway and Daemon software
- Update Gateway and Daemon software within a reasonable time-frame of new releases being made available
- Manage facility users through the Facility Dashboard
- Reassign “unselected” or misattributed data through the Dataset Browser
- Monitor the health of NDTS for their facility, including heartbeats, through the virtual NAN Operating Center (vNOC)
Security
- Out-of-date operating systems on spectrometer workstations may lack modern encryption. To mitigate this risk, NDTS employs a dedicated Gateway computer between the workstations and the NDTS Receiver. The Gateway runs a current Linux distribution, and users are expected to apply security updates promptly.
- Because the Gateway resides on the same internal network as the workstations, dataset transfers from a workstation to the Gateway occur over an unencrypted channel; this local scope generally makes encryption unnecessary.
- All outbound communication originates from the Gateway; NAN datacenter services never initiate connections to facility Gateways. Transfers from the Gateway to the Receiver are fully encrypted, and mutual TLS certificates ensure the Gateway is connected to the correct Receiver. Checksums protect every transmission, and any failed transfer, either (workstation-to-Gateway or Gateway-to-Receiver) is queued locally for automatic retry.
- Upon arrival at the Receiver, each dataset is replicated across two independent storage systems. After ingestion, the data is stored redundantly in two additional locations, each offering high durability.