NDTS Experiment Harvesting
- NAN Data Transport System
- NDTS Overview
- NDTS Installation
- TopSpin 3.x Requirement
- Managing, Monitoring, and Logging
- NDTS Usage Guides
Overview
Identifying acquisition start and end events
The daemon detects when TopSpin or VNMRj is launched, who launched it, and whether it controls the spectrometer. How experiment start and end events are detected varies depending on the software version controlling the spectrometer:
- VNMRj - the daemon listens to InfoProc network messages to detect transitions in status (e.g. idle to acquiring)
- TopSpin 3.x - the daemon uses inotify (ReadDirectoryChanges on Windows) to monitor changes to the accounting file. See additional details on how NDTS utilizes the TopSpin accounting file here:
- accounting must be enabled on a per user basis. See enabling TopSpin accouting for more details.
- only detects experiment end events. The experiment start time is determined after the experiment is complete
- TopSpin 4.x - the daemon monitors the
shmem
file in the user-specific TopSpin directory ({topspin_directory}/prog/curdir/{user}/shmem
) for start and end events
Sessions
Each dataset harvested is associated with a sessionID. New sessions are created when:
- a user logs in to the workstation
- when a user starts the spectrometer software that controls the spectrometer (VNMRj or TopSpin)
- when the NAN User is changed in the NDTS-GUI
Experimental datasets with the same experiment name (e.g. expN
for VnmrJ, and data_directory/N
for TopSpin) and the same sessionID are all tagged as redundant except for the most recent experiment which is tagged preferred. The preferred / redundanat status can later be altered by the user from the Dataset Browser.
How NDTS determines if a dataset should be harvested
The NDTS daemon follows a logical workflow to determine if an experiment that completes should be harvested. Data will be harvested if:
- the user is configured for harvesting (a user, who the facility manager has set to never collect from the Manage Facility Users tool, will not have their data harvest regardless of other NDTS settings)
- the NDTS-GUI harvest toggle is "on"
- the dataset contains valid files (fid / ser), (propar / acqus*)
- the experiment job number is below the configured threshold set by the facility manager and displayed in the NDTS-GUI (this allows experiments, such as pulse calibrations, to easily be skipped by users without the need to toggle the harvesting status)
- the acquisition time exceeds 1 second (VNMRj often records a less than 1 second dataset at the beginning of a new dataset which contains garbage data. This requirement avoids those from being harvested).
- pulse program is not topshim, rga, wobb, or gs (this avoids harvesting datasets involved in tuning, shimming, receiver gain settings, and real-time parameter optimization)
Note that the NDTS daemon must be running. The daemon will not detect datasets that were collected while the daemon was disabled. The daemon is configured to be running at all times, so hopefully this is a rare event. If the daemon were disabled, the manual harvesting button from the NDTS-GUI may be used to harvested the missed dataset(s).
NDTS writes a detailed audit log that lists all experiments run on the spectrometer and a status of the transfer (e.g. sent, spooled, skipped). Click here for details.
What is harvested
When the NAN daemon detects the completion of an experiment and decides it should be harvested it bundles the files together and sends it to the Gateway. The daemon has been developed to capture all the datasets needed to recapitulate the experiment as well as metadata that is automatically determined as well as metadata entered by the user through the NDTS-GUI. VNMRj and TopSpin < 4.x have many necessary files that lie outside the experimental directory and the daemon has been configured to locate those files and include them in the dataset. A list of what is harvested is here:
- Files:
- time domain data
- parameter files
- shape waveforms
- non-uniform sample schedules
- pulse program files (including pre-compiled version)
- probe head files
- entire experiment directory
- Metadata:
- local workstation username
- facility
- instrument
- installed probe
- NAN user
- solution or solid state
- Selected Project, Study, and Sample
- user defined notes and metadata (Z0 drift rate, MAS spinning, MAS rate)
What if gateway is unreachable
Once the daemon is notified that an experiment has completed it immediately attempts to send the dataset to the Gateway. If the transfer fails, the daemon will immediately copy the dataset to /opt/nan-dtdaemon/spool
directory under a timestamp based on the experiment complete datetime.
With every heartbeat (every 10 minutes) the daemon looks for the existence of spooled datasets. If they exist the daemon attempts to send them to the Gateway. If it succeeds the spooled dataset files are removed from the workstation. If the transfer fails again the dataset remains in the spool. The spooled data will remain indefinitely until the dataset may be transferred to the Gateway. If multiple datasets have accumulated in the spool directory and the Gateway comes back on-line that daemon will throttle the pace of sending datasets to the Gateway to not overwhelm either the workstation or gateway computers.
Limitations
- If the NDTS daemon is not running no datasets will be harvested.
- For Bruker paropt arrayed experiments the experimental directory is often overwritten faster than NDTS can harvest and thus datasets collected with paropt are often corrupted. In addition, if the paropt data is saved as a 2D parameter set (but one which did not run) the daemon will not know of its existence and not collect it. We recommend running paropt experiments in job numbers higher than the configured value for data harvesting so that these datasets may be ignored.
- A dataset that is missing key information, such as a fid or ser file or parameter sets will be ignored.