NDTS Daemon Operation

From Network for Advanced NMR
Revision as of 19:38, 2 June 2025 by Mmaciejewski (talk | contribs)
Jump to navigationJump to search

Running and Monitoring the Daemon

This page explains how to control the data-transport-daemon service, verify connectivity, and interpret the daemon’s log and audit files on every spectrometer workstation.

Service Control

# Start the daemon
sudo /sbin/service data-transport-daemon start

# Stop the daemon
sudo /sbin/service data-transport-daemon stop

# Restart (reloads configuration)
sudo /sbin/service data-transport-daemon restart

# Check status
sudo /sbin/service data-transport-daemon status
  • The daemon refuses to start if another instance is already running.*

Heartbeat and Connectivity

  • The daemon sends a heartbeat to the Gateway every 10 minutes.
  • The Gateway forwards that heartbeat to the NDTS Receiver; entries are visible in vNOC.

Slack Notifications

When heartbeats stop, the Receiver posts to the facility’s Slack channel.

Condition Time-out Receiver Action Slack Message
First missed heartbeat > 20 min Mark workstation offline offline
Still missing at next poll + 8 min Repeat offline (max 3) offline
Heartbeat resumes Mark workstation online online

Slack channels (one per facility):

  • ccrc-ndts-notifications
  • nmrfam-ndts-notifications
  • uchc-ndts-notifications

Version Tracking

  • On start-up the daemon writes its version to the log.
  • A file named
/opt/nan-dtdaemon/running_workstation_version-X.Y.Z
 records the version and start time.

Experiment Transfer Audit

Each processed experiment appends a line to

/opt/nan-dtdaemon/logs/ndtd_audit.txt

Fields: timestamp • workstation user • NMRhub user (or unselected) • start/end • path • daemon version • action (sent • spooled • sent-spooled • skipped-trivial • skipped-disabled)

Daemon Logging

NDTS writes two workstation logs:

  • nan-dtdaemon.log — runtime events, heartbeats, errors
  • ndtd_audit.txt — one-line summary per experiment

This page focuses on **nan-dtdaemon.log**.

Log Levels

Each line begins with a level tag. The level is controlled by the log_level parameter in ndtd_configuration.dat.

Level Verbosity Typical Use
fatal Highest-priority, least frequent Events that make the daemon shut down and cannot be auto-recovered
error Critical problems Failures that stop normal operation but daemon continues running
warning Important but non-fatal issues Conditions worth attention; daemon recovers automatically
info Default Unusual or noteworthy events; normal operations generate very little output
debug Diagnostic detail Ongoing list of major operations; log grows steadily
trace Maximum detail Every internal step; use only for short troubleshooting sessions

Log File Example

The fragment below is reproduced verbatim from the PDF (pp. 14-15):

Thu Sep 28 13:17:03 2023 LOG_START Started dtd logger.
Thu Sep 28 13:17:03 2023 INFO NDTD Workstation version is 1.0.15
Thu Sep 28 13:17:03 2023 INFO *** This is a Topspin Workstation ***
Thu Sep 28 13:17:03 2023 INFO Ndtd Control Processor listening.
Thu Sep 28 13:17:03 2023 INFO Entering polling loop...
Thu Sep 28 13:17:03 2023 INFO Workstation user has changed!
Thu Sep 28 13:17:03 2023 INFO workstation user is nmradmin
Thu Sep 28 13:17:03 2023 INFO User nmradmin is included in NAN data collection!
Thu Sep 28 13:17:03 2023 INFO Harvesting setting for user nmradmin is on
Thu Sep 28 13:17:03 2023 INFO Topspin program has been detected and is running.
Thu Sep 28 13:17:03 2023 INFO Setting directory to watch to /opt/topspin4.2.0/prog/curdir/nmradmin/shmem

Analysis of the Example

  1. LOG_START — first line of every run; timestamp shows when the daemon began.
  2. **“Workstation version is 1.0.15”** — confirms the daemon build that is running.
  3. **“*** This is a Topspin Workstation ***”** — daemon detected TopSpin rather than VNMRJ.
  4. **“Ndtd Control Processor listening.”** — background thread ready to accept UI commands.
  5. **“Entering polling loop…”** — daemon has fully initialised and is now watching for experiment triggers.
  6. **User-change block (lines 6-9)**
    1. detects a KDE/X login to TopSpin (Workstation user has changed!)
    2. logs Linux username (nmradmin)
    3. verifies NMRhub mapping (included in NAN data collection)
    4. notes harvesting on/off status for that user.
  7. **“Topspin program has been detected and is running.”** — daemon confirmed the acquisition software is active.
  8. **“Setting directory to watch …”** — final line shows the exact filesystem path monitored for experiment-complete triggers.

If the last two lines never appear, the daemon cannot locate TopSpin/VNMRJ and no data will be harvested.

Troubleshooting Checklist

Symptom What to Check
No new data reaches NAN service data-transport-daemon status
• Latest heartbeat in vNOC
• Gateway log for incoming files
Repeated offline Slack alerts Workstation powered off? Network drop? Firewall blocking port 60195?
Log grows rapidly log_level left at trace – reset to info
Experiments remain spooled Gateway unreachable → verify IP/port and gateway service status

Next Step

Return to NDTS Overview or continue to Accessing Collected Data.