NDTS Daemon Operation: Difference between revisions

From Network for Advanced NMR
Jump to navigationJump to search
No edit summary
Line 66: Line 66:
(sent • spooled • sent-spooled • skipped-trivial • skipped-disabled)
(sent • spooled • sent-spooled • skipped-trivial • skipped-disabled)


== '''Daemon Log File''' ==
= Daemon Logging =
* Main log: <pre>/opt/nan-dtdaemon/logs/nan-dtdaemon.log</pre>
 
* Verbosity controlled by '''log_level''' in <code>ndtd_configuration.dat</code>
NDTS writes two workstation logs:
  (fatal &lt; error &lt; warning &lt; '''info''' &lt; debug &lt; trace).
 
* '''nan-dtdaemon.log''' — runtime events, heartbeats, errors 
* '''ndtd_audit.txt''' — one-line summary per experiment
 
This page focuses on **nan-dtdaemon.log**.
 
== '''Log Levels''' ==
Each line begins with a level tag.  The level is controlled by the
<code>log_level</code> parameter in <code>ndtd_configuration.dat</code>.
 
{| class="wikitable"
! Level !! Verbosity !! Typical Use
|-
| fatal || Highest-priority, least frequent || Events that make the daemon shut down and cannot be auto-recovered
|-
| error || Critical problems || Failures that stop normal operation but daemon continues running
|-
| warning || Important but non-fatal issues || Conditions worth attention; daemon recovers automatically
|-
| info || Default || Unusual or noteworthy events; normal operations generate very little output
|-
| debug || Diagnostic detail || Ongoing list of major operations; log grows steadily
|-
| trace || Maximum detail || Every internal step; use only for short troubleshooting sessions
|}
 
== '''Log File Example''' ==
The fragment below is reproduced verbatim from the PDF (pp. 14-15):


Example start-up excerpt (INFO):
<pre>
<pre>
Thu Sep 28 13:17:03 2023 LOG_START Started dtd logger.
Thu Sep 28 13:17:03 2023 LOG_START Started dtd logger.
Line 77: Line 103:
Thu Sep 28 13:17:03 2023 INFO *** This is a Topspin Workstation ***
Thu Sep 28 13:17:03 2023 INFO *** This is a Topspin Workstation ***
Thu Sep 28 13:17:03 2023 INFO Ndtd Control Processor listening.
Thu Sep 28 13:17:03 2023 INFO Ndtd Control Processor listening.
Thu Sep 28 13:17:03 2023 INFO Entering polling loop...
Thu Sep 28 13:17:03 2023 INFO Workstation user has changed!
Thu Sep 28 13:17:03 2023 INFO workstation user is nmradmin
Thu Sep 28 13:17:03 2023 INFO User nmradmin is included in NAN data collection!
Thu Sep 28 13:17:03 2023 INFO Harvesting setting for user nmradmin is on
Thu Sep 28 13:17:03 2023 INFO Topspin program has been detected and is running.
Thu Sep 28 13:17:03 2023 INFO Setting directory to watch to /opt/topspin4.2.0/prog/curdir/nmradmin/shmem
</pre>
</pre>
== '''Analysis of the Example''' ==
# '''LOG_START''' — first line of every run; timestamp shows when the daemon began. 
# **“Workstation version is 1.0.15”** — confirms the daemon build that is running. 
# **“*** This is a Topspin Workstation ***”** — daemon detected TopSpin rather than VNMRJ. 
# **“Ndtd Control Processor listening.”** — background thread ready to accept UI commands. 
# **“Entering polling loop…”** — daemon has fully initialised and is now watching for experiment triggers. 
# **User-change block (lines 6-9)** 
## detects a KDE/X login to TopSpin (<code>Workstation user has changed!</code>) 
## logs Linux username (<code>nmradmin</code>) 
## verifies NMRhub mapping (<code>included in NAN data collection</code>) 
## notes harvesting on/off status for that user. 
# **“Topspin program has been detected and is running.”** — daemon confirmed the acquisition software is active. 
# **“Setting directory to watch …”** — final line shows the exact filesystem path monitored for experiment-complete triggers.
If the last two lines never appear, the daemon cannot locate TopSpin/VNMRJ and no data will be harvested.


== '''Troubleshooting Checklist''' ==
== '''Troubleshooting Checklist''' ==

Revision as of 19:36, 2 June 2025

Running and Monitoring the Daemon

This page explains how to control the data-transport-daemon service, verify connectivity, and interpret the daemon’s log and audit files on every spectrometer workstation.

Service Control

# Start the daemon
sudo /sbin/service data-transport-daemon start

# Stop the daemon
sudo /sbin/service data-transport-daemon stop

# Restart (reloads configuration)
sudo /sbin/service data-transport-daemon restart

# Check status
sudo /sbin/service data-transport-daemon status
  • The daemon refuses to start if another instance is already running.*

Heartbeat and Connectivity

  • The daemon sends a heartbeat to the Gateway every 10 minutes.
  • The Gateway forwards that heartbeat to the NDTS Receiver; entries are visible in vNOC.

Slack Notifications

When heartbeats stop, the Receiver posts to the facility’s Slack channel.

Condition Time-out Receiver Action Slack Message
First missed heartbeat > 20 min Mark workstation offline offline
Still missing at next poll + 8 min Repeat offline (max 3) offline
Heartbeat resumes Mark workstation online online

Slack channels (one per facility):

  • ccrc-ndts-notifications
  • nmrfam-ndts-notifications
  • uchc-ndts-notifications

Version Tracking

  • On start-up the daemon writes its version to the log.
  • A file named
/opt/nan-dtdaemon/running_workstation_version-X.Y.Z
 records the version and start time.

Experiment Transfer Audit

Each processed experiment appends a line to

/opt/nan-dtdaemon/logs/ndtd_audit.txt

Fields: timestamp • workstation user • NMRhub user (or unselected) • start/end • path • daemon version • action (sent • spooled • sent-spooled • skipped-trivial • skipped-disabled)

Daemon Logging

NDTS writes two workstation logs:

  • nan-dtdaemon.log — runtime events, heartbeats, errors
  • ndtd_audit.txt — one-line summary per experiment

This page focuses on **nan-dtdaemon.log**.

Log Levels

Each line begins with a level tag. The level is controlled by the log_level parameter in ndtd_configuration.dat.

Level Verbosity Typical Use
fatal Highest-priority, least frequent Events that make the daemon shut down and cannot be auto-recovered
error Critical problems Failures that stop normal operation but daemon continues running
warning Important but non-fatal issues Conditions worth attention; daemon recovers automatically
info Default Unusual or noteworthy events; normal operations generate very little output
debug Diagnostic detail Ongoing list of major operations; log grows steadily
trace Maximum detail Every internal step; use only for short troubleshooting sessions

Log File Example

The fragment below is reproduced verbatim from the PDF (pp. 14-15):

Thu Sep 28 13:17:03 2023 LOG_START Started dtd logger.
Thu Sep 28 13:17:03 2023 INFO NDTD Workstation version is 1.0.15
Thu Sep 28 13:17:03 2023 INFO *** This is a Topspin Workstation ***
Thu Sep 28 13:17:03 2023 INFO Ndtd Control Processor listening.
Thu Sep 28 13:17:03 2023 INFO Entering polling loop...
Thu Sep 28 13:17:03 2023 INFO Workstation user has changed!
Thu Sep 28 13:17:03 2023 INFO workstation user is nmradmin
Thu Sep 28 13:17:03 2023 INFO User nmradmin is included in NAN data collection!
Thu Sep 28 13:17:03 2023 INFO Harvesting setting for user nmradmin is on
Thu Sep 28 13:17:03 2023 INFO Topspin program has been detected and is running.
Thu Sep 28 13:17:03 2023 INFO Setting directory to watch to /opt/topspin4.2.0/prog/curdir/nmradmin/shmem

Analysis of the Example

  1. LOG_START — first line of every run; timestamp shows when the daemon began.
  2. **“Workstation version is 1.0.15”** — confirms the daemon build that is running.
  3. **“*** This is a Topspin Workstation ***”** — daemon detected TopSpin rather than VNMRJ.
  4. **“Ndtd Control Processor listening.”** — background thread ready to accept UI commands.
  5. **“Entering polling loop…”** — daemon has fully initialised and is now watching for experiment triggers.
  6. **User-change block (lines 6-9)**
    1. detects a KDE/X login to TopSpin (Workstation user has changed!)
    2. logs Linux username (nmradmin)
    3. verifies NMRhub mapping (included in NAN data collection)
    4. notes harvesting on/off status for that user.
  7. **“Topspin program has been detected and is running.”** — daemon confirmed the acquisition software is active.
  8. **“Setting directory to watch …”** — final line shows the exact filesystem path monitored for experiment-complete triggers.

If the last two lines never appear, the daemon cannot locate TopSpin/VNMRJ and no data will be harvested.

Troubleshooting Checklist

Symptom What to Check
No new data reaches NAN service data-transport-daemon status
• Latest heartbeat in vNOC
• Gateway log for incoming files
Repeated offline Slack alerts Workstation powered off? Network drop? Firewall blocking port 60195?
Log grows rapidly log_level left at trace – reset to info
Experiments remain spooled Gateway unreachable → verify IP/port and gateway service status

Next Step

Return to NDTS Overview or continue to Accessing Collected Data.