NDTS Daemon Operation: Difference between revisions
Mmaciejewski (talk | contribs) |
Mmaciejewski (talk | contribs) No edit summary |
||
Line 1: | Line 1: | ||
= Running and Monitoring the Daemon = | == Running and Monitoring the Daemon == | ||
This page explains how to control the '''data-transport-daemon''' service, verify connectivity, and interpret the daemon’s log and audit files on every spectrometer workstation. | This page explains how to control the '''data-transport-daemon''' service, verify connectivity, and interpret the daemon’s log and audit files on every spectrometer workstation. | ||
Line 66: | Line 65: | ||
(sent • spooled • sent-spooled • skipped-trivial • skipped-disabled) | (sent • spooled • sent-spooled • skipped-trivial • skipped-disabled) | ||
= Daemon Logging = | == '''Daemon Logging''' == | ||
NDTS writes two workstation logs: | NDTS writes two workstation logs: | ||
Line 75: | Line 73: | ||
This page focuses on **nan-dtdaemon.log**. | This page focuses on **nan-dtdaemon.log**. | ||
== '''Log Levels''' == | === '''Log Levels''' === | ||
Each line begins with a level tag. The level is controlled by the | Each line begins with a level tag. The level is controlled by the | ||
<code>log_level</code> parameter in <code>ndtd_configuration.dat</code>. | <code>log_level</code> parameter in <code>ndtd_configuration.dat</code>. | ||
Line 95: | Line 93: | ||
|} | |} | ||
== '''Log File Example''' == | === '''Log File Example''' === | ||
The fragment below is reproduced verbatim from the PDF (pp. 14-15): | The fragment below is reproduced verbatim from the PDF (pp. 14-15): | ||
Line 112: | Line 110: | ||
</pre> | </pre> | ||
== '''Analysis of the Example''' == | === '''Analysis of the Example''' === | ||
# '''LOG_START''' — first line of every run; timestamp shows when the daemon began. | # '''LOG_START''' — first line of every run; timestamp shows when the daemon began. | ||
# **“Workstation version is 1.0.15”** — confirms the daemon build that is running. | # **“Workstation version is 1.0.15”** — confirms the daemon build that is running. |
Revision as of 19:38, 2 June 2025
Running and Monitoring the Daemon
This page explains how to control the data-transport-daemon service, verify connectivity, and interpret the daemon’s log and audit files on every spectrometer workstation.
Service Control
# Start the daemon sudo /sbin/service data-transport-daemon start # Stop the daemon sudo /sbin/service data-transport-daemon stop # Restart (reloads configuration) sudo /sbin/service data-transport-daemon restart # Check status sudo /sbin/service data-transport-daemon status
- The daemon refuses to start if another instance is already running.*
Heartbeat and Connectivity
- The daemon sends a heartbeat to the Gateway every 10 minutes.
- The Gateway forwards that heartbeat to the NDTS Receiver; entries are visible in vNOC.
Slack Notifications
When heartbeats stop, the Receiver posts to the facility’s Slack channel.
Condition | Time-out | Receiver Action | Slack Message |
---|---|---|---|
First missed heartbeat | > 20 min | Mark workstation offline | offline |
Still missing at next poll | + 8 min | Repeat offline (max 3) | offline |
Heartbeat resumes | – | Mark workstation online | online |
Slack channels (one per facility):
ccrc-ndts-notifications
nmrfam-ndts-notifications
uchc-ndts-notifications
Version Tracking
- On start-up the daemon writes its version to the log.
- A file named
/opt/nan-dtdaemon/running_workstation_version-X.Y.Z
records the version and start time.
Experiment Transfer Audit
Each processed experiment appends a line to
/opt/nan-dtdaemon/logs/ndtd_audit.txt
Fields: timestamp • workstation user • NMRhub user (or unselected) • start/end • path • daemon version • action (sent • spooled • sent-spooled • skipped-trivial • skipped-disabled)
Daemon Logging
NDTS writes two workstation logs:
- nan-dtdaemon.log — runtime events, heartbeats, errors
- ndtd_audit.txt — one-line summary per experiment
This page focuses on **nan-dtdaemon.log**.
Log Levels
Each line begins with a level tag. The level is controlled by the
log_level
parameter in ndtd_configuration.dat
.
Level | Verbosity | Typical Use |
---|---|---|
fatal | Highest-priority, least frequent | Events that make the daemon shut down and cannot be auto-recovered |
error | Critical problems | Failures that stop normal operation but daemon continues running |
warning | Important but non-fatal issues | Conditions worth attention; daemon recovers automatically |
info | Default | Unusual or noteworthy events; normal operations generate very little output |
debug | Diagnostic detail | Ongoing list of major operations; log grows steadily |
trace | Maximum detail | Every internal step; use only for short troubleshooting sessions |
Log File Example
The fragment below is reproduced verbatim from the PDF (pp. 14-15):
Thu Sep 28 13:17:03 2023 LOG_START Started dtd logger. Thu Sep 28 13:17:03 2023 INFO NDTD Workstation version is 1.0.15 Thu Sep 28 13:17:03 2023 INFO *** This is a Topspin Workstation *** Thu Sep 28 13:17:03 2023 INFO Ndtd Control Processor listening. Thu Sep 28 13:17:03 2023 INFO Entering polling loop... Thu Sep 28 13:17:03 2023 INFO Workstation user has changed! Thu Sep 28 13:17:03 2023 INFO workstation user is nmradmin Thu Sep 28 13:17:03 2023 INFO User nmradmin is included in NAN data collection! Thu Sep 28 13:17:03 2023 INFO Harvesting setting for user nmradmin is on Thu Sep 28 13:17:03 2023 INFO Topspin program has been detected and is running. Thu Sep 28 13:17:03 2023 INFO Setting directory to watch to /opt/topspin4.2.0/prog/curdir/nmradmin/shmem
Analysis of the Example
- LOG_START — first line of every run; timestamp shows when the daemon began.
- **“Workstation version is 1.0.15”** — confirms the daemon build that is running.
- **“*** This is a Topspin Workstation ***”** — daemon detected TopSpin rather than VNMRJ.
- **“Ndtd Control Processor listening.”** — background thread ready to accept UI commands.
- **“Entering polling loop…”** — daemon has fully initialised and is now watching for experiment triggers.
- **User-change block (lines 6-9)**
- detects a KDE/X login to TopSpin (
Workstation user has changed!
) - logs Linux username (
nmradmin
) - verifies NMRhub mapping (
included in NAN data collection
) - notes harvesting on/off status for that user.
- detects a KDE/X login to TopSpin (
- **“Topspin program has been detected and is running.”** — daemon confirmed the acquisition software is active.
- **“Setting directory to watch …”** — final line shows the exact filesystem path monitored for experiment-complete triggers.
If the last two lines never appear, the daemon cannot locate TopSpin/VNMRJ and no data will be harvested.
Troubleshooting Checklist
Symptom | What to Check |
---|---|
No new data reaches NAN | • service data-transport-daemon status • Latest heartbeat in vNOC • Gateway log for incoming files |
Repeated offline Slack alerts | Workstation powered off? Network drop? Firewall blocking port 60195? |
Log grows rapidly | log_level left at trace – reset to info
|
Experiments remain spooled | Gateway unreachable → verify IP/port and gateway service status |
Next Step
Return to NDTS Overview or continue to Accessing Collected Data.