NDTS Daemon Operation: Difference between revisions
Mmaciejewski (talk | contribs) No edit summary |
Mmaciejewski (talk | contribs) |
||
Line 66: | Line 66: | ||
(sent • spooled • sent-spooled • skipped-trivial • skipped-disabled) | (sent • spooled • sent-spooled • skipped-trivial • skipped-disabled) | ||
== ''' | = Daemon Logging = | ||
* | |||
NDTS writes two workstation logs: | |||
* '''nan-dtdaemon.log''' — runtime events, heartbeats, errors | |||
* '''ndtd_audit.txt''' — one-line summary per experiment | |||
This page focuses on **nan-dtdaemon.log**. | |||
== '''Log Levels''' == | |||
Each line begins with a level tag. The level is controlled by the | |||
<code>log_level</code> parameter in <code>ndtd_configuration.dat</code>. | |||
{| class="wikitable" | |||
! Level !! Verbosity !! Typical Use | |||
|- | |||
| fatal || Highest-priority, least frequent || Events that make the daemon shut down and cannot be auto-recovered | |||
|- | |||
| error || Critical problems || Failures that stop normal operation but daemon continues running | |||
|- | |||
| warning || Important but non-fatal issues || Conditions worth attention; daemon recovers automatically | |||
|- | |||
| info || Default || Unusual or noteworthy events; normal operations generate very little output | |||
|- | |||
| debug || Diagnostic detail || Ongoing list of major operations; log grows steadily | |||
|- | |||
| trace || Maximum detail || Every internal step; use only for short troubleshooting sessions | |||
|} | |||
== '''Log File Example''' == | |||
The fragment below is reproduced verbatim from the PDF (pp. 14-15): | |||
<pre> | <pre> | ||
Thu Sep 28 13:17:03 2023 LOG_START Started dtd logger. | Thu Sep 28 13:17:03 2023 LOG_START Started dtd logger. | ||
Line 77: | Line 103: | ||
Thu Sep 28 13:17:03 2023 INFO *** This is a Topspin Workstation *** | Thu Sep 28 13:17:03 2023 INFO *** This is a Topspin Workstation *** | ||
Thu Sep 28 13:17:03 2023 INFO Ndtd Control Processor listening. | Thu Sep 28 13:17:03 2023 INFO Ndtd Control Processor listening. | ||
Thu Sep 28 13:17:03 2023 INFO Entering polling loop... | |||
Thu Sep 28 13:17:03 2023 INFO Workstation user has changed! | |||
Thu Sep 28 13:17:03 2023 INFO workstation user is nmradmin | |||
Thu Sep 28 13:17:03 2023 INFO User nmradmin is included in NAN data collection! | |||
Thu Sep 28 13:17:03 2023 INFO Harvesting setting for user nmradmin is on | |||
Thu Sep 28 13:17:03 2023 INFO Topspin program has been detected and is running. | |||
Thu Sep 28 13:17:03 2023 INFO Setting directory to watch to /opt/topspin4.2.0/prog/curdir/nmradmin/shmem | |||
</pre> | </pre> | ||
== '''Analysis of the Example''' == | |||
# '''LOG_START''' — first line of every run; timestamp shows when the daemon began. | |||
# **“Workstation version is 1.0.15”** — confirms the daemon build that is running. | |||
# **“*** This is a Topspin Workstation ***”** — daemon detected TopSpin rather than VNMRJ. | |||
# **“Ndtd Control Processor listening.”** — background thread ready to accept UI commands. | |||
# **“Entering polling loop…”** — daemon has fully initialised and is now watching for experiment triggers. | |||
# **User-change block (lines 6-9)** | |||
## detects a KDE/X login to TopSpin (<code>Workstation user has changed!</code>) | |||
## logs Linux username (<code>nmradmin</code>) | |||
## verifies NMRhub mapping (<code>included in NAN data collection</code>) | |||
## notes harvesting on/off status for that user. | |||
# **“Topspin program has been detected and is running.”** — daemon confirmed the acquisition software is active. | |||
# **“Setting directory to watch …”** — final line shows the exact filesystem path monitored for experiment-complete triggers. | |||
If the last two lines never appear, the daemon cannot locate TopSpin/VNMRJ and no data will be harvested. | |||
== '''Troubleshooting Checklist''' == | == '''Troubleshooting Checklist''' == |
Revision as of 19:36, 2 June 2025
Running and Monitoring the Daemon
This page explains how to control the data-transport-daemon service, verify connectivity, and interpret the daemon’s log and audit files on every spectrometer workstation.
Service Control
# Start the daemon sudo /sbin/service data-transport-daemon start # Stop the daemon sudo /sbin/service data-transport-daemon stop # Restart (reloads configuration) sudo /sbin/service data-transport-daemon restart # Check status sudo /sbin/service data-transport-daemon status
- The daemon refuses to start if another instance is already running.*
Heartbeat and Connectivity
- The daemon sends a heartbeat to the Gateway every 10 minutes.
- The Gateway forwards that heartbeat to the NDTS Receiver; entries are visible in vNOC.
Slack Notifications
When heartbeats stop, the Receiver posts to the facility’s Slack channel.
Condition | Time-out | Receiver Action | Slack Message |
---|---|---|---|
First missed heartbeat | > 20 min | Mark workstation offline | offline |
Still missing at next poll | + 8 min | Repeat offline (max 3) | offline |
Heartbeat resumes | – | Mark workstation online | online |
Slack channels (one per facility):
ccrc-ndts-notifications
nmrfam-ndts-notifications
uchc-ndts-notifications
Version Tracking
- On start-up the daemon writes its version to the log.
- A file named
/opt/nan-dtdaemon/running_workstation_version-X.Y.Z
records the version and start time.
Experiment Transfer Audit
Each processed experiment appends a line to
/opt/nan-dtdaemon/logs/ndtd_audit.txt
Fields: timestamp • workstation user • NMRhub user (or unselected) • start/end • path • daemon version • action (sent • spooled • sent-spooled • skipped-trivial • skipped-disabled)
Daemon Logging
NDTS writes two workstation logs:
- nan-dtdaemon.log — runtime events, heartbeats, errors
- ndtd_audit.txt — one-line summary per experiment
This page focuses on **nan-dtdaemon.log**.
Log Levels
Each line begins with a level tag. The level is controlled by the
log_level
parameter in ndtd_configuration.dat
.
Level | Verbosity | Typical Use |
---|---|---|
fatal | Highest-priority, least frequent | Events that make the daemon shut down and cannot be auto-recovered |
error | Critical problems | Failures that stop normal operation but daemon continues running |
warning | Important but non-fatal issues | Conditions worth attention; daemon recovers automatically |
info | Default | Unusual or noteworthy events; normal operations generate very little output |
debug | Diagnostic detail | Ongoing list of major operations; log grows steadily |
trace | Maximum detail | Every internal step; use only for short troubleshooting sessions |
Log File Example
The fragment below is reproduced verbatim from the PDF (pp. 14-15):
Thu Sep 28 13:17:03 2023 LOG_START Started dtd logger. Thu Sep 28 13:17:03 2023 INFO NDTD Workstation version is 1.0.15 Thu Sep 28 13:17:03 2023 INFO *** This is a Topspin Workstation *** Thu Sep 28 13:17:03 2023 INFO Ndtd Control Processor listening. Thu Sep 28 13:17:03 2023 INFO Entering polling loop... Thu Sep 28 13:17:03 2023 INFO Workstation user has changed! Thu Sep 28 13:17:03 2023 INFO workstation user is nmradmin Thu Sep 28 13:17:03 2023 INFO User nmradmin is included in NAN data collection! Thu Sep 28 13:17:03 2023 INFO Harvesting setting for user nmradmin is on Thu Sep 28 13:17:03 2023 INFO Topspin program has been detected and is running. Thu Sep 28 13:17:03 2023 INFO Setting directory to watch to /opt/topspin4.2.0/prog/curdir/nmradmin/shmem
Analysis of the Example
- LOG_START — first line of every run; timestamp shows when the daemon began.
- **“Workstation version is 1.0.15”** — confirms the daemon build that is running.
- **“*** This is a Topspin Workstation ***”** — daemon detected TopSpin rather than VNMRJ.
- **“Ndtd Control Processor listening.”** — background thread ready to accept UI commands.
- **“Entering polling loop…”** — daemon has fully initialised and is now watching for experiment triggers.
- **User-change block (lines 6-9)**
- detects a KDE/X login to TopSpin (
Workstation user has changed!
) - logs Linux username (
nmradmin
) - verifies NMRhub mapping (
included in NAN data collection
) - notes harvesting on/off status for that user.
- detects a KDE/X login to TopSpin (
- **“Topspin program has been detected and is running.”** — daemon confirmed the acquisition software is active.
- **“Setting directory to watch …”** — final line shows the exact filesystem path monitored for experiment-complete triggers.
If the last two lines never appear, the daemon cannot locate TopSpin/VNMRJ and no data will be harvested.
Troubleshooting Checklist
Symptom | What to Check |
---|---|
No new data reaches NAN | • service data-transport-daemon status • Latest heartbeat in vNOC • Gateway log for incoming files |
Repeated offline Slack alerts | Workstation powered off? Network drop? Firewall blocking port 60195? |
Log grows rapidly | log_level left at trace – reset to info
|
Experiments remain spooled | Gateway unreachable → verify IP/port and gateway service status |
Next Step
Return to NDTS Overview or continue to Accessing Collected Data.