Revision as of 19:23, 2 June 2025

Running and Monitoring the Daemon

This page explains how to control the data-transport-daemon service, verify connectivity, and interpret the daemon’s log and audit files on every spectrometer workstation.

Service Control

# Start the daemon
sudo /sbin/service data-transport-daemon start

# Stop the daemon
sudo /sbin/service data-transport-daemon stop

# Restart (reloads configuration)
sudo /sbin/service data-transport-daemon restart

# Check status
sudo /sbin/service data-transport-daemon status

The daemon refuses to start if another instance is already running.*

Heartbeat and Connectivity

The daemon sends a heartbeat to the Gateway every 10 minutes.
The Gateway forwards that heartbeat to the NDTS Receiver; entries are visible in vNOC.

Slack Notifications

When heartbeats stop, the Receiver posts to the facility’s Slack channel.

Condition	Time-out	Receiver Action	Slack Message
≈ 20 min \| Mark workstation offline \| offline
+8 min \| Re-post offline (max 3) \| offline
– \| Mark workstation online \| online

Slack channel names (one per facility):

ccrc-ndts-notifications
nmrfam-ndts-notifications
uchc-ndts-notifications

Version Tracking

On start-up, the daemon writes its version to the log.
A file named

/opt/nan-dtdaemon/running_workstation_version-X.Y.Z

 contains the version and start timestamp.

Experiment Transfer Audit

Every processed experiment appends one line to

/opt/nan-dtdaemon/logs/ndtd_audit.txt

Fields:

Timestamp
Workstation (Linux) user
Selected NMRhub user (or unselected)
Experiment start & end time
Path to experiment data
Daemon version
Action (sent • spooled • sent-spooled • skipped-trivial • skipped-disabled)

Daemon Log File

Main log:
```
/opt/nan-dtdaemon/logs/nan-dtdaemon.log
```
Verbosity is set by log_level in ndtd_configuration.dat

 (fatal < error < warning < info < debug < trace).

Example start-up excerpt (level INFO):

Thu Sep 28 13:17:03 2023 LOG_START Started dtd logger.
Thu Sep 28 13:17:03 2023 INFO NDTD Workstation version is 1.0.15
Thu Sep 28 13:17:03 2023 INFO *** This is a Topspin Workstation ***
Thu Sep 28 13:17:03 2023 INFO Ndtd Control Processor listening.

Troubleshooting Checklist

Symptom	What to Check
• `service data-transport-daemon status` • Latest heartbeat timestamp in vNOC • Gateway log for incoming files
Workstation powered off? Network drop? Firewall still allowing port 60195?
`log_level trace` left enabled → set back to info
Gateway unreachable → verify IP/port and Gateway service status

Next Step

Return to NDTS Overview or proceed to Accessing Collected Data.*

@@ Line 1: / Line 1: @@
 = Running and Monitoring the Daemon =
-This page explains how to control the **data-transport-daemon** service, verify connectivity, and interpret the logs produced on each spectrometer workstation.
+This page explains how to control the '''data-transport-daemon''' service, verify connectivity, and interpret the daemon’s log and audit files on every spectrometer workstation.
-== '''Starting, Stopping, and Checking Status''' ==
+== '''Service Control''' ==
 <pre>
 # Start the daemon
@@ Line 17: / Line 17: @@
 sudo /sbin/service data-transport-daemon status
 </pre>
-*The daemon will refuse to start if an instance is already running on the workstation.*
+*The daemon refuses to start if another instance is already running.*
 == '''Heartbeat and Connectivity''' ==
-* By default the daemon sends a **heartbeat** to the Gateway every **10 minutes**.
+* The daemon sends a heartbeat to the Gateway every '''10&nbsp;minutes'''.
-* The Gateway forwards that heartbeat to the NDTS Receiver, where it is logged in the NAN Repository and surfaced in vNOC.
+* The Gateway forwards that heartbeat to the NDTS Receiver; entries are visible in vNOC.
 === Slack Notifications ===
-When a heartbeat is missed, the Receiver posts alerts to the facility’s Slack channel.
+When heartbeats stop, the Receiver posts to the facility’s Slack channel.
 {| class="wikitable"
-! Condition !! Time-out !! Action !! Slack message
+! Condition !! Time-out !! Receiver Action !! Slack Message
 |-
-| Missed heartbeat &gt; 20 min || ≈ 20 min || Daemon marked '''offline''' || “*offline*” message (repeats once)
+| First missed heartbeat > 20 min | ≈ 20 min | Mark workstation '''offline''' | ''offline''
 |-
-| Heartbeat resumes || – || Daemon marked '''online''' || “*online*” message
+| Heartbeat still missing (next poll) | +8 min | Re-post '''offline''' (max 3) | ''offline''
+|-
+| Heartbeat resumes | – | Mark workstation '''online''' | ''online''
 |}
-Channels are named:
+Slack channel names (one per facility):
 * <code>ccrc-ndts-notifications</code>
@@ Line 40: / Line 42: @@
 * <code>uchc-ndts-notifications</code>
-== '''Version Information''' ==
+== '''Version Tracking''' ==
-* On daemon start-up, the version is written to the log file (see below).
+* On start-up, the daemon writes its version to the log.
-* A file named **<code>/opt/nan-dtdaemon/running_workstation_version-X.Y.Z</code>** is created, timestamped with the start time.
+* A file named
+  <pre>/opt/nan-dtdaemon/running_workstation_version-X.Y.Z</pre>
+  contains the version and start timestamp.
 == '''Experiment Transfer Audit''' ==
-Every processed experiment adds one line to
+Every processed experiment appends one line to
 <pre>/opt/nan-dtdaemon/logs/ndtd_audit.txt</pre>
 Fields:
-# Timestamp  # Workstation user  # NMRhub user (or ‘‘unselected’’)  # Start & End time
+# Timestamp
-# Path to data  # Daemon version  # Action
+# Workstation (Linux) user
-(sent | spooled | sent-spooled | skipped-trivial | skipped-disabled)
+# Selected NMRhub user (or ''unselected'')
+# Experiment start & end time
+# Path to experiment data
+# Daemon version
+# Action (sent • spooled • sent-spooled • skipped-trivial • skipped-disabled)
-== '''Daemon Logs''' ==
+== '''Daemon Log File''' ==
 * Main log: <pre>/opt/nan-dtdaemon/logs/nan-dtdaemon.log</pre>
-* **log_level** is set in <code>ndtd_configuration.dat</code>
+* Verbosity is set by '''log_level''' in <code>ndtd_configuration.dat</code>
-   (fatal &lt; error &lt; warning &lt; info &lt; debug &lt; trace).
+   (fatal < error < warning < '''info''' < debug < trace).
 Example start-up excerpt (level INFO):
@@ Line 69: / Line 77: @@
 == '''Troubleshooting Checklist''' ==
 {| class="wikitable"
-! Symptom !! Check
+! Symptom !! What to Check
 |-
-| No new data in NAN | • <code>service data-transport-daemon status</code>
+| No new data reaches NAN | • <code>service data-transport-daemon status</code><br/>• Latest heartbeat timestamp in vNOC<br/>• Gateway log for incoming files
-• Heartbeat timestamp in vNOC
-• Gateway log for incoming files
 |-
-| Slack “offline” alerts | Workstation powered off? Network drop? Firewall blocking port 60195?
+| Repeated ''offline'' Slack alerts | Workstation powered off? Network drop? Firewall still allowing port&nbsp;60195?
 |-
-| Log file grows rapidly | <code>log_level trace</code> left enabled → reset to '''info'''
+| Log growing rapidly | <code>log_level trace</code> left enabled → set back to '''info'''
 |-
-| Experiments marked ‘‘spooled’’ only | Gateway unreachable → verify IP/port and gateway service status
+| Experiments stay ''spooled'' | Gateway unreachable → verify IP/port and Gateway service status
 |}
+== '''Next Step''' ==
+*Return to [[NDTS Overview|NDTS Overview]] or proceed to [[NDTS_Data_Access|Accessing Collected Data]].*

NDTS Daemon Operation: Difference between revisions

Revision as of 19:23, 2 June 2025

Contents

Running and Monitoring the Daemon

Service Control

Heartbeat and Connectivity

Slack Notifications

Version Tracking

Experiment Transfer Audit

Daemon Log File

Troubleshooting Checklist

Next Step

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools