Consulting djbware Publications

13. Maintenance & Diagnostics

We expect the s/qmail daemons to be 'supervised'. The premium tool is DJB's 'Daemontools' but other will work als well.
In essence it means: In case a service is abended it is restarted; though withouth solving the cause ... .


Quick links:

13.1 Supervise

s/qmail comes with run scripts tailored for supervise. These scripts are rather simple and need to be adjusted to the specific environment.

The exception is the generic log run script. Of course, particular filters can be defined as well.

13.2 dmesg

Abending processes can be tracked by means of an entry in the kernel's ring buffer, accessible thru the dmesg (daily message) facility. It makes sense, to inspect its content on a regular base. Under Linux, the recorded time stamps are resolved via:

dmesg -T [Mo Mar 8 21:50:17 2021] UDP: short packet: From 45.11.18.194:30120 366/8 to 85.25.149.179:34021 [So Mar 14 19:32:09 2021] tai64nlocal[10709]: segfault at 14 ip 000000000040061f sp 00007fff42e59da0 error 4 in tai64nlocal[400000+2000] [So Mar 14 19:32:19 2021] tai64nlocal[10729]: segfault at 14 ip 000000000040061f sp 00007ffdc91aeea0 error 4 in tai64nlocal[400000+2000] [So Mar 14 19:32:26 2021] tai64nlocal[10734]: segfault at 14 ip 000000000040061f sp 00007fff9d041d00 error 4 in tai64nlocal[400000+2000] [Sa Mar 20 15:18:08 2021] UDP: bad checksum. From 119.230.34.47:41143 to 85.25.149.179:65535 ulen 583 [So Mar 21 20:33:48 2021] sslserver[13568]: segfault at 0 ip 000000000040e308 sp 00007fffca07a238 error 4 in sslserver[400000+1a000]

Here, you see tai64nlocal and sslserver abending as well; in the last case doing some alpha tests ...

13.3 System updates

Here, I run some binaries since years, even over several updates of my FreeBSD operating system, which I use for development. In case the C-lib libraries are compatible, it may work - or not.

Recently, I saw tai64nlocal abending while filtering some logs with it. Thus, here is an advice:

In case this works without problems, you can be sure, that the new binaries are build to be executable on your current systems. In case you don't do: Could work; but occassionally abends may happen.

13.4 Daemon maintenance

In theory, all s/qmail deamon processes should run friction-free and without maintainance. However, all daemon processes listening to connections from the Internet require additional management.

Here are some common problem which need attention:

13.5 Queue maintenance

In order to maintain s/qmail's queue and long-living delivery artifacts, the following tools are provided:

The first two tools are used to inspect and list the content of s/qmail's queue.

To use these tools, particular permissions are required. Check the individual man pages for requirements.

13.5.1 Delivery Timout Table

qmail-remote records the state of unsuccessful SMTP TCP sessions in a binary file including the observed timeout.

The content of that table can be listed calling qmail-tcpto. Proably, you get something similar:

/var/qmail/bin/qmail-tcpto 2a01:4f8:201:626c::13 timed out 131445 seconds ago; # recent timeouts: 2 2001:470:7a56::1 timed out 131445 seconds ago; # recent timeouts: 2 2001:470:6d:603:250:56ff:fea2:af49 timed out 131444 seconds ago; # recent timeouts: 2 2001:1410:200:eea::1 timed out 131445 seconds ago; # recent timeouts: 2 2a03:4000:6:23ed::1 timed out 131445 seconds ago; # recent timeouts: 2 2a02:168:420b:f::1:2a timed out 131445 seconds ago; # recent timeouts: 2 2a00:1450:400c:c07::1a timed out 255994 seconds ago; # recent timeouts: 1 212.51.144.43 timed out 131382 seconds ago; # recent timeouts: 2 2001:41d0:701:1100::4a3 timed out 131381 seconds ago; # recent timeouts: 2 2a00:1450:400c:c08::1b timed out 131441 seconds ago; # recent timeouts: 1

There is no real reason to worry, other that the binary file recording those issues may become large and unresponsive, unlike you use qmail-tcpok to clear it. On a busy server you should do it regularily.

13.5.2 Current Queue Content

The current queue content can be inspected on statistical and individual base regarding the undelivered mails.

Use qmail-qstat to get a statistical overview of the queue. This information can be retrieved unrestricted.

Individual mails in the queue are displayed by means of qmail-qread including source and destination together with the delivery channel, the subject -- and most important -- the delivery number.

Unlike the original qmail, s/qmail is much more performant and thus the entries you see are mostely due to greylisting and will be delivered once the remote MTA has changed its receiving policy.

Also, s/qmail is able to pre-process messages much faster due to the BIGTODO and EXTODO improvements. If you still observe bottlenecks in here, check the following:

  1. In case the delivery problems are due to bounces, consider to raise a particular bounce host. See the qmail-remote and read chapter 09.
  2. If the delivery volume is to high for a particular MTA/domain receiving the mail, you should set up an own s/qmail instance by means of QMQ to delegate delivery to that MTA/domain.

13.5.3 qmail-qmaint

qmail-qmaint is a new module and able to do the following:

  1. It checks the s/qmail queue for conistency. Indicates problems but otherwise is calm.
  2. It allows to remove (pre-processed) but unwanted mails from the queue given their id.
  3. Given the DKIM stage are, it will automatically correct permissions, and
  4. will remove DKIM remnant files, provided -D is given as argument.

Prior of removing a mail from the queue, you have to call qmail-qread to retrieve the mail's id (without the leading pound sign '#'), stop qmail-send and then call qmail-qmaint. Even without stopping qmail-send it will work; but the queue might be in an incorrect state, complaining in the logs. Thus requiring further actions as shutting down/restarting qmail-send and potentially re-calling qmail-qmaint interactively might be necessary.

13.6. Diagnostics

There are a few conditions which need to have a diagnostics with s/qmail:

  1. qmail-smtpd does not receive unencrypted or TLS encrypted mail.
  2. qmail-pop3d does not deliver unencrypted or TLS encrypted mail.
  3. qmail-remote does not deliver unencrypted or TLS encrypted mail.

In general, we may consider again three possible cases for these problems:

In any of those cases, we need to do additional diagnostics, exceeding the information given in the s/qmail logs.

13.6.1 DNS problems

s/qmail comes with the following DNS tools:

Given a MX (SMTP) host, dnsmxip is the favorit tool to show the MX host and its IP for the domain part of an email address. In case, a particular reply is given, you know that the domain (here: email) can reach its purported destiny (potentially).

The further qualification of the MX can be tested via dnstlsa MX showing the DANE records for that MX host and dnstxt DOMAIN returning the SPF and DKIM settings here.

13.6.2 SMTP/POP3 connection problems

Both SMTP and POP3 (as well as IMAP4) are protocols using simple ASCII commands to show and trigger their behavior. On an unenrypted connection use telnet or mconnect from ucspi-tcp6 to simple connet to the SMTP server, which could be your own or a remote MX on port 25.
You (potentially) can trigger the reception of an email here. Even more important, starting the session with HELO or EHLO the MX tells you its capabilitites.

For POP3 use telnet host pop3 to test a connection. However, POP3 is not very eloquent here.

13.6.3 TLS connection problems

Diagnostics of a TLS encrypted connection is a bit more difficult. Given the current Internet mail services, StartTLS can be assumed.

We have two cases:

  1. Mail can not be send by means of qmail-remote.
  2. Mail can not be received my qmail-smtpd.

Let's discuss the first case with a real word sample (which was happing in September 2021):

1. "bartelsmissey.com" is the domain name of the recipient. 2. What is the remote host (MX)? $ dnsmxip bartelsmissey.com mx1.netsolmail.net: 10 [172.65.252.97] $ telnet mx1.netsolmail.net 25 Trying 172.65.252.97... Connected to mx1.netsolmail.net. Escape character is '^]'. 220 inbound.net.registeredsite.com ESMTP SMTP Service (NO SPAM/UCE) ehlo du 250-jax4mhib32.registeredsite.com Hello mail.fehcom.net [85.25.149.179], pleased to meet you 250-ENHANCEDSTATUSCODES 250-PIPELINING 250-8BITMIME 250-SIZE 69905064 250-DSN 250-ETRN 250-STARTTLS 250-DELIVERBY 250 HELP quit 221 2.0.0 jax4mhib32.registereds This looks ok. 3. And now lets go for the StartTLS connection: $ openssl s_client -brief -starttls smtp -connect mx1.netsolmail.net:25 Didn't find STARTTLS in server response, trying anyway... write:errno=32 $ openssl s_client -showcerts -starttls smtp -connect:25 mx1.netsolmail.net CONNECTED(00000005) Didn't find STARTTLS in server response, trying anyway... write:errno=32 --- no peer certificate available --- No client certificate CA names sent --- SSL handshake has read 0 bytes and written 0 bytes Verification: OK --- New, (NONE), Cipher is (NONE) Secure Renegotiation IS NOT supported Compression: NONE Expansion: NONE No ALPN negotiated Early data was not sent Verify return code: 0 (ok) --- These results don't need a comment.

Thus openssl s_client is a very useful tool to diagnose conformance for a particular remote host offering TLS services.

Given your own TLS services, reversely, test your own settings by the very same mean!