Timeout Anything

  • python, linux
  • 3
  • 1
  • finished

Sometimes blocking operations take a long to complete and we need a way to break them and investigate what’s going on. Unfortunately, not all of them support timing out which returns control to the program. There are many ways to deal with the situation, like using a separate thread or async library. If blocking operation involves waiting for a file descriptor to be ready, we could wrap it in a select() or poll() functions. These solutions are fine, but unnecessarily complex and might bite us when we want to add other features to our program.

Let’s write a simple filter: a program which runs a subprocess and pipes and converts its output to something else. In Python we can write it like this:

import sys
import subprocess


def process_line(s: str) -> str:
  ...  # skip the logic for simplicity


cmd = ["tail", "-f", "myfile"]
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
while True:
  line = proc.stdout.readline()
  print(process_line(line.decode()), flush=True)

This program has a problem: the call to proc.stdout.readline() might block infinitely. readline() only returns when it reches a newline character or when subprocess sends EOF (end-of-file), typically when it’s closing its standard output. But process can finish abruptly, for example when killed with SIGKILL. In such case it won’t have an opportunity to cleanly close its file descriptors. readline() will block forever in this case and our program will hang. It’d would be good to check from time to time if there’s still a subprocess to read from.

We need a background timeout for readline() call, but we have synchronous, single-threaded application. How do we add one? The answer is: use standard OS signals, specifically SIGALRM.

With signal.alarm() function we can arrange for a SIGALRM signal delivery once a configured time (in seconds) passes. If at the same time we enable a signal handler which raises TimeoutError when SIGALRM occurs, we end up having a very simple and robust way to setup a timeout for any function.

To simplify its use, let’s put it into a context manager. It’ll setup and clean alarm automatically whenever we enter and quit the with block.

import signal
from contextlib import contextmanager


@contextmanager
def timeout(secs: int):
    def _handler(signum, frame):
        raise TimeoutError("Timeout expired")

    assert secs > 0, "timeout must be positive integer"

    curr = signal.alarm(0)
    if curr != 0:
        # restore previous alarm and fail setting the new one
        signal.alarm(curr)
        raise AssertionError("only one SIGALRM can be active at a time")

    signal.signal(signal.SIGALRM, _handler)
    signal.alarm(secs)

    try:
        yield
    finally:
        signal.alarm(0)
        signal.signal(signal.SIGALRM, signal.SIG_DFL)

From now on we can check every 5 seconds that subprocess is alive and act accordingly if it isn’t. Let’s see how it works:

while True:
    with timeout(5):
        try:
            line = proc.stdout.readline()
        except TimeoutError:
            if proc.poll() is not None:
                print("Subprocess died", file=sys.stderr)
                sys.exit(1)
            continue

        if not line:  # EOF
            break

    print(process_line(line.decode()), flush=True)

This way we may add timeouts to any function, not only I/O-related ones. Caveat of this approach is that only one alarm can be scheduled at a time. We must be aware of it and don’t use alarms for other purposes, or at least at the same time when timeout clock is ticking. Specifically, we mustn’t nest timeouts.

Manual page for alarm() also notes that sleep() function may be implemented using SIGALRM, so mixing calls to alarm() and sleep() might not be the best idea.