Monday, December 19, 2011

Signals and Threads with Python on FreeBSD

Over the last few months, I've been plagued by a fun bug in Python around handling of signals in multi-threaded programs on FreeBSD.

If you kill a multi-threaded program, FreeBSD will deliver the signal to any running thread, while Linux will only deliver the signal to the main thread. Python guarantees that as far as Python is concerned, only the main thread will handle the signal, but it makes no guarantees about anything else. Unfortunately, this leads to a few problems.

If you use the FreeBSD ports build of Python, for the most part you'll get correct thread and signal behavior. The upstream maintainers have installed a patch that basically blocks signals for all threads but the main thread. Unfortunately, this leads to one flaw. If ever you fork, from a thread (e.g. to spawn a subprocess), the signals are not unblocked, so your subprocess is unkillable.

If you use a stock version of Python on FreeBSD, you get the following, different, problems. This makes it difficult to write portable code on FreeBSD and Linux.

Working with signals and threads in Python on FreeBSD (stock Python)

If you're writing multi-threaded code on FreeBSD, and you want to handle signals, you need to ensure that you are prepared to handle interrupted system calls in every thread, not just the main thread. Usually this means wrapping them in a try/except block like this:

while True:
    try:
        data = my_sock.read()
        if not data:
            break
        buffer.append(data)
    except socket.error, e:
        if e.errno != errno.EINTR:
            raise

On Linux, you only need to worry about this in the main thread. On FreeBSD, you need to worry about it everywhere.

The other thing you need to avoid is blocking indefinitely in the main thread. It's a common pattern to spawn a thread to handle connections and have your main thread wait for a signal or some other indication it's time to quit. Unforunately, because of Python's assumptions about signals, this doesn't work on FreeBSD. Not even signal.pause() in the main thread will return when a signal is received. For example, the following code will never exit on FreeBSD.

import os
import signal
import threading
import time

def handler(signum, frame):
    print 'Signal %d handled' % (signum,)

def kill_me():
    time.sleep(1)
    print 'Suicide?'
    os.kill(os.getpid(), signal.SIGTERM)
    time.sleep(1)

if __name__ == '__main__':
    signal.signal(signal.SIGTERM, handler)
    t = threading.Thread(target=kill_me).start()
    signal.pause()
    print 'Got a signal, exit.'


The fix is to replace the blocking call (in the example above the signal.pause() in the main thread with a sleep-loop.

def my_signal_handler(signal, frame):
    global _run
    _run = False

_run = True
signal.signal(signal.SIGTERM, my_signal_handler)

# Spawn some threads ...

while _run:
    time.sleep(1)

# Join the threads ...

If your application has a need to handle signals faster, you'll want to have a shorter sleep. If you can tolerate a longer delay, pick a longer sleep time. This is obviously inefficient, but it's the best you can do with a stock Python.

No comments:

Post a Comment