python – multiprocessing.Queue在“读者”进程死亡后的死锁

我一直在玩多处理程序包,并注意到在以下情况下队列可能会被解锁以便进行读取:

>“读者”进程使用get,超时> 0:

self.queue.get(timeout=3)

>“读取器”在get由于超时而阻塞时死亡.

在该队列永久锁定之后.

应用程序演示了这个问题

我创建了两个子进程“Worker”(进入队列)和“Receiver”(从队列中获取).此外,父母过程会定期检查他的孩子是否为are alive并在需要时开始新孩子.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import multiprocessing
import procname
import time

class Receiver(multiprocessing.Process):
    ''' Reads from queue with 3 secs timeout '''

    def __init__(self, queue):
        multiprocessing.Process.__init__(self)
        self.queue = queue

    def run(self):
        procname.setprocname('Receiver')
        while True:
            try:
                msg = self.queue.get(timeout=3)
                print '<<< `{}`, queue rlock: {}'.format(
                    msg, self.queue._rlock)
            except multiprocessing.queues.Empty:
                print '<<< EMPTY, Queue rlock: {}'.format(
                    self.queue._rlock)
                pass


class Worker(multiprocessing.Process):
    ''' Puts into queue with 1 sec sleep '''

    def __init__(self, queue):
        multiprocessing.Process.__init__(self)
        self.queue = queue

    def run(self):
        procname.setprocname('Worker')
        while True:
            time.sleep(1)
            print 'Worker: putting msg, Queue size: ~{}'.format(
                self.queue.qsize())
            self.queue.put('msg from Worker')


if __name__ == '__main__':
    queue = multiprocessing.Queue()

    worker = Worker(queue)
    worker.start()

    receiver = Receiver(queue)
    receiver.start()

    while True:
        time.sleep(1)
        if not worker.is_alive():
            print 'Restarting worker'
            worker = Worker(queue)
            worker.start()
        if not receiver.is_alive():
            print 'Restarting receiver'
            receiver = Receiver(queue)
            receiver.start()

进程树在ps中的样子

bash
 \_ python queuetest.py
     \_ Worker
     \_ Receiver

控制台输出

$python queuetest.py
Worker: putting msg, Queue size: ~0
<<< `msg from Worker`, queue rlock: <Lock(owner=None)>
Worker: putting msg, Queue size: ~0
<<< `msg from Worker`, queue rlock: <Lock(owner=None)>
Restarting receiver                        <-- killed Receiver with SIGTERM
Worker: putting msg, Queue size: ~0
Worker: putting msg, Queue size: ~1
Worker: putting msg, Queue size: ~2
<<< EMPTY, Queue rlock: <Lock(owner=SomeOtherProcess)>
Worker: putting msg, Queue size: ~3
Worker: putting msg, Queue size: ~4
Worker: putting msg, Queue size: ~5
<<< EMPTY, Queue rlock: <Lock(owner=SomeOtherProcess)>
Worker: putting msg, Queue size: ~6
Worker: putting msg, Queue size: ~7

有没有办法绕过这个?使用get_nowait结合睡眠似乎是某种解决方法,但它不会“随时”读取数据.

系统信息

$uname -sr
Linux 3.11.8-200.fc19.x86_64

$python -V
Python 2.7.5

In [3]: multiprocessing.__version__
Out[3]: '0.70a1'

“它只是有效”的解决方案

在写这个问题时,我想出了一些对Receiver类的愚蠢修改:

class Receiver(multiprocessing.Process):

    def __init__(self, queue):
        multiprocessing.Process.__init__(self)
        self.queue = queue

    def run(self):
        procname.setprocname('Receiver')
        while True:
            time.sleep(1)
            while True:
                try:
                    msg = self.queue.get_nowait()
                    print '<<< `{}`, queue rlock: {}'.format(
                        msg, self.queue._rlock)
                except multiprocessing.queues.Empty:
                    print '<<< EMPTY, Queue rlock: {}'.format(
                        self.queue._rlock)
                    break

但这对我来说似乎不太好.

最佳答案 这可能是因为来自Queue.get()的* not_empty.release()*永远不会发生(进程已被杀死).您是否尝试在Receiver中捕获TERM信号并在退出之前释放Queue互斥锁?

点赞