事情的起因是我在看下面一段代码遇到的疑惑,明明是while True,为什么代码没有死循环??
class D(threading.Thread):
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
url = self.queue.get()
self.download_file(url)
self.queue.task_done()
def download_file(self, url):
h = urllib2.urlopen(url)
f = os.path.basename(url)+'.html'
with open(f,'wb') as f:
while True:
c = h.read(1024)
if not c:
break
f.write(c)
if __name__ == "__main__":
urls= ['http://www.baidu.com','http://www.sina.com']
queue = Queue.Queue()
for i in range(5):
t = D(queue)
t.setDaemon(True)
t.start()
for u in urls:
queue.put(u)
queue.join()
之前一直简单认为setDaemon就是设置为后台线程而已,没有进一步去挖掘里面的含义。
可问题的关键就是setDaemon,在底层的thread模块中,只要主线程结束了,所有的其它线程都会结束,这很明显,主线程结束python将销毁运行时环境,主线程肯定会被结束。
threading模块的线程setDaemon就是为了解决这个问题的,如果setDaemon(True),那么和之前一样,主线程结束,所有子线程都将结束。如果setDaemon(False),主线程将等待该线程结束,等同于你调用线程的join方法。
所以如果将上面的setDaemon注释和修改为False,那么程序将死循环。
其实我们并不推荐上面的做法,上面做法有点线程池的味道,但如果你看过一些python的线程池实现,while True
循环中肯定有检测退出语句,因为在python的世界里言明比隐晦更加pythonic。但很不幸的是,上面的代码就来
自与<<编写高质量代码:改善Python程序的91个建议>>,我并没有喷这本书,但我觉得代码举例的确有待商榷。
你可能好奇,setDaemon(False)是如何等同于线程join的呢?,不急,且听我慢慢道来。
未解决这个问题,threading模块引入了_MainThread对象
# Special thread class to represent the main thread
# This is garbage collected through an exit handler
class _MainThread(Thread):
def __init__(self):
Thread.__init__(self, name="MainThread")
self._Thread__started.set()
self._set_ident()
with _active_limbo_lock:
_active[_get_ident()] = self
def _set_daemon(self):
return False
def _exitfunc(self):
self._Thread__stop()
t = _pickSomeNonDaemonThread()
if t:
if __debug__:
self._note("%s: waiting for other threads", self)
while t:
t.join()
t = _pickSomeNonDaemonThread()
if __debug__:
self._note("%s: exiting", self)
self._Thread__delete()
def _pickSomeNonDaemonThread():
for t in enumerate():
if not t.daemon and t.is_alive():
return t
return None
# Create the main thread object,
# and make it available for the interpreter
# (Py_Main) as threading._shutdown.
_shutdown = _MainThread()._exitfunc
其实_MainThread并没有干什么事,唯一的贡献就是在threading模块导入时创建了一个实例,并将_exitfunc
赋值给_shutdown函数。_exitfunc将收集所有非daemon且alive的线程,并调用线程的join方法。哦,原来是
_MainThread悄悄的在幕后奋斗着,剩下的问题就是谁调用_shutdown函数的呢?
当python要销毁运行时之前肯定会调用,所以打开pythonrun.c,你会发现如下函数
/* Wait until threading._shutdown completes, provided
the threading module was imported in the first place.
The shutdown routine will wait until all non-daemon
"threading" threads have completed. */
static void
wait_for_thread_shutdown(void)
{
#ifdef WITH_THREAD
PyObject *result;
PyThreadState *tstate = PyThreadState_GET();
PyObject *threading = PyMapping_GetItemString(tstate->interp->modules,
"threading");
if (threading == NULL) {
/* threading not imported */
PyErr_Clear();
return;
}
result = PyObject_CallMethod(threading, "_shutdown", "");
if (result == NULL)
PyErr_WriteUnraisable(threading);
else
Py_DECREF(result);
Py_DECREF(threading);
#endif
}
原来是这家伙在搞鬼,涨见识了,原来在C中还有调用py代码的需求啊。没办法啊,谁让threading模块是纯py
代码呢!!!