场景
在程序中遇到耗时较长的操作(如等待用户响应,连接remote server),通常会想到创建 Child process 去处理。在Perl 中 使用fork()
## fork a child process
die "$@" unless defined( my $child_pid = fork());
if ($child_pid) { # If I have a child PID, then I must be the parent
push @children_pids, $child_pid;
print "children's PIDs: @children_pids\n";
} else { # I am the child
my $wait_time = int(rand(10));
sleep $wait_time;
my $localtime = localtime;
print "Child: Some child exited at $localtime\n";
exit 0; # Exit the child
}
那么 Child process 的回收问题怎么解决?如果 Parent process 任由它自生自灭,就会导致Zombie process。
回收方法
有两种方式如下,无论哪种都涉及waitpid()
,当然也可以使用wait()
(不建议)
- Blocking wait
- Non blocking wait
在此提一下waitpid()
下文code有使用
waitpid PID,FLAGS
PID: -1(stands for any child process) | pid ( > 0 )
FLAGS: 0 (for a blocking wait) | WHOHANG (return 0 or -1if no dead children)
i.e,. waitpid($pid, 0) ## in a blocking wait
waitpid(-1, WHOHANG) ## no blocking wait
Blocking wait
见如下code:
my @children_pids;
print "Parent: my pid $$\n";
for my $count (1..10){
die "$@" unless defined( my $child_pid = fork());
if ($child_pid) { # If I have a child PID, then I must be the parent
push @children_pids, $child_pid;
print "children's PIDs: @children_pids\n";
} else { # I am the child
my $wait_time = int(rand(10));
sleep $wait_time;
my $localtime = localtime;
print "Child: Some child exited at $localtime\n";
exit 0; # Exit the child
}
}
foreach my $child (@children_pids) {
print "Parent: Waiting on $child\n";
waitpid($child, 0); ## will not go to next step unless $child reaped
my $localtime = localtime;
print "Parent: Child $child was reaped - $localtime.\n";
}
Output:
[root@VTB93-PC1 ~]# perl /tmp/test_fork.pl
Parent: my pid 15117
children's PIDs: 15118
children's PIDs: 15118 15119
children's PIDs: 15118 15119 15120
children's PIDs: 15118 15119 15120 15121
children's PIDs: 15118 15119 15120 15121 15122
children's PIDs: 15118 15119 15120 15121 15122 15123
children's PIDs: 15118 15119 15120 15121 15122 15123 15124
children's PIDs: 15118 15119 15120 15121 15122 15123 15124 15125
children's PIDs: 15118 15119 15120 15121 15122 15123 15124 15125 15126
children's PIDs: 15118 15119 15120 15121 15122 15123 15124 15125 15126 15127
Parent: Waiting on 15118
Child: Some child exited at Thu Apr 21 13:28:48 2016
Parent: Child 15118 was reaped - Thu Apr 21 13:28:48 2016.
Parent: Waiting on 15119
Child: Some child exited at Thu Apr 21 13:28:48 2016
Child: Some child exited at Thu Apr 21 13:28:48 2016
Child: Some child exited at Thu Apr 21 13:28:50 2016
Child: Some child exited at Thu Apr 21 13:28:51 2016
Parent: Child 15119 was reaped - Thu Apr 21 13:28:51 2016.
Parent: Waiting on 15120
Child: Some child exited at Thu Apr 21 13:28:51 2016
Child: Some child exited at Thu Apr 21 13:28:52 2016
Child: Some child exited at Thu Apr 21 13:28:53 2016
Child: Some child exited at Thu Apr 21 13:28:54 2016
Parent: Child 15120 was reaped - Thu Apr 21 13:28:54 2016.
Parent: Waiting on 15121
Parent: Child 15121 was reaped - Thu Apr 21 13:28:54 2016.
Parent: Waiting on 15122
Parent: Child 15122 was reaped - Thu Apr 21 13:28:54 2016.
Parent: Waiting on 15123
Child: Some child exited at Thu Apr 21 13:28:54 2016
Parent: Child 15123 was reaped - Thu Apr 21 13:28:54 2016.
Parent: Waiting on 15124
Parent: Child 15124 was reaped - Thu Apr 21 13:28:54 2016.
Parent: Waiting on 15125
Parent: Child 15125 was reaped - Thu Apr 21 13:28:54 2016.
Parent: Waiting on 15126
Parent: Child 15126 was reaped - Thu Apr 21 13:28:54 2016.
Parent: Waiting on 15127
Parent: Child 15127 was reaped - Thu Apr 21 13:28:54 2016.
可以看出规律,回收的顺序是按照`fork()`的顺序, 可在实际中 child process 耗时有长有短,后fork()
的process很有可能比较早的先结束,所以引入第二种Reap机制,使用$SIG{CHLD}
Non blocking wait
这里先提一下$SIG{CHLD}
, 可接受的赋值如下:
$SIG{CHLD} = ‘IGNORE’; ## Children reaped by system
$SIG{CHLD} = ‘DEFAULT’; ## System defined
$SIG{CHLD} = &REAPER; ## do REAPER if SIGCHLD catched
use POSIX ":sys_wait_h";
$SIG{CHLD}=\&REAPER;
sub REAPER {
my $child;
while(( $child = waitpid(-1, &WNOHANG)) > 0){
my $localtime = localtime;
print "Parent: Child $child was reaped - $localtime.\n";
}
$SIG{CHLD}=\&REAPER;
}
my @children_pids;
print "Parent: my pid $$\n";
for my $count (1..10){
die "$@" unless defined( my $child_pid = fork());
if ($child_pid) { # If I have a child PID, then I must be the parent
push @children_pids, $child_pid;
print "children's PIDs: @children_pids\n";
} else { # I am the child
my $wait_time = int(rand(10));
sleep $wait_time;
my $localtime = localtime;
print "Child: Some child exited at $localtime\n";
exit 0; # Exit the child
}
}
## Keep parent alive to reap all children
while (1) {
sleep;
}
Outpuit:
Parent: my pid 15189
children's PIDs: 15194
children's PIDs: 15194 15195
children's PIDs: 15194 15195 15196
children's PIDs: 15194 15195 15196 15197
children's PIDs: 15194 15195 15196 15197 15198
children's PIDs: 15194 15195 15196 15197 15198 15199
children's PIDs: 15194 15195 15196 15197 15198 15199 15200
children's PIDs: 15194 15195 15196 15197 15198 15199 15200 15201
children's PIDs: 15194 15195 15196 15197 15198 15199 15200 15201 15202
children's PIDs: 15194 15195 15196 15197 15198 15199 15200 15201 15202 15203
Child: Some child exited at Thu Apr 21 13:42:25 2016
Parent: Child 15202 was reaped - Thu Apr 21 13:42:25 2016.
Child: Some child exited at Thu Apr 21 13:42:27 2016
Parent: Child 15197 was reaped - Thu Apr 21 13:42:27 2016.
Child: Some child exited at Thu Apr 21 13:42:29 2016
Parent: Child 15201 was reaped - Thu Apr 21 13:42:29 2016.
Child: Some child exited at Thu Apr 21 13:42:30 2016
Parent: Child 15194 was reaped - Thu Apr 21 13:42:30 2016.
Child: Some child exited at Thu Apr 21 13:42:31 2016
Parent: Child 15198 was reaped - Thu Apr 21 13:42:31 2016.
Child: Some child exited at Thu Apr 21 13:42:31 2016
Parent: Child 15200 was reaped - Thu Apr 21 13:42:31 2016.
Child: Some child exited at Thu Apr 21 13:42:31 2016
Parent: Child 15203 was reaped - Thu Apr 21 13:42:31 2016.
Child: Some child exited at Thu Apr 21 13:42:32 2016
Parent: Child 15199 was reaped - Thu Apr 21 13:42:32 2016.
Child: Some child exited at Thu Apr 21 13:42:33 2016
Parent: Child 15195 was reaped - Thu Apr 21 13:42:33 2016.
Child: Some child exited at Thu Apr 21 13:42:34 2016
Parent: Child 15196 was reaped - Thu Apr 21 13:42:34 2016.
可以看出但某个Child exit 后会发送SIGCHLD给Parent,顺序依据exit先后。当然,这是Parent跟踪Children(如获取返回信息local $?
,以上都返回 0
)作出相应操作。如果无需如此,可直接$SIG{CHLD}='INGNORE'
; 交给OS吧。
总结
- 使用 SIGCHLD REAPER 更符合现实场景,结合
waitpid()
以 Non blocking wait 方式 - Parent 负责 Children 的生命周期,以免出现 Zombie process
- PID 1 init ## 所有Process 的 parent