system_server的一次看门狗超时,从log上进行简单分析。
前面一部分可以看到异常的情况是system_server_watchdog。说明system_server在某个地方卡住了,没有及时喂狗。
Exception Class: SWT
Exception Type: system_server_watchdog
Current Executing Process:
system_server
Trigger time:[2018-03-06 06:20:03.370242]
Backtrace:
Process: system_server
Subject: Blocked in handler on foreground thread (android.fg)
我们直接看到main线程的trace,可以看到线程处于Blocked状态,
waiting to lock <0x0509dfb0> (a com.android.server.am.ActivityManagerService) held by thread 15
它在等一个锁0x0509dfb0,这个锁被thread 15持有。所以接下来我们要看下thread15的trace情况
"main" prio=5 tid=1 Blocked
| group="main" sCount=1 dsCount=0 obj=0x76156080 self=0x7c6be9aa00
| sysTid=1285 nice=-2 cgrp=default sched=0/0 handle=0x7c7087ea98
| state=S schedstat=( 150600829064 33759848616 223685 ) utm=12212 stm=2848 core=0 HZ=100
| stack=0x7fcb5a9000-0x7fcb5ab000 stackSize=8MB
| held mutexes=
at com.android.server.am.ActivityManagerService.broadcastIntent(ActivityManagerService.java:22164)
- waiting to lock <0x0509dfb0> (a com.android.server.am.ActivityManagerService) held by thread 15
at android.app.ActivityManagerNative.broadcastStickyIntent(ActivityManagerNative.java:114)
at android.app.ActivityManagerNative.broadcastStickyIntent(ActivityManagerNative.java:104)
at com.android.server.BatteryService$14.run(BatteryService.java:2417)
at android.os.Handler.handleCallback(Handler.java:836)
at android.os.Handler.dispatchMessage(Handler.java:103)
at android.os.Looper.loop(Looper.java:232)
at com.android.server.SystemServer.run(SystemServer.java:542)
at com.android.server.SystemServer.main(SystemServer.java:392)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1103)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:964)
可以看到 tid=15的线程在等另一个锁0x002546dc,他被thread 11持有。所以接下来看一下thread 11在做什么
"BroadcastQueue" prio=5 tid=15 Blocked
| group="main" sCount=1 dsCount=0 obj=0x12c4aa60 self=0x7c5ecfa800
| sysTid=1304 nice=-2 cgrp=default sched=0/0 handle=0x7c51c74450
| state=S schedstat=( 112819444333 21808133623 77767 ) utm=9495 stm=1786 core=2 HZ=100
| stack=0x7c51b72000-0x7c51b74000 stackSize=1037KB
| held mutexes=
at com.android.server.am.BroadcastQueue.broadcastTimeoutLocked(BroadcastQueue.java:2181)
- waiting to lock <0x002546dc> (a android.util.SparseArray) held by thread 11
at com.android.server.am.BroadcastQueue$BroadcastHandler.handleMessage(BroadcastQueue.java:248)
- locked <0x0509dfb0> (a com.android.server.am.OppoActivityManagerService)
at android.os.Handler.dispatchMessage(Handler.java:110)
at android.os.Looper.loop(Looper.java:232)
at android.os.HandlerThread.run(HandlerThread.java:61)
at com.android.server.ServiceThread.run(ServiceThread.java:46)
可以看到tid=11卡在了一个Native的方法getPss。所以问题就处在了这个方法里面。
"android.bg" prio=5 tid=11 Native
| group="main" sCount=1 dsCount=0 obj=0x12c5bb80 self=0x7c5ecf8000
| sysTid=1300 nice=10 cgrp=bg_non_interactive sched=0/0 handle=0x7c52088450
| state=R schedstat=( 640275049135 220669513539 194368 ) utm=20012 stm=44015 core=2 HZ=100
| stack=0x7c51f86000-0x7c51f88000 stackSize=1037KB
| held mutexes=
kernel: (couldn't read /proc/self/task/1300/stack)
native: #00 pc 000000000006d558 /system/lib64/libc.so (read+8)
native: #01 pc 00000000000750bc /system/lib64/libc.so (__sread+48)
native: #02 pc 0000000000078f74 /system/lib64/libc.so (__srefill+272)
native: #03 pc 0000000000059398 /system/lib64/libc.so (fgets+128)
native: #04 pc 00000000000f7a24 /system/lib64/libandroid_runtime.so (???)
native: #05 pc 00000000008ece3c /system/framework/arm64/boot-framework.oat (Java_android_os_Debug_getPss__I_3J_3J+184)
at android.os.Debug.getPss(Native method)
at com.android.server.am.ActivityManagerService$3.handleMessage(ActivityManagerService.java:2889)
- locked <0x002546dc> (a android.util.SparseArray)
at android.os.Handler.dispatchMessage(Handler.java:110)
at android.os.Looper.loop(Looper.java:232)
at android.os.HandlerThread.run(HandlerThread.java:61)