SystemServer的理解

SystemServer创建的可以分成两部分,一部分是在Zygote进程中fork并初始化SystemServer进程,另一部分是执行SystemServer类的mian来启动系统的服务。

1、SystemServer的创建过程

1.1、创建SystemServer创建
init.rc文件根据语句import /init.${ro.zygote}.rc属性来判断引入不同的文件,从而导入不同版本的Zygote进程,属性ro.zygote可能取值有zygote32,zyoget32_64,zygote64,zygote64_32,因此init.rc同级目录下一共有4个与zygote相关的文件:init.zygote32.rc,init.zygote64.rc,init.zygote32_64.rc,init.zygote64_32.rc,例如:
init.zygote64.rc

service zygote /system/bin/app_process64 -Xzygote /system/bin --zygote --start-system-server
    class main
    socket zygote stream 660 root system
    onrestart write /sys/android_power/request_state wake
    onrestart write /sys/power/state on
    onrestart restart media
    onrestart restart netd

从init.zygote64.rc文件定义中Zygote进程的启动参数包括了“–start-system-server”,因此在ZygoteInit类中的main方法会调用startSystemServer方法来启SystemServer,startSystemServer方法内容:
ZygoteInit.java

 private static boolean startSystemServer(String abiList, String socketName)
            throws MethodAndArgsCaller, RuntimeException {
       ......
        /* Hardcoded command line to start the system server */
        String args[] = {
            "--setuid=1000",
            "--setgid=1000",
            "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1032,3001,3002,3003,3006,3007",
            "--capabilities=" + capabilities + "," + capabilities,
            "--runtime-init",
            "--nice-name=system_server",
            "com.android.server.SystemServer",
        };
        ZygoteConnection.Arguments parsedArgs = null;

        int pid;

        try {
             ......
            /* Request to fork the system server process */
            pid = Zygote.forkSystemServer(
                    parsedArgs.uid, parsedArgs.gid,
                    parsedArgs.gids,
                    parsedArgs.debugFlags,
                    null,
                    parsedArgs.permittedCapabilities,
                    parsedArgs.effectiveCapabilities);
        } catch (IllegalArgumentException ex) {
            throw new RuntimeException(ex);
        }

        /* For child process */
        if (pid == 0) {
            if (hasSecondZygote(abiList)) {
                waitForSecondaryZygote(socketName);
            }

            handleSystemServerProcess(parsedArgs);
        }

        return true;
    }

从面看出,上面主要做了三件事,

  • 为SystemServer准备启动参数,其中SystemServer的进程ID和组ID都是1000,执行类是com.android.server.SystemServer
  • 调用Zygote的forkSystemServer()来fork出SystemServer子进程,接着调用native层的函数来完成实际的操作,forkSystemServer->nativeForkSystemServer
    Zygote.java
public static int forkSystemServer(int uid, int gid, int[] gids, int debugFlags,
            int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
        VM_HOOKS.preFork();
        int pid = nativeForkSystemServer(
                uid, gid, gids, debugFlags, rlimits, permittedCapabilities, effectiveCapabilities);
        VM_HOOKS.postForkCommon();
        return pid;
    }

    native private static int nativeForkSystemServer(int uid, int gid, int[] gids, int debugFlags,
            int[][] rlimits, long permittedCapabilities, long effectiveCapabilities);

framework/base/core/jni/com_android_internal_os_Zygote.cpp

static jint com_android_internal_os_Zygote_nativeForkSystemServer(
        JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
        jint debug_flags, jobjectArray rlimits, jlong permittedCapabilities,
        jlong effectiveCapabilities) {
  pid_t pid = ForkAndSpecializeCommon(env, uid, gid, gids,
                                      debug_flags, rlimits,
                                      permittedCapabilities, effectiveCapabilities,
                                      MOUNT_EXTERNAL_NONE, NULL, NULL, true, NULL,
                                      NULL, NULL);
  if (pid > 0) {
      // The zygote process checks whether the child process has died or not.
      ALOGI("System server process %d has been created", pid);
      gSystemServerPid = pid;
      // There is a slight window that the system server process has crashed
      // but it went unnoticed because we haven't published its pid yet. So
      // we recheck here just to make sure that all is well.
      int status;
      if (waitpid(pid, &status, WNOHANG) == pid) {
          ALOGE("System server process %d has died. Restarting Zygote!", pid);
          RuntimeAbort(env);//启动SystemServer失败,重启系统
      }
  }
  return pid;
}

native层调用ForkAndSpecializeCommon函数后,如果启动的SystemServer,会去判断是否成功启动,如果不成功,会让让Zygote自己退出重启,在ForkAndSpecializeCommon方法里,会通SetSigChldHandler()方法去设置处理SIGCHID信号的函数SigChidHandler():

static void SigChldHandler(int /*signal_number*/) {
  pid_t pid;
  int status;

  while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
     // Log process-death status that we care about.  In general it is
     // not safe to call LOG(...) from a signal handler because of
     // possible reentrancy.  However, we know a priori that the
     // current implementation of LOG() is safe to call from a SIGCHLD
     // handler in the zygote process.  If the LOG() implementation
     // changes its locking strategy or its use of syscalls within the
     // lazy-init critical section, its use here may become unsafe.
    if (WIFEXITED(status)) {
      if (WEXITSTATUS(status)) {
        ALOGI("Process %d exited cleanly (%d)", pid, WEXITSTATUS(status));
      }
    } else if (WIFSIGNALED(status)) {
      if (WTERMSIG(status) != SIGKILL) {
        ALOGI("Process %d exited due to signal (%d)", pid, WTERMSIG(status));
      }
      if (WCOREDUMP(status)) {
        ALOGI("Process %d dumped core.", pid);
      }
    }

    // If the just-crashed process is the system_server, bring down zygote
    // so that it is restarted by init and system server will be restarted
    // from there.
    if (pid == gSystemServerPid) {
//如果死亡的是SystemServer进程,zygote将退出
      ALOGE("Exit zygote because system server (%d) has terminated");
      kill(getpid(), SIGKILL);
    }
  }

  // Note that we shouldn't consider ECHILD an error because
  // the secondary zygote might have no children left to wait for.
  if (pid < 0 && errno != ECHILD) {
    ALOGW("Zygote SIGCHLD error in waitpid: %s", strerror(errno));
  }
}

waitpid()来防止子进程变”僵尸”,来判断是否是SystemServer进程,如果是,会令Zygote进程会自杀,将导致init进程杀死自己所有服务并重启Zygote

  • fork出SystemServer后,在fork出来的进程调用 handleSystemServerProcess(parsedArgs):
private static void handleSystemServerProcess(
            ZygoteConnection.Arguments parsedArgs)
            throws ZygoteInit.MethodAndArgsCaller {

        closeServerSocket();

        // set umask to 0077 so new files and directories will default to owner-only permissions.
        Os.umask(S_IRWXG | S_IRWXO);

        if (parsedArgs.niceName != null) {
            Process.setArgV0(parsedArgs.niceName);
        }

        final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");
        if (systemServerClasspath != null) {
            performSystemServerDexOpt(systemServerClasspath);
        }

        if (parsedArgs.invokeWith != null) {
            String[] args = parsedArgs.remainingArgs;
            // If we have a non-null system server class path, we'll have to duplicate the
            // existing arguments and append the classpath to it. ART will handle the classpath
            // correctly when we exec a new process.
            if (systemServerClasspath != null) {
                String[] amendedArgs = new String[args.length + 2];
                amendedArgs[0] = "-cp";
                amendedArgs[1] = systemServerClasspath;
                System.arraycopy(parsedArgs.remainingArgs, 0, amendedArgs, 2, parsedArgs.remainingArgs.length);
            }

            WrapperInit.execApplication(parsedArgs.invokeWith,
                    parsedArgs.niceName, parsedArgs.targetSdkVersion,
                    null, args);
        } else {
            ClassLoader cl = null;
            if (systemServerClasspath != null) {
                cl = new PathClassLoader(systemServerClasspath, ClassLoader.getSystemClassLoader());
                Thread.currentThread().setContextClassLoader(cl);
            }

            /*
             * Pass the remaining arguments to SystemServer.
             */
            RuntimeInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
        }

        /* should never reach here */
    }

这里先关闭从zygote继承来的socket,接着将SystemServer进程的umask设为0077(Os.umask(S_IRWXG | S_IRWXO):R读,W写,X执行,O其他用户,G组用户),所以System创建的文件属性就是0700,(umask设置 的权限就是chmod设置权限的补码),只有文件创建者SystemServer进程可以访问。接着 Process.setArgV0(parsedArgs.niceName),修改进程名。
参数invokeWith通常为null,走else分支,接着用ClassLoader加载SystemServer类,从而调用 RuntimeInit.zygoteInit(),最后抛出来异常来(清除调用帧,防止ActivityThread.main方法返回),才真正调用SystemServer.main

2、SystemServer的初始化

public static void main(String[] args) {
        new SystemServer().run();
    }
private void run() {
        // If a device's clock is before 1970 (before 0), a lot of
        // APIs crash dealing with negative numbers, notably
        // java.io.File#setLastModified, so instead we fake it and
        // hope that time from cell towers or NTP fixes it shortly.
        if (System.currentTimeMillis() < EARLIEST_SUPPORTED_TIME) {
            Slog.w(TAG, "System clock is before 1970; setting to 1970.");
            SystemClock.setCurrentTimeMillis(EARLIEST_SUPPORTED_TIME);
        }

        // Here we go!
        Slog.i(TAG, "Entered the Android system server!");
        EventLog.writeEvent(EventLogTags.BOOT_PROGRESS_SYSTEM_RUN, SystemClock.uptimeMillis());

        // In case the runtime switched since last boot (such as when
        // the old runtime was removed in an OTA), set the system
        // property so that it is in sync. We can't do this in
        // libnativehelper's JniInvocation::Init code where we already
        // had to fallback to a different runtime because it is
        // running as root and we need to be the system user to set
        // the property. http://b/11463182
        SystemProperties.set("persist.sys.dalvik.vm.lib.2", VMRuntime.getRuntime().vmLibrary());

        // Enable the sampling profiler.
        if (SamplingProfilerIntegration.isEnabled()) {
            SamplingProfilerIntegration.start();
            mProfilerSnapshotTimer = new Timer();
            mProfilerSnapshotTimer.schedule(new TimerTask() {
                @Override
                public void run() {
                    SamplingProfilerIntegration.writeSnapshot("system_server", null);
                }
            }, SNAPSHOT_INTERVAL, SNAPSHOT_INTERVAL);
        }

        // Mmmmmm... more memory!
        VMRuntime.getRuntime().clearGrowthLimit();

        // The system server has to run all of the time, so it needs to be
        // as efficient as possible with its memory usage.
        VMRuntime.getRuntime().setTargetHeapUtilization(0.8f);

        // Some devices rely on runtime fingerprint generation, so make sure
        // we've defined it before booting further.
        Build.ensureFingerprintProperty();

        // Within the system server, it is an error to access Environment paths without
        // explicitly specifying a user.
        Environment.setUserRequired(true);

        // Ensure binder calls into the system always run at foreground priority.
        BinderInternal.disableBackgroundScheduling(true);

        // Prepare the main looper thread (this thread).
        android.os.Process.setThreadPriority(
                android.os.Process.THREAD_PRIORITY_FOREGROUND);
        android.os.Process.setCanSelfBackground(false);
        Looper.prepareMainLooper();

        // Initialize native services.
        System.loadLibrary("android_servers");
        nativeInit();

        // Check whether we failed to shut down last time we tried.
        // This call may not return.
        performPendingShutdown();

        // Initialize the system context.
        createSystemContext();

        // Create the system service manager.
        mSystemServiceManager = new SystemServiceManager(mSystemContext);
        LocalServices.addService(SystemServiceManager.class, mSystemServiceManager);

        // Start services.
        try {
            startBootstrapServices();
            startCoreServices();
            startOtherServices();
        } catch (Throwable ex) {
            Slog.e("System", "******************************************");
            Slog.e("System", "************ Failure starting system services", ex);
            throw ex;
        }

        // For debug builds, log event loop stalls to dropbox for analysis.
        if (StrictMode.conditionallyEnableDebugLogging()) {
            Slog.i(TAG, "Enabled StrictMode for system server main thread.");
        }

        // Loop forever.
        Looper.loop();
        throw new RuntimeException("Main thread loop unexpectedly exited");
    }

总结上面做的事情 :

  • 1、调整时间,如果系统时间比1970还早,就调到到1970.
  • 2、SystemProperties.set(“persist.sys.dalvik.vm.lib.2”, VMRuntime.getRuntime().vmLibrary()) 设置该属性的值为虚拟机制的运行库路径
  • 3、 VMRuntime.getRuntime().setTargetHeapUtilization(0.8f);调整虚拟机堆的内存,设定虚拟机堆利用率为0.8,当实际的使用率偏离设定的比率时,虚拟机在垃圾回收的时候将调整堆的大小 ,使实际胳使用率接近设定的百分比
  • 4、System.loadLibrary(“android_servers”) 装载库libandroid_servers.so
  • 5、nativeInit()启用native层的SensorService(提供各种传感器的服务)
  • 6、调用createSystemContext来获取context
  • 7、创建SystemServiceManager的对象,负责系统的Service启动
  • 8、startBootstrapServices(); startCoreServices(); startOtherServices();创建并运行所有java服务
  • 9、调用Looper.looper()进入消息的循环
    分析下,createSystemContext():
 private void createSystemContext() {
        ActivityThread activityThread = ActivityThread.systemMain();
        mSystemContext = activityThread.getSystemContext();
        mSystemContext.setTheme(android.R.style.Theme_DeviceDefault_Light_DarkActionBar);
    }
public static ActivityThread systemMain() {
        // The system process on low-memory devices do not get to use hardware
        // accelerated drawing, since this can add too much overhead to the
        // process.
        if (!ActivityManager.isHighEndGfx()) {
            HardwareRenderer.disable(true);
        } else {
            HardwareRenderer.enableForegroundTrimming();
        }
        ActivityThread thread = new ActivityThread();
        thread.attach(true);
        return thread;
    }

SystemServer本身也需要一个和APK应用类似的上下文环境,所以会创建一个ActivityThread,再调用attch方法,true就是代表系统级应用,

 private void attach(boolean system) {
        sCurrentActivityThread = this;
        mSystemThread = system;
        if (!system) {
            ......
        } else {
            // Don't set application object here -- if the system crashes,
            // we can't display an alert, we just want to die die die.
            android.ddm.DdmHandleAppName.setAppName("system_process",
                    UserHandle.myUserId());
            try {
                mInstrumentation = new Instrumentation();
              //创建context对象
                ContextImpl context = ContextImpl.createAppContext(
                        this, getSystemContext().mPackageInfo);
                mInitialApplication = context.mPackageInfo.makeApplication(true, null);
                mInitialApplication.onCreate();
            } catch (Exception e) {
                throw new RuntimeException(
                        "Unable to instantiate Application():" + e.toString(), e);
            }
         .....
        }
    }

attach方法为true时会创建一个类似应用的环境,这里创建ContextImp和Application,最后还调用了appliction的oncreate方法,完全模拟创建一个应用。但是,每个上下文环境都需要对应一个apk文件,看方法getSystemContext:

 public ContextImpl getSystemContext() {
        synchronized (this) {
            if (mSystemContext == null) {
                mSystemContext = ContextImpl.createSystemContext(this);
            }
            return mSystemContext;
        }
    }

->createSystemContext

static ContextImpl createSystemContext(ActivityThread mainThread) {
        LoadedApk packageInfo = new LoadedApk(mainThread);
        ContextImpl context = new ContextImpl(null, mainThread,
                packageInfo, null, null, false, null, null);
        context.mResources.updateConfiguration(context.mResourcesManager.getConfiguration(),
                context.mResourcesManager.getDisplayMetricsLocked(Display.DEFAULT_DISPLAY));
        return context;
    }

->LoadedApk()

LoadedApk(ActivityThread activityThread) {
        mActivityThread = activityThread;
        mApplicationInfo = new ApplicationInfo();
        mApplicationInfo.packageName = "android";
        mPackageName = "android";
        ......
    }

LoadedApkc对象保存一个apk文件的信息,这个构造方法中会使用的包名指定为”android“,而framework-res.apk的包名就是android。getSystemContext就是mSystemContext对象所对应的apk文件其实就是framework-res.apk。
那么,attach就好理解了,attach返回新创建的ContextImpl对象实际上是在复制mSystemContext对象,接下来创建的application对象实际代表framework-res.apk。
因此ActivityThread.systemMain实际上是相当于创建了framework-res.apk的上下文环境。

3、SystemServer的WatchDog

在SystemServer中的startOtherServices()方法中,

private void startOtherServices() {
...
            Slog.i(TAG, "Init Watchdog");
            final Watchdog watchdog = Watchdog.getInstance();
            watchdog.init(context, mActivityManagerService);
...
}

watchdog单例模式的获取对象形式,

private Watchdog() {
        super("watchdog");
        // Initialize handler checkers for each common thread we want to check.  Note
        // that we are not currently checking the background thread, since it can
        // potentially hold longer running operations with no guarantees about the timeliness
        // of operations there.

        // The shared foreground thread is the main checker.  It is where we
        // will also dispatch monitor checks and do other work.
        mMonitorChecker = new HandlerChecker(FgThread.getHandler(),
                "foreground thread", DEFAULT_TIMEOUT);
        mHandlerCheckers.add(mMonitorChecker);
        // Add checker for main thread.  We only do a quick check since there
        // can be UI running on the thread.
        mHandlerCheckers.add(new HandlerChecker(new Handler(Looper.getMainLooper()),
                "main thread", DEFAULT_TIMEOUT));
        // Add checker for shared UI thread.
        mHandlerCheckers.add(new HandlerChecker(UiThread.getHandler(),
                "ui thread", DEFAULT_TIMEOUT));
        // And also check IO thread.
        mHandlerCheckers.add(new HandlerChecker(IoThread.getHandler(),
                "i/o thread", DEFAULT_TIMEOUT));
        // And the display thread.
        mHandlerCheckers.add(new HandlerChecker(DisplayThread.getHandler(),
                "display thread", DEFAULT_TIMEOUT));
    }

它的构造方法,主要是创建几个HandlerChecker对象:

  • 主线程
  • FgThread
  • uiThread
  • IoThread
  • DisplayThread

并把它们保存到数组列表mHandlerCheckers,每个HandlerChecker对象对应一个被监控的线程,
->watcjdpg.init();

public void init(Context context, ActivityManagerService activity) {
        mResolver = context.getContentResolver();
        mActivity = activity;

        context.registerReceiver(new RebootRequestReceiver(),
                new IntentFilter(Intent.ACTION_REBOOT),
                android.Manifest.permission.REBOOT, null);
    }

注册了ReboorRequestReceive,监听重启的intent:ACTION_REBOOT
3.1、
watchdog主要监控的是进程。
Watchdog提供两个方法addThread与addMonitor分别明智来增加需要监控的线程与服务,addThread实际创建一个和受监控对象关联的HandlerChecker对象

public void addThread(Handler thread) {
        addThread(thread, DEFAULT_TIMEOUT);
    }

    public void addThread(Handler thread, long timeoutMillis) {
        synchronized (this) {
            if (isAlive()) {
                throw new RuntimeException("Threads can't be added once the Watchdog is running");
            }
            final String name = thread.getLooper().getThread().getName();
            mHandlerCheckers.add(new HandlerChecker(thread, name, timeoutMillis));
        }
    }

对服务的监控也是通过 HandlerChecker 对象来完成的,一个HandlerChecker 对象可以检查所有服务,mHandlerCheckers会完成这个任务

public void addMonitor(Monitor monitor) {
        synchronized (this) {
            if (isAlive()) {
                throw new RuntimeException("Monitors can't be added once the Watchdog is running");
            }
            mMonitorChecker.addMonitor(monitor);
        }
    }

这个mMonitorChecker就是watchdog构造方法的mMonitorChecker,它的类里面有一个成员变量private final ArrayList<Monitor> mMonitors = new ArrayList<Monitor>(),通过jaddMonitor方法把服务add进这个mMonitors 列表里:

public final class HandlerChecker implements Runnable {
        private final Handler mHandler;
        private final String mName;
        private final long mWaitMax;
        private final ArrayList<Monitor> mMonitors = new ArrayList<Monitor>();
        private boolean mCompleted;
        private Monitor mCurrentMonitor;
        private long mStartTime;

        HandlerChecker(Handler handler, String name, long waitMaxMillis) {
            mHandler = handler;
            mName = name;
            mWaitMax = waitMaxMillis;
            mCompleted = true;
        }

        public void addMonitor(Monitor monitor) {
            mMonitors.add(monitor);
        }

例如,ActivityManagerService,PackageManagerService,PowerManagerService等服务和里面的线程加入到监控中。一个服务想通过WatchDog来监控,必须实现watchdog的接口Moniter:

public interface Monitor {
        void monitor();
    }

并且调用addMonitor方法把自己加入wathcdog的服务监控列表中。
3.2、监控原理
通过锁的mLock持有的时间是否颠簸来判断服务是否正常
watchdog.run方法

 public void run() {
        boolean waitedHalf = false;
        while (true) {
            final ArrayList<HandlerChecker> blockedCheckers;
            final String subject;
            final boolean allowRestart;
            int debuggerWasConnected = 0;
            synchronized (this) {
                long timeout = CHECK_INTERVAL;
                // Make sure we (re)spin the checkers that have become idle within
                // this wait-and-check interval
                for (int i=0; i<mHandlerCheckers.size(); i++) {
                    HandlerChecker hc = mHandlerCheckers.get(i);
                    hc.scheduleCheckLocked();
                }

                if (debuggerWasConnected > 0) {
                    debuggerWasConnected--;
                }

                // NOTE: We use uptimeMillis() here because we do not want to increment the time we
                // wait while asleep. If the device is asleep then the thing that we are waiting
                // to timeout on is asleep as well and won't have a chance to run, causing a false
                // positive on when to kill things.
                long start = SystemClock.uptimeMillis();
                while (timeout > 0) {
                    if (Debug.isDebuggerConnected()) {
                        debuggerWasConnected = 2;
                    }
                    try {
                        wait(timeout);
                    } catch (InterruptedException e) {
                        Log.wtf(TAG, e);
                    }
                    if (Debug.isDebuggerConnected()) {
                        debuggerWasConnected = 2;
                    }
                    timeout = CHECK_INTERVAL - (SystemClock.uptimeMillis() - start);
                }

                final int waitState = evaluateCheckerCompletionLocked();
                if (waitState == COMPLETED) {
                    // The monitors have returned; reset
                    waitedHalf = false;
                    continue;
                } else if (waitState == WAITING) {
                    // still waiting but within their configured intervals; back off and recheck
                    continue;
                } else if (waitState == WAITED_HALF) {
                    if (!waitedHalf) {
                        // We've waited half the deadlock-detection interval.  Pull a stack
                        // trace and wait another half.
                        ArrayList<Integer> pids = new ArrayList<Integer>();
                        pids.add(Process.myPid());
                        ActivityManagerService.dumpStackTraces(true, pids, null, null,
                                NATIVE_STACKS_OF_INTEREST);
                        waitedHalf = true;
                    }
                    continue;
                }
                 ......
                //杀死SystemServer
                Slog.w(TAG, "*** GOODBYE!");
                Process.killProcess(Process.myPid());
                System.exit(10);
            }

            waitedHalf = false;
        }
    }

做三件事:
1、

 public void scheduleCheckLocked() {
            if (mMonitors.size() == 0 && mHandler.getLooper().isIdling()) {
                // If the target looper is or just recently was idling, then
                // there is no reason to enqueue our checker on it since that
                // is as good as it not being deadlocked.  This avoid having
                // to do a context switch to check the thread.  Note that we
                // only do this if mCheckReboot is false and we have no
                // monitors, since those would need to be executed at this point.
                mCompleted = true;
                return;
            }

            if (!mCompleted) {
                // we already have a check in flight, so no need
                return;
            }

            mCompleted = false;
            mCurrentMonitor = null;
            mStartTime = SystemClock.uptimeMillis();
            mHandler.postAtFrontOfQueue(this);
        }

HandlerChecker对象又要监控服务,又要监控某个线程:
mMonitors.size() ==0,说明HandlerChecker没有监控的服务
mHandler.getLooper().isIdling()==true,说明被监控的线程空闲状态,线程良好,mCompleted设为true。否则把mCompleted设为false,然后记录消息开始发送的时候到mStartTime,再给被监控的线程发送一个信息(注意HandlerChecker是实现了Runnable接口)。再它的run方法:

public void run() {
            final int size = mMonitors.size();
            for (int i = 0 ; i < size ; i++) {
                synchronized (Watchdog.this) {
                    mCurrentMonitor = mMonitors.get(i);
                }
                mCurrentMonitor.monitor();
            }

            synchronized (Watchdog.this) {
                mCompleted = true;
                mCurrentMonitor = null;
            }
        }

如果mCurrentMonitor.monitor(),如果得不到服务中的锁的话,线程就挂起,下面的代码就执行不了,mCompleted 就不能设置为true了,如果线程没有挂起,mCompleted =true,就说明线程或者服务正常,否则有问题,就要通过等待的时间是否超过规定的时间来判断。
通常被监控的服务实现Monitor接口实现的方法

 public void monitor() {
        synchronized (this) { }
    }

2、 给受控制的纯种发送信息后,调用wait方法让WatchDog线程睡眠一段时间(30秒)

static final boolean DB = false;

// Set this to true to have the watchdog record kernel thread stacks when it fires
static final boolean RECORD_KERNEL_THREADS = true;

static final long DEFAULT_TIMEOUT = DB ? 10*1000 : 60*1000;
static final long CHECK_INTERVAL = DEFAULT_TIMEOUT / 2;

3、调用evaluateCheckerCompletionLocked()获取等待时间来检查线程或者服务是否有问题

// These are temporally ordered: larger values as lateness increases
static final int COMPLETED = 0;
static final int WAITING = 1;
static final int WAITED_HALF = 2;
static final int OVERDUE = 3;
private int evaluateCheckerCompletionLocked() {
        int state = COMPLETED;
        for (int i=0; i<mHandlerCheckers.size(); i++) {
            HandlerChecker hc = mHandlerCheckers.get(i);
            state = Math.max(state, hc.getCompletionStateLocked());
        }
        return state;
    }

->HandlerChecker.java->getCompletionStateLocked

public int getCompletionStateLocked() {
            if (mCompleted) {
                return COMPLETED;
            } else {
                long latency = SystemClock.uptimeMillis() - mStartTime;
                if (latency < mWaitMax/2) {
                    return WAITING;
                } else if (latency < mWaitMax) {
                    return WAITED_HALF;
                }
            }
            return OVERDUE;
        }

mStartTime 就是步骤(1)中给被监控的线程发信息scheduleCheckLocked()的那个时间 ,如果scheduleCheckLocked方法中mCompleted为tue(线程正常,不会挂起的情况),直接return COMPLETED;如果为false就计算它等待的时候,则:
COMPLETED:值为0,表示状态良好;
WAITING:值为1,表示正常等待消息处理的结果;
WAITED_HALF:值为2,表示正常等待并且等待的时间超过了规定时间的一半;
OVERDUE:值为3,表示等待的时间超过了规定时间

最后,只要不是OVERDUE状态都可以continue继续执行,否则就会杀死SystemServer。
参考<深入解析android 5.0系统>

    原文作者:Android
    原文地址: https://www.jianshu.com/p/8cd2e85f3de9
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞