[JUC] AQS独占模式详解

2019年3月10日 366次阅读来源: JUC

前言

AQS是指java.util.concurrent.locks.AbstractQueuedSynchronizer类，AQS并没有使用类似synchronized这样特殊的关键字，而是通过维护一个状态变量和一个先进先出(FIFO)的同步队列和来实现锁和同步器功能。在JDK11中AbstractQueuedSynchronizer具有如下实现类：
《[JUC] AQS独占模式详解》
可以看到常用的ReentrantLock、CountDownLatch等都使用AQS作为内部实现。

独占模式和共享模式

AQS提供了两种工作模式：独占(exclusive)模式和共享(shared)模式。它的所有子类中，要么实现并使用了它独占功能的 API，要么使用了共享功能的API，而不会同时使用两套 API，即便是它最有名的子类 ReentrantReadWriteLock，也是通过两个内部类：读锁和写锁，分别实现的两套 API 来实现的。

独占模式即当锁被某个线程成功获取时，其他线程无法获取到该锁，共享模式即当锁被某个线程成功获取时，其他线程仍然可能获取到该锁。

本文中我们结合重入锁(ReentrantLock)分析独占模式的实现

AQS类结构

AbstractQueuedSynchronizer类的继承关系如下：
《[JUC] AQS独占模式详解》
其中AbstractOwnableSynchronizer定义了如下成员用于保存独占模式下当前持有锁的线程：

    /** * The current owner of exclusive mode synchronization. */
    private transient Thread exclusiveOwnerThread;

AbstractQueuedSynchronizer类定义了如下成员，分别用于保存同步等待队列的头部(head)、尾部(tail)和同步状态(state)，同步队列中节点类型是Node：

    /** * Head of the wait queue, lazily initialized. Except for * initialization, it is modified only via method setHead. Note: * If head exists, its waitStatus is guaranteed not to be * CANCELLED. * 同步队列头结点 */
    private transient volatile Node head;

    /** * Tail of the wait queue, lazily initialized. Modified only via * method enq to add new wait node. * 同比队列尾节点 */
    private transient volatile Node tail;

    /** * The synchronization state. * 例如在ReentrantLock中 state >= 1 表示有线程获取了锁，并且可能获取的不止 * 一次(state > 1, 重复多次获取锁) */
    private volatile int state;

Node类定义了如下成员：

        /** * Status field, taking on only the values: * SIGNAL: The successor of this node is (or will soon be) * blocked (via park), so the current node must * unpark its successor when it releases or * cancels. To avoid races, acquire methods must * first indicate they need a signal, * then retry the atomic acquire, and then, * on failure, block. * CANCELLED: This node is cancelled due to timeout or interrupt. * Nodes never leave this state. In particular, * a thread with cancelled node never again blocks. * CONDITION: This node is currently on a condition queue. * It will not be used as a sync queue node * until transferred, at which time the status * will be set to 0. (Use of this value here has * nothing to do with the other uses of the * field, but simplifies mechanics.) * PROPAGATE: A releaseShared should be propagated to other * nodes. This is set (for head node only) in * doReleaseShared to ensure propagation * continues, even if other operations have * since intervened. * 0: None of the above * * The values are arranged numerically to simplify use. * Non-negative values mean that a node doesn't need to * signal. So, most code doesn't need to check for particular * values, just for sign. * * The field is initialized to 0 for normal sync nodes, and * CONDITION for condition nodes. It is modified using CAS * (or when possible, unconditional volatile writes). * 代表节点状态，可以取五个值SIGNAL、CANCELLED、CONDITION、PROPAGATE、0 * SIGNAL：waitStatus为SIGNAL时表示当前节点的后继节点中的线程将被或已经被挂起，当当前 * 节点释放锁或被取消时，需要唤醒他的后继节点(unpark后继节点的线程)； * CANCELLED：waitStatus为CANCELLED表示当前节点的线程由于超时或者中断被cancel； * CONDITION：waitStatus为CONDITION代表当前节点处在条件队列中等待某个条件的发生，只有在 * 使用到Condition节点的状态才可能会是这个值； * PROPAGATE：共享模式下waitStatus为PROPAGATE表示会想后继节点传播唤醒线程的操作； */
        volatile int waitStatus;

        /** * Link to predecessor node that current node/thread relies on * for checking waitStatus. Assigned during enqueuing, and nulled * out (for sake of GC) only upon dequeuing. Also, upon * cancellation of a predecessor, we short-circuit while * finding a non-cancelled one, which will always exist * because the head node is never cancelled: A node becomes * head only as a result of successful acquire. A * cancelled thread never succeeds in acquiring, and a thread only * cancels itself, not any other node. * 指向队列中节点的前驱节点 */
        volatile Node prev;

        /** * Link to the successor node that the current node/thread * unparks upon release. Assigned during enqueuing, adjusted * when bypassing cancelled predecessors, and nulled out (for * sake of GC) when dequeued. The enq operation does not * assign next field of a predecessor until after attachment, * so seeing a null next field does not necessarily mean that * node is at end of queue. However, if a next field appears * to be null, we can scan prev's from the tail to * double-check. The next field of cancelled nodes is set to * point to the node itself instead of null, to make life * easier for isOnSyncQueue. * 指向队列中节点的后继节点 */
        volatile Node next;

        /** * The thread that enqueued this node. Initialized on * construction and nulled out after use. * 保存竞争锁的线程 */
        volatile Thread thread;

        /** * Link to next node waiting on condition, or the special * value SHARED. Because condition queues are accessed only * when holding in exclusive mode, we just need a simple * linked queue to hold nodes while they are waiting on * conditions. They are then transferred to the queue to * re-acquire. And because conditions can only be exclusive, * we save a field by using special value to indicate shared * mode. * 条件队列中指向队列的下一个节点(条件队列是一个单向链表，同步队列是一个双向链表)， * 同步队列中用该值表示该节点是独占模式(值为EXCLUSIVE)还是共享模式(值为SHARED) */
        Node nextWaiter;

ReentrantLock如何使用AQS实现锁

我们以公平重入锁(ReentrantLock)的如下场景来分析AQS是如何工作的，如下图所示，假设有一个重入锁lock，有五个线程t1、t2、t3、t4，红色虚线表示在这个时间线成功获取锁，并修改同步状态
《[JUC] AQS独占模式详解》

时间线1

在时间线1线程t1获取锁并将AQS的state值设为1，调用链为lock() -> acquire(int arg) -> tryAcquire(int acquires), tryAcquire(int acquires)实现如下：

        /** * Fair version of tryAcquire. Don't grant access unless * recursive call or no waiters or is first. */
        @ReservedStackAccess
        protected final boolean tryAcquire(int acquires) {
        	// 传入的acquires值为1
            final Thread current = Thread.currentThread();
            // 此时state为0
            int c = getState();
            if (c == 0) {
            	// 先判断同步队列中是否已经有其他线程在等待锁(此时同步队列还未建立，没有其
            	// 他线程在等待锁)，没有的话通过VarHandle将state值设为1,VarHandle类似
            	// 于CAS的原子操作
                if (!hasQueuedPredecessors() &&
                    compareAndSetState(0, acquires)) {
                    // 将持有锁的线程设为当前线程，设置AbstractOwnableSynchronizer类的
                    // exclusiveOwnerThread成员为current
                    setExclusiveOwnerThread(current);
                    // 返回true成功获取锁
                    return true;
                }
            }
            else if (current == getExclusiveOwnerThread()) {
                int nextc = c + acquires;
                if (nextc < 0)
                    throw new Error("Maximum lock count exceeded");
                setState(nextc);
                return true;
            }
            return false;
        }

在t1第一次调用lock()时，能够成功获取锁，此时同步队列尚未建立，但会将AQS中的同步状态值设为1，表示当前已有线程持有该锁，如下图所示：
《[JUC] AQS独占模式详解》

时间线2

在时间线2线程t2区争抢lock锁，但此时线程t1尚未释放lock锁，调用链为lock() -> acquire(int arg) -> tryAcquire(int acquires)

        /** * Fair version of tryAcquire. Don't grant access unless * recursive call or no waiters or is first. */
        @ReservedStackAccess
        protected final boolean tryAcquire(int acquires) {
        	// 传入的acquires值为1
            final Thread current = Thread.currentThread();
            // 此时state为0
            int c = getState();
            if (c == 0) {
                if (!hasQueuedPredecessors() &&
                    compareAndSetState(0, acquires)) {
                    setExclusiveOwnerThread(current);
                    return true;
                }
            }
             // exclusiveOwnerThread为线程t1,current为线程t2,不会进入到该if语句
            else if (current == getExclusiveOwnerThread()) {
                int nextc = c + acquires;
                if (nextc < 0)
                    throw new Error("Maximum lock count exceeded");
                setState(nextc);
                return true;
            }
            // 直接返回false表是获取锁失败
            return false;
        }

此时回到acquire(int arg)方法：

    public final void acquire(int arg) {
    	// tryAcquire获取锁失败，返回false，执行acquireQueued
        if (!tryAcquire(arg) &&
            acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
            selfInterrupt();
    }

我们看addWaiter方法：

    private Node addWaiter(Node mode) {
    	// 传入的mode是Node.EXCLUSIVE表示是独占模式的同步队列节点
        Node node = new Node(mode);
		// 死循环将node添加到队列中
        for (;;) {
            Node oldTail = tail;
            if (oldTail != null) {
            	// 步骤2、初始化完将node加入到同步队列
            	// 将node的前驱设为tail
                node.setPrevRelaxed(oldTail);
                // 原子操作方式设置node为新的tail
                if (compareAndSetTail(oldTail, node)) {
                	// 将原来tail节点的next指向新的tail
                    oldTail.next = node;
                    return node;
                }
            } else {
            	// 步骤1、head和tail都为null，进入初始化同步队列流程
                initializeSyncQueue();
            }
        }
    }

addWaiter首先执行步骤1，即执行initializeSyncQueue()初始化方法，初始化结束后同步队列如下图所示：
《[JUC] AQS独占模式详解》
接着执行步骤2，将新创建的node节点加入到同步队列,如下图所示

此时同步队列已经创建好新建node节点的thread成员为t2线程，nextWaiter为Node.EXCLUSIVE表示独占模式，头结点(head)的next成员指向新创建的node节点，新创建节点的prev指向头结点，队列tail指向新创建的节点，节点加入到队列后会将头结点(head)的waitStatus修改为Node.SIGNAL。

回到acquireQueued(final Node node, int arg)方法，我们在Node的Doc中知道如果一个节点的waitStatus为Node.SIGNAL，则当节点释放锁或被取消时会去唤醒其后继节点。acquireQueued方法所做的事情就是不断的去判断node的前驱是否为head，若是则尝试获取锁，获取失败再判断是否要挂起该线程(判断过程将节点前驱的waitStatus设为Node.SIGNAL)，被挂起后等待被唤醒，若有其他线程唤醒了该线程，则继续循环尝试获取锁，然后挂起。

	// node是上面新创建的节点，arg值为1
    final boolean acquireQueued(final Node node, int arg) {
        boolean interrupted = false;
        try {
            for (;;) {
            	// 找到前驱节点
                final Node p = node.predecessor();
                // 前驱节点为head则可以尝试去获取锁，时间线2时刻t1线程尚未释放锁，state为1，t2线
                // 程获取锁失败
                if (p == head && tryAcquire(arg)) {
                	// 若成功获取锁则将会node设置为新的head，设置为新的head会导致node的
                	// prev成员和thread成员设为null
                    setHead(node);
                    p.next = null; // help GC
                    return interrupted;
                }
                // 判断t2线程是否需要挂起，挂起前需要将前驱的waitStatus设为Node.SIGNAL
                if (shouldParkAfterFailedAcquire(p, node))
                	// 挂起t2线程，因为这是一个死循环，当t2线程从此处被唤醒时会继续执行死循环
                	// 然后再次尝试去获取锁(tryAcquire),获取失败再挂起，一直到获取锁成功为止
                	// 退出死循环
                    interrupted |= parkAndCheckInterrupt();
            }
        } catch (Throwable t) {
            cancelAcquire(node);
            if (interrupted)
                selfInterrupt();
            throw t;
        }
    }
	
	// 判断是否要挂起node节点中的线程，只有当node的前驱pred的waitStatus为Node.SIGNAL
	// 时才会挂起node节点的线程
    private static boolean shouldParkAfterFailedAcquire(Node pred, Node node) {
        int ws = pred.waitStatus;
        // 若前驱节点的waitStatus已经为Node.SIGNAL，则可以直接挂起node节点中的线程
        if (ws == Node.SIGNAL)
            /* * This node has already set status asking a release * to signal it, so it can safely park. */
            return true;
        // waitStatus大于0表示node前驱节点的线程已经被取消(Cancel),需要从pred处往前寻找
        // 到第一个线程未被取消的节点，将node前驱设为找到的节点
        if (ws > 0) {
            /* * Predecessor was cancelled. Skip over predecessors and * indicate retry. */
            do {
                node.prev = pred = pred.prev;
            } while (pred.waitStatus > 0);
            // 设置新的前驱
            pred.next = node;
        } else {
            /* * waitStatus must be 0 or PROPAGATE. Indicate that we * need a signal, but don't park yet. Caller will need to * retry to make sure it cannot acquire before parking. */
             // waitStatus为0(Node创建的初始状态),或者为PROPAGATE(共享模式)，这两种情况下都要
             // 将前驱的waitStatus设为Node.SIGNAL，以便在前驱释放锁时可以唤醒node节点的线程
            pred.compareAndSetWaitStatus(ws, Node.SIGNAL);
        }
        return false;
    }

在我们的场景下到时间线2为止线程t2已经被挂起

时间线3

在时间线3线程t1和t3都尝试去获取锁，根据ReentrantLock获取锁的逻辑

       protected final boolean tryAcquire(int acquires) {
            final Thread current = Thread.currentThread();
            int c = getState();
            // 线程t1还占据着锁，state值为1
            if (c == 0) {
                if (!hasQueuedPredecessors() &&
                    compareAndSetState(0, acquires)) {
                    setExclusiveOwnerThread(current);
                    return true;
                }
            }
            // 时间线3时刻占有锁的线程还是t1(exclusiveOwnerThread是t1),所以t1再次尝试获取锁
            // 时会进入下面的if分支，而t3尝试获取锁时两个if分支都不会进入，直接返回false获取锁失败
            else if (current == getExclusiveOwnerThread()) {
            	// t1线程将state加上1，表示t1线程再次获取了锁(这里就是重入锁中重入的由来，
            	// 已经占有锁的线程可以重复获取锁)
                int nextc = c + acquires;
                if (nextc < 0)
                    throw new Error("Maximum lock count exceeded");
                // 给state设置新值
                setState(nextc);
                return true;
            }
            return false;
        }

经过时间线3后线程t1再次获取到锁，而线程t3和时间线2时刻的t2线程一样因为无法获取锁而被加入到同步队列中，经过时间线3后同步队列如下(每加入一个新的节点到队列尾部都会修改其前驱节点的waitStatus值为Node.SIGNAL)
《[JUC] AQS独占模式详解》

时间线4

在时间线4线程t1依然未释放锁(state值为2)，线程t4尝试获取锁还是失败，t4加入到同步队列，经过时间线4后同步队列状态如下图所示
《[JUC] AQS独占模式详解》

时间线5

在时间线5时刻线程t1第一次释放锁，释放锁的代码调用链为unlock() -> release(int arg) -> tryRelease(int releases)，tryRelease(int releases)代码如下：

		// ReentrantLock调用一次unlock传入的releases值为1
		@ReservedStackAccess
        protected final boolean tryRelease(int releases) {
        	// 在时间线5时刻state值为2，releases为1
            int c = getState() - releases;
            if (Thread.currentThread() != getExclusiveOwnerThread())
                throw new IllegalMonitorStateException();
            boolean free = false;
            // c为1
            if (c == 0) {
                free = true;
                setExclusiveOwnerThread(null);
            }
            // 设置state为1
            setState(c);
            // state不为0，线程t1还未释放锁
            return free;
        }

		public final boolean release(int arg) {
			// 释放锁失败，tryRelease返回false，线程t1不会去唤醒同步队列中等待线程
            if (tryRelease(arg)) {
                 Node h = head;
                 if (h != null && h.waitStatus != 0)
                     unparkSuccessor(h);
                 return true;
            }
            return false;
        }

时间线6

在时间线6时刻线程t1再次释放锁，同样释放锁的调用链为unlock() -> release(int arg) -> tryRelease(int releases)，此时会进入unparkSuccessor方法唤醒同步队列中的等待线程，即唤醒线程t2(这里只考虑使用ReentrantLock的公平锁)

		// ReentrantLock调用一次unlock传入的releases值为1
		@ReservedStackAccess
        protected final boolean tryRelease(int releases) {
        	// 在时间线6时刻state值为1，releases为1
            int c = getState() - releases;
            if (Thread.currentThread() != getExclusiveOwnerThread())
                throw new IllegalMonitorStateException();
            boolean free = false;
            // c为0
            if (c == 0) {
                free = true;
                // 设置占有锁的线程为null
                setExclusiveOwnerThread(null);
            }
            // 设置state为0
            setState(c);
            // state为0，线程t1完全释放锁
            return free;
        }

		public final boolean release(int arg) {
			// 释放锁成功，tryRelease返回true，线程t1调用unparkSuccessor唤醒同
			// 步队列中等待线程
            if (tryRelease(arg)) {
                 Node h = head;
                 if (h != null && h.waitStatus != 0)
                     unparkSuccessor(h);
                 return true;
            }
            return false;
        }

进入到unparkSuccessor方法唤醒线程t2

	// 此时传入的node为头结点head
	private void unparkSuccessor(Node node) {
        /* * If status is negative (i.e., possibly needing signal) try * to clear in anticipation of signalling. It is OK if this * fails or if status is changed by waiting thread. */
        int ws = node.waitStatus;
        if (ws < 0)
        	// 这里要将node的waitStatus设为0我猜测是为了防止重复唤醒node的后继节点
        	// (参照release方法里进入unparkSuccessor的条件h.waitStatus != 0)
            node.compareAndSetWaitStatus(ws, 0);

        /* * Thread to unpark is held in successor, which is normally * just the next node. But if cancelled or apparently null, * traverse backwards from tail to find the actual * non-cancelled successor. */
        // 寻找第一个没有被cancel的后继节点，从后往前找(为什么要从后往前找暂时不清楚，
        // next可能为null??)
        Node s = node.next;
        if (s == null || s.waitStatus > 0) {
            s = null;
            for (Node p = tail; p != node && p != null; p = p.prev)
                if (p.waitStatus <= 0)
                    s = p;
        }
        if (s != null)
        	// 唤醒后继第一个未被cancel的线程，我们这里就是唤醒t2线程
            LockSupport.unpark(s.thread);
    }

t2线程在获取锁时的调用链为acquire(int arg) -> acquireQueued(final Node node, int arg) -> parkAndCheckInterrupt()，在方法parkAndCheckInterrupt出被挂起，现在被t1线程在parkAndCheckInterrupt处唤醒后回到acquireQueued方法出继续执行，先执行如下所示的步骤1，在下一个循环中执行步骤2，如下所示

	final boolean acquireQueued(final Node node, int arg) {
        boolean interrupted = false;
        try {
            for (;;) {
                final Node p = node.predecessor();
                // 步骤2 下一次循环再次尝试去获取锁，由于此时state为0，t2线程能成功获取锁并
                // 将state设为1(这里我们考虑的是公平锁即FIFO，后面加入到同步队列的t3、t4线
                // 程不会和t2线程争抢锁)
                if (p == head && tryAcquire(arg)) {
                	// 将t2线程所在的节点设置为新的head节点
                    setHead(node);
                    p.next = null; // help GC
                    return interrupted;
                }
                if (shouldParkAfterFailedAcquire(p, node))
                	// 步骤1 线程t2在此时被唤醒，继续执行下一个循环
                    interrupted |= parkAndCheckInterrupt();
            }
        } catch (Throwable t) {
            cancelAcquire(node);
            if (interrupted)
                selfInterrupt();
            throw t;
        }
    }