装配 – 带XCHG的Spinlock

2023年5月19日 210次阅读

维基百科提供的带有x86 XCHG命令的自旋锁的示例实现是：

; Intel syntax

locked:                      ; The lock variable. 1 = locked, 0 = unlocked.
     dd      0

spin_lock:
     mov     eax, 1          ; Set the EAX register to 1.

     xchg    eax, [locked]   ; Atomically swap the EAX register with
                             ;  the lock variable.
                             ; This will always store 1 to the lock, leaving
                             ;  the previous value in the EAX register.

     test    eax, eax        ; Test EAX with itself. Among other things, this will
                             ;  set the processor's Zero Flag if EAX is 0.
                             ; If EAX is 0, then the lock was unlocked and
                             ;  we just locked it.
                             ; Otherwise, EAX is 1 and we didn't acquire the lock.

     jnz     spin_lock       ; Jump back to the MOV instruction if the Zero Flag is
                             ;  not set; the lock was previously locked, and so
                             ; we need to spin until it becomes unlocked.

     ret                     ; The lock has been acquired, return to the calling
                             ;  function.

spin_unlock:
     mov     eax, 0          ; Set the EAX register to 0.

     xchg    eax, [locked]   ; Atomically swap the EAX register with
                             ;  the lock variable.

     ret                     ; The lock has been released.

从这里https://en.wikipedia.org/wiki/Spinlock#Example_implementation

我不明白为什么解锁需要是原子的.怎么了？

spin_unlock:
     mov     [locked], 0

最佳答案解锁需要
have release semantics才能正确保护关键部分.但它不需要顺序一致性.原子性不是真正的问题(见下文).

所以,是的,在x86上,一个简单的商店是安全的,glibc’s pthread_spin_unlock does so:：

    movl    $1, (%rdi)
    xorl    %eax, %eax
    retq

另请参见一个简单但可能可用的x86 spinlock implementation I wrote in this answer,使用带有暂停指令的只读自旋循环.

可能这个代码是根据位字段版本改编的.

用btr解锁一个位域中的一个标志是不安全的,因为它是包含字节的非原子读 – 修改 – 写(or the containing naturally-aligned 4 byte dword or 2 byte word).

所以也许是谁写了它没有意识到simple stores to aligned addresses are atomic on x86, like on most ISAs.但x86有什么弱排序的ISA不是每个商店都有release semantics.释放锁的xchg使每个解锁成为一个完整的内存屏障,这超出了正常的锁定语义. (虽然在x86上,采取锁定将是一个完整的障碍,因为没有办法在没有xchg或其他锁定指令的情况下进行原子RMW或原子比较和交换,而且这些都是像mfence这样的完全障碍.)

解锁存储在技术上不需要是原子的,因为我们只存储零或1,所以只有低位字节才重要.例如我认为如果锁是未对齐的并且跨越缓存行边界分裂,它仍然可以工作.撕裂可能发生但无关紧要,真正发生的是锁定的低字节被原子地修改,操作总是将零置于高3字节.

如果您想返回旧值以捕获双重解锁错误,则更好的实现将单独加载和存储：

spin_unlock:
     ;; pre-condition: [locked] is non-zero

     mov     eax,  [locked]        ; old value, for debugging
     mov     dword [locked], 0     ; On x86, this is an atomic store with "release" semantics.

     ;test    eax,eax
     ;jz    double_unlocking_detected    ; or leave this to the caller
     ret