清理yum源缓存_缓存是万恶之源

清理yum源缓存

The practice of caching is about as effective at lowering latencies and load as it is at introducing nasty correctness problems. It is almost a law of nature that once you introduce a denormalization, it’s a matter of time before it diverges from the source of truth. The transient nature of caches makes problems very difficult to debug and clouds the matter in an extra layer of mystery. All this is to say that if you can live with the performance and load without caching, for the love of everything that’s good in the world, don’t add it. In some cases though, your clients can’t stomach the long latencies and neither can your system of record take the load, so you strike a deal with The Caching Devil (what’d you think that “d” in memcached stood for).

缓存实践在降低延迟和负载方面与引入讨厌的正确性问题一样有效。 引入非规范化几乎是自然法则,它脱离真理的源头是时间问题。 缓存的瞬态特性使问题很难调试,并且使问题变得更加神秘。 这就是说,如果您可以在不缓存的情况下承受性能和负载,那么对于世界上所有有益的事物,请不要添加它。 但是在某些情况下,您的客户无法忍受较长的延迟,您的记录系统也无法承受重担,因此您与The Caching Devil(您认为memcached中的“ d”代表什么意思)达成了协议。

《清理yum源缓存_缓存是万恶之源》

At Box we’ve had our share of run-ins with the beast and to tame it we’ve relied on many strategies well-known in the industry as well as some tricks we’re happy to contribute to the community’s tool belt. Since caching is most commonly used to optimize latency and load in read-heavy environments, for the purposes of this post, we’ll avoid the write-through cache variations and focus on caches that are populated upon read.

在Box,我们与野兽有一些磨合,并且为了驯服它,我们依赖于行业中许多知名的策略,以及我们乐意为社区工具带做出贡献的一些技巧。 由于高速缓存最常用于优化重读环境中的延迟和负载,因此,出于本文的目的,我们将避免直写式高速缓存的变化,而将重点放在读取时填充的高速缓存上。

《清理yum源缓存_缓存是万恶之源》 At a high level, reads look the value up in cache before reading it from the system of record, if necessary. The cache is populated on cache misses. Writes are responsible for invalidating stale cache values. 在较高的级别上,如果需要的话,在从记录系统读取值之前,读取会先在高速缓存中查找该值。 在缓存未命中时填充缓存。 写入负责使过时的缓存值无效。

As the computer science adage aptly proclaims, cache invalidation is the hard part. Figuring out which cache keys are rendered stale by a given system of record mutation is often not trivial. Although this can be very tedious, it is, however, relatively easily reproducible and testable. On the other hand, concurrency-related cache consistency problems are a lot more subtle. Readers experienced with distributed systems will notice a couple of such problems that can occur in the caching system described above:

正如计算机科学谚语所恰当宣称的那样 ,缓存无效是最困难的部分。 确定给定的记录突变系统使哪些缓存键变得陈旧通常不是一件容易的事。 尽管这可能非常繁琐,但是相对而言,它相对容易重现和测试。 另一方面,与并发有关的缓存一致性问题要微妙得多。 具有分布式系统经验的读者会注意到,在上述缓存系统中可能会发生几个这样的问题:

  • In case of high-volume read traffic, a write (and thus a cache value invalidation) can lead to a thundering herd of readers storming the system of record to reload the value into cache.

    在大容量读取流量的情况下,写入操作(从而导致缓存值无效)会导致大量的读取器涌入记录系统,以将值重新加载到缓存中。

  • A concurrent read and write can cause a stale value to be stored in cache indefinitely. Consider the following sequence of operations, for example:

    并发的读取和写入会导致陈旧的值无限期地存储在高速缓存中。 例如,考虑以下操作顺序:

《清理yum源缓存_缓存是万恶之源》 This serialization of steps yields a persistently stale value in cache: the reader writes a value it read before the write alters the system of record and invalidates affected cache values. 步骤的这种序列化会在缓存中产生持久性过时的值:读取器在写入更改记录系统并使受影响的缓存值无效之前写入它读取的值。

The canonical solution to both of the above concurrency issues was introduced by the famous 2013 Facebook paper entitled “Scaling Memcache at Facebook”. The concept of “leases” is introduced as sort of a per-cache-key lock preventing thundering herds and stale sets. It relies on two common cache system operations:

2013年著名的Facebook论文“在Facebook上扩展Memcache”介绍了上述两个并发问题的规范解决方案。 引入“租用”的概念是为了防止缓存群和陈旧集的每高速缓存密钥锁定。 它依赖于两种常见的缓存系统操作:

  • atomic_add(key, value): set the provided value for key if and only if the key has not already been set. Otherwise, the operation is failed. In Memcached this is implemented as add, while in Redis –SETNX.

    atomic_add(key, value) :仅当尚未设置密钥时,才设置密钥的提供值。 否则,操作将失败。 在Memcached中,这是作为add实现的,而在Redis中是SETNX

  • atomic_check_and_set(key, expected_value, new_value): set the new_value for the provided key if and only if the key is currently associated with expected_value. In Memcached this is implemented as cas. Unfortunately (and surprisingly), Redis doesn’t have a command with such semantics, but this functionality gap can be closed trivially by a simple Lua script.

    atomic_check_and_set(key, expected_value, new_value)设置new_value用于所提供的key ,当且仅当key当前关联expected_value 。 在Memcached中,将其实现为cas 。 不幸的是(令人惊讶的是),Redis没有具有这种语义的命令,但是可以通过简单的Lua脚本轻松地弥补此功能上的不足。

With these concepts in mind, our read operation implementation can be amended as follows:

考虑到这些概念,可以将我们的读取操作实现修改如下:

《清理yum源缓存_缓存是万恶之源》 Read implementation amended for thundering herd and stale set protection. 阅读实施修正案,以保护雷电群和陈旧集。

This approach allows your cache to effectively shield the system of record from thundering herds. In case of a cache miss only one lucky request will successfully be able to add the lease and interact with the source of truth, while others will be relegated to polling the lease until the lucky request populates it with the calculated value.

这种方法使您的缓存可以有效地保护记录系统免遭雷电袭击​​。 在发生高速缓存未命中的情况下,只有一个幸运的请求将能够成功添加租约并与真相进行交互,而其他幸运的请求将被降级为轮询租约,直到幸运的请求用计算出的值填充它为止。

This mechanism also protects us from the race condition described above. Cache poisoning occurs when the system of record is mutated and cache is invalidated between the time when a reader fetches the data from the source of truth and the time when they put it in the cache. This model will prevent the reader from poisoning the cache because their atomic check-and-set will fail in case a writer changed the record underneath them.

这种机制还可以保护我们免受上述竞争状况的影响。 当记录系统发生突变并且在使读取器从真相源中获取数据的时间与将数据放入缓存的时间之间的缓存无效时,就会发生缓存中毒。 此模型将防止读取器中毒缓存,因为如果写入者更改了其下的记录,则其原子检查和设置将失败。

《清理yum源缓存_缓存是万恶之源》

Although for the time being he is puzzled, unfortunately, the cache devil does have more tricks up its sleeve. Consider a use-case where your data is consistently read frequently, but parts of it also undergo periodic bursts of frequent writes:

尽管暂时他感到困惑,但是不幸的是,缓存恶魔确实有更多的窍门。 考虑一个用例,在该用例中,您的数据始终被频繁地读取,但是其中的某些部分也会经历频繁的频繁写入突发:

《清理yum源缓存_缓存是万恶之源》 Reader 1 experiences a ridiculous amount of latency as it waits in vain for Reader 3 and Reader 2 to populate the cache key of interest. 读取器1徒劳地等待读取器3和读取器2填充感兴趣的缓存密钥时,经历了可笑的延迟。

A pathological condition may take place where during stretches of numerous writes, readers end up taking turns acquiring leases and querying the system of record only to have their lease cleared by a write. This in effect serializes reads from the source of truth but doesn’t deduplicate them, which ultimately causes very high read latencies and timeouts as readers wait to get their turn to fetch the value they need from the source of truth.

可能会发生病理状况,在大量写操作期间,读者最终轮流获取租约并查询记录系统,以仅通过写操作清除其租约。 实际上,这会序列化来自真理源的读取,但不会对它们进行重复数据删除,这最终会导致很高的读取延迟和超时,因为读者等待轮流从真理源中获取所需的价值。

We faced this problem within our distributed relational data service at Box and thought of a couple of solutions related to this. The approach we ended up going with drew upon the insight that any read that’s waiting on a lease can safely use the value retrieved by the reader that holds the lease, even if the lease ultimately ends up getting cleared by a writer and the final atomic_check_and_set fails. Indeed, if a reader encountered another reader’s lease, the reader must have arrived before the writer cleared the cache value and thus before the write was acknowledged, so both readers can return the value retrieved by the lease holder without sacrificing read-after-write consistency. To take advantage of this insight, in addition to performing the atomic_check_and_set to attempt to store the value computed from the source of truth into the cache, the reader who acquired the lease will also stash away the value in a different location in the cache that can be discovered by readers waiting on the lease.

我们在Box的分布式关系数据服务中遇到了这个问题,并想到了一些与此相关的解决方案。 我们最终采用的方法是基于这样的见解,即任何正在等待租约的读取都可以安全地使用持有租约的读取器检索的值,即使租约最终最终被写者清除并且最终的atomic_check_and_set失败。 实际上,如果某个读取器遇到了另一个读取器的租约,则该读取器必须在写入器清除缓存值之前以及因此在确认写入之前就已经到达,因此这两个读取器都可以返回由租约持有人检索的值,而不会牺牲写入后读取的一致性。 。 为了利用这种洞察力,除了执行atomic_check_and_set尝试将根据真相源计算出的值存储到高速缓存中之外,获得租约的读者还将把该值存储在高速缓存中的其他位置,从而可以被等待租约的读者发现。

Using a flowchart to illustrate the algorithm becomes complex and hard to read, but below is a code snippet that does it. The snippet is written in super procedural Java aimed at clarity of high-level approach with no attention given to error handling, compile-time safety, maintainability, etc.

使用流程图来说明算法变得复杂且难以阅读,但是下面是执行该算法的代码段。 该代码段是用超级过程Java编写的,旨在使高级方法清晰明了,而没有关注错误处理,编译时安全性,可维护性等。

public class LookasideCache {
    private CacheCluster cacheCluster;
    private LeaseUtils leaseUtils;


    /**
     * Read a value utilizing a lookaside cache.
     *
     * @param key whose value is to be returned
     * @param sourceOfTruthComputation the function to use to fetch the value from the
     *                                 source of truth if that should become necessary.
     *                                 The computation is presumed to be expensive and
     *                                 impose significant load on the source of truth
     *                                 data store.
     * @return a byte[] representing the value associated with the looked up key.
     */
    public byte[] read(byte[] key, Function<byte[], byte[]> sourceOfTruthComputation) {
        return read(key, sourceOfTruthComputation, Collections.EMPTY_SET);
    }


    /**
     * A private recursive helper for the above read method. Allows us to carry the set
     * of previously seen leases on the stack.
     */
    private byte[] read(
        byte[] key,
        Function<byte[], byte[]> sourceOfTruthComputation,
        Set<Lease> previouslySeenLeases
    ) {
        // Start by looking up the provided key as well as all previously seen leases.
        List<byte[]> cacheKeysToLookUp = new ArrayList<>();
        cacheKeysToLookUp.add(key);
        for (Lease previouslySeenLease : previouslySeenLeases) {
            cacheKeysToLookUp.add(previouslySeenLease.getNonce());
        }
        List<byte[]> valuesFromCacheServer = cacheCluster.get(cacheKeysToLookUp);
        byte[] valueForKey = valuesFromCacheServer.remove(0);
        // Check if the value is stashed behind one of the leases we've previously seen.
        for (byte[] valueForPreviouslySeenLease : valuesFromCacheServer) {
            if (valueForPreviouslySeenLease != null) {
                return valueForPreviouslySeenLease;
            }
        }


        if (valueForKey == null) {
            // The value is not in the cache. Let's try to grab a lease on this key.
            Lease newLease = leaseUtils.createNew();
            boolean leaseAddSucceeded = cacheCluster.atomicAdd(key, newLease.getNonce());
            if (leaseAddSucceeded) {
                // We managed to acquire a lease on this key. This means it's up to us
                // to go to the source of truth and populate the cache with the value
                // from there.
                byte[] valueFromSourceOfTruth = sourceOfTruthComputation.apply(key);
                boolean checkAndSetSuccceeded = cacheCluster.atomicCheckAndSet(
                    key,
                    newLease.getNonce(),
                    valueFromSourceOfTruth
                );
                // Let's use the lease nonce as the key and associate the value
                // we computed with it.
                cacheCluster.set(newLease.getNonce(), valueFromSourceOfTruth);
                return valueFromSourceOfTruth;
            } else {
                // Another request managed to acquire the lease before us. Let's retry.
                return read(key, sourceOfTruthComputation, previouslySeenLeases);
            }
        } else if (leaseUtils.isCacheValueLease(valueForKey)) {
            // Another cache request is holding the lease on this key. Let's give it
            // some time to fill it and try again.
            sleep(100);
            previouslySeenLeases.add(leaseUtils.fromCacheValue(valueForKey));
            return read(key, sourceOfTruthComputation, previouslySeenLeases);
        } else {
            // Got the value from the cache server, let's return it.
            return valueForKey;
        }
    }


    private void sleep(int millis) {
        try {
            Thread.sleep(millis);
        } catch (InterruptedException e) {
            System.err.println("Sleep interrupted.");
            e.printStackTrace();
        }
    }
}


interface CacheCluster {
    /**
     * Gets a list of keys from the cache cluster.
     * @param keys to get from the cache cluster.
     * @return a list of byte[] representing the values associated with the provided
     * keys. The returned list will be parallel to the provided list of keys, in other
     * words, the value associated with a certain key will be in the same position in
     * the returned list as the key is in the keys list. Nulls in the returned list will
     * represent cache misses. While this isn't a great way to design an API, it works
     * well for this high level illustration of the algorithm.
     */
    List<byte[]> get(List<byte[]> keys);


    /**
     * Sets associates the provided value with the provided key in the cache cluster.
     */
    void set(byte[] key, byte[] value);


    /**
     * Set the provided value for key if and only if the key has not already been set.
     * Otherwise, the operation is failed.
     * @return true if the operation succeeds and the value has been set,
     *         false otherwise.
     */
    boolean atomicAdd(byte[] key, byte[] value);


    /**
     * Set the valueToSet for the provided key if and only if the key is currently
     * associated with expectedValue.
     * @return true if the operation succeeds and the value has been set,
     *         false otherwise.
     */
    boolean atomicCheckAndSet(byte[] key, byte[] expectedValue, byte[] valueToSet);
}


interface Lease {
    byte[] getNonce();
}


interface LeaseUtils {
    Lease createNew();
    Lease fromCacheValue(byte[] nonce);
    boolean isCacheValueLease(byte[] value);
}

The devil has been stumped by this approach for a while now, as we’ve been using variants of this algorithm for millions of requests per second for the distributed relational data tier at Box. Being some of the devil’s most loyal customers, we hope this overview of our dealings with the beast helps in your struggle for performance and consistency.

魔鬼已经被这种方法困扰了一段时间,因为我们一直在Box的分布式关系数据层上使用这种算法的变体来每秒处理数百万个请求。 作为魔鬼最忠实的客户,我们希望本期与野兽打交道的概述有助于您为性能和一致性而奋斗。

If you’re interesting in joining us, check out our open opportunities.

如果您有兴趣加入我们,请查看我们的开放机会

翻译自: https://medium.com/box-tech-blog/cache-is-the-root-of-all-evil-e64ebd7cbd3b

清理yum源缓存

    原文作者:一二三是五六十
    原文地址: https://blog.csdn.net/weixin_26717681/article/details/108906773
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞