[转]大数据和分布式经典论文汇总

下面论文均为大数据和分布式比较经典的论文,包括:CAP、BASE、2PC、一致性协议、一致性哈希、逻辑时钟、Leases 等。如果大家还有比较好的论文,欢迎在下面评论。

文章目录

1 分布式理论

2 Google

3 Amazon

4 Facebook

5 Streaming

6 Microsoft

7 Apache Spark

8 Apache Hadoop

9 Apache Flink

10 Apache ZooKeeper

11 Apache Mesos

12 Apache Kafka

13 KV Database

14 Schedulers

分布式理论

Time, Clocks, and the Ordering of Events in a Distributed System

(Paxos) The Part-Time Parliament

Paxos Made Simple

Paxos Made Practical

Paxos Made Live – An Engineering Perspective

Revisiting the Paxos algorithm

Distributed Snapshots: Determining Global States of Distributed Systems

Reaching Agreement in the Presence of Faults

The Byzantine General Problem

(CAP) Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services

(2PC) Concurrency Control and Recovery in Database Systems

BASE: An Acid Alternative

An Overview of Clock Synchronization

Epidemic Algorithms for Replicated Database Maintenance

Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency

Weighted Voting for Replicated Data

A Quorum-Consensus Replication Method for Abstract Data Types

Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web

(Raft) In Search of an Understandable Consensus Algorithm

Google

The Google File System

MapReduce: Simplified Data Processing on Large Clusters

Bigtable: A Distributed Storage System for Structured Data

The Chubby lock service for loosely-coupled distributed systems

Large-scale Incremental Processing Using Distributed Transactions and Notifications

Dremel: Interactive Analysis of Web-Scale Datasets

Omega: flexible, scalable schedulers for large compute clusters

MillWheel: Fault-Tolerant Stream Processing at Internet Scale

Large-scale cluster management at Google with Borg

Dapper, a Large-Scale Distributed Systems Tracing Infrastructure

Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications

Spanner: Google’s Globally-Distributed Database

F1: A Distributed SQL Database That Scales

Pregel: A System for Large-Scale Graph Processing

The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing

Amazon

Dynamo: Amazon’s Highly Available Key-value Store

Facebook

Cassandra – A Decentralized Structured Storage System

Hive – A Warehousing Solution Over a Map-Reduce Framework

Riffle: Optimized Shuffle Service for Large-Scale Data Analytics

Streaming

S4: Distributed Stream Computing Platform

Microsoft

Schema-Agnostic Indexing with Azure DocumentDB

Apache Spark

Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing

Spark: Cluster Computing with Working Sets

GraphX: Graph Processing in a Distributed Dataflow Framework

MLlib: Machine Learning in Apache Spark

Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling

Shark: SQL and Rich Analytics at Scale

Spark SQL: Relational Data Processing in Spark

Discretized Streams: Fault-Tolerant Streaming Computation at Scale

Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters

Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark

Apache Hadoop

The Hadoop Distributed File System

Apache Hadoop YARN: Yet Another Resource Negotiator

Apache Flink

Apache Flink™: Stream and Batch Processing in a Single Engine

Lightweight Asynchronous Snapshots for Distributed Dataflows

Apache ZooKeeper

ZooKeeper’s atomic broadcast protocol: Theory and practice

ZooKeeper: Wait-free coordination for Internet-scale systems

Apache Mesos

Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center

Apache Kafka

Kafka: a Distributed Messaging System for Log Processing

KV Database

Serving Large-scale Batch Computed Data with Project Voldemort

Schedulers

Column-Stores vs. Row-Stores: How Different Are They

Really?

    原文作者:贺大伟
    原文地址: https://www.jianshu.com/p/ac0ba88b2650
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞