下面论文均为大数据和分布式比较经典的论文,包括:CAP、BASE、2PC、一致性协议、一致性哈希、逻辑时钟、Leases 等。如果大家还有比较好的论文,欢迎在下面评论。
文章目录
分布式理论
Time, Clocks, and the Ordering of Events in a Distributed System
(Paxos) The Part-Time Parliament
Paxos Made Live – An Engineering Perspective
Revisiting the Paxos algorithm
Distributed Snapshots: Determining Global States of Distributed Systems
Reaching Agreement in the Presence of Faults
(2PC) Concurrency Control and Recovery in Database Systems
An Overview of Clock Synchronization
Epidemic Algorithms for Replicated Database Maintenance
Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency
Weighted Voting for Replicated Data
A Quorum-Consensus Replication Method for Abstract Data Types
(Raft) In Search of an Understandable Consensus Algorithm
MapReduce: Simplified Data Processing on Large Clusters
Bigtable: A Distributed Storage System for Structured Data
The Chubby lock service for loosely-coupled distributed systems
Large-scale Incremental Processing Using Distributed Transactions and Notifications
Dremel: Interactive Analysis of Web-Scale Datasets
Omega: flexible, scalable schedulers for large compute clusters
MillWheel: Fault-Tolerant Stream Processing at Internet Scale
Large-scale cluster management at Google with Borg
Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications
Spanner: Google’s Globally-Distributed Database
F1: A Distributed SQL Database That Scales
Pregel: A System for Large-Scale Graph Processing
Amazon
Dynamo: Amazon’s Highly Available Key-value Store
Cassandra – A Decentralized Structured Storage System
Hive – A Warehousing Solution Over a Map-Reduce Framework
Riffle: Optimized Shuffle Service for Large-Scale Data Analytics
Streaming
S4: Distributed Stream Computing Platform
Microsoft
Schema-Agnostic Indexing with Azure DocumentDB
Apache Spark
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
Spark: Cluster Computing with Working Sets
GraphX: Graph Processing in a Distributed Dataflow Framework
MLlib: Machine Learning in Apache Spark
Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling
Shark: SQL and Rich Analytics at Scale
Spark SQL: Relational Data Processing in Spark
Discretized Streams: Fault-Tolerant Streaming Computation at Scale
Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters
Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark
Apache Hadoop
The Hadoop Distributed File System
Apache Hadoop YARN: Yet Another Resource Negotiator
Apache Flink
Apache Flink™: Stream and Batch Processing in a Single Engine
Lightweight Asynchronous Snapshots for Distributed Dataflows
Apache ZooKeeper
ZooKeeper’s atomic broadcast protocol: Theory and practice
ZooKeeper: Wait-free coordination for Internet-scale systems
Apache Mesos
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
Apache Kafka
Kafka: a Distributed Messaging System for Log Processing
KV Database
Serving Large-scale Batch Computed Data with Project Voldemort
Schedulers
Column-Stores vs. Row-Stores: How Different Are They
Really?