完整视频和PPT链接
微云分享密码请关注公众号“bigdata_summit”
并回复 “kafka-sf2017”获取
Building Event-Driven Services with Stateful Streams
by Benjamin Stopford, Engineer, Confluent
video, slide
Event Driven Services come in many shapes and sizes from tiny event driven functions that dip into an event stream, right through to heavy, stateful services which can facilitate request response. This practical talk makes the case for building this style of system using Stream Processing tools. We also walk through a number of patterns for how we actually put these things together.
以下内容来自谷歌翻译:
事件驱动服务有许多形状和大小,从小事件驱动的功能,浸入事件流,直到重,有状态的服务,可以促进请求响应。这个实际的谈话使用Stream Processing工具来构建这种风格的系统。我们还通过一些模式,了解我们如何将这些东西放在一起。
Building Stateful Financial Applications with Kafka Streams
by Charles Reese, Senior Software Engineer, Funding Circle
video, slide
At Funding Circle, we are building a global lending platform with Apache Kafka and Kafka Streams to handle high volume, real-time processing with rapid clearing times similar to a stock exchange. In this talk, we will provide an overview of our system architecture and summarize key results in edge service connectivity, idempotent processing, and migration strategies.
以下内容来自谷歌翻译:
在资金圈,我们正在使用Apache Kafka和Kafka构建全球贷款平台,以处理大量实时处理快速清算时间类似于证券交易所。在本讲座中,我们将概述系统架构,并总结边缘服务连接,幂等处理和迁移策略的关键结果。
Fast Data in Supply Chain Planning
by Jeroen Soeters, Lead Developer, ThoughtWorks
video, slide
We are migrating one of the top 3 consumer packaged goods companies from a batch-oriented systems architecture to a streaming micro services platform. In this talk I’ll explain how we leverage the Lightbend reactive stack and Kafka to achieve this and how the 4 Kafka APIs fit in our architecture. Also I explain why Kafka Streams <3 Enterprise Integration Patterns.
以下内容来自谷歌翻译:
我们正在将三大消费品商品公司之一从批量导向的系统架构迁移到流式微服务平台。在这个演讲中,我将解释我们如何利用Lightbend反应堆栈和Kafka来实现这一点,以及API中适合的4个Kafka在我们的建筑。另外我解释为什么Kafka Streams <3企业集成模式。
Kafka Stream Processing for Everyone with KSQL
by Nick Dearden, Director of Engineering, Confluent
video, slide
The rapidly expanding world of stream processing can be daunting, with new concepts (various types of time semantics, windowed aggregates, changelogs, and so on) and programming frameworks to master. KSQL is a new open-source project which aims to simplify all this and make stream processing available to everyone.
以下内容来自谷歌翻译:
快速扩展的流处理世界可能是艰巨的,新的概念(各种类型的时间语义,窗口聚合,更改日志等)和编程框架来掌握。 KSQL是一个新的开源项目,旨在简化所有这些,并使流处理可用于所有人。
Portable Streaming Pipelines with Apache Beam
by Frances Perry, Software Engineer, Google
video, slide
Much as SQL stands as a lingua franca for declarative data analysis, Apache Beam aims to provide a portable standard for expressing robust, out-of-order data processing pipelines in a variety of languages across a variety of platforms. By cleanly separating the user’s processing logic from details of the underlying execution engine, the same pipelines will run on any Apache Beam runtime environment, whether it’s on-premise or in the cloud, on open source frameworks like Apache Spark or Apache Flink, or on managed services like Google Cloud Dataflow. In this talk, I will:
Briefly, introduce the capabilities of the Beam model for data processing and integration with IO connectors like Apache Kafka.
Discuss the benefits Beam provides regarding portability and ease-of-use.
Demo the same Beam pipeline running on multiple runners in multiple deployment scenarios (e.g. Apache Flink on Google Cloud, Apache Spark on AWS, Apache Apex on-premise).
Give a glimpse at some of the challenges Beam aims to address in the future.
以下内容来自谷歌翻译:
就像SQL作为声明式数据分析的通用语言一样,Apache Beam旨在提供一种便携式标准,用于在各种平台上以各种语言表达强大的无序数据处理流水线。通过干净地将用户的处理逻辑与底层执行引擎的细节分开,相同的管道将在任何Apache Beam运行时环境中运行,无论是内部部署还是在云中,在Apache <span class =’no’ > Spark </ span>或Apache Flink,或Google Cloud Dataflow等托管服务。在这个演讲中,我会:
简要介绍Beam模型的功能,用于数据处理和与IO连接器的集成,如Apache Kafka。
讨论Beam提供的便携性和易用性的优势。
演示在多个部署场景(例如,Google Cloud上的Apache Flink,AWS上的Apache Spark,Apache Apex内部部署)上运行的相同波束管道。
瞥见Beam将来面临的一些挑战。
Query the Application, Not a Database: “Interactive Queries” in Kafka’s Streams API
by Matthias Sax, Engineer, Confluent
video, slide
Kafka Streams allows to build scalable streaming apps without a cluster. This “Cluster-to-go” approach is extended by a “DB-to-go” feature: Interactive Queries allows to directly query app internal state, eliminating the need for an external DB to access this data. This avoids redundantly stored data and DB update latency, and simplifies the overall architecture, e.g., for micro-services.
以下内容来自谷歌翻译:
Kafka Streams允许构建可扩展的流应用而无需集群。这种“即插即用”方法通过“数据即发”功能扩展:交互查询允许直接查询应用程序内部状态,从而无需外部数据库来访问此数据。这避免了冗余存储的数据和DB更新延迟,并且简化了整体架构,例如用于微服务。
Real-Time Document Rankings with Kafka Streams
by Hunter Kelly, Senior Software/Data Engineer, Zalando
video, slide
The HITS algorithm creates a score for documents; one is “hubbiness”, the other is “authority”. Usually this is done as a batch operation, working on all the data at once. However, with careful consideration, this can be implemented in a streaming architecture using KStreams and KTables, allowing efficient real time sampling of rankings at a frequency appropriate to the specific use case.
以下内容来自谷歌翻译:
HITS算法为文档创建分数;一个是“喧嚣”,另一个是“权威”。通常这是作为批处理操作完成的,同时处理所有数据。然而,仔细考虑,这可以在使用KStreams和KTables的流式架构中实现,允许以适合特定用例的频率进行有效的实时抽样排序。
Streaming Processing in Python – 10 ways to avoid summoning Cuthulu
by Holden Karau, Principal Software Engineer, IBM
video, slide
<3 Python & want to process data from Kafka? This talk will look how to make this awesome. In many systems the traditional approach involves first reading the data into the JVM and then passing the data to Python, which can be a little slow, and on a bad day results in almost impossible to debug. This talk will look at how to be more awesome in Spark & how to do this in Kafka Streams.
以下内容来自谷歌翻译:
<3 Python&想要从Kafka处理数据?这个讲话会看起来如何使这个真棒。在许多系统中,传统的方法包括首先将数据读入JVM,然后将数据传递给Python,这可能有点慢,而在糟糕的一天,几乎不可能调试。这个演讲将会在Spark中如何做得更好,如何在Kafka Streams中执行此操作。