《Kafka:The Definitive Guide》第一章Meet Kafka问题集

1.什么是publish/subscribe消息模型?

Publish/subscribe messaging is a pattern that is characterized by the sender (publisher) of a piece of data (message) not specifically directing it to a receiver. Instead, the publisher classifies the message somehow, and that receiver (subscriber) subscribes to receive certain classes of messages. Pub/sub systems often have a broker, a central point where messages are published, to facilitate this.

2.相比Direct Connection/RPC方式,消息队列有什么好处呢?

《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》
《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》 可以解耦啊

《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》
《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》
3.什么是kafka?

Apache Kafka is a publish/subscribe messaging system designed to solve this problem.It is often described as a “distributed commit log”. A filesystem or database commit log is designed to provide a durable record of all transactions so that they can be replayed to consistently build the state of a system. Similarly, data within Kafka is stored durably, in order, and can be read deterministically.

4.kafka message的组成?

key(optional): 用于partition,相同key的message写入同一个partition。

value:就是消息content了。

5.什么是batch?

为了写入效率和减少网络信息传递(每个message都需要ack),可以用batch来一次性提交多个message。

6.对于kafka来说,message的格式是什么?

是raw bytes,所以在传入的时候需要序列化。一般用json或xml。

7.什么是topic?

就是消息的类型啊,consumer可以按topic进行订阅。

8.topic为什么还要划分成partition?

因为要考虑topic的可扩展啊,不同的partition可以在不同的机器上。

由于分成了partition,所以也导致了不好的事情发生:时序无法保证了,只能保证一个partition里的message是有序的,不能保证消费多个partition时是全局有序的。

这里还有个知乎的答案是解释这个问题的。

《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》
《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》
9.consumer在消费时要保存offset嘛?

consumer需要保存每个partition的offset,也就是在每个partition上消费到了哪里,这样如果consumer重启了,也可以根据offset继续消费。否则可能会丢消息的。

10.什么是consumer group?

consumer group就是为了可以水平扩展consumer而产生的,就是一堆consumer的集合。

这些consumer在消费时,每个partition只对应到一个consumer。这个对应关系叫做ownership of the partition by the consumer。

如果一个consumer挂了,那么其他consumer会把它的partition接管过来消费。

《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》
《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》
11.什么是broker?

A single Kafka server is called a broker.The broker receives messages from producers, assigns offsets to them, and commits the messages to storage on disk. It also services consumers, responding to fetch requests for partitions and responding with the messages that have been committed to disk. Depending on the specific hardware and its performance characteristics, a single broker can easily handle thousands of partitions and millions of messages per second.

12.可以搭建broker的集群嘛?

当然可以喽。

当搭建了broker集群后,就会自动选举出一个broker作为controller,它负责管理这些broker。

13.什么是leader?

暂时还没搞明白这个概念,先贴个图。

《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》
《《Kafka:The Definitive Guide》第一章Meet Kafka问题集》
14.什么是retention?

就是消息的过期时间。不同的topic可以设置不同的过期时间。

15.Kafka在实际中的使用场景?

a.activity tracking,比如实时的用户行为,生成实时报表,机器学习等。

b.task queue,比如发注册邮件。

c.日志收集,搜索,监控。

16.Kafka是哪个公司开源的?

LinkedIn。

17.Kafka为什么叫做Kafka呢?

I thought that since Kafka was a system optimized for writing using a writer’s name

would make sense. I had taken a lot of lit classes in college and liked Franz Kafka. Plus

the name sounded cool for an open source project.

So basically there is not much of a relationship.

—Jay Kreps

    原文作者:衫秋南
    原文地址: https://zhuanlan.zhihu.com/p/27255092
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞