Docker搭建Kafka集群
环境准备
操作系统
CentOS7.6
安装Docker
参照安装(点击)
单实例(Without Docker)
安装JDK
去官网上下载1.8版本的tar.gz ,如果使用yum安装或者下载rpm包安装,则会缺少Scala2.11需要的部分文件。
tar xf jdk-8u221-linux-x64.tar -C /usr/lib/jvm
rm -rf /usr/bin/java
ln -s /usr/lib/jvm/jdk1.8.0_221/bin/java /usr/bin/java
编辑文件
vim /etc/profile.d/java.sh
添加
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_221
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=${JAVA_HOME}/lib:${JRE_HOME}/lib:$CLASSPATH
export PATH=${JAVA_HOME}/bin:$PATH
然后使环境变量生效
source /etc/profile
执行以下命令检查环境变量
[root@vm1 bin]# echo $JAVA_HOME
/usr/lib/jvm/jdk1.8.0_221
[root@vm1 bin]# echo $JAVA_HOME
/usr/lib/jvm/jdk1.8.0_221
下载安装包
从官网获取下载地址
wget https://www.apache.org/dyn/closer.cgi?path=/kafka/2.3.0/kafka_2.11-2.3.0.tgz
解压
tar xf kafka_2.11-2.3.0.tgz -C /opt/
启动进程
因为kafka启动依赖于zookeeper,先启动zookeeper
cd /opt/kafka_2.11-2.3.0/bin
./zookeeper-server-start.sh -daemon ../config/zookeeper.properties
启动以后启动kafka进程
./kafka-server-start.sh -daemon ../config/server.properties
检查端口2181和9092,确认zookeeper和kafka已经启动
测试
#创建topic
./kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test
#查看topic
./kafka-topics.sh --list --bootstrap-server localhost:9092
#生产消息
/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
This is a message
This is another message
#消费队列
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
确保以上操作都能够正常进行
集群安装(Without Docker)
准备
启动三台虚拟机,vm1、vm2、vm3在一个子网当中。三台机器按照单实例模式进行安装。
启动zookeeper 集群
zookeeper集群配置参考这里
可以启动使用kafka内部的zookeeper搭建集群,也可以使用zookeeper的docker集群。
后续过程使用zookeeper的docker集群,即集群配置为”localhost:2181,localhost:2182,localhost:2183″
更改配置
- 需要把kafka的broker id 设置为不同的,这里分别把vm1,vm2,vm3的上面实例的broker id 设置为1,2,3
- 需要把kafka的zookeeper地址设置为集群地址
- /tmp/kafka-logs目录下的meta.properties里面由kafka的broker id,可能会造成broker不匹配,可以删掉。
这里使用
执行以下命令替换配置
#vm1
sed -i 's/broker.id=0/broker.id=1/g' /opt/kafka_2.11-2.3.0/config/server.properties
sed -i 's/zookeeper.connect=localhost:2181/zookeeper.connect=localhost:2181,localhost:2182,localhost:2183\/kafka/g' /opt/kafka_2.11-2.3.0/config/server.properties
rm -rf /tmp/kafka-logs
#vm2
sed -i 's/broker.id=0/broker.id=2/g' /opt/kafka_2.11-2.3.0/config/server.properties
sed -i 's/zookeeper.connect=localhost:2181/zookeeper.connect=localhost:2181,localhost:2182,localhost:2183\/kafka/g' /opt/kafka_2.11-2.3.0/config/server.properties
rm -rf /tmp/kafka-logs
#vm3
sed -i 's/broker.id=0/broker.id=3/g' /opt/kafka_2.11-2.3.0/config/server.properties
sed -i 's/zookeeper.connect=localhost:2181/zookeeper.connect=localhost:2181,localhost:2182,localhost:2183\/kafka/g' /opt/kafka_2.11-2.3.0/config/server.properties
rm -rf /tmp/kafka-logs
启动
在vm1,vm2,vm3上执行启动命令
./kafka-server-start.sh -daemon ../config/server.properties
执行jps,检查进程是否存在
继续通过zookeeper检查
docker run -it --rm zookeeper zkCli.sh -server vm1:2181
进入后敲
ls /kafka/brokers/ids
看是否1,2,3三个节点全在,如果不能发现,有时重启可以解决。
测试
#创建topic
./kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 3 --partitions 3 --topic test-cluster
#查看topic
./kafka-topics.sh --list --bootstrap-server localhost:9092
#生产消息
./kafka-console-producer.sh --broker-list localhost:9092 --topic test-cluster
This is a message
This is another message
#消费队列
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-cluster --from-beginning
Docker 单实例
启动zookeeper集群
仍然使用上面搭建zookeeper集群
启动镜像
docker run -d --name kafka --hostname kafka \
-p 9092:9092 --restart=always \
-e KAFKA_ADVERTISED_HOST_NAME=vm1 -e KAFKA_ADVERTISED_PORT=9092 \
-e KAFKA_ZOOKEEPER_CONNECT=vm1:2181,vm1:2182,vm1:2183 \
wurstmeister/kafka:latest
KAFKA_ADVERTISED_HOST_NAME设置的是docker host的ip
KAFKA_ADVERTISED_PORT设置的是docker host上暴露的端口
server.configure的大部分配置都可以通过设置环境变量转换成想要的值,例如KAFKA_ADVERTISED_HOST_NAME对应的是advertised.host.name
测试
#创建topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --create --bootstrap-server vm1:9092 --replication-factor 1 --partitions 1 --topic test
#查看topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server vm1:9092
#生产消息
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-producer.sh --broker-list vm1:9092 --topic test
This is a message
This is another message
#消费队列
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
监控节点
docker run -itd --name kafka-manager --hostname kafka-manager \
-p 9000:9000 --restart=always \
-e ZK_HOSTS=vm1:2181,vm1:2182,vm1:2183 \
sheepkiller/kafka-manager
访问监控节点 http://vm1:9000
overlay网络集群
准备基础环境
部署三台机器vm1,vm2,vm3。基础环境和单实例一样。
创建swarm overlay网络
创建数据目录
在vm1,vm2,vm3上分别创建目录
#vm1
mkdir -p /opt/volumns/kafka-1/kafka-logs-kafka
mkdir -p /opt/volumns/kafka-1/logs
#vm2
mkdir -p /opt/volumns/kafka-2/kafka-logs-kafka
mkdir -p /opt/volumns/kafka-2/logs
#vm3
mkdir -p /opt/volumns/kafka-3/kafka-logs-kafka
mkdir -p /opt/volumns/kafka-3/logs
启动zookeeper
#vm1
docker run -d --name=zookeeper-1 --hostname=zookeeper-1 \
--network=overlay --restart=always \
-p 2181:2181 -p 8080:8080 -e ZOO_MY_ID=1 \
-e ZOO_SERVERS="server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181" \
-v /etc/localtime:/etc/localtime -v \
zookeeper
#vm2
docker run -d --name=zookeeper-2 --hostname=zookeeper-2 \
--network=overlay --restart=always \
-p 2181:2181 -p 8080:8080 -e ZOO_MY_ID=2 \
-e ZOO_SERVERS="server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181" \
-v /etc/localtime:/etc/localtime \
zookeeper
#vm3
docker run -d --name=zookeeper-3 --hostname=zookeeper-3 \
--network=overlay --restart=always \
-p 2181:2181 -p 8080:8080 -e ZOO_MY_ID=3 \
-e ZOO_SERVERS="server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181" \
-v /etc/localtime:/etc/localtime \
zookeeper
这里的zookeeper必须启动在overlay网络当中
启动命令
#vm1
docker run -d --name kafka-1 --hostname kafka-1 -p 9092:9092 \
--restart=always --network=overlay \
-e KAFKA_ADVERTISED_HOST_NAME=vm1 -e KAFKA_ADVERTISED_PORT=9092 \
-e KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 \
-e KAFKA_HOST_NAME=0.0.0.0 -e KAFKA_BROKER_ID=1 \
-v /opt/volumns/kafka-1/kafka-logs-kafka:/kafka/kafka-logs-kafka \
-v /opt/volumns/kafka-1/logs:/opt/kafka/logs \
-v /etc/localtime:/etc/localtime \
wurstmeister/kafka:latest
#vm2
docker run -d --name kafka-2 --hostname kafka-2 -p 9092:9092 \
--restart=always --network=overlay \
-e KAFKA_ADVERTISED_HOST_NAME=vm2 -e KAFKA_ADVERTISED_PORT=9092 \
-e KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 \
-e KAFKA_HOST_NAME=0.0.0.0 -e KAFKA_BROKER_ID=2 \
-v /opt/volumns/kafka-2/kafka-logs-kafka:/kafka/kafka-logs-kafka \
-v /opt/volumns/kafka-2/logs:/opt/kafka/logs \
-v /etc/localtime:/etc/localtime \
wurstmeister/kafka:latest
#vm3
docker run -d --name kafka-3 --hostname kafka-3 -p 9092:9092 \
--restart=always --network=overlay \
-e KAFKA_ADVERTISED_HOST_NAME=vm3 -e KAFKA_ADVERTISED_PORT=9092 \
-e KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181 \
-e KAFKA_HOST_NAME=0.0.0.0 -e KAFKA_BROKER_ID=3 \
-v /opt/volumns/kafka-3/kafka-logs-kafka:/kafka/kafka-logs-kafka \
-v /opt/volumns/kafka-3/logs:/opt/kafka/logs \
-v /etc/localtime:/etc/localtime \
wurstmeister/kafka:latest
- KAFKA_ADVERTISED_HOST_NAME要使用docker host或者DNS的ip,不能使用容器的hostname
advertise系列配置是为了返回访问数据的地址,所以需要获得client能访问到ip和端口。 - KAFKA_ADVERTISED_PORT要使用docker host或者DNS的port,理由同上。
- KAFKA_HOST_NAME要用0.0.0.0,这样的话 -p 参数才能够代理kafka的地址。
检查节点
测试zookeeper 里面的kafka节点是否都在
docker run -it --rm zookeeper zkCli.sh -server vm1:2181
进去后执行
ls /brokers/ids
显示
[1, 2, 3]
测试
#创建topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --create --bootstrap-server vm1:9092,vm2:9092,vm3:9092 --replication-factor 3 --partitions 3 --topic test
#查看topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server vm1:9092,vm2:9092,vm3:9092
#生产消息
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-producer.sh --broker-list vm1:9092,vm2:9092,vm3:9092 --topic test
This is a message
This is another message
#消费队列
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server vm1:9092,vm2:9092,vm3:9092 --topic test --from-beginning
Docker Stack搭建集群
启动zookeeper
zookeeper集群配置参考这里
参考上文的docker stack集群部署
使用上文的目录i和配置,但不启动。zookeeper的启动配置和kafka一起都放在下面的yaml配置文件中,作为一个stack使用。
编写配置
由于docker stack 不支持depends_on语法,这里没有让kafka依赖zookeeper。
你可以自行搜索dockerize功能来实现依赖。
version: "3"
services:
zookeeper-1:
image: zookeeper
hostname: zookeeper-1
networks:
- overlay
ports:
- 2181:2181
- 8080:8080
environment:
ZOO_MY_ID: 1
ZOO_SERVERS: server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181
volumes:
- /etc/localtime:/etc/localtime
- /opt/volumns/zookeeper-1/data:/data
- /opt/volumns/zookeeper-1/datalog:/datalog
deploy:
restart_policy:
condition: on-failure
replicas: 1
placement:
constraints:
- node.hostname==vm1
zookeeper-2:
image: zookeeper
hostname: zookeeper-2
networks:
- overlay
ports:
- 2182:2181
- 8081:8080
environment:
ZOO_MY_ID: 2
ZOO_SERVERS: server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181
volumes:
- /etc/localtime:/etc/localtime
- /opt/volumns/zookeeper-2/data:/data
- /opt/volumns/zookeeper-2/datalog:/datalog
deploy:
restart_policy:
condition: on-failure
replicas: 1
placement:
constraints:
- node.hostname==vm2
zookeeper-3:
image: zookeeper
hostname: zookeeper-3
networks:
- overlay
ports:
- 2183:2181
- 8082:8080
environment:
ZOO_MY_ID: 3
ZOO_SERVERS: server.1=zookeeper-1:2888:3888;2181 server.2=zookeeper-2:2888:3888;2181 server.3=zookeeper-3:2888:3888;2181
volumes:
- /etc/localtime:/etc/localtime
- /opt/volumns/zookeeper-3/data:/data
- /opt/volumns/zookeeper-3/datalog:/datalog
deploy:
restart_policy:
condition: on-failure
replicas: 1
placement:
constraints:
- node.hostname==vm3
kafka-1:
image: wurstmeister/kafka
hostname: kafka-1
networks:
- overlay
ports:
- 9092:9092
environment:
- KAFKA_ADVERTISED_HOST_NAME=vm1
- KAFKA_ADVERTISED_PORT=9092
- KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
- KAFKA_HOST_NAME=0.0.0.0
- KAFKA_BROKER_ID=1
volumes:
- /etc/localtime:/etc/localtime
- /opt/volumns/kafka-1/kafka-logs-kafka:/kafka/kafka-logs-kafka
- /opt/volumns/kafka-1/logs:/opt/kafka/logs
deploy:
restart_policy:
condition: on-failure
replicas: 1
placement:
constraints:
- node.hostname==vm1
kafka-2:
image: wurstmeister/kafka
hostname: kafka-2
networks:
- overlay
ports:
- 9093:9092
environment:
- KAFKA_ADVERTISED_HOST_NAME=vm1
- KAFKA_ADVERTISED_PORT=9093
- KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
- KAFKA_HOST_NAME=0.0.0.0
- KAFKA_BROKER_ID=2
volumes:
- /etc/localtime:/etc/localtime
- /opt/volumns/kafka-2/kafka-logs-kafka:/kafka/kafka-logs-kafka
- /opt/volumns/kafka-2/logs:/opt/kafka/logs
deploy:
restart_policy:
condition: on-failure
replicas: 1
placement:
constraints:
- node.hostname==vm2
kafka-3:
image: wurstmeister/kafka
hostname: kafka-3
networks:
- overlay
ports:
- 9094:9092
environment:
- KAFKA_ADVERTISED_HOST_NAME=vm1
- KAFKA_ADVERTISED_PORT=9094
- KAFKA_ZOOKEEPER_CONNECT=zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181
- KAFKA_HOST_NAME=0.0.0.0
- KAFKA_BROKER_ID=3
volumes:
- /etc/localtime:/etc/localtime
- /opt/volumns/kafka-3/kafka-logs-kafka:/kafka/kafka-logs-kafka
- /opt/volumns/kafka-3/logs:/opt/kafka/logs
deploy:
restart_policy:
condition: on-failure
replicas: 1
placement:
constraints:
- node.hostname==vm3
networks:
overlay:
driver: overlay
启动集群
docker stack deploy -c kafka.yaml kafka
检查节点
测试zookeeper 里面的kafka节点是否都在
docker run -it --rm zookeeper zkCli.sh -server vm2:2182
进去后执行
ls /brokers/ids
显示
[1, 2, 3]
测试
#创建topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --create --bootstrap-server vm1:9092,vm1:9093,vm1:9094 --replication-factor 3 --partitions 3 --topic test
#查看topic
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server vm1:9092,vm1:9093,vm1:9094
#生产消息
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-producer.sh --broker-list vm1:9092,vm1:9093,vm1:9094 --topic test
This is a message
This is another message
#消费队列
docker run -it --rm wurstmeister/kafka:latest /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server vm1:9092,vm1:9093,vm1:9094 --topic test --from-beginning