Hadoop集群配置

2019年6月9日 250次阅读来源: Spike_3154

Hadoop 集群的安装配置大致为如下流程:

选定一台机器作为 Master
在 Master 节点上配置 hadoop 用户、安装 SSH server、安装 Java 环境
在 Master 节点上安装 Hadoop，并完成配置
在其他 Slave 节点上配置 hadoop 用户、安装 SSH server、安装 Java 环境
将 Master 节点上的 /usr/local/hadoop 目录复制到其他 Slave 节点上
在 Master 节点上开启 Hadoop

为便于区分，首先修改各个主机的主机名

sudo vim /etc/hostname

将Master主机的主机名修改为Master,Slave主机的主机名修改为Slave1.

然后更改两个主机的映射

sudo vim /etc/hosts

将文件内容改为

192.168.128.128   Master
192.168.128.129   Slave1

完成后应保证Master与Slave节点之间能ping通。

ping Master -c 3   # 只ping 3次，否则要按 Ctrl+c 中断
ping Slave1 -c 3

设置SSH免密码登陆
首先在Master节点中生成共钥

cd ~/.ssh               # 如果没有该目录，先执行一次ssh localhost
rm ./id_rsa*            # 删除之前生成的公匙（如果有）
ssh-keygen -t rsa       # 一直按回车就可以

然后将公钥添加到Master节点的信任列表，使得Master可以SSH到本机

eversilver@debian:~/.ssh$ cat ./id_rsa.pub >> ~/.ssh/authorized_keys
eversilver@debian:~/.ssh$ ssh Master
The authenticity of host 'master (192.168.128.128)' can't be established.
ECDSA key fingerprint is 67:e8:69:98:28:91:04:20:5c:00:bb:6b:e8:bb:51:94.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'master,192.168.128.128' (ECDSA) to the list of known hosts.
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
You have new mail.
Last login: Mon May  8 09:59:42 2017 from 192.168.128.1
eversilver@debian:~$

同样将公钥传递给Slave1节点，并在Slave1上将其加入到其信任列表上。

eversilver@debian:~$ scp /home/eversilver/.ssh/id_rsa.pub eversilver@Slave1:/home/eversilver
The authenticity of host 'slave1 (192.168.128.129)' can't be established.
ECDSA key fingerprint is 67:e8:69:98:28:91:04:20:5c:00:bb:6b:e8:bb:51:94.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave1,192.168.128.129' (ECDSA) to the list of known hosts.
eversilver@slave1's password: 
id_rsa.pub                                                                                                                                                   100%  399     0.4KB/s   00:00

在Slave1上执行

eversilver@debian:/usr/local/hadoop$ cat ~/id_rsa.pub >> ~/.ssh/authorized_keys
eversilver@debian:/usr/local/hadoop$ rm ~/id_rsa.pub 
eversilver@debian:/usr/local/hadoop$

再在Master进行登陆

eversilver@debian:~$ ssh Slave1
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
You have new mail.
Last login: Mon May  8 12:05:08 2017 from 192.168.128.1
eversilver@debian:~$

说明配置成功

配置Hadoop环境变量

HADOOP_HOME=/usr/local/hadoop
export PATH=$HADOOP_HOME/bin:/$HADOOP_HOME/sbin:$PATH

配置分布式环境
集群/分布式模式需要修改 /usr/local/hadoop/etc/hadoop 中的5个配置文件，更多设置项可点击查看官方说明，这里仅设置了正常启动所必须的设置项： slaves、core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml 。

文件 slaves，将作为 DataNode 的主机名写入该文件，每行一个，默认为 localhost，所以在伪分布式配置时，节点即作为 NameNode 也作为 DataNode。分布式配置可以保留 localhost，也可以删掉，让 Master 节点仅作为 NameNode 使用。
本教程让 Master 节点仅作为 NameNode 使用，因此将文件中原来的 localhost 删除，只添加一行内容：Slave1。
文件 core-site.xml 改为下面的配置：

<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://Master:9000</value>
        </property>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>file:/usr/local/hadoop/tmp</value>
                <description>Abase for other temporary directories.</description>
        </property>
</configuration>

文件 hdfs-site.xml，dfs.replication 一般设为 3，但我们只有一个 Slave 节点，所以 dfs.replication 的值还是设为 1

<configuration>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>Master:50090</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>file:/usr/local/hadoop/tmp/dfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>file:/usr/local/hadoop/tmp/dfs/data</value>
        </property>
</configuration>

文件 mapred-site.xml

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>Master:10020</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>Master:19888</value>
        </property>
</configuration>

文件 yarn-site.xml：

<configuration>
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>Master</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
</configuration>

将 Master 上的 /usr/local/Hadoop 文件夹复制到各个节点上。因为之前有跑过伪分布式模式，建议在切换到集群模式前先删除之前的临时文件。在 Master 节点上执行：

cd /usr/local
sudo rm -r ./hadoop/tmp     # 删除 Hadoop 临时文件
sudo rm -r ./hadoop/logs/*   # 删除日志文件
tar -zcf ~/hadoop.master.tar.gz ./hadoop   # 先压缩再复制
cd ~
scp ./hadoop.master.tar.gz Slave1:/home/hadoop

在 Slave1 节点上执行：

sudo rm -r /usr/local/hadoop    # 删掉旧的（如果存在）
sudo tar -zxf ~/hadoop.master.tar.gz -C /usr/local
sudo chown -R hadoop /usr/local/hadoop

如果有其他 Slave 节点，也要执行将 hadoop.master.tar.gz 传输到 Slave 节点、在 Slave 节点解压文件的操作。

首次启动需要先在 Master 节点执行 NameNode 的格式化：

hdfs namenode -format       # 首次运行需要执行初始化，之后不需要

接着可以启动 hadoop 了，启动需要在 Master 节点上进行：

start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver

通过命令 jps 可以查看各个节点所启动的进程。正确的话，在 Master 节点上可以看到 NameNode、ResourceManager、SecondrryNameNode、JobHistoryServer 进程，如下:

eversilver@debian:/usr/local/hadoop$ jps
24178 NameNode
24356 SecondaryNameNode
24501 ResourceManager
13974 QuorumPeerMain
21607 NodeManager
19880 DataNode
24809 JobHistoryServer
24842 Jps

而在Slave1节点中可以看到DataNode以及NodeManager：

eversilver@debian:/usr/local/hadoop$ jps
6579 QuorumPeerMain
9044 DataNode
9141 NodeManager
7176 Kafka
9244 Jps

还需要在 Master 节点上通过命令 hdfs dfsadmin -report 查看 DataNode 是否正常启动，如果 Live datanodes 不为 0 ，则说明集群启动成功。

eversilver@debian:/usr/local/hadoop$ hdfs dfsadmin -report
Configured Capacity: 20091629568 (18.71 GB)
Present Capacity: 10727268352 (9.99 GB)
DFS Remaining: 10727243776 (9.99 GB)
DFS Used: 24576 (24 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Live datanodes (1):
Name: 192.168.128.129:50010 (Slave1)
Hostname: Slave1
Decommission Status : Normal
Configured Capacity: 20091629568 (18.71 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 9364361216 (8.72 GB)
DFS Remaining: 10727243776 (9.99 GB)
DFS Used%: 0.00%
DFS Remaining%: 53.39%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Mon May 08 14:11:54 CST 2017
eversilver@debian:/usr/local/hadoop$

上面显示的DataNode数量为1.

也可以通过 Web 页面看到查看 DataNode 和 NameNode 的状态：http://Master:50070/。
伪分布式、分布式配置切换时的注意事项

从分布式切换到伪分布式时，不要忘记修改 slaves 配置文件；
在两者之间切换时，若遇到无法正常启动的情况，可以删除所涉及节点的临时文件夹，这样虽然之前的数据会被删掉，但能保证集群正确启动。所以如果集群以前能启动，但后来启动不了，特别是 DataNode 无法启动，不妨试着删除所有节点（包括 Slave 节点）上的 /usr/local/hadoop/tmp 文件夹，再重新执行一次 hdfs namenode -format，再次启动试试。

执行分布式实例
首先创建 HDFS 上的用户目录：

hdfs dfs -mkdir -p /user/hadoop

将 /usr/local/hadoop/etc/hadoop 中的配置文件作为输入文件复制到分布式文件系统中.并通过查看 DataNode 的状态（占用大小有改变），输入文件确实复制到了 DataNode 中：

hdfs dfs -mkdir input
hdfs dfs -put /usr/local/hadoop/etc/hadoop/*.xml input

接着就可以运行 MapReduce 作业了：

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep input output 'dfs[a-z.]+'

运行时的输出信息与伪分布式类似，会显示 Job 的进度。

《Hadoop集群配置》

同样可以通过 Web 界面查看任务进度
http://master:8088/cluster，在 Web 界面点击 “Tracking UI” 这一列的 History 连接，可以看到任务的运行信息，如下图所示：

《Hadoop集群配置》

执行完毕后的输出结果：

《Hadoop集群配置》

关闭 Hadoop 集群也是在 Master 节点上执行的：

stop-yarn.sh
stop-dfs.sh
mr-jobhistory-daemon.sh stop historyserver

同伪分布式一样，也可以不启动 YARN，但要记得改掉 mapred-site.xml 的文件名。

    原文作者：Spike_3154
    原文地址: https://www.jianshu.com/p/f83229af1898
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。