基于YARN的HADOOP分布式集群安装

HADOOP分布式集群安装

前提

准备工作

修改HOSTS,在文件尾加入IP及对应的HOSTNAME,详细教程

  • vim /etc/hosts

192.168.2.8 master-8
192.168.2.5 slave-5
192.168.2.9 slave-9

> 
> **SSH集群机器相互免密登录**
> * 安装SSH,并生成密钥,[教程](http://www.jianshu.com/p/c3c87697d93c)
> * 将各SLAVE节点的公钥发送到MASTER节点
> ```
scp  ~/.ssh/id_rsa.pub root@master-8:~/.ssh/id_rsa.pub.slave-5
scp  ~/.ssh/id_rsa.pub root@master-8:~/.ssh/id_rsa.pub.slave-9
  • 在MASTER中,将收到的SLAVE公钥和MASTER自己的公钥加入认证文件authorized_keys
    cat ~/.ssh/id_rsa.pub* >> ~/.ssh/authorized_keys
  • 将认证文件分发到各SLAVE节点

scp ~/.ssh/authorized_keys root@slave-5:~/.ssh/
scp ~/.ssh/authorized_keys root@slave-9:~/.ssh/

> * 检验
> ```
ssh master-8
ssh slave-5
ssh slave-9

JAVA安装,安装教程

下载

  • 官方下载地址
  • 本文下载的版本hadoop-2.7.3.tar.gz
  • wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz

安装

  • tar zxvf hadoop-2.7.3.tar.gz -C /usr/local

配置

  • 修改HADOOP配置文件
  • cd /usr/local/hadoop-2.7.3/etc/hadoop
  • 修改hadoop-env.shyarn-env.sh的JAVA_HOME
    export JAVA_HOME=/usr/local/jdk1.8.0_111
  • slaves中添加SLAVE的IP或HOSTNAME

slave-5
slave-9

> * 配置`core-site.xml`
> ```xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master-8:9000/</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>file:/usr/local/hadoop-2.7.3/tmp</value>
    </property>
</configuration>
  • 配置hdfs-site.xml

<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master-8:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop-2.7.3/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop-2.7.3/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>

* 配置`mapred-site.xml`
* `sudo cp mapred-site.xml.template mapred-site.xml`
```xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>
  • 配置yarn-site.xml
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>master-8:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>master-8:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master-8:8035</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>master-8:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master-8:8088</value>
    </property>
</configuration>

分发

scp -r hadoop-2.7.3/ root@slave-5:/usr/local
scp -r hadoop-2.7.3/ root@slave-9:/usr/local


## 启动
> * ```
cd /usr/local/hadoop-2.7.3
bin/hadoop namenode -format
sbin/start-dfs.sh
sbin/start-yarn.sh

检验

  • http://localhost:8088
    原文作者:yaohwang
    原文地址: https://www.jianshu.com/p/4155a75ce7b8
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞