Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】

一、目的

记录Spark集群框架搭建及实验自学心得。

二、准备工作

  1. VMware 15 Pro
  2. Centos7
  3. JDK 1.8
  4. Hadoop 2.7.2
  5. SecureCRT version 8.5
  6. Scala 2.12.7
  7. Spark 2.3.1
  8. Zookeeper 3.4.10
  9. HBase 2.0.2
  10. Hive 2.3.4

三、安装过程

3.1 在虚拟机中安装CentOS7

3.1.1 虚拟机设置

打开VMware15Pro,并创建虚拟机。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

选择典型安装。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

设定稍后安装本地已下载好的Centos7系统。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述
《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述
《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述
《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

3.1.2 安装Linux系统

载入CentOS7安装文件。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

开启此虚拟机,系统文件自动导入。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述
《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

CentOS7系统安装设置。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述
《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

考虑到默认安装软件选择是“最小安装”,该方式安装后需要手动添加资源较多,将其更替为“GNOME桌面”。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

用户设置,为了避免后期hadoop集群环境搭建时候反复切换权限用户,可以选择只建立root账户。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述
《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

完成安装。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

3.2 JAVA环境

3.2.1 卸载Linux自带的jdk

查看系统自带的jdk

[root@master ~]# java -version
openjdk version "1.8.0_161"
OpenJDK Runtime Environment (build 1.8.0_161-b14)
OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)

查询系统自带的java文件,根据不同的系统版本,输入rpm -qa | grep jdk或者rpm -qa | grep java

[root@master ~]# rpm -qa | grep jdk
java-1.7.0-openjdk-headless-1.7.0.171-2.6.13.2.el7.x86_64
java-1.8.0-openjdk-headless-1.8.0.161-2.b14.el7.x86_64
java-1.7.0-openjdk-1.7.0.171-2.6.13.2.el7.x86_64
java-1.8.0-openjdk-1.8.0.161-2.b14.el7.x86_64
copy-jdk-configs-3.3-2.el7.noarch

删除noarch文件以外的其他文件,输入rpm -e --nodeps 需要卸载的安装文件名

[root@master ~]# rpm -e --nodeps java-1.7.0-openjdk-headless-1.7.0.171-2.6.13.2.el7.x86_64
[root@master ~]# rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.161-2.b14.el7.x86_64
[root@master ~]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.171-2.6.13.2.el7.x86_64
[root@master ~]# rpm -e --nodeps java-1.8.0-openjdk-1.8.0.161-2.b14.el7.x86_64

查看是否已经删除完毕

[root@master ~]# java -version
bash: /usr/bin/java: 没有那个文件或目录

3.2.2 下载并安装最新版本的jdk

jdk下载可分成两种情况:
A.在虚拟机中借助自带的火狐浏览器,将jdk文件下载到虚拟机中。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

默认下载到Linux系统的下载文件中。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

B.将jdk直接下载到本地windows系统,然后通过SecureCRT等工具导入虚拟机中,本次试验采用该法。

[root@,master ~]# rz
rz waiting to receive.
Starting zmodem transfer.  Press Ctrl+C to cancel.
Transferring jdk-8u181-linux-x64.tar.gz...
  100%  181295 KB    36259 KB/sec    00:00:05       0 Errors

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

由于本机直接root用户登录,通过rz命令后jdk载入到/root/Home路径。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

将idk安装包转移到系统文件中,可以通过madir命令,也可以直接定位到安装文件然后手动转移并修改jdk路径,本次试验首先在opt文件下新建一个java文件,然后将jdk放入/opt/java路径下。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

通过tar -zxvf jdk-8u181-linux-x64.tar.gz命令解压安装包。

[root@master ~]# cd /opt/java
[root@master java]# tar -zxvf jdk-8u181-linux-x64.tar.gz

3.2.3 环境变量设置

通过vi /etc/profile或者vim /etc/profile进入profile文件的编辑状态(vim相关编辑命令请自行百度),也可直接在Linux系统下直接进入/etc/profile路径进行操作。最后,将以下内容复制到profile文件的最后。

#java environment
export JAVA_HOME=/opt/java/jdk1.8.0_181
export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
export PATH=$PATH:${JAVA_HOME}/bin

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

输入source /etc/profile使得刚才的修改生效,同时java -version再次查看java是否已经完成安装。

[root@master ~]# source /etc/profile
[root@master ~]# java -version
java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

3.3 SSH免密登陆

3.3.1 准备工作

查看是否安装SSH,一般Linux系统默认安装。

[root@master ~]# rpm -qa |grep ssh
openssh-clients-7.4p1-16.el7.x86_64
libssh2-1.4.3-10.el7_2.1.x86_64
openssh-7.4p1-16.el7.x86_64
openssh-server-7.4p1-16.el7.x86_64

借助vi /etc/host修改机器名和IP。

master  192.168.31.237
slave1  192.168.31.238
slave2  192.168.31.239

3.3.2 设置免密登陆

生成公钥与私钥。

[root@master ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): y
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in y.
Your public key has been saved in y.pub.
The key fingerprint is:
SHA256:+cCJUbTOrw0ON9gjKK7D5rsdNRcWlrNFXxpZpDY2jM4 root@slave2
The key's randomart image is:
+---[RSA 2048]----+
|       +=. .++   |
|      .+.o+.=    |
|      .o=. X     |
|      .B+oo o    |
|     o..SE       |
|    ..oo +       |
|. ... + * o      |
|.+...  = *       |
|+*+.    o .      |
+----[SHA256]-----+
[root@master ~]# 

合并公钥到authorized_keys文件,在master服务器,进入/root/.ssh目录,通过SSH命令合并。

[root@master ~]# cd /root/.ssh
[root@master ~]# cat id_rsa.pub>> authorized_keys
[root@master ~]# ssh root@192.168.31.238 cat ~/.ssh/id_rsa.pub >> authorized_keys
[root@master ~]# ssh root@192.168.31.239 cat ~/.ssh/id_rsa.pub >> authorized_keys

把master服务器的authorized_keys、known_hosts复制到slave服务器的/root/.ssh目录。

scp -r /root/.ssh/authorized_keys root@192.168.31.238:/root/.ssh/  
scp -r /root/.ssh/known_hosts   root@192.168.31.238:/root/.ssh/

scp -r /root/.ssh/authorized_keys root@192.168.31.239:/root/.ssh/
scp -r /root/.ssh/known_hosts  root@192.168.31.239:/root/.ssh/

验证是否可以免密登陆其他机器。

[root@master ~]# ssh slave1
Last login: Mon Oct  1 16:43:06 2018
[root@slave1 ~]# ssh master
Last login: Mon Oct  1 16:43:58 2018 from slave1
[root@master ~]# ssh slave2
Last login: Mon Oct  1 16:43:33 2018

bug
如何解决虚拟机无法连接外网?

[root@master ~]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
# 未生成ip地址
        inet6 fe80::20c:29ff:fe72:641f   prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:72:64:1f  txqueuelen 1000  (Ethernet)
        RX packets 12335  bytes 1908583 (1.8 MiB)
        RX errors 0  dropped 868  overruns 0  frame 0
        TX packets 11  bytes 828 (828.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
virbr0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 192.168.122.1  netmask 255.255.255.0  broadcast 192.168.122.255
        ether 52:54:00:cb:c7:a8  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@master ~]# service network start
Restarting network (via systemctl):  Job for network.service failed because the control process exited with error code. See "systemctl status network.service" and "journalctl -xe" for details.
                                                           [失败]

[root@master ~]# systemctl status network.service
● network.service - LSB: Bring up/down networking
   Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: disabled)
   Active: failed (Result: exit-code) since 三 2018-12-05 16:59:04 CST; 1min 7s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 4546 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE)
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master network[4546]: RTNETLINK answers: File exists
12月 05 16:59:04 master systemd[1]: network.service: control process exited, code...=1
12月 05 16:59:04 master systemd[1]: Failed to start LSB: Bring up/down networking.
12月 05 16:59:04 master systemd[1]: Unit network.service entered failed state.
12月 05 16:59:04 master systemd[1]: network.service failed.
Hint: Some lines were ellipsized, use -l to show in full.

[root@master ~]# tail -f /var/log/messages
Dec  5 16:59:04 master network: RTNETLINK answers: File exists
Dec  5 16:59:04 master network: RTNETLINK answers: File exists
Dec  5 16:59:04 master systemd: network.service: control process exited, code=exited status=1
Dec  5 16:59:04 master systemd: Failed to start LSB: Bring up/down networking.
Dec  5 16:59:04 master systemd: Unit network.service entered failed state.
Dec  5 16:59:04 master systemd: network.service failed.
Dec  5 17:00:01 master systemd: Started Session 10 of user root.
Dec  5 17:00:01 master systemd: Starting Session 10 of user root.
Dec  5 17:01:01 master systemd: Started Session 11 of user root.
Dec  5 17:01:01 master systemd: Starting Session 11 of user root.

[root@master ~]# cat /var/log/messages | grep network
Dec  5 14:09:20 master kernel: drop_monitor: Initializing network drop monitor service
Dec  5 14:09:43 master systemd: Starting Import network configuration from initramfs...
Dec  5 14:09:43 master systemd: Started Import network configuration from initramfs.
Dec  5 14:10:01 master systemd: Starting LSB: Bring up/down networking...
Dec  5 14:10:08 master network: 正在打开环回接口: [  确定  ]
Dec  5 14:10:09 master network: 正在打开接口 ens33: ERROR     : [/etc/sysconfig/network-scripts/ifup-eth] Error, some other host (70:85:C2:03:8E:AF) already uses address 192.168.31.237.
Dec  5 14:10:09 master /etc/sysconfig/network-scripts/ifup-eth: Error, some other host (70:85:C2:03:8E:AF) already uses address 192.168.31.237.
Dec  5 14:10:09 master network: [失败]
Dec  5 14:10:09 master systemd: network.service: control process exited, code=exited status=1
Dec  5 14:10:09 master systemd: Failed to start LSB: Bring up/down networking.
Dec  5 14:10:09 master systemd: Unit network.service entered failed state.
Dec  5 14:10:09 master systemd: network.service failed.
Dec  5 14:11:46 master pulseaudio: GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

solution
解决方法依具体情况而定,大致分为以下几种:

# 01 修改ifcfg-ens33文件(网络上很多资料提示需要将ens33变更为eth0,其实大可不必)
[root@master ~]# cd /etc/sysconfig/network-scripts
[root@master network-scripts]# ls
ifcfg-ens33  ifdown-isdn      ifup          ifup-plip      ifup-tunnel
ifcfg-lo     ifdown-post      ifup-aliases  ifup-plusb     ifup-wireless
ifdown       ifdown-ppp       ifup-bnep     ifup-post      init.ipv6-global
ifdown-bnep  ifdown-routes    ifup-eth      ifup-ppp       network-functions
ifdown-eth   ifdown-sit       ifup-ib       ifup-routes    network-functions-ipv6
ifdown-ib    ifdown-Team      ifup-ippp     ifup-sit
ifdown-ippp  ifdown-TeamPort  ifup-ipv6     ifup-Team
ifdown-ipv6  ifdown-tunnel    ifup-isdn     ifup-TeamPort
[root@master network-scripts]# vi ifcfg-ens33
TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
#设置静态IP
BOOTPROTO="static"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="ens33"
UUID="cecb46d8-4d6e-4678-b2f4-445b9f09c73d"
DEVICE="ens33"
#开机自启
ONBOOT="yes"
IPADDR=192.168.31.237
NETMASK=255.255.255.0
GATEWAY=192.168.31.1
DNS1=192.168.31.1

# 02 考虑到当前IP被占用的情况,设置新的静态IP地址,包括/etc/hosts和/etc/sysconfig/network-scripts/ifcfg-ens33
[root@master ~]# vi /etc/hostname
[root@master ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens33
[root@master ~]# service network restart
Restarting network (via systemctl):                        [  确定  ]

# 03 关闭NetworkManager管理套件
[root@master ~]# systemctl stop NetworkManager
[root@master ~]# systemctl disable NetworkManager
Removed symlink /etc/systemd/system/multi-user.target.wants/NetworkManager.service.
Removed symlink /etc/systemd/system/dbus-org.freedesktop.NetworkManager.service.
Removed symlink /etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service.
[root@master ~]# systemctl restart network

# 通过上述方式最终成功解决
[root@master ~]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.31.237  netmask 255.255.255.0  broadcast 192.168.31.255
        inet6 fe80::20c:29ff:fe72:641f  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:72:64:1f  txqueuelen 1000  (Ethernet)
        RX packets 341  bytes 32414 (31.6 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 61  bytes 7540 (7.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 2  bytes 108 (108.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2  bytes 108 (108.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
virbr0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 192.168.122.1  netmask 255.255.255.0  broadcast 192.168.122.255
        ether 52:54:00:cb:c7:a8  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

3.4 Hadoop2.7.2安装及集群配置

3.4.1 Hadoop安装

与jdk文件处理方式类似,导入并解压到/opt/Hadoop路径下。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述
《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

配置hadoop环境变量。

[root@master ~]# vim /etc/profile
export HADOOP_HOME=/opt/hadoop/hadoop2.7.2
 export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
:x
[root@master ~]# source /etc/profile

验证是否完成安装。

[root@master ~]# hadoop version
Hadoop 2.7.2
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r b165c4fe8a74265c792ce23f546c64604acf0e41
Compiled by jenkins on 2016-01-26T00:08Z
Compiled with protoc 2.5.0
From source with checksum d0fda26633fa762bff87ec759ebe689c
This command was run using /opt/hadoop/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar

3.4.2 伪分布式集群配置

在/opt/hadoop目录下创建数据存放的文件夹,tmp、dfs、dfs/data、dfs/name。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

进入hadoop配置文件目录。

[root@master ~]# cd /opt/hadoop/hadoop-2.7.2/etc/hadoop
[root@master hadoop]# ls
capacity-scheduler.xml      httpfs-env.sh            mapred-env.sh
configuration.xsl           httpfs-log4j.properties  mapred-queues.xml.template
container-executor.cfg      httpfs-signature.secret  mapred-site.xml.template
core-site.xml               httpfs-site.xml          slaves
hadoop-env.cmd              kms-acls.xml             ssl-client.xml.example
hadoop-env.sh               kms-env.sh               ssl-server.xml.example
hadoop-metrics2.properties  kms-log4j.properties     yarn-env.cmd
hadoop-metrics.properties   kms-site.xml             yarn-env.sh
hadoop-policy.xml           log4j.properties         yarn-site.xml
hdfs-site.xml               mapred-env.cmd

配置core-site.xml 文件。

vi core-site.xml

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://master:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/hadoop/tmp</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>131702</value>
    </property>
</configuration>

配置hdfs-site.xml文件。

 vi hdfs-site.xml

<configuration>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///opt/hadoop/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:///opt/hadoop/dfs/data</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>master:50090</value>
    </property>
    <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
    </property>
</configuration>

配置 mapred-site.xml 文件。

vi mapred-site.xml.template

<configuration>
<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
          <final>true</final>
    </property>
  <property>
     <name>mapreduce.jobtracker.http.address</name>
     <value>master:50030</value>
  </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>master:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>master:19888</value>
    </property>
    <property>
        <name>mapred.job.tracker</name>
        <value>http://master:9001</value>
    </property>
</configuration>

配置 yarn-site.xml 文件。

vi yarn-site.xml

<configuration>
 <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>      <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>master:8032</value>
    </property>
    <property>
       <name>yarn.resourcemanager.scheduler.address</name>
        <value>master:8030</value>
    </property>
    <property>
       <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master:8031</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>master:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
    </property>
     <property>
       <name>yarn.resourcemanager.hostname</name>
       <value>master</value>
</property>
    <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>2048</value>
    </property>
</configuration>

配置hadoop-env.shyarn-env.sh的JAVA_HOME。

[root@master hadoop]# vi hadoop-env.sh
[root@master hadoop]# vi yarn-env.sh

配置slaves,增加两个slave节点。

#删除默认的localhost
slave1
slave2

通过scp将master服务器上配置好的Hadoop复制到各个节点对应位置上。

[root@master hadoop]# scp -r /opt/hadoop  192.168.10.132:/opt/
[root@master hadoop]# scp -r /opt/hadoop  192.168.10.133:/opt/

3.4.3 启动hadoop

从master服务器上进行hadoop文件目录,并初始化。

[root@master ~]# cd /opt/hadoop/hadoop-2.7.2
[root@master hadoop-2.7.2]# bin/hdfs namenode –format

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

启动/终止命令

sbin/start-dfs.sh
sbin/start-yarn.sh

sbin/stop-dfs.sh
sbin/stop-yarn.sh

输入jps查看相关信息。

  1. master
[root@master hadoop-2.7.2]# jps
8976 Jps
8710 ResourceManager
8559 SecondaryNameNode

  1. slave
[root@slave1 ~]# jps
4945 Jps
3703 DataNode
4778 NodeManager

3.5 Spark安装及环境配置

3.5.1 Scala安装

3.5.2 Spark安装

3.5.3 Spark启动

关闭/开启 防火墙。

# 开启防火墙
[root@master ~]# systemctl start firewalld.service
# 关闭防火墙
[root@master ~]# systemctl stop firewalld.service
# 开启开机启动
[root@master ~]# systemctl enable firewalld.service
# 关闭开机启动
[root@master ~]# systemctl disable firewalld.service
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.

启动Hadoop节点。

[root@master ~]# cd /opt/hadoop/hadoop-2.7.2/
[root@master hadoop-2.7.2]# sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /opt/hadoop/hadoop-2.7.2/logs/hadoop-root-namenode-master.out
slave1: starting datanode, logging to /opt/hadoop/hadoop-2.7.2/logs/hadoop-root-datanode-slave1.out
slave2: starting datanode, logging to /opt/hadoop/hadoop-2.7.2/logs/hadoop-root-datanode-slave2.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /opt/hadoop/hadoop-2.7.2/logs/hadoop-root-secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/hadoop-2.7.2/logs/yarn-root-resourcemanager-master.out
slave2: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.2/logs/yarn-root-nodemanager-slave2.out
slave1: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.2/logs/yarn-root-nodemanager-slave1.out
[root@master hadoop-2.7.2]# jps
3648 SecondaryNameNode
4099 Jps
3801 ResourceManager

启动Spark。

[root@master hadoop-2.7.2]# cd /opt/spark/spark-2.3.1-bin-hadoop2.7
[root@master spark-2.3.1-bin-hadoop2.7]# sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /opt/spark/spark-2.3.1-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.master.Master-1-master.out
slave1: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark/spark-2.3.1-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave1.out
slave2: starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark/spark-2.3.1-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave2.out

Spark集群测试(master节点)。

《Spark集群框架搭建01【VM15+CentOS7+Hadoop+Scala+Spark+Zookeeper+HBase+Hive】》 在这里插入图片描述

    原文作者:LivinLuo
    原文地址: https://www.jianshu.com/p/2cd5eb0fb9f8
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞