本文是基于网络大数据助教的spark安装指南,我作为作业整理一下这个过程,仅限于有线网络,请先参考hadoop_1_hadoop安装后在看此文。
将百度网盘下载的2个tgz文件放入上文的vmshare文件夹中
在master端输入
sudo tar -zxvf /mnt/hgfs/vmshare/spark-1.6.2-bin-hadoop2.6.tgz -C /usr/local
sudo chown -R liudehua:liudehua /usr/local/spark-1.6.2-bin-hadoop2.6/
sudo tar -zxvf /mnt/hgfs/vmshare/scala-2.10.6.tgz -C /usr/local/
sudo chown -R liudehua:liudehua /usr/local/scala-2.10.6/
环境变量配置
vim ~/.bashrc
在文件末尾加入如下配置
export SCALA_HOME=/usr/local/scala-2.10.6
export PATH=${SCALA_HOME}/bin:$PATH
export SPARK_HOME=/usr/local/spark-1.6.2-bin-hadoop2.6
export PATH=${SPARK_HOME}/lib:$PATH
退出后,输入
source ~/.bashrc
spark-env.sh和slaves配置
cd /usr/local/spark-1.6.2-bin-hadoop2.6/conf
cp spark-env.sh.template spark-env.sh
输入
vim spark-env.sh
在末尾添加(注意ip要改为master的ip,内存是slave1内存)
export JAVA_HOME=/opt/java/jdk1.8.0_60
export SCALA_HOME=/usr/local/scala-2.10.6
export SPARK_MASTER_IP=10.128.1.125
export SPARK_WORKER_MEMORY=2g
export HADOOP_CONF_DIR=/usr/local/hadoop-2.6.0/conf
输入
cp slaves.template slaves
输入
vim slaves
在最末尾删除localhost
添加如下内容
Master
Slave1
将/usr/local/spark-1.6.2-bin-hadoop2.6和/usr/local/scala-2.10.6复制到各个Slave的对应位置上.
sudo scp -r /usr/local/spark-1.6.2-bin-hadoop2.6/ liudehua@Slave1:/home/liudehua/Desktop/spark-1.6.2-bin-hadoop2.6/
输入yes后输入Slave1的密码
sudo scp -r /usr/local/scala-2.10.6/ liudehua@Slave1:/home/liudehua/Desktop/scala-2.10.6/
输入Slave1密码
然后在Slave1端输入
sudo cp -r /home/liudehua/Desktop/spark-1.6.2-bin-hadoop2.6/ /usr/local/spark-1.6.2-bin-hadoop2.6/
sudo cp -r /home/liudehua/Desktop/scala-2.10.6/ /usr/local/scala-2.10.6/
修改Slave1权限(不知道是不是必须)
sudo chown -R liudehua:liudehua /usr/local/spark-1.6.2-bin-hadoop2.6/
sudo chown -R liudehua:liudehua /usr/local/scala-2.10.6/
配置全部完成
测试
先按照前文启动hadoop
输入
cd /usr/local/spark-1.6.2-bin-hadoop2.6/sbin
sh start-all.sh