目前本系列文章有:
搭建大数据平台系列(0)-机器准备
搭建大数据平台系列(1)-Hadoop环境搭建[hdfs,yarn,mapreduce]
搭建大数据平台系列(2)-zookeeper环境搭建
搭建大数据平台系列(3)-hbase环境搭建
搭建大数据平台系列(4)-hive环境搭建
0.准备步骤
Hive 是依赖在Hadoop上的,所以他的安装不需要像Hadoop或者spark那样每个节点都安装一遍,只需在Hadoop的master节点上安装一个即可。Hive的安装前,需要Hadoop的环境,以及Mysql。
1.安装过程
1.1下载并解压安装包
#下载hive-1.1.0-cdh5.5.0.tar.gz到master机器的~/bigdataspacce文件夹下
#解压安装包的命令:
[hadoop@master ~]$ cd ~/bigdataspacce
[hadoop@master bigdataspace]$ tar -zxvf hive-1.1.0-cdh5.5.0.tar.gz
#解压完成后删除压缩包:
[hadoop@master bigdataspace]$ rm hive-1.1.0-cdh5.5.0.tar.gz
#配置HIVE_HOME环境变量
[hadoop@master ~]$ sudo vi /etc/profile
(添加配置内容如下,红色为需要新增的配置)
export HIVE_HOME=/home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0
export PATH=$JAVA_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin:$PATH
#让环境变量生效
[hadoop@master ~]$ source /etc/profile
1.2修改hive-env.sh配置文件
[hadoop@master ~]$ cd /home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0/conf
[hadoop@master conf]$ cp hive-env.sh.template hive-env.sh
[hadoop@master conf]$ vi hive-env.sh
#在hive-env.sh配置文件末尾加上:
export HADOOP_HOME=/home/hadoop/bigdataspace/hadoop-2.6.0-cdh5.5.0
export HIVE_CONF_DIR=/home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0/conf
1.3新建hive-site.xml配置文件
[hadoop@master conf]$ vi hive-env.sh
##主要的配置内容如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/data/hive-1.1.0-cdh5.5.0/hive-db/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/data/hive-1.1.0-cdh5.5.0/tmp/hive-${user.name}</value>
<description>Scratch space for Hive jobs</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/data/hive-1.1.0-cdh5.5.0/tmp/${user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/data/hive-1.1.0-cdh5.5.0/downloaded</value>
<description>
Temporary local directory for added resources in the remote file system.
</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/data/hive-1.1.0-cdh5.5.0/queryLogs/${user.name}</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>
jdbc:mysql://slave1:3306/hive?useUnicode=true&characterEncoding=utf8
</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
</configuration>
1.4添加mysql-connector的jar包到hive安装路径下的lib文件夹
#$HIVE_HOME为前面hive安装的目录路径:/home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0
[hadoop@master ~] mv mysql-connector-java-5.1.33.jar $HIVE_HOME/lib
1.5启动元数据服务
[hadoop@master ~]$ cd ~/bigdataspace/hive-1.1.0-cdh5.5.0
[hadoop@master hive-1.1.0-cdh5.5.0]$ ./bin/hive --service metastore &
1.6启动/停止hive (CTL)命令行
#因为一开始配置了HIVE_HOME环境变量,可以直接在任何目录下执行hive命令了,进入hive控制台
[hadoop@master bigdataspace]$ hive
Logging initialized using configuration in jar:file:/home/hadoop/bigdataspace
/hive-1.1.0-cdh5.5.0/lib/hive-common-1.1.0-cdh5.5.0.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive (default)>
上面报错了,解决Logging initialized using configuration in jar:file… (因为没log配置文件,直接从jar包查找)
$ cd ~/bigdataspace/ /hive-1.1.0-cdh5.5.0/conf
$ cp beeline-log4j.properties.template beeline-log4j.properties
$ cp hive-log4j.properties.template hive-log4j.properties
$ cp hive-exec-log4j.properties.template hive-exec-log4j.properties
[hadoop@master bigdataspace]$ hive
Logging initialized using configuration in file:/home/hadoop/bigdataspace/
hive-1.1.0-cdh5.5.0/conf/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive (default)>
hive> quit; #(退出hive,使用exit也可以)
1.7启动/停止beeline命令行(CTL)
#启动:
[hadoop@master bigdataspace]$ beeline
#停止:
beeline> !q
1.8HiveServer2的使用
[hadoop@master ~]$ cd ~/bigdataspace/hive-1.1.0-cdh5.5.0/bin/
[hadoop@master bin]$ ./hiveserver2 & #后面的&表示改命名在系统后台执行
(如果执行上面命令让界面无法回到命令行,可以按ctrl+C回到命令行,这里&会让hiverserver2在后台继续执行)
#查看HiveServer2的进程情况(如果无则hiverserver2启动失败或停止了):
[hadoop@master bin]$ ps -ef |grep HiveServer2
hadoop 25545 14762 3 17:02 pts/1 00:00:21 /home/hadoop/bigdataspace/jdk1.8.0_60/bin/java -Xmx256m -Djava.library.path=/home/hadoop/bigdataspace/hadoop-2.6.0-cdh5.5.0/lib/native/ -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/hadoop/bigdataspace/hadoop-2.6.0-cdh5.5.0/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/hadoop/bigdataspace/hadoop-2.6.0-cdh5.5.0 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /home/hadoop/bigdataspace/hive-1.1.0-cdh5.5.0/lib/hive-service-1.1.0-cdh5.5.0.jar org.apache.hive.service.server.HiveServer2
hadoop 26038 14762 0 17:14 pts/1 00:00:00 grep HiveServer2
(“kill -9 PID” 可以通过kill停止hiveserver2的后台服务)
使用beeline连接hiveserver2测试:
(
jdbc:hive2:表示连接到hiveserver2
master:表示hiveserver2安装的机器host/IP
10001:表示hiveserver2设置的端口号(hive-site.xml中可设置)
)
[hadoop@master hive-1.1.0-cdh5.5.0]$ beeline -u jdbc:hive2://master:10001
###这里可能会出现一些slf4j包有多个,引用异常,但是不是报错,如:
SLF4J: Class path contains multiple SLF4J bindings
SLF4J: Found binding in [jar:file:/home/hadoop/bigdataspace/had…)
Connecting to jdbc:hive2://master:10001
Connected to: Apache Hive (version 1.1.0-cdh5.5.0)
Driver: Hive JDBC (version 1.1.0-cdh5.5.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.1.0-cdh5.5.0 by Apache Hive
0: jdbc:hive2://master:10001>
以上完成了Hive的基本安装配置。