鉴于apache版本的Hadoop监控不完善,我们生产使用了CDH,同时使用Cloudera Manager监控,Cloudera Manager不但能监控集群还能监控机器,免费版也够用真心值得推荐。
离线安装Cloudera Manager和CDH
备注:通过修改每个主机/etc/cloudera-scm-agent/config.ini的server_host为安装Cloudera Manager的机器的主机名来直接识别需要管理的主机,这样你可以避免为CDH集群按照指定主机。
Cloudera Manager配置MySQL
安装MySQL(忽略)
创建cm用户赋予权限(为了避免问题赋予最大权限)
create user ‘cm’ identified by ‘cm’;
grant all on . to ‘cm’@’master01’ identified by ‘cm’ with grant option;
flush privileges;初始化Cloudera Manager的数据库
/usr/share/cmf/schema/scm_prepare_database.sh -h master01 -ucm -pcm mysql –scm-host master01 scm cm cm执行完命令后查看/etc/cloudera-scm-server/db.properties如下
com.cloudera.cmf.db.type=mysql com.cloudera.cmf.db.host=master01 com.cloudera.cmf.db.name=scm com.cloudera.cmf.db.user=cm com.cloudera.cmf.db.password=cm
关闭内置数据库PostgreSQL(一定要停止)
/etc/init.d/cloudera-scm-server-db stop重启Cloudera Manager
/etc/init.d/cloudera-scm-server stop
/etc/init.d/cloudera-scm-server start
配置时钟同步(使用NTP)
假设现在有3个节点(hadoop01、hadoop02、hadoop03):
hadoop01节点作为ntp服务器与外界对时中心同步时间,随后对hadoop02、hadoop03节点提供时间同步服务。
hadoop02、hadoop03节点以hadoop01节点为基础同步时间。
每个节点安装ntp
yum install ntp配置hadoop01的/etc/ntp.conf
server 0.cn.pool.ntp.org server 0.asia.pool.ntp.org server 3.asia.pool.ntp.org # allow update time by the upper server # 允许上层时间服务器主动修改本机时间 restrict 0.cn.pool.ntp.org nomodify notrap noquery restrict 0.asia.pool.ntp.org nomodify notrap noquery restrict 3.asia.pool.ntp.org nomodify notrap noquery # Undisciplined Local Clock. This is a fake driver intended for backup # and when no outside source of synchronized time is available. # 外部时间服务器不可用时,以本地时间作为时间服务 server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10
配置hadoop02、hdaoop03的/etc/ntp.conf
server 0.cn.pool.ntp.org server 0.asia.pool.ntp.org server 3.asia.pool.ntp.org # allow update time by the upper server # 允许上层时间服务器主动修改本机时间 restrict 0.cn.pool.ntp.org nomodify notrap noquery restrict 0.asia.pool.ntp.org nomodify notrap noquery restrict 3.asia.pool.ntp.org nomodify notrap noquery # Undisciplined Local Clock. This is a fake driver intended for backup # and when no outside source of synchronized time is available. # 外部时间服务器不可用时,以本地时间作为时间服务 server 127.127.1.0 # local clock fudge 127.127.1.0 stratum 10 server hadoop01 prefer
启动 ntp
hadoop01 启动之前先手动同步下ntpdate 0.cn.pool.ntp.org
hadoop02-03 启动之前先手动同步下ntpdate hadoop01service ntpd start
设置开机启动(每个节点)
chkconfig ntpd on
安装CDH中遇到的问题
Cloudera Manager安装无法登陆
Cloudera Manager安装后访问问题
备注:问题没有彻底解决,通过远程访问阿里windows系统
Hive问题
Hive出现的问题作者的文章中都有,我居然其中的问题全出现了。
mysql-connector-java.jar 位置在:/usr/share/java/mysql-connector-java.jar
且名字一定要为:mysql-connector-java.jar
配置HA
Hive
完全卸载Cloudera Manager
配置ngxin访问CM
问题
Cloudera_Server启动: com.cloudera.server.web.cmf.csrf.CsrfRefererInterceptor: Rejecting request originating from
解决
grep安装目录的csrf,直接注释掉然后重启cloudera-manager-server
[root@xx.xx.xx.xx cloudera-scm-server]# cd /usr/share/cmf/
[root@xx.xx.xx.xx cmf]# grep -i -r csrf ./
Binary file ./cloudera-navigator-server/libs/cdh5/hadoop-yarn-server-nodemanager-2.6.0-cdh5.5.0.jar matches
Binary file ./common_jars/hadoop-yarn-server-nodemanager-2.6.0-cdh5.5.0.jar matches
Binary file ./common_jars/server-5.6.0.jar matches
Binary file ./common_jars/hadoop-yarn-server-resourcemanager-2.5.0-cdh5.3.2.jar matches
./webapp/WEB-INF/spring/mvc-config.xml: <bean class="com.cloudera.server.web.cmf.csrf.CsrfRefererInterceptor" />
Binary file ./lib/cdh5-java6/hadoop-yarn-server-resourcemanager-2.5.0-cdh5.3.2.jar matches
Binary file ./lib/server-5.6.0.jar matches
Binary file ./lib/cdh5/hadoop-yarn-server-nodemanager-2.6.0-cdh5.5.0.jar matches
[root@xx.xx.xx.xx cmf]# vi ./webapp/WEB-INF/spring/mvc-config.xml
注释掉这个bean,然后重启server,在访问nginx,整个世界清净了
<!-- <bean class="com.cloudera.server.web.cmf.csrf.CsrfRefererInterceptor" /> -->