Oozie WorkFlow中Hive Action使用案例

官方地址

http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.3.6/DG_HiveActionExtension.html

复制样例重新命名后对hive进行修改

 cp -r  examples/apps/hive oozie-apps/
mv oozie-apps/hive  hive-select

修改hive-select中的job.properties

nameNode=hdfs://hadoop-senior.beifeng.com:8020
jobTracker=hadoop-senior.beifeng.com:8032
queueName=default
examplesRoot=examples
oozieAppsRoot=user/beifeng/oozie-apps
oozieDataRoot=user/beifeng/oozie/datas

oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/${oozieAppsRoot}/hive-select/workflow.xml

inputDir=hive-select/input
outputDir=hive-select/output

oozie.use.system.libpath=true 表示使用hdfs系统beifeng用户下的share依赖包。
注意:端口号是否正确。hdfs:8020 jobtracker:8032

测试hive使用的api是新版本还是老版本

《Oozie WorkFlow中Hive Action使用案例》

在hive中创建dept表

CREATE TABLE IF NOT EXISTS default.dept
(
dept_no string COMMENT 'id',
dept_name string ,
dept_url string 
)
COMMENT 'dept'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
LOCATION '/user/hive/warehouse/dept'

编写hive的sql脚本

load data local inpath '/opt/datas/dept.txt' overwrite into table dept;

编写流程xml文件

<?xml version="1.0" encoding="UTF-8"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one
  or more contributor license agreements.  See the NOTICE file
  distributed with this work for additional information
  regarding copyright ownership.  The ASF licenses this file
  to you under the Apache License, Version 2.0 (the
  "License"); you may not use this file except in compliance
  with the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<workflow-app xmlns="uri:oozie:workflow:0.5" name="hive-wf">
    <start to="hive-node"/>

    <action name="hive-node">
        <hive xmlns="uri:oozie:hive-action:0.5">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/${oozieAppsRoot}/${outputDir}"/>
            </prepare>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <script>dept-select.sql</script>
            <param>OUTPUT=${nameNode}/${oozieAppsRoot}/${outputDir}</param>
        </hive>
        <ok to="end"/>
        <error to="fail"/>
    </action>

    <kill name="fail">
        <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

注意:workflow和hive的版本信息。根据.cloudera的oozie官方文档说明为主。

创建hdfs上的oozie-app目录

bin/hdfs dfs -mkdir -p  /user/beifeng/oozie-apps

复制oozie中的工作流select-dept到hdfs系统

../hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put oozie-apps/hive-select /user/beifeng/oozie-apps/

《Oozie WorkFlow中Hive Action使用案例》

复制hive配置文件及修改工作流文件

cp ../hive-0.13.1-cdh5.3.6/conf/hive-site.xml oozie-apps/hive-select/

《Oozie WorkFlow中Hive Action使用案例》

创建hive的依赖jar包lib及上传

mkdir -p oozie-apps/hive-select/lib
cp ../hive-0.13.1-cdh5.3.6/lib/mysql-connector-java-5.1.27-bin.jar oozie-apps/hive-select/lib

复制hive-select 到HDFS

bin/hdfs dfs -put ../oozie-4.0.0-cdh5.3.6/oozie-apps/hive-select/* /user/beifeng/oozie-apps/hive-select/

设置oozie请求地址

export OOZIE_URL=http://hadoop-senior.beifeng.com:11000/oozie

运行job

bin/oozie job -config oozie-apps/hive-select/job.properties -run

查看job运行状态

bin/oozie job -info 0000001-180315133250705-oozie-beif-W

《Oozie WorkFlow中Hive Action使用案例》

查看mysql中dept表中是否已有数据

《Oozie WorkFlow中Hive Action使用案例》

    原文作者:志辉聊码
    原文地址: https://www.jianshu.com/p/7451e774254f
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞