HIVE行级更新操作配置

在hive上执行事务操作(增删改查)是hive从0.14版本就支持,但并不是默认支持,需要修改参数,如果不修改参数配置,会报如下异常

hive>  update tablename set ORDER_NO='jinxin' where ID='21326';
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
hive> delete from tablename where ID='21326';
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.

一、创建表语句

Hive对使用Update功能的表有特定的语法要求, 语法要求如下:
(1)要执行Update的表中, 建表时必须带有buckets(分桶)属性
(2)要执行Update的表中, 需要指定格式,其余格式目前赞不支持, 如:parquet格式, 目前只支持ORCFileformat和AcidOutputFormat
(3)要执行Update的表中, 建表时必须指定参数(‘transactional’ = true);
例如:

create table student(  
  id int,  
  name String
)clustered by (id) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');  

二、修改配置文件hive-site.xml

由于我用的CDH版本,所以直接在CDH界面更改

1、进入hive点击配置

《HIVE行级更新操作配置》 Paste_Image.png

2、搜索hive-site

《HIVE行级更新操作配置》 Paste_Image.png

3、找到hive客户端高级配置代码段

《HIVE行级更新操作配置》 Paste_Image.png

4、增加配置

《HIVE行级更新操作配置》 Paste_Image.png

xml形式代码为

<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
<property>
<name>hive.enforce.bucketing</name>
<value>true</value>
</property>
<property>
<name>hive.exec.dynamic.partition.mode</name
<value>nonstrict</value>
</property>
<property>
<name>hive.txn.manager</name>
<value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value></property>
<property>
<name>hive.compactor.initiator.on</name>
<value>true</value>
</property>

保存更改的配置

《HIVE行级更新操作配置》 Paste_Image.png

5、重启hive并部署配置

重启

《HIVE行级更新操作配置》 Paste_Image.png

部署配置

《HIVE行级更新操作配置》 Paste_Image.png

《HIVE行级更新操作配置》 Paste_Image.png

OK配置完成

三、测试hive行级更新

hive> delete from user where ORDER_NO='12345';
Query ID = hdfs_20170418143939_9f98b068-9dee-4f15-a85e-61571837f1f6
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 2
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1491881598805_0017, Tracking URL = http://master2:8088/proxy/application_1491881598805_0017/
Kill Command = /soft/bigdata/clouderamanager/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/bin/hadoop job  -kill job_1491881598805_0017
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 2
2017-04-18 14:39:05,571 Stage-1 map = 0%,  reduce = 0%
2017-04-18 14:39:10,675 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 2.5 sec
2017-04-18 14:39:11,694 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.77 sec
2017-04-18 14:39:17,819 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 9.03 sec
MapReduce Total cumulative CPU time: 9 seconds 30 msec
Ended Job = job_1491881598805_0017
Loading data to table testsqoop.dwf_order
Table testsqoop.dwf_order stats: [numFiles=14, numRows=0, totalSize=25473, rawDataSize=4375]
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 2  Reduce: 2   Cumulative CPU: 9.03 sec   HDFS Read: 59086 HDFS Write: 637 SUCCESS
Total MapReduce CPU Time Spent: 9 seconds 30 msec
OK
Time taken: 18.82 seconds
hive> 

OK

    原文作者:献给记性不好的自己
    原文地址: https://www.jianshu.com/p/1403536de981
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞