利用Sqoop实现Hive的数据与MySQL数据的互导

1. 配置概览

Hive arguments:
   --create-hive-table                         Fail if the target hive table exists
   --hive-database <database-name>             Sets the database name to use when importing to hive
   --hive-delims-replacement <arg>             Replace Hive record \0x01 and row delimiters (\n\r) from imported string fields with user-defined string
   --hive-drop-import-delims                   Drop Hive record \0x01 and row delimiters (\n\r) from imported string fields
   --hive-home <dir>                           Override $HIVE_HOME Import tables into Hive (Uses Hive's default delimiters if none are set.)
   --hive-import                               
   --hive-overwrite                            Overwrite existing data in the Hive table
   --hive-partition-key <partition-key>        Sets the partition key to use when importing to hive
   --hive-partition-value <partition-value>    Sets the partition value to use when importing to hive
   --hive-table <table-name>                   Sets the table name to use when importing to hive
   --map-column-hive <arg>                     Override mapping for specific column to hive types.

2. 把MySQL表中数据导入到hive表中

drop table if exists hive_users;

create table hive_users (id string,name string,age int)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '\t';

[root@repo bin]# ./sqoop import \
--connect jdbc:mysql://192.168.9.100:3306/test \
--username root \
--password 123456 \
--table users \
--fields-terminated-by '\t' \
--num-mappers 1 \
--hive-import \
--hive-database default \
--hive-table hive_users \
--delete-target-dir 

hive> select * from hive_users;
OK
1   Jed     15
2   Tom     16
3   Tony    17
4   Bob     18
5   Harry   19
6   Jack    20

3. 把hive表中数据导入到MySQL表中

mysql> create table users_from_hive (id int,name varchar(10),age int,primary key (`id`));

[root@repo bin]# ./sqoop export \
--connect jdbc:mysql://192.168.9.100:3306/test \
--username root \
--password 123456 \
--table users_from_hive \
--input-fields-terminated-by '\t' \
--export-dir /hive_single_user/warehouse/hive_users \
--num-mappers 1

mysql> select * from users_from_hive;
+----+-------+------+
| id | name  | age  |
+----+-------+------+
|  1 | Jed   |   15 |
|  2 | Tom   |   16 |
|  3 | Tony  |   17 |
|  4 | Bob   |   18 |
|  5 | Harry |   19 |
|  6 | Jack  |   20 |
+----+-------+------+

注意
在sqoop-1.4.6以前,从MySQL中导出数据到hive表中,不能指定文件格式为parquet,只能先导入到HDFS,在从HDFS上load parquet file

4. 把sqoop命令写到文件中,sqoop执行时使用这个文件来执行命令

[root@repo myshell]# vim sqoop-options-test
--connect jdbc:mysql://192.168.9.100:3306/test \
--username root \
--password 123456 \ 
--target-dir /user/root/SQOOP/import/users_options \
--num-mappers 1 

[root@repo bin]# ./sqoop import \
--options-file /root/myshell/sqoop-options-test \
--table users_from_hive

[root@repo bin]# hdfs dfs -cat /user/root/SQOOP/import/users_options/*
1,Jed,15
2,Tom,16
3,Tony,17
4,Bob,18
5,Harry,19
6,Jack,20

注意
(1) 选项在文件中与手工设定可以同时使用
(2) 可以在选项文件中写注释,# …

    原文作者:CoderJed
    原文地址: https://www.jianshu.com/p/2e2b8894b3f9
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞