python – 连接到HiveServer2时impyla挂起

我正在用
Python编写一些ETL流程,对于部分流程,我们使用Hive.根据
documentation,Cloudera的impyla客户端与Impala和Hive一起工作.

根据我的经验,客户端为Impala工作,但在我尝试连接到Hive时挂起:

from impala.dbapi import connect

conn = connect(host='host_running_hs2_service', port=10000, user='awoolford', password='Bzzzzz')
cursor = conn.cursor()          <- hangs here
cursor.execute('show tables')
results = cursor.fetchall()
print results

如果我逐步进入代码,它会在尝试打开会话时挂起(line #873 of hiveserver2.py).

起初,我怀疑防火墙端口可能阻止连接,所以我尝试使用Java连接.令我惊讶的是,这有效:

public class Main {
    private static String driverName = "org.apache.hive.jdbc.HiveDriver";
    public static void main(String[] args) throws SQLException {
        try {
            Class.forName(driverName);
        } catch (ClassNotFoundException e) {
            e.printStackTrace();
            System.exit(1);
        }
        Connection connection = DriverManager.getConnection("jdbc:hive2://host_running_hs2_service:10000/default", "awoolford", "Bzzzzz");
        Statement statement = connection.createStatement();
        ResultSet resultSet = statement.executeQuery("SHOW TABLES");

        while (resultSet.next()) {
            System.out.println(resultSet.getString(1));
        }
    }
}

由于Hive和Python是如此常用的技术,我很想知道是否有其他人遇到过这个问题,如果有的话,你做了什么修复它?

版本:

> Hive 1.1.0-cdh5.5.1
> Python 2.7.11 | Anaconda 2.3.0
> Redhat 6.7

最佳答案

/path/to/bin/hive --service hiveserver2 --hiveconf hive.server2.authentication=NOSASL

from impala.dbapi import connect

conn = connect(host='host_running_hs2_service', port=10000, user='awoolford', password='Bzzzzz', auth_mechanism='NOSASL')
cursor = conn.cursor()
cursor.execute('show tables')
results = cursor.fetchall()
print results
点赞