How to Use Beeline to connect to Impala

You can certainly connect to Impala using Hive Driver from beeline, like below command:

beeline -u 'jdbc:hive2://<impala-daemon-host>:21050/default;auth=noSasl'

However, the result output format does not work properly:

> show tables;
customers
dim_prod
mansi
sample_07
sample_08
small
web_logs
+-------+--+
| name  |
+-------+--+
+-------+--+

Notice the output is not inside the columns?

The better approach is to use Cloudera Impala JDBC Driver, please follow the steps below:

1. Download the driver from Cloudera Impala JDBC Driver
2. Extract the files and put it somewhere on the host that you need to run beeline command, in my case is under /root/impala-jdbc/jdbc
3. Run the following command to update the HADOOP_CLASSPATH to include the Impala JDBC Driver JARs

export HADOOP_CLASSPATH=`hadoop classpath`:/root/impala-jdbc/jdbc/*

4. Finally you are ready to start beeline:

beeline -d "com.cloudera.impala.jdbc41.Driver" -u 'jdbc:impala://<impala-daemon-host>:21050;AuthMech=0'

You will need to tell beeline the class name for the driver using “-d” option, in my case the driver class is com.cloudera.impala.jdbc41.Driver

The output looks much better:

> show tables;
+------------+--+
|    name    |
+------------+--+
| customers  |
| dim_prod   |
| mansi      |
| sample_07  |
| sample_08  |
| small      |
| web_logs   |
+------------+--+
7 rows selected (0.219 seconds)

However, the best way is to connect from impala-shell which is designed for Impala natively.

Hope this helps.

Syntax Error When Using Impala JDBC Driver

If you get the following error when running Impala query through Impala JDBC Driver:

com.cloudera.sqlengine.exceptions.SQLEngineException: [Simba][JSQLEngine](11000) 
Syntax error near character 3: use<<< ??? >>> database_name
at com.cloudera.sqlengine.parser.PTParseErrorInfo.makeSyntaxError(Unknown Source)
at com.cloudera.sqlengine.parser.generated.Parser.yyerror(Unknown Source)
at com.cloudera.sqlengine.parser.generated.Parser.yyparse(Unknown Source)
at com.cloudera.sqlengine.parser.generated.Parser.parse(Unknown Source)
at com.cloudera.sqlengine.parser.PTParser.parse(Unknown Source)
at com.cloudera.sqlengine.dsiext.dataengine.SqlDataEngine.prepare(Unknown Source)
at com.cloudera.hivecommon.dataengine.HiveJDBCDataEngine.prepare(Unknown Source)
at com.cloudera.jdbc.common.SStatement.executeNoParams(Unknown Source)
at com.cloudera.jdbc.common.SStatement.execute(Unknown Source)
at com.jpmc.risk.mspdb.GenerateHQL1.compareSumEbcdicResult(GenerateHQL1.java:870)
at com.jpmc.risk.mspdb.MSPDriver.execute(MSPDriver.java:366)
at com.jpmc.risk.mspdb.MSPDriver.main(MSPDriver.java:43)

com.cloudera.hivecommon.dataengine.HiveJDBCDataEngine.prepare: [Simba][JSQLEngine](11000) 
Syntax error near character 36: SELECT * FROM table_name LIMIT<<< ??? >>> 1
com.cloudera.sqlengine.exceptions.SQLEngineException: [Simba][JSQLEngine](11000) 
Syntax error near character 36: SELECT * FROM table_name LIMIT<<< ??? >>> 1
at com.cloudera.sqlengine.parser.PTParseErrorInfo.makeSyntaxError(Unknown Source)
at com.cloudera.sqlengine.parser.generated.Parser.yyerror(Unknown Source)
at com.cloudera.sqlengine.parser.generated.Parser.yyparse(Unknown Source)
at com.cloudera.sqlengine.parser.generated.Parser.parse(Unknown Source)
at com.cloudera.sqlengine.parser.PTParser.parse(Unknown Source)
at com.cloudera.sqlengine.dsiext.dataengine.SqlDataEngine.prepare(Unknown Source)
at com.cloudera.hivecommon.dataengine.HiveJDBCDataEngine.prepare(Unknown Source)
at com.cloudera.jdbc.common.SStatement.executeNoParams(Unknown Source)
at com.cloudera.jdbc.common.SStatement.executeQuery(Unknown Source)
at com.jpmc.risk.mspdb.GenerateHQL1.insExepRecords(GenerateHQL1.java:316)
at com.jpmc.risk.mspdb.MSPDriver.execute(MSPDriver.java:398)
at com.jpmc.risk.mspdb.MSPDriver.main(MSPDriver.java:43)

it is caused by Impala Driver tried to transform queries that were already in the Impala native form, and caused syntax error.

The solution here is to add UseNativeQuery=1 in the JDBC query string parameter, for example:

jdbc:impala://localhost:21050;AuthMech=3;UID=UserName;PWD=Password&useNativeQuery=1

So that the query will not be transformed and remain as what ever the original query was, this should help with the error.

Available options to connect to Hive and Impala from .NET application

Recently a customer has requested information regarding which one to use to connect to HS2 or Impala from .NET application, ODBC or JDBC? I will briefly summarise my findings below:

Both ODBC and .NET are Microsoft products, so it is natural that they will work nicely together. And according to my research, ODBC will also perform better in the case that you do not fetch large amounts of data. Even if you do, the performance difference will not be so great that you will notice the difference. You can have a look at some examples here: Using the Microsoft Hive ODBC Driver in a .NET client

JDBC Drivers are written in Java and designed for Java programs. It can work with .NET, but will require some extra work, and it might make later troubleshooting much harder down the track.

Thrift API is another choice, but the API changes from version to versions, and it adds extra layer between your application and database, so I would expect it will perform worse than ODBC and JDBC counterparts. For more details, please have a look at this page: https://thrift.apache.org/lib/csharp.

So in my opinion, ODBC is the one to go.