I have seen lots of CDH users now have trouble setting up Hive/Impala ODBC drivers on Windows 10 machine to connect to remote Kerberized cluster recently. Connection keeps getting Kerberos related error messages. Like below:
[Cloudera][Hardy] (34) Error from server: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credential cache is empty).
[Cloudera][ImpalaODBC] (100) Error from the Impala Thrift API: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (No credentials cache found)
To help CDH users to get it working without much hassle, I would like to compile a list of steps below for reference. I have tested this in my VM Windows 10.
1. For Kerberos authentication to work, you need to get a valid Kerberos ticket on your client machine, which is Windows 10. Hence, you will need to download and install MIT Kerberos client tool so that you can authenticate yourself against the remote cluster, much like running “kinit” on Linux.
To get the tool, please visit http://web.mit.edu/kerberos/dist and follow the links
2. In order for client machine to talk to remote KDC server that contains principal database, we need a valid krb5 configuration file on client side. This file normally sits under /etc/krb5.conf on Linux. On Windows 10, it should be under C:\ProgramData\MIT\Kerberos5\krb5.ini. Please copy the krb5.conf file in your cluster and then copy to this location on your Windows machine. Please be aware that the file name in Windows should be krb5.ini, not krb5.conf. Also note that C:\ProgramData is a hidden directory, so you will need to unhide it first from File Explorer before you can access the files underneath it.
3. Make sure that you connect to correct port number, for Hive, it is normally 10000 by default. For Impala, it should be 21050, NOT 21000, which is used by impala-shell.
If you have Load Balancer setup for either Hive or Impala, then the port number could also be different, please consult with your system admin to get the correct port number if this is the case.
4. Add Windows system variable KRB5CCNAME with value of “C:\krb5\krb5cc”, where “krb5cc” is a file name for the kerberos ticket cache, it can be anything, but we commonly use krb5cc or krb5cache. To do so, please follow steps below:
a. open “File Explorer”
b. right click on “This PC”
c. select “Properties”
d. next to “Computer name”, click on “Change settings”
e. click on “Advanced” tab and then “Environment Variables”
f. under “System Variables”, click on “New”
g. enter “KRB5CCNAME” in “Variable name” and “C:\krb5\krb5cc” in “Variable value” (without double quotes)
h. click on “OK” and then “OK” again
i. restart Windows
5. If you have SSL enabled for either Hive or Impala, you will also need to “Enable SSL” for ODBC driver. This can be found under “SSL Options” popup window, see below screenshot for details:
Please note that “SSL Options” is only available in newer version of ODBC driver, if you do not see this option, please upgrade ODBC driver to latest version. At the time of writing, Hive ODBC Driver is at 2.5.24.
That should be it. The above are the common missing steps by Windows users when trying to connect to Hive or Impala via ODBC. If you have encountered other problems that need extra steps, please leave a comment below and I will update my post.
Hope above helps.