Enabling Kerberos Debug for Hive

From time to time, we need to do troubleshooting steps for locating the root cause of Kerberos failure in Hive. I will outline below steps in order to turn on debugging message from both Client and HiveServer2 server side.

  1. To enable on Hive Client side (beeline), simply add the following export commands before you run beeline command:
    export HADOOP_JAAS_DEBUG=true;
    export HADOOP_OPTS='-Dsun.security.krb5.debug=true -Dsun.security.jgss.debug=true'
    

    Then the debug message will be printed on the shell when you run beeline.

  2. To enable kerberos debug on HiveServer2 side (assuming you are using Cloudera Manager)
    1. To to CM > Hive > Configuration
    2. locate “HiveServer2 Environment Advanced Configuration Snippet (Safety Valve)”
    3. add following to the textarea:
      HADOOP_OPTS='-Dsun.security.krb5.debug=true -Dsun.security.jgss.debug=true'
      
    4. Save and restart Hive service

    Once restarted, you will be able to locate the kerberos debug message from HiveServer2’s process directory on the server host, which is located under /var/run/cloudera-scm-agent/process/XXX-hive-HIVESERVER2/logs/stdout.log, where XXX is the largest number under the directory for HiveServer2

The sample debug message for kerberos looks like below:

Java config name: null
Native config name: /etc/krb5.conf
Loaded from native config
[UnixLoginModule]: succeeded importing info:
uid = 0
gid = 0
supp gid = 0
Debug is true storeKey false useTicketCache true useKeyTab false doNotPrompt true ticketCache is null isInitiator true KeyTab is null refreshKrb5Config is false principal is null tryFirstPass is false useFirstPass is false storePass is false clearPass is false
Acquire TGT from Cache
>>>KinitOptions cache name is /tmp/krb5cc_0
>>>DEBUG client principal is impala/{host-name}@REAL.COM
>>>DEBUG server principal is krbtgt/REAL.COM@REAL.COM
>>>DEBUG key type: 23
>>>DEBUG auth time: Sun Aug 13 21:07:46 PDT 2017
>>>DEBUG start time: Sun Aug 13 21:07:46 PDT 2017
>>>DEBUG end time: Mon Aug 14 07:07:46 PDT 2017
>>>DEBUG renew_till time: Sun Aug 20 21:07:46 PDT 2017
>>> CCacheInputStream: readFlags() FORWARDABLE; RENEWABLE; INITIAL; PRE_AUTH;
>>>DEBUG client principal is impala/{host-name}@REAL.COM
>>>DEBUG server principal is X-CACHECONF:/krb5_ccache_conf_data/pa_type/krbtgt/REAL.COM@REAL.COM
>>>DEBUG key type: 0
>>>DEBUG auth time: Wed Dec 31 16:00:00 PST 1969
>>>DEBUG start time: null
>>>DEBUG end time: Wed Dec 31 16:00:00 PST 1969
>>>DEBUG renew_till time: null
>>> CCacheInputStream: readFlags()
Principal is impala/{host-name}@REAL.COM
[UnixLoginModule]: added UnixPrincipal,
UnixNumericUserPrincipal,
UnixNumericGroupPrincipal(s),
to Subject
Commit Succeeded

Search Subject for Kerberos V5 INIT cred (<>, sun.security.jgss.krb5.Krb5InitCredential)
Found ticket for impala/{host-name}@REAL.COM to go to krbtgt/REAL.COM@REAL.COM expiring on Mon Aug 14 07:07:46 PDT 2017
Entered Krb5Context.initSecContext with state=STATE_NEW
Found ticket for impala/{host-name}@REAL.COM to go to krbtgt/REAL.COM@REAL.COM expiring on Mon Aug 14 07:07:46 PDT 2017
Service ticket not found in the subject
>>> Credentials acquireServiceCreds: same realm
default etypes for default_tgs_enctypes: 23.
>>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
>>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType
>>> KdcAccessibility: reset
>>> KrbKdcReq send: kdc=kdc-host.com TCP:88, timeout=3000, number of retries =3, #bytes=1607
>>> KDCCommunication: kdc=kdc-host.com TCP:88, timeout=3000,Attempt =1, #bytes=1607
>>>DEBUG: TCPClient reading 1581 bytes
>>> KrbKdcReq send: #bytes read=1581
>>> KdcAccessibility: remove kdc-host.com
>>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType
>>> KrbApReq: APOptions are 00100000 00000000 00000000 00000000
>>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType
Krb5Context setting mySeqNumber to: 789412608
Created InitSecContextToken:

From above message, you can see at least below info:

  • Client config file for kerberos /etc/krb5.conf
  • Ticket case file: /tmp/krb5cc_0
  • Client principal name: impala/{host-name}@REAL.COM
  • KDC server host: kdc=kdc-host.com and using TCP connection via port 88 (TCP:88)
  • and a lot more others that might be useful for your troubleshooting

Hope above helps.

How to enable HiveServer2 audit log through Cloudera Manager

This article explains the steps required to enable audit log for HiveServer2, so that all queries run through HiveServer2 will be audited into a central log file.

Please follow the steps below:

  1. Go to Cloudera Manager home page > Hive > Configuration
  2. Tick “Enable Audit Collection”
  3. Ensure “Audit Log Directory” location point to a path that has enough disk space
  4. Go to Cloudera Manager home page > click on “Cloudera Management Service” > Instances
  5. Click on “Add Role Instances” button on the top right corner of the page
  6. Choose a host for Navigator Audit Server & Navigator Metadata Server
  7. Then follow on screen instructions to finish adding the new roles
  8. Once the roles are added successfully, Cloudera Manager will ask you to restart a few services, including Hive
  9. Go ahead and restart Hive

After restarting, Hive’s audit log will be enabled and logged into /var/log/hive/audit directory by default.

Please note that you are not required start Navigator services, so if you don’t need them running, you can just leave them at STOP state, the Hive’s audit logs should still function as normal. However, it is a requirement to have Navigator installed for the audit log to function properly, as there are some libraries from Navigator are required for audit to work.

How to query a multiple delimited table in Hive

This article explains how to query a multi delimited Hive table in CDH5.3.x.

Use case as follow:

Having the following table definitions:

CREATE TABLE test_multi 
(a string, b string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
WITH SERDEPROPERTIES (
    "field.delim"="#|",
    "collection.delim"=":",
    "mapkey.delim"="@"
);

Query all columns is OK:

select * from test_multi;
+---------------+---------------+
| test_multi.a  | test_multi.b  |
+---------------+---------------+
| eric          | test more     |
+---------------+---------------+
1 row selected (1.58 seconds)

However, query a single column will get the following error in HS2:

select a from test_multi;
Error: Error while processing statement: FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: 
Class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found
        at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:334)
        at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:352)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:126)
        ... 22 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1953)
        at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:304)
        ... 24 more

This happens in CDh5.3.3, which ships Hive 0.13, and I am not sure whether it also applies to CDh5.4.x and CDh5.5.x.

This is caused by class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe was not loaded to Hive’s UDF list when HiveServer2 starts up. We need to copy the JAR file that contains this class to Hive’s AUX directory.

Steps as follows:

1) Locate the AUX directory for HiveServer2, if you don’t have one create one and update Hive’s configuration through Cloudera Manager. If you don’t use Cloudera Manager, simply create a directory on HiveServer2 host, in my case is /hive/jars.

2) Create a symlink to file /opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib.jar (or if not using Cloudera Manager, /usr/lib/hive/lib/hive-contrib.jar) from within /hive/jars

ln -s /opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib.jar /hive/jars/hive-contrib.jar

3) If you don’t use Cloudera Manager, add the following:

<property>
<name>hive.aux.jars.path</name>
<value>/hive/jars/hive-contrib.jar</value>
</property>

to hive-site.xml for HiveServer2.

If you use Cloudera Manager, simply go to step 4.

4) Restart HiveServer2.

This should be able to remove the error we saw earlier in the post and get Hive query working.

Hope this helps.

Kerberos connections to HIveServer2 not working cross domain

The following is the scenario of the cross domain problem with Kerberized cluster:

1. Cluster is within realm “DEV.EXAMPLE.COM”
2. Client is outside cluster with realm “EXAMPLE.COM”
3. Connect to Impala from client machine works
4. Connect to HS2 from client machine does not work and get the following error:

java.lang.IllegalArgumentException: Illegal principal name <user>@EXAMPLE.COM: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
No rules applied to <user>@EXAMPLE.COM
	at org.apache.hadoop.security.User.<init>(User.java:50)
	at org.apache.hadoop.security.User.<init>(User.java:43)
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1221)
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1205)
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:689)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to <user>@EXAMPLE.COM
	at org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
	at org.apache.hadoop.security.User.<init>(User.java:48)
	... 8 more

This is caused by HDFS not resolving the principal from cross domain to the local user in the cluster. To fix the issue, follow the steps below:

1. In Cloudera Manager go to HDFS > Configuration > search for “Trusted Kerberos Realms” > add “EXAMPLE.COM” to list
2. Firstly restart HS2
3. Confirm that we can connect to HS2 from client now
4. Restart the rest of the services

This should allow user to connect to HS2 from outside the cluster’s realm.