WebHCat Request Failed With Error “id: HTTP: no such user”

WebHCat, previously known as Templeton, is the REST API for HCatalog, a table and storage management layer for Hadoop. Users can use WebHCat to access metadata information from HCatalog, as well as submitting jobs for MapReduce, Hive & Pig.

Below is an example of how to retrieve a list of databases via WebHCat API:

curl --negotiate -u: http://{webhcat-hostname}:50111/templeton/v1/ddl/database/

Please note that port 50111 is the default port number for WebHCat. And sample output looks like below:

{"databases":["default","mytest","s3_test","udf_db"]}

However, recently I was facing an issue that WebHCat request failed with below error:

2018-08-08 17:18:44,413 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:hive (auth:PROXY) via HTTP/{webhcat-hostname}@CDH511.COM (auth:KERBEROS) cause:org.apache.thrift.transport.TTrans
portException: java.net.SocketException: Connection reset
2018-08-08 17:18:44,414 ERROR org.apache.hive.hcatalog.templeton.CatchallExceptionMapper: java.lang.reflect.UndeclaredThrowableException
java.io.IOException: java.lang.reflect.UndeclaredThrowableException
...
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset
....
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDelegationToken(HiveMetaStoreClient.java:1882)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDelegationToken(HiveMetaStoreClient.java:1872)
....

From the stacktrace, we can see that WebHCat failed when trying to collect delegation token from HiveMetaStore. So checking the HMS server log, I found below error:

2018-08-08 17:18:19,675 WARN  org.apache.hadoop.security.ShellBasedUnixGroupsMapping: [pool-7-thread-2]: unable to return groups for user HTTP
PartialGroupNameException The user name 'HTTP' is not found. id: HTTP: no such user
id: HTTP: no such user

It is pretty clear that HMS failed due to user “HTTP” is missing. Adding “HTTP” user on the HMS server host resolved the issue.

Research further, I realized that this was due to in Hive configuration, hadoop.proxyuser.hive.groups was set to a list of groups, rather than “*”, and “HTTP” was one in the group list. You will not get such error if the hadoop.proxyuser.hive.groups is set at “*”, and only failed if “HTTP” was added manually (it is required to be on this list if the value is not “*”, because “hive” user need to be able to impersonate as “HTTP” user for the request to work).

The reason for such failure is because when hadoop.proxyuser.hive.groups is set as “*”, Hive will not bother to check for user’s existence, since every user is allowed. However, when a list of users are defined here, when Hive impersonates as those users, it will try to make sure that those users exist on the host that Hive runs. In our case, “HTTP” user did not exist on HMS host, HMS failed with the error we saw earlier. So we just need to add this user to resolve the issue.

Hope above helps for anyone who also have the same issue.

Oozie Hive2 Action Failed with Error: “HiveSQLException: Failed to execute session hooks”

If you have an Oozie Hive2 job that fails with below error message randomly, which can be found in Oozie’s server log, located by default under /var/log/oozie:

2018-06-02 09:00:01,103 WARN org.apache.oozie.action.hadoop.Hive2Credentials: SERVER[hlp3058p.oocl.com] 
USER[dmsa_appln] GROUP[-] TOKEN[] APP[DMSA_CMTX_PCON_ETL_ONLY] JOB[0010548-180302135253124-oozie-oozi-W] 
ACTION[0010548-180302135253124-oozie-oozi-W@spark-6799] Exception in addtoJobConf
org.apache.hive.service.cli.HiveSQLException: Failed to execute session hooks
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:241)
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:232)
        at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:491)
        at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:181)
        at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
        at java.sql.DriverManager.getConnection(DriverManager.java:571)
        at java.sql.DriverManager.getConnection(DriverManager.java:233)
        at org.apache.oozie.action.hadoop.Hive2Credentials.addtoJobConf(Hive2Credentials.java:66)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.setCredentialTokens(JavaActionExecutor.java:1213)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1063)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1295)
        at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232)
        at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
        at org.apache.oozie.command.XCommand.call(XCommand.java:286)
        at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332)
        at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to execute session hooks
        at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:308)
        at org.apache.hive.service.cli.CLIService.openSession(CLIService.java:178)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:422)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:316)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1253)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1238)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
        ... 3 more
Caused by: java.lang.IllegalStateException: zip file closed
        at java.util.zip.ZipFile.ensureOpen(ZipFile.java:634)
        at java.util.zip.ZipFile.getEntry(ZipFile.java:305)
        at java.util.jar.JarFile.getEntry(JarFile.java:227)
        at sun.net.www.protocol.jar.URLJarFile.getEntry(URLJarFile.java:128)
        at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:132)
        at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:150)
        at java.net.URLClassLoader.getResourceAsStream(URLClassLoader.java:233)
        at javax.xml.parsers.SecuritySupport$4.run(SecuritySupport.java:94)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.xml.parsers.SecuritySupport.getResourceAsStream(SecuritySupport.java:87)
        at javax.xml.parsers.FactoryFinder.findJarServiceProvider(FactoryFinder.java:283)
        at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:255)
        at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:121)
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2526)
        at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2513)
        at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2409)
        at org.apache.hadoop.conf.Configuration.get(Configuration.java:982)
        at org.apache.sentry.binding.hive.conf.HiveAuthzConf.<init>(HiveAuthzConf.java:162)
        at org.apache.sentry.binding.hive.HiveAuthzBindingHook.loadAuthzConf(HiveAuthzBindingHook.java:131)
        at org.apache.sentry.binding.hive.HiveAuthzBindingSessionHook.run(HiveAuthzBindingSessionHook.java:108)
        at org.apache.hive.service.cli.session.SessionManager.executeSessionHooks(SessionManager.java:420)
        at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:300)
        ... 12 more

It is likely that you are hitting a possible issue with JDK. Please refer to HADOOP-13809 for details. There is no prove at this stage that it is JDK bug, but workaround is at JDK level. As mentioned in the JIRA, you can add below parameters to HiveServer2’s Java options:

-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl

If you are using Cloudera Manager, you can go to:

CM > Hive > Configuration > Search “Java configuration options for HiveServer2”

and add above parameters to the end of the string, and don’t forget an extra space before it.

Then restart HiveServer2 through CM. This should help to avoid the issue.

Hive CLI Prints SLF4J Error to Standard Output

If you have both Hive and Spark running on the same cluster, chances are that Hive CLI will probably produce the following WARNING message upon exit of each session:

WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked. 
WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.

Sample full output looks like below:

[root@localhost ~]# hive -e "show tables;"

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/hive-common-1.1.0-cdh5.12.1.jar!/hive-log4j.properties
OK
sample_07
sample_08
web_logs
Time taken: 1.281 seconds, Fetched: 3 row(s)
WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.

Even though Hive CLI has been deprecated in CDH, there are still lots of enterprise users out there are still stuck with Hive CLI due to legacy reasons and it is not easy for them to migrate to use Beeline or query Hive through ODBC or JDBC in their applications.

This is not an issue if you just run Hive CLI from command line and view the output. However, if you want to capture the result set from Hive CLI’s stdout, it will be a trouble, see test case below:

[root@localhost ~]# output=`hive -e "show tables;"`

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/hive-common-1.1.0-cdh5.12.1.jar!/hive-log4j.properties
OK
Time taken: 1.281 seconds, Fetched: 3 row(s)

And when you echo out the variable $output, it will contain the WARNING message:

[root@localhost ~]# echo $output
sample_07
sample_08
web_logs
WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.

And when you want to use this output as other inputs in your application, things will go crazy.

This happens due to the below code under hive’s script file (/opt/cloudera/parcels/CDH/lib/hive/bin/hive) that loads Spark’s JAR file into Hive’s CLASSPATH:

# add Spark assembly jar to the classpath
if [[ -n "$SPARK_HOME" && !("$HIVE_SKIP_SPARK_ASSEMBLY" = "true") ]]
then
  sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar`
  CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}"
fi

Luckily, the latest CDH release, in fact from CDH 5.12.0, Cloudera has backported an upstream JIRA HIVE-12179, which added a checking for environment variable called “HIVE_SKIP_SPARK_ASSEMBLY”. So we can use this variable to disable the loading of Spark JARs for Hive CLI if you do not need to use Hive on Spark.

So the workaround is as simple as setting “HIVE_SKIP_SPARK_ASSEMBLY” to “true” so that the “if” statement will be skipped. See below example:

[root@localhost ~]# output=`export HIVE_SKIP_SPARK_ASSEMBLY=true; hive -e "show tables;"`

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/hive-common-1.1.0-cdh5.12.1.jar!/hive-log4j.properties
OK
Time taken: 1.281 seconds, Fetched: 3 row(s)

And the output will be clean like below:

[root@localhost ~]# echo $output
sample_07
sample_08
web_logs

This workaround will not work if you need to use Hive on Spark in Hive CLI, because it essentially disables the loading of Spark JARs. And of course, using of Hive CLI is strongly NOT recommended, and migration to Beeline or use ODBC/JDBC to connect to HiveServer2 is the right way to go in the long run.

Hope above information helps.

Hive Long Queries Causing ZooKeeper Fail With OutOfMemory Error

I have seen lots of Hadoop users are not using Big Data technology correctly. Hadoop is designed for Big Data, so it works well with large file sizes and this is why we have block size for HDFS as 256MB or 512MB depending on use cases. However, lots of users, even from big corporate companies are not utilizing such technology by having lots of small files and partitions for a given Hive table. Some particular users have millions of partitions and hundreds of millions of files stored in HDFS, each file is in KB of size. This puts lots of pressure on all components in the Hadoop echo system, including HDFS, HiveServer2, Impala, ZooKeeper just to name a few.

In this particular post, I am going to discuss one of the side effect of such usage pattern that it will crash ZooKeeper with OutOfMemory error, combined with large string of Hive query being run.

The following was what happened:

1. User runs a Hive query with very long string (contains 100KB of characters)
2. This Hive query runs against a table with millions of partitions
3. The query will scan through about 20,000 partition
4. Since hive.support.concurrency=true, when the query is running, Hive will try to create one ZNode per partition in ZooKeeper to indicate that those partitions are locked
5. Hive will also store the full Hive query string against each ZNode for debugging purpose, so that when issue happened, user can check ZNode and see which query locks the partition
6. So we have 20,000 partitions * 100K each, we will end up creating 2GB of data in ZooKeeper, just for this query alone
7. If we have multiple similar queries, ZooKeeper can reach to memory limit easily in no time

To overcome this problem, Hive introduced a new feature to control the number of characters to be stored against each ZNode in such scenario, via upstream JIRA HIVE-16334. This JIRA has been backported into CDH since 5.12.0.

However, the default size is 1MB (1,000,000 bytes), which is still big and above case will still happen. To work around this issue, we can simply reduce the number of Hive query being stored, say to 10K. (Storing of the query string is purely for debugging purpose, so in theory we can reduce to a very small size, but probably not a good idea if you want to troubleshoot other issues, so 10K should be a good starting point).

To do so, please follow below steps (assuming that you are using Cloudera Manager):

1. Go go CM > Hive > Configuration > HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml
2. Enter below into textarea (view as XML):

<property>
    <name>hive.lock.query.string.max.length</name>
    <value>10000</value>
    <description>The maximum length of the query string to store in the lock. Set it to 10K.</description>
</property>

3. Save and restart HiveServer2

After that, we should have less chance of hitting ZooKeeper OutOfMemory in the above scenario. However, the root cause was due to too many partitions, so the first priority is to reduce as much as possible so that each query will not scan more than 1000 partitions to get good performance.

For users using CDH older than CDH 5.12.0, suggestion is to upgrade.

Hope above helps.

Show Create Table Output Truncated for VIEW in Hive

Yesterday i was working on a very weird issue in Hive that the SHOW CREATE TABLE for a VIEW returned partial output. This happened in Hue, Beeline as well as Hive CLI. Please see below test case:

1. From Hue interface:

2. From Beeline:

0: jdbc:hive2://localhost:10000/default> SHOW CREATE TABLE test_view;
+------------------------------------+--+
|           createtab_stmt           |
+------------------------------------+--+
| CREATE VIEW `test_view` AS SELECT  |
|                                    |
| FROM `default`.`sample_07`         |
+------------------------------------+--+

3. From Hive CLI:

hive> SHOW CREATE TABLE test_view;
OK
CREATE VIEW `test_view` AS SELECT

FROM `default`.`sample_07`

After some researching and testing in varies CDH versions, I found out that the issue was caused by having “\t” character in the VIEW’s create statement, and it only happens in CDH version before 5.13.1. Any characters after “\t” will be removed in the output.

What happens in CDH 5.13.1 version is that Hive will remove white spaces in the query string before saving to database, so this will not happen, however, simply upgrade CDH will not help to resolve the issue, as the code that reads from Hive MetaStore database does not change.

To fix the issue, we need to update the database stored in the Hive MetaStore backend, please follow below steps (this is for MySQL, for other database types, please consult with vendor):

1. STOP Hive services, so that no once an use the it to update Hive Metadata by running any queries
2 Dump the whole MySQL database out into text a file using “mysqldump” command line tool (please make sure all data is dumped correctly so that we can load data back)
3. Use text editor to replace “tab” with spaces for all the data stored in “TBLS” table
4. Since we can not RENAME databases, just drop it and re-create it
5. Reload the modified dump file back into newly created database
6. Restart Hive and confirm if issue is fixed.

It is better to test this in the DEV/TEST cluster before applying the same change in PROD to minimize mistakes.

Hope above helps.