Hive LPAD function hangs HS2 and its subsequent queries

Recently I have noticed an issue in Hive that running below query will hang HiveServer2 indefinitely until it got cancelled. And because HiveServer2’s global compilation lock, it prevents any further queries from submitted into Hive and all queries appear to be queued.

SELECT lpad("String",10,'');

After researching, I concluded that it was caused by a Hive’s upstream bug: HIVE-15792. This has been fixed in upstream Hive 2.3.0 and in CDH5.12.1 and CDH5.13.0 onwards.

Currently, the only workaround is to have a non-empty string passed into the third parameter of the function, so below query will work:

SELECT lpad("String",10,'v');

The issue is basically caused by a code logic that will run in a loop infinitely when an empty string is passed into the function. You can have a look at the Patch in the JIRA for details of the change.

Hope above information helps.

Redirect Docker/Kubernete log out of /var/log/message for Cloudera Data Science Workbench

After you installed Cloudera Data Science Workbench (CDSW) in a cluster, by default, the log messages for Docker and Kubernete all go into /var/log/messages. This is OK in most cases, but some user might not want the logs to be shared with other processes, so separating those logs outside of /var/log/messages might make sense.

To do this, please follow steps below:

1. Create the file under /etc/rsyslog.d/10-docker.conf
2. Put the below contents into the above file:

# Docker logging
*.* {
 /var/log/docker.log
 stop
}

3. Configure logrotate to roll and archive these files by creating file: /etc/logrotate.d/docker
4. Put the following contents into file /etc/logrotate.d/docker:

/var/log/docker.log {
    size 100M
    rotate 2
    missingok
    compress
}

5. Finally, restart rsyslog:

service rsyslog restart

After all above steps, the CDSW logging should be redirected to /var/log/docker.log, no restart of CDSW is required.

Note: this blog was based on steps from Mark Wolfe’s Blog. The difference is that instead of “daemon.*” in step 2, I need to use “*.*“, otherwise only kubernete log will go into /var/log/docker.log, not dockerd logs.

HiveMetaStore Failed to Start in Cloudera Manager: cmf.service.config.ConfigGenException: Unable to generate config file creds.localjceks

Recently I was dealing with an issues that HiveMetaStore failed to start in a Cloudera Manager managed environment. It failed with below errors:

Caused by: com.cloudera.cmf.service.config.ConfigGenException: Unable to generate config file creds.localjceks
        at com.cloudera.cmf.service.config.JceksConfigFileGenerator.generate(JceksConfigFileGenerator.java:63)
        at com.cloudera.cmf.service.HandlerUtil.emitConfigFiles(HandlerUtil.java:133)
        at com.cloudera.cmf.service.AbstractRoleHandler.generateConfiguration(AbstractRoleHandler.java:887)

This problem is very common if you have the following misconfiguration in your cluster:

1. Wrong version of Java being used. For a list of supported version of Java by Cloudera, please refer to below link:
CDH and Cloudera Manager Supported JDK Versions

2. Different version of Java used across the cluster hosts.

So run:

java -version

and check symlinks under /usr/java/jdk****-cloudera to confirm they are consistent across the whole cluster.

After all above were performed, try to restart failed service, most likely the issue should be resolved. If not, please let me know in the comments below.