Hive Long Queries Causing ZooKeeper Fail With OutOfMemory Error

I have seen lots of Hadoop users are not using Big Data technology correctly. Hadoop is designed for Big Data, so it works well with large file sizes and this is why we have block size for HDFS as 256MB or 512MB depending on use cases. However, lots of users, even from big corporate companies are not utilizing such technology by having lots of small files and partitions for a given Hive table. Some particular users have millions of partitions and hundreds of millions of files stored in HDFS, each file is in KB of size. This puts lots of pressure on all components in the Hadoop echo system, including HDFS, HiveServer2, Impala, ZooKeeper just to name a few.

In this particular post, I am going to discuss one of the side effect of such usage pattern that it will crash ZooKeeper with OutOfMemory error, combined with large string of Hive query being run.

The following was what happened:

1. User runs a Hive query with very long string (contains 100KB of characters)
2. This Hive query runs against a table with millions of partitions
3. The query will scan through about 20,000 partition
4. Since hive.support.concurrency=true, when the query is running, Hive will try to create one ZNode per partition in ZooKeeper to indicate that those partitions are locked
5. Hive will also store the full Hive query string against each ZNode for debugging purpose, so that when issue happened, user can check ZNode and see which query locks the partition
6. So we have 20,000 partitions * 100K each, we will end up creating 2GB of data in ZooKeeper, just for this query alone
7. If we have multiple similar queries, ZooKeeper can reach to memory limit easily in no time

To overcome this problem, Hive introduced a new feature to control the number of characters to be stored against each ZNode in such scenario, via upstream JIRA HIVE-16334. This JIRA has been backported into CDH since 5.12.0.

However, the default size is 1MB (1,000,000 bytes), which is still big and above case will still happen. To work around this issue, we can simply reduce the number of Hive query being stored, say to 10K. (Storing of the query string is purely for debugging purpose, so in theory we can reduce to a very small size, but probably not a good idea if you want to troubleshoot other issues, so 10K should be a good starting point).

To do so, please follow below steps (assuming that you are using Cloudera Manager):

1. Go go CM > Hive > Configuration > HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml
2. Enter below into textarea (view as XML):

<property>
    <name>hive.lock.query.string.max.length</name>
    <value>10000</value>
    <description>The maximum length of the query string to store in the lock. Set it to 10K.</description>
</property>

3. Save and restart HiveServer2

After that, we should have less chance of hitting ZooKeeper OutOfMemory in the above scenario. However, the root cause was due to too many partitions, so the first priority is to reduce as much as possible so that each query will not scan more than 1000 partitions to get good performance.

For users using CDH older than CDH 5.12.0, suggestion is to upgrade.

Hope above helps.

HiveServer2 Failed to Start With “Table/View ‘DBS’ does not exist”

Recently I was dealing with an issue that after user upgraded CDH to 5.12.x, HiveServer2 was unable to start up. Examining the HiveServer2 log we noticed that for some reason HiveServer2 was try to access HiveMetaStore database and returned table does not exist error. See below full stacktrace:

2017-12-01 20:17:15,419 WARN  org.apache.hadoop.hive.metastore.MetaStoreDirectSql: [Thread-13]: Self-test query [select "DB_ID" from "DBS"] failed; direct SQL is disabled
javax.jdo.JDODataStoreException: Error executing SQL query "select "DB_ID" from "DBS"".
        at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
        at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:230)
        at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.runTestQuery(MetaStoreDirectSql.java:226)
        at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:134)
        at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:347)
        at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:298)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:60)
        at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:69)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:682)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:660)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:713)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:508)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:461)
        at org.apache.hive.service.auth.HiveAuthFactory.<init>(HiveAuthFactory.java:118)
        at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:58)
        at java.lang.Thread.run(Thread.java:745)
NestedThrowablesStackTrace:
java.sql.SQLSyntaxErrorException: Table/View 'DBS' does not exist.
        at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
        at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
        at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
        at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
        at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
        at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
        at org.apache.derby.impl.jdbc.EmbedPreparedStatement.<init>(Unknown Source)
        at org.apache.derby.impl.jdbc.EmbedPreparedStatement42.<init>(Unknown Source)
        at org.apache.derby.jdbc.Driver42.newEmbedPreparedStatement(Unknown Source)
        at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown Source)
        at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown Source)
        at com.jolbox.bonecp.ConnectionHandle.prepareStatement(ConnectionHandle.java:1193)
        at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:350)
        at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:194)
        at org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:267)
        at org.datanucleus.store.query.Query.executeQuery(Query.java:1786)
        at org.datanucleus.store.query.AbstractSQLQuery.executeWithArray(AbstractSQLQuery.java:339)
        at org.datanucleus.store.query.Query.execute(Query.java:1654)
        at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221)
        at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.runTestQuery(MetaStoreDirectSql.java:226)
        at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.<init>(MetaStoreDirectSql.java:134)
        at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:347)
        at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:298)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:60)
        at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:69)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:682)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:660)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:713)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:508)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:461)
        at org.apache.hive.service.auth.HiveAuthFactory.<init>(HiveAuthFactory.java:118)
        at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:58)
        at java.lang.Thread.run(Thread.java:745)
Caused by: ERROR 42X05: Table/View 'DBS' does not exist.
        at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
        at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
        at org.apache.derby.impl.sql.compile.FromBaseTable.bindTableDescriptor(Unknown Source)
        at org.apache.derby.impl.sql.compile.FromBaseTable.bindNonVTITables(Unknown Source)
        at org.apache.derby.impl.sql.compile.FromList.bindTables(Unknown Source)
        at org.apache.derby.impl.sql.compile.SelectNode.bindNonVTITables(Unknown Source)
        at org.apache.derby.impl.sql.compile.DMLStatementNode.bindTables(Unknown Source)
        at org.apache.derby.impl.sql.compile.DMLStatementNode.bind(Unknown Source)
        at org.apache.derby.impl.sql.compile.CursorNode.bindStatement(Unknown Source)
        at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown Source)
        at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown Source)
        at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(Unknown Source)
        ... 29 more

If you look closely at the stacktrace, you will notice that HiveServer2 was trying to access Derby database for those tables and hence failed with error. This does not make sense because:

1. The backend database was configured to use MySQL, not Derby
2. The database access should happen at HiveMetaStore service, not HiveServer2

After researching, I found out that this was actually caused by the following:

– A known issue in Cloudera Manager, that when you chose DBTokenStore as the HiveMetaStore Delegation Token Store, Cloudera Manager will also HiveServer2 to use DBTokenStore
– HiveServer2 does not support DBTokenStore until CDH 5.13.0 (introduced by HIVE-12270)
– When DBTokenStore was used by HiveServer2, the behaviour for HiveServer2 is that it will use Derby as the backend for the DBTokenStore and somehow it will try to query the tables that supposed to be used by HiveMetaStore (I have not figured out why yet)

Since user had HiveMetaStore HA, roll back to MemoryTokenStore was not an option, so the solution is to use ZookeeperTokenStore instead, which is supported by both HiveMetaStore as well as HiveServer2.

So simply go to CM > Hive > Configuration > “Hive Metastore Delegation Token Store” and select “org.apache.hadoop.hive.thrift.ZooKeeperTokenStore“, see below screenshot:

Save and then restart Hive services.

“No data or no sasl data in the stream” Error in HiveServer2 Log

I have seen lots of users complain about seeing lots of “No data or no sasl data in the stream” errors in the HiveServer2 server log, yet they have not noticed any performance impact nor query failure for Hive. So I think it would be good to write a blog about the possible reason behind this to clarify and remove the concerns that users have.

The following shows the full error message and stacktrace taken from HiveServer2 log:

ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-533556]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:765)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:762)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1687)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:762)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream

The likely cause is below:

  1. You have kerberos enabled
  2. You have multiple HiveServer2 hosts
  3. You have Load Balancer enabled in front of all HS2 servers that have such errors

If you have above setup, the error message you saw in HiveServer2 is harmless and can be safely ignored. This just indicated that SASL negotiation failed for one particular Hive client, which in this case would be the Load Balancer who pings regularly to those HiveServer2’s to check for connectivity. Those pings from LB were trying with PLAIN TCP connection, hence those messages.

There are a couple of ways to avoid those messages:

1. Reduce the frequency of pings from LB, this will reduce the errors in the log, however, won’t avoid it. I do not know a way to configure the LB to avoid PLAIN TCP connection, this is outside of scope of this blog, you might need to consult to F5 or HAProxy manual for further info.

2. Add filter to HiveServer2’s logging to filter out those exceptions:

a. Using Cloudera Manager, navigate to Hive > Configuration > “HiveServer2 Logging Advanced Configuration Snippet (Safety Valve)”
b. Copy and paste the the following configuration into the safety valve:

log4j.appender.RFA.filter.1=org.apache.log4j.filter.ExpressionFilter 
log4j.appender.RFA.filter.1.Expression=EXCEPTION ~= org.apache.thrift.transport.TSaslTransportException 
log4j.appender.RFA.filter.1.AcceptOnMatch=false

c. Then save and restart HiveServer2 service through Cloudera Manager.

Hope above helps.

Enabling Kerberos Debug for Hive

From time to time, we need to do troubleshooting steps for locating the root cause of Kerberos failure in Hive. I will outline below steps in order to turn on debugging message from both Client and HiveServer2 server side.

  1. To enable on Hive Client side (beeline), simply add the following export commands before you run beeline command:
    export HADOOP_JAAS_DEBUG=true;
    export HADOOP_OPTS='-Dsun.security.krb5.debug=true -Dsun.security.jgss.debug=true'
    

    Then the debug message will be printed on the shell when you run beeline.

  2. To enable kerberos debug on HiveServer2 side (assuming you are using Cloudera Manager)
    1. To to CM > Hive > Configuration
    2. locate “HiveServer2 Environment Advanced Configuration Snippet (Safety Valve)”
    3. add following to the textarea:
      HADOOP_OPTS='-Dsun.security.krb5.debug=true -Dsun.security.jgss.debug=true'
      
    4. Save and restart Hive service

    Once restarted, you will be able to locate the kerberos debug message from HiveServer2’s process directory on the server host, which is located under /var/run/cloudera-scm-agent/process/XXX-hive-HIVESERVER2/logs/stdout.log, where XXX is the largest number under the directory for HiveServer2

The sample debug message for kerberos looks like below:

Java config name: null
Native config name: /etc/krb5.conf
Loaded from native config
[UnixLoginModule]: succeeded importing info:
uid = 0
gid = 0
supp gid = 0
Debug is true storeKey false useTicketCache true useKeyTab false doNotPrompt true ticketCache is null isInitiator true KeyTab is null refreshKrb5Config is false principal is null tryFirstPass is false useFirstPass is false storePass is false clearPass is false
Acquire TGT from Cache
>>>KinitOptions cache name is /tmp/krb5cc_0
>>>DEBUG client principal is impala/{host-name}@REAL.COM
>>>DEBUG server principal is krbtgt/REAL.COM@REAL.COM
>>>DEBUG key type: 23
>>>DEBUG auth time: Sun Aug 13 21:07:46 PDT 2017
>>>DEBUG start time: Sun Aug 13 21:07:46 PDT 2017
>>>DEBUG end time: Mon Aug 14 07:07:46 PDT 2017
>>>DEBUG renew_till time: Sun Aug 20 21:07:46 PDT 2017
>>> CCacheInputStream: readFlags() FORWARDABLE; RENEWABLE; INITIAL; PRE_AUTH;
>>>DEBUG client principal is impala/{host-name}@REAL.COM
>>>DEBUG server principal is X-CACHECONF:/krb5_ccache_conf_data/pa_type/krbtgt/REAL.COM@REAL.COM
>>>DEBUG key type: 0
>>>DEBUG auth time: Wed Dec 31 16:00:00 PST 1969
>>>DEBUG start time: null
>>>DEBUG end time: Wed Dec 31 16:00:00 PST 1969
>>>DEBUG renew_till time: null
>>> CCacheInputStream: readFlags()
Principal is impala/{host-name}@REAL.COM
[UnixLoginModule]: added UnixPrincipal,
UnixNumericUserPrincipal,
UnixNumericGroupPrincipal(s),
to Subject
Commit Succeeded

Search Subject for Kerberos V5 INIT cred (<>, sun.security.jgss.krb5.Krb5InitCredential)
Found ticket for impala/{host-name}@REAL.COM to go to krbtgt/REAL.COM@REAL.COM expiring on Mon Aug 14 07:07:46 PDT 2017
Entered Krb5Context.initSecContext with state=STATE_NEW
Found ticket for impala/{host-name}@REAL.COM to go to krbtgt/REAL.COM@REAL.COM expiring on Mon Aug 14 07:07:46 PDT 2017
Service ticket not found in the subject
>>> Credentials acquireServiceCreds: same realm
default etypes for default_tgs_enctypes: 23.
>>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
>>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType
>>> KdcAccessibility: reset
>>> KrbKdcReq send: kdc=kdc-host.com TCP:88, timeout=3000, number of retries =3, #bytes=1607
>>> KDCCommunication: kdc=kdc-host.com TCP:88, timeout=3000,Attempt =1, #bytes=1607
>>>DEBUG: TCPClient reading 1581 bytes
>>> KrbKdcReq send: #bytes read=1581
>>> KdcAccessibility: remove kdc-host.com
>>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType
>>> KrbApReq: APOptions are 00100000 00000000 00000000 00000000
>>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType
Krb5Context setting mySeqNumber to: 789412608
Created InitSecContextToken:

From above message, you can see at least below info:

  • Client config file for kerberos /etc/krb5.conf
  • Ticket case file: /tmp/krb5cc_0
  • Client principal name: impala/{host-name}@REAL.COM
  • KDC server host: kdc=kdc-host.com and using TCP connection via port 88 (TCP:88)
  • and a lot more others that might be useful for your troubleshooting

Hope above helps.

How to enable HiveServer2 audit log through Cloudera Manager

This article explains the steps required to enable audit log for HiveServer2, so that all queries run through HiveServer2 will be audited into a central log file.

Please follow the steps below:

  1. Go to Cloudera Manager home page > Hive > Configuration
  2. Tick “Enable Audit Collection”
  3. Ensure “Audit Log Directory” location point to a path that has enough disk space
  4. Go to Cloudera Manager home page > click on “Cloudera Management Service” > Instances
  5. Click on “Add Role Instances” button on the top right corner of the page
  6. Choose a host for Navigator Audit Server & Navigator Metadata Server
  7. Then follow on screen instructions to finish adding the new roles
  8. Once the roles are added successfully, Cloudera Manager will ask you to restart a few services, including Hive
  9. Go ahead and restart Hive

After restarting, Hive’s audit log will be enabled and logged into /var/log/hive/audit directory by default.

Please note that you are not required start Navigator services, so if you don’t need them running, you can just leave them at STOP state, the Hive’s audit logs should still function as normal. However, it is a requirement to have Navigator installed for the audit log to function properly, as there are some libraries from Navigator are required for audit to work.