Recently I discovered that the performance logs were missing from both HiveServer2 and HiveMetaStore server logs. This makes troubleshooting performance related issue very hard. The log message that I am expecting is something like below:

HiveServer2 Log:

2020-01-02 08:30:26,450 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-898872]: <PERFLOG method=getSplits from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2020-01-02 08:30:26,507 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-898872]: </PERFLOG method=getSplits start=1577928626450 end=1577928626507 duration=57 from=org.apache.hadoop.hive.ql.io.CombineHi
veInputFormat>

Above log tells me that the getSplit operation took 57 milli-seconds to complete.

HiveMetaStore Log:

2020-01-10 09:59:37,151 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [pool-5-thread-28]: <PERFLOG method=get_tables from=org.apache.hadoop.hive.metastore.RetryingHMSHandler>
2020-01-10 09:59:37,157 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [pool-5-thread-28]: </PERFLOG method=get_tables start=1578610777151 end=1578610777157 duration=6 from=org.apache.hadoop.hive.metastore.RetryingHMSHandler threadId=28 retryCount=0 error=false>

Above log tells me that HMS spent 6 milli-seconds to get list of tables for a certain database.

If those numbers against “duration” is high, we can know exactly at what stage the slowness is from during troubleshooting steps. However, this information is missing from HiveServer2 and HiveMetaStore logs in CDH6.

To remedy this, follow steps below:

  1. Go to Cloudera Manager home page
  2. Click through to Hive Service and then Configuration page
  3. Search for below two configurations:
    1. Hive Metastore Server Logging Advanced Configuration Snippet (Safety Valve)
    2. HiveServer2 Logging Advanced Configuration Snippet (Safety Valve)
  4. Enter below contents into the textarea of above mentioned settings:
    rootLogger.appenderRefs=root, console, DRFA, PerfLogger
    logger.PerfLogger.name = org.apache.hadoop.hive.ql.log.PerfLogger
    logger.PerfLogger.level = DEBUG
  5. Save then restart Hive Services
  6. Check both HS2 and HMS logs to confirm that performance logging are in place

Leave a Reply

Your email address will not be published. Required fields are marked *