Recently I was dealing with an issue that HiveServer2 was not able to start up and keeps failing with NullPointerException error. See below full stacktrace:

2019-07-10 17:34:55,243 INFO  org.apache.hive.service.server.HiveServer2: [main]: Exception caught when calling stop of HiveServer2 before retrying start
java.lang.NullPointerException
        at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:483)
        at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:571)
...
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.NullPointerException)
        at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:220)
        at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:338)
        at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:299)
        at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:274)
        at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:256)
        at org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider.init(DefaultHiveAuthorizationProvider.java:29)
        at org.apache.hadoop.hive.ql.security.authorization.HiveAuthorizationProviderBase.setConf(HiveAuthorizationProviderBase.java:112)
        ... 21 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.NullPointerException)
        at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3646)
        at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:231)
        at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:215)
        ... 27 more
Caused by: MetaException(message:java.lang.NullPointerException)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_all_functions_result$get_all_functions_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_all_functions_result$get_all_functions_resultStandardScheme.read(ThriftHiveMetastore.java)
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_all_functions_result.read(ThriftHiveMetastore.java)
....

Since the stacktrace showed that HiveServer2 failed to startup while loading All Functions, I focused my analysis on checking the list of functions we had in the FUNCS table in HMS’ backend database.

After trying with different ways to re-produce the issue, I finally narrowed down that it was caused by stale DB_ID in the FUNCS table, that did not have any reference to the DB_ID defined in the DBS table, which is used to store the list of databases in Hive. See below outputs from MySQL:

mysql> SELECT DISTINCT DB_ID FROM DBS;
+--------+
| DB_ID  |
+--------+
| 1      |
+--------+
1 row in set (0.00 sec)
mysql> SELECT FUNC_ID, f.DB_ID FROM FUNCS f LEFT JOIN DBS d ON (f.DB_ID = d.DB_ID) WHERE d.DB_ID IS NULL;
+---------+--------+
| FUNC_ID | DB_ID  |
+---------+--------+
|     278 |      1 |
|     362 |      1 |
|     363 |      1 |
|     366 |      1 |
|     367 |      1 |
|     371 |      1 |
|      21 |   2253 |
|     361 |   2259 |
|      36 |   2434 |
|     161 |   2481 |
....
+---------+--------+ 
47 rows in set (0.02 sec)

Above query will return all the DB_IDs that were in the FUNCS table, but no corresponding links in the DBS table. When HiveServer2 tried to load those functions that have broken DB link, HiveServer2 was not able to retrieve those functions and failed with NPE.

To fix the issue, simply remove the rows in FUNCS table that have the broken link:

DELETE FROM FUNCS WHERE DB_ID NOT IN (SELECT DISTINCT DB_ID FROM DBS);

Remember to BACKUP your database first before any attempt to remove anything!!

Then, try to restart HiveServer2 again and it should be resolved.

Leave a Reply

Your email address will not be published. Required fields are marked *