Sqoop Hive Import Failed After Upgrading to CDH5.4.x or CDH5.5.x

Sqoop Hive Import Failed After Upgrading to CDH5.4.x or CDH5.5.x

This article explains the root cause of Sqoop Hive Import failure after upgrading to CDH5.4.x or CDh5.5.x and the solution to fix the issue. After upgrading to CDH5.5.x from CDh5.3.x, Sqoop Hive Import got the following error:
16/01/12 18:27:25 WARN hive.TableDefWriter: Column EXPIRN_TS had to be cast to a less precise type in Hive
16/01/12 18:27:25 INFO hive.HiveImport: Loading uploaded data into Hive
16/01/12 18:27:25 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
16/01/12 18:27:25 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.lang.ClassNotFoundException: 
org.apache.hadoop.hive.conf.HiveConf
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
... 12 more
After upgrading to CDH5.4.x from CDh5.3.x, Sqoop Hive Import got the following error:
16/01/18 16:42:40 INFO hive.HiveImport: Loading uploaded data into Hive
16/01/18 16:42:40 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
16/01/18 16:42:40 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
java.lang.NullPointerException
	at org.apache.sqoop.hive.HiveConfig.addHiveConfigs(HiveConfig.java:61)
	at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
	at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
	at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
	at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
This is caused by the new code checking the /usr/lib/hive directory and set it as HIVE_HOME in configure-sqoop script under /opt/cloudera/parcels/CDH/lib/sqoop/bin directory:
if [ -z "${HIVE_HOME}" ]; then
  if [ -d "/usr/lib/hive" ]; then
    export HIVE_HOME=/usr/lib/hive
  elif [ -d ${SQOOP_HOME}/../hive ]; then
    export HIVE_HOME=${SQOOP_HOME}/../hive
  fi
fi
And on the sqoop edge node, we have /usr/lib/hive directory created with only mysql-connector-java.jar in it. When HIVE_HOME is set to this directory, because it is missing all other hive required libraries, hive import will fail. To fix this issue, simply rename /usr/lib/hive to /usr/lib/hive-bak, do the test and make sure that cluster is functioning OK for a couple of weeks. And if no further issues, it can be safe to remove /usr/lib/hive-bak for good.

Loading

Leave a Reply

Your email address will not be published. Required fields are marked *

My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!