Oozie Hive2 Action Failed with Error: “HiveSQLException: Failed to execute session hooks”

If you have an Oozie Hive2 job that fails with below error message randomly, which can be found in Oozie’s server log, located by default under /var/log/oozie:

2018-06-02 09:00:01,103 WARN org.apache.oozie.action.hadoop.Hive2Credentials: SERVER[hlp3058p.oocl.com] 
USER[dmsa_appln] GROUP[-] TOKEN[] APP[DMSA_CMTX_PCON_ETL_ONLY] JOB[0010548-180302135253124-oozie-oozi-W] 
ACTION[0010548-180302135253124-oozie-oozi-W@spark-6799] Exception in addtoJobConf
org.apache.hive.service.cli.HiveSQLException: Failed to execute session hooks
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:241)
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:232)
        at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:491)
        at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:181)
        at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
        at java.sql.DriverManager.getConnection(DriverManager.java:571)
        at java.sql.DriverManager.getConnection(DriverManager.java:233)
        at org.apache.oozie.action.hadoop.Hive2Credentials.addtoJobConf(Hive2Credentials.java:66)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.setCredentialTokens(JavaActionExecutor.java:1213)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1063)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1295)
        at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232)
        at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
        at org.apache.oozie.command.XCommand.call(XCommand.java:286)
        at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332)
        at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to execute session hooks
        at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:308)
        at org.apache.hive.service.cli.CLIService.openSession(CLIService.java:178)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:422)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:316)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1253)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1238)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
        ... 3 more
Caused by: java.lang.IllegalStateException: zip file closed
        at java.util.zip.ZipFile.ensureOpen(ZipFile.java:634)
        at java.util.zip.ZipFile.getEntry(ZipFile.java:305)
        at java.util.jar.JarFile.getEntry(JarFile.java:227)
        at sun.net.www.protocol.jar.URLJarFile.getEntry(URLJarFile.java:128)
        at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:132)
        at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:150)
        at java.net.URLClassLoader.getResourceAsStream(URLClassLoader.java:233)
        at javax.xml.parsers.SecuritySupport$4.run(SecuritySupport.java:94)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.xml.parsers.SecuritySupport.getResourceAsStream(SecuritySupport.java:87)
        at javax.xml.parsers.FactoryFinder.findJarServiceProvider(FactoryFinder.java:283)
        at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:255)
        at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:121)
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2526)
        at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2513)
        at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2409)
        at org.apache.hadoop.conf.Configuration.get(Configuration.java:982)
        at org.apache.sentry.binding.hive.conf.HiveAuthzConf.<init>(HiveAuthzConf.java:162)
        at org.apache.sentry.binding.hive.HiveAuthzBindingHook.loadAuthzConf(HiveAuthzBindingHook.java:131)
        at org.apache.sentry.binding.hive.HiveAuthzBindingSessionHook.run(HiveAuthzBindingSessionHook.java:108)
        at org.apache.hive.service.cli.session.SessionManager.executeSessionHooks(SessionManager.java:420)
        at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:300)
        ... 12 more

It is likely that you are hitting a possible issue with JDK. Please refer to HADOOP-13809 for details. There is no prove at this stage that it is JDK bug, but workaround is at JDK level. As mentioned in the JIRA, you can add below parameters to HiveServer2’s Java options:

-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl

If you are using Cloudera Manager, you can go to:

CM > Hive > Configuration > Search “Java configuration options for HiveServer2”

and add above parameters to the end of the string, and don’t forget an extra space before it.

Then restart HiveServer2 through CM. This should help to avoid the issue.

Oozie Spark Actions Fail with Error “Spark config without ‘=’: –conf”

Currently Oozie provides easy interface for Spark1 jobs via Spark1 action, so that user does not have to embed spark-submit into shell action. However, recently I have discovered an issue in Oozie that it has a bug to parse Spark configurations and incorrectly generated a spark-submit command to submit Spark jobs. By checking Oozie’s launcher stderr.log, I discovered below error:

Error: Spark config without '=': --conf
Run with --help for usage help or --verbose for debug output
Intercepting System.exit(1)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], exit code [1]

Also, by checking the stdout.log, I can see below incorrect command for Spark:

  --conf
  spark.yarn.security.tokens.hive.enabled=false
  --conf
  --conf
  spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hive/lib/*:$PWD/*
  --conf
  spark.driver.extraClassPath=$PWD/*

You can see that there were double “–conf” generated by Oozie for Spark command. This explains the error we saw earlier about “Spark config without ‘=’: –conf”.

This is caused by a known issue reported upstream: OOZIE-2923.

This is a bug on Oozie side that it wrongly parses below configs:

--conf spark.executor.extraClassPath=...
--conf spark.driver.extraClassPath=...

The workaround is to remove the “–conf” in front of the first instance of spark.executor.extraClassPath, so that it will be added by Oozie. For example, if you have below :

<spark-opts>
--files /etc/hive/conf/hive-site.xml 
--driver-memory 4G 
--executor-memory 2G 
... 
--conf spark.yarn.security.tokens.hive.enabled=false 
--conf spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hive/lib/*
</spark-opts>

Simply remove the first –conf before spark.executor.extraClassPath, so it becomes:

<spark-opts>
--files /etc/hive/conf/hive-site.xml 
--driver-memory 4G 
--executor-memory 2G 
... 
--conf spark.yarn.security.tokens.hive.enabled=false  
spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hive/lib/*
</spark-opts>

This will allow you to avoid the issue.

However, the downside is that if you decide to upgrade to a version of CDH that contains the fix to this issue, you will need to re-add “–conf” back.

OOZIE-2923 is affecting CDH5.10.x, CDH5.11.0 and CDH5.11.1.

And CDH5.11.2 and CDH5.12.x and above contains the fix.

Oozie SSH Action Failed With “externalId cannot be empty” Error

Last week I was working with an issue that when running a very simple SSH action through Oozie, the job kept failing with “externalId cannot be empty” error. The workflow only had one single SSH action, and nothing else. See the workflow example below:

<workflow-app name="SSH Action Test" xmlns="uri:oozie:workflow:0.5">
    <start to="ssh-5c4d"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="ssh-5c4d">
        <ssh xmlns="uri:oozie:ssh-action:0.1">
            <host>user1@another-server-url</host>
            <command>ls / >> /tmp/test.log</command>
            <capture-output/>
        </ssh>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>

And the error message from the Oozie server looked like below:

2018-01-03 06:12:45,347 ERROR org.apache.oozie.command.wf.ActionStartXCommand: 
SERVER[{oozie-server-url}] USER[admin] GROUP[-] TOKEN[] APP[SSH Action Test] JOB[0000000-180103010440574-ooz
ie-oozi-W] ACTION[0000000-180103010440574-oozie-oozi-W@ssh-5c4d] Exception,
java.lang.IllegalArgumentException: externalId cannot be empty
        at org.apache.oozie.util.ParamChecker.notEmpty(ParamChecker.java:90)
        at org.apache.oozie.util.ParamChecker.notEmpty(ParamChecker.java:74)
        at org.apache.oozie.WorkflowActionBean.setStartData(WorkflowActionBean.java:503)
        at org.apache.oozie.command.wf.ActionXCommand$ActionExecutorContext.setStartData(ActionXCommand.java:387)
        at org.apache.oozie.action.ssh.SshActionExecutor.start(SshActionExecutor.java:269)
        at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:232)
        at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
        at org.apache.oozie.command.XCommand.call(XCommand.java:286)
        at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:332)
        at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:261)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:179)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

We confirmed that the passwordless connection from the Ooize server to the remote server worked correctly without issues.

After digging through the Oozie source code, I found out that it was due to the fact that Oozie uses Java’s Runtime.exec library to execute the commands remotely. And Runtime.exec does not work in the same way as shell, especially when re-directing output to a file, which Runtime.exec does not support at all. What happened under the hood was that Oozie will split the full command “ls / >> /tmp/test.log” into tokens “ls”, “/”, “>>”, “/tmp/test.log”, and pass all of them into Runtime.exec. And when Runtime.exec executed the command, it treated all tokens, apart from “ls” as the parameters to “ls” command. As you would expect, “>>” is not a file, and “ls” command will fail complain that file does not exist, hence will return exit status of 1, rather than 0.

Oozie tried to capture the PID of the remote process, but failed, and hence returned “externalId cannot be empty” error.

The workaround is simple, just store the full command you want to run into a new script file and ask Oozie to execute that script instead:

1. Create a file “ssh-action.sh” on the target host, for example, under /home/{user}/scripts/ssh-action.sh
2. Add command “ls / >> /tmp/ssh.log” to the file
3. Make the file executable by running:

chmod 744 /home/{user}/scripts/ssh-action.sh

4. Update Oozie workflow to run the new shell script instead:

<ssh xmlns="uri:oozie:ssh-action:0.1">
    <host>user@remote-server-url</host>
    <command>/home/{user}/scripts/ssh-action.sh</command>
    <capture-output/>
</ssh>

And then the SSH action should work perfectly.

Oozie Server failed to Start with error java.lang.NoSuchFieldError: EXTERNAL_PROPERTY

This issue happens in CDH distribution of Hadoop that is managed by Cloudera Manager (possibly in other distributions as well, due to known upstream JIRA, but I have not tested). Oozie will fail to start after enabling Oozie HA through Cloudera Manager user interface.

The full error message from Oozie’s process stdout.log (can be found under /var/run/cloudera-scm-agent/process/XXX-oozie-OOZIE_SERVER/logs directory) file looks like below:

 

Wed Jan 25 11:07:41 GST 2017 
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera 
using 5 as CDH_VERSION 
using /var/lib/oozie/tomcat-deployment as CATALINA_BASE 
Copying JDBC jar from /usr/share/java/oracle-connector-java.jar to /var/lib/oozie 

ERROR: Oozie could not be started 

REASON: java.lang.NoSuchFieldError: EXTERNAL_PROPERTY 

Stacktrace: 
----------------------------------------------------------------- 
java.lang.NoSuchFieldError: EXTERNAL_PROPERTY 
at org.codehaus.jackson.map.introspect.JacksonAnnotationIntrospector._findTypeResolver(JacksonAnnotationIntrospector.java:777) 
at org.codehaus.jackson.map.introspect.JacksonAnnotationIntrospector.findPropertyTypeResolver(JacksonAnnotationIntrospector.java:214) 
at org.codehaus.jackson.map.ser.BeanSerializerFactory.findPropertyTypeSerializer(BeanSerializerFactory.java:370) 
at org.codehaus.jackson.map.ser.BeanSerializerFactory._constructWriter(BeanSerializerFactory.java:772) 
at org.codehaus.jackson.map.ser.BeanSerializerFactory.findBeanProperties(BeanSerializerFactory.java:586) 
at org.codehaus.jackson.map.ser.BeanSerializerFactory.constructBeanSerializer(BeanSerializerFactory.java:430) 
at org.codehaus.jackson.map.ser.BeanSerializerFactory.findBeanSerializer(BeanSerializerFactory.java:343) 
at org.codehaus.jackson.map.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:287) 
at org.codehaus.jackson.map.ser.StdSerializerProvider._createUntypedSerializer(StdSerializerProvider.java:782) 
at org.codehaus.jackson.map.ser.StdSerializerProvider._createAndCacheUntypedSerializer(StdSerializerProvider.java:735) 
at org.codehaus.jackson.map.ser.StdSerializerProvider.findValueSerializer(StdSerializerProvider.java:344) 
at org.codehaus.jackson.map.ser.StdSerializerProvider.findTypedValueSerializer(StdSerializerProvider.java:420) 
at org.codehaus.jackson.map.ser.StdSerializerProvider._serializeValue(StdSerializerProvider.java:601) 
at org.codehaus.jackson.map.ser.StdSerializerProvider.serializeValue(StdSerializerProvider.java:256) 
at org.codehaus.jackson.map.ObjectMapper._configAndWriteValue(ObjectMapper.java:2566) 
at org.codehaus.jackson.map.ObjectMapper.writeValue(ObjectMapper.java:2056) 
at org.apache.oozie.util.FixedJsonInstanceSerializer.serialize(FixedJsonInstanceSerializer.java:65) 
at org.apache.curator.x.discovery.details.ServiceDiscoveryImpl.internalRegisterService(ServiceDiscoveryImpl.java:201) 
at org.apache.curator.x.discovery.details.ServiceDiscoveryImpl.registerService(ServiceDiscoveryImpl.java:186) 
at org.apache.oozie.util.ZKUtils.advertiseService(ZKUtils.java:217) 
at org.apache.oozie.util.ZKUtils.<init>(ZKUtils.java:141) 
at org.apache.oozie.util.ZKUtils.register(ZKUtils.java:154) 
at org.apache.oozie.service.ZKLocksService.init(ZKLocksService.java:70) 
at org.apache.oozie.service.Services.setServiceInternal(Services.java:386) 
at org.apache.oozie.service.Services.setService(Services.java:372) 
at org.apache.oozie.service.Services.loadServices(Services.java:305) 
at org.apache.oozie.service.Services.init(Services.java:213) 
at org.apache.oozie.servlet.ServicesLoader.contextInitialized(ServicesLoader.java:46) 
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4210) 
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4709) 
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:802) 
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779) 
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:583) 
at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:944) 
at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:779) 
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:505) 
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1322) 
at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:325) 
at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142) 
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1068) 
at org.apache.catalina.core.StandardHost.start(StandardHost.java:822) 
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1060) 
at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:463) 
at org.apache.catalina.core.StandardService.start(StandardService.java:525) 
at org.apache.catalina.core.StandardServer.start(StandardServer.java:759) 
at org.apache.catalina.startup.Catalina.start(Catalina.java:595) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:606) 
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289) 
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414) 

To fix the issue, please follow the steps below:

  1. Delete or move the following files under CDH’s parcel directory (most likely they are symlinks):

    /opt/cloudera/parcels/CDH/lib/oozie/libserver/hive-exec.jar
    /opt/cloudera/parcels/CDH/lib/oozie/libtools/hive-exec.jar
    

  2. Download hive-exec-{cdh version}-core.jar file from the Cloudera repo, for example, for CDH5.8.2, please go to:
    https://repository.cloudera.com/cloudera/cloudera-repos/org/apache/hive/hive-exec/1.1.0-cdh5.8.2/

    and put the file under the following directories on the Oozie server:

    /opt/cloudera/parcels/CDH/lib/oozie/libserver/
    /opt/cloudera/parcels/CDH/lib/oozie/libtools/
    

  3. Download kryo-2.22.jar from the maven repository:
    http://repo1.maven.org/maven2/com/esotericsoftware/kryo/kryo/2.22/kryo-2.22.jar

    and put it under directories on the Oozie server:

    /opt/cloudera/parcels/CDH/lib/oozie/libserver/
    /opt/cloudera/parcels/CDH/lib/oozie/libtools/
    

  4. Finally restart Oozie service

This is a known Oozie issue and reported in the upstream JIRA: OOZIE-2621, which has been resolved and targeted for 4.3.0 release.

Hope this helps.

How to load different version of Spark into Oozie

This article explains the steps needed to load Spark2 into Oozie under CDH5.9.x which comes with Spark1.6. Although this was tested under CDH5.9.0, it should be similar for earlier releases.

Please follow the steps below:

  1. Locate the current shared-lib directory by running:
    oozie admin -oozie http://<oozie-server-host>:11000/oozie -sharelibupdate
    

    you will get something like below:

    [ShareLib update status]
    host = http://<oozie-server-host>:11000/oozie
    status = Successful
    sharelibDirOld = hdfs://<oozie-server-host>:8020/user/oozie/share/lib/lib_20161202183044
    sharelibDirNew = hdfs://<oozie-server-host>:8020/user/oozie/share/lib/lib_20161202183044
    

    This tells me that the current sharelib directory is /user/oozie/share/lib/lib_20161202183044

  2. Create a new directory for spark2.0 under this directory:
    hadoop fs -mkdir /user/oozie/share/lib/lib_20161202183044/spark2
    

  3. Put all your spark 2 jars under this directory, please also make sure that oozie-sharelib-spark-4.1.0-cdh5.9.0.jar is there too
  4. Update the sharelib by running:

    oozie admin -oozie http://<oozie-server-host>:11000/oozie -sharelibupdate
    
  5. Confirm that the spark2 has been added to the shared lib path:

    oozie admin -oozie http://<oozie-server-host>:11000/oozie -shareliblist
    

    you should get something like below:

    [Available ShareLib]
    spark2
    oozie
    hive
    distcp
    hcatalog
    sqoop
    mapreduce-streaming
    spark
    pig
    
  6. Go back to spark workflow and add the following configuration under Spark action:

    <property>
        <name>oozie.action.sharelib.for.spark</name>
        <value>spark2</value>
    </property>
    
  7. Save workflow and run to test if it will pick up the correct JARs now.

Please be advised that although this can work, it will put Spark action in Oozie not supported by Cloudera, because it is not tested and it should not be recommended. But if you are still willing to go ahead, the steps above should help.