- Locate the current shared-lib directory by running:
oozie admin -oozie http://<oozie-server-host>:11000/oozie -sharelibupdateyou will get something like below:
[ShareLib update status] host = http://<oozie-server-host>:11000/oozie status = Successful sharelibDirOld = hdfs://<oozie-server-host>:8020/user/oozie/share/lib/lib_20161202183044 sharelibDirNew = hdfs://<oozie-server-host>:8020/user/oozie/share/lib/lib_20161202183044This tells me that the current sharelib directory is /user/oozie/share/lib/lib_20161202183044
- Create a new directory for spark2.0 under this directory:
hadoop fs -mkdir /user/oozie/share/lib/lib_20161202183044/spark2
- Put all your spark 2 jars under this directory, please also make sure that oozie-sharelib-spark-4.1.0-cdh5.9.0.jar is there too
- Update the sharelib by running:
oozie admin -oozie http://<oozie-server-host>:11000/oozie -sharelibupdate
- Confirm that the spark2 has been added to the shared lib path:
oozie admin -oozie http://<oozie-server-host>:11000/oozie -shareliblistyou should get something like below:
[Available ShareLib] spark2 oozie hive distcp hcatalog sqoop mapreduce-streaming spark pig
- Go back to spark workflow and add the following configuration under Spark action:
<property> <name>oozie.action.sharelib.for.spark</name> <value>spark2</value> </property>
- Save workflow and run to test if it will pick up the correct JARs now.
Please be advised that although this can work, it will put Spark action in Oozie not supported by Cloudera, because it is not tested and it should not be recommended. But if you are still willing to go ahead, the steps above should help.