Patch for SENTRY-2240 – DROP UDF Permission Issue in Sentry

Last week, I have discovered an issue in Sentry that it does not check permissions properly when a user is trying to DROP a function. To re-produce this is easy, simply create a function under a database using admin account, and make sure that one particular user does not have ANY permissions on the database that the UDF was created under. Then, try to use that user to DROP the function.

I immediately checked if there is any upstream JIRA reported, but I was not able to find any, hence I filed a new JIRA, please see SENTRY-2240.

I have done some patches before for Sqoop and Hive, however, I have not done any for Sentry yet, so I think this JIRA is a good one to start. I went ahead to check out Sentry code from github, examined through the code to see what was wrong and could see that for CREATE and DROP FUNCTION calls, Sentry does not care what database the user was under:

HiveAuthzBindingHook.java#L226
HiveAuthzBindingHook.java#L230

      case HiveParser.TOK_CREATEFUNCTION:
        ........

        // create/drop function is allowed with any database
        currDB = Database.ALL;
        break;
      case HiveParser.TOK_DROPFUNCTION:
        // create/drop function is allowed with any database
        currDB = Database.ALL;
        break;

I have spent last weekend thinking about and applying fixes and also updating test cases that are affected. I have forked into my repository and created a branch to track my changes until final version. Please refer to https://github.com/ericlin05/sentry/tree/SENTRY-2240.

If you have any comments on my patch or want to discuss it, please add your comments below.

Drop Impala UDF function returned NoSuchObjectException error

Impala UDF works a bit differently than Hive UDF, as they are written in different languages. Recently I have encountered an issue that when I try to drop Impala UDF written in C, it failed with below error:

DROP FUNCTION udf_testing.has_vowels;
Query: DROP FUNCTION udf_testing.has_vowels
ERROR:
ImpalaRuntimeException: Error making 'dropFunction' RPC to Hive Metastore:
CAUSED BY: NoSuchObjectException: Function has_vowels does not exist

The “SHOW CREATE FUNCTION” query worked OK as below:

SHOW CREATE FUNCTION has_vowels;
Query: SHOW CREATE FUNCTION has_vowels
+-----------------------------------------------------------------------------+
| result                                                                      |
+-----------------------------------------------------------------------------+
| CREATE FUNCTION udf_testing.has_vowels(STRING)                              |
|  RETURNS BOOLEAN                                                            |
|  LOCATION 'hdfs://nameservice1/user/hive/impala-udf/libudfsample.so'        |
|  SYMBOL='_Z9HasVowelsPN10impala_udf15FunctionContextERKNS_9StringValE'      |
|                                                                             |
+-----------------------------------------------------------------------------+
Fetched 1 row(s) in 0.03s

After researching, it turned out that when dropping functions in Impala, you will also need to specify the function parameters. So below query will work:

DROP FUNCTION udf_testing.has_vowels(STRING);
Query: DROP FUNCTION udf_testing.has_vowels(STRING)

This is not immediately obvious, but it is documented on Cloudera’ offical documentation site: DROP FUNCTION Statement.

Hope above helps.

How to Compile Hive UDF without MVN

Last week I needed to do a quick compile of a sample Hive UDF for testing purpose. I didn’t want to install all Maven stuff and create a pom.xml file just for this simple task. Thanks to my colleague YiBing, who is an expert in Hive, I got it working in just a few commands in the shell.

[root@localhost hive-udf]$ export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/
[root@localhost hive-udf]$ export PATH=$PATH:$JAVA_HOME/bin
[root@localhost hive-udf]$ javac -cp `hadoop classpath`:"/opt/cloudera/parcels/CDH/lib/hive/lib/*" GenericUDFCurrentUser.java

[root@localhost hive-udf]$ ls
GenericUDFCurrentUser.class GenericUDFCurrentUser.java

[root@localhost hive-udf]$ mkdir -p com/elin
[root@localhost hive-udf]$ mv GenericUDFCurrentUser.class com/elin
[root@localhost hive-udf]$ jar cvf elin-udf.jar com
added manifest
adding: com/(in = 0) (out= 0)(stored 0%)
adding: com/elin/(in = 0) (out= 0)(stored 0%)
adding: com/elin/GenericUDFCurrentUser.class(in = 2505) (out= 1145)(deflated 54%)

Then I could access the UDF in Hive

ADD JAR /user/ericlin/hive-udf/elin-udf.jar;
CREATE TEMPORARY FUNCTION get_user AS com.elin.GenericUDFCurrentUser';
SELECT get_user() from test;

It is so simple without the need of any third party thingy.