Sentry HDFS sync will NOT sync URI privilege

http://www.nbinflatables.com/essay-writing-sites/ I have seen lots of cases that Hive users were trying to give user permission to a certain directory using Sentry command to GRANT URI in the Sentry HDFS sync enabled cluster. This logic seems to be correct, however it will not work.

link Sentry HDFS sync will only sync Sentry privileges to HDFS ACLs at database or table level, and it will ignore all the privileges for URIs.

http://naturalbydesign.us/essay-about-helping-disabled-people/ So if your username is “test”, GROUP is “test_group” and ROLE is “test_role”, which as the following privilege:

http://www.wonderlandparty.it/i-must-do-my-homework/ 0: jdbc:hive2://ausplcdhedge03.us.dell.com:10> show grant role test_role; +---------------------------------+-------+-----------+--------+----------------+----------------+------------+--------------+------------------+---------+ | database | table | partition | column | principal_name | principal_type | privilege | grant_option | grant_time | grantor | +---------------------------------+-------+-----------+--------+----------------+----------------+------------+--------------+------------------+---------+ | hdfs://nameservice1/path/to/dir | | | | test_role | ROLE | * | false | 1468340836037000 | -- | +---------------------------------+-------+-----------+--------+----------------+----------------+------------+--------------+------------------+---------+

http://luxwarranty.com/writing-help-ut/ If you “getfacl” on path “hdfs://nameservice1/path/to/dir”, it will not show that GROUP “test_group” has READ and WRITE permissions. In order to get the Sentry privilege synced, a table will need to be linked to the URI.

http://www.rainbowmediasolutionltd.com/pneumatic-hybrid-phd-thesis/ pneumatic hybrid phd thesis Try the following:

College Essay Jmu CREATE DATABASE dummy; -- have any dummy tables under this database USE dummy; CREATE EXTERNAL TABLE dummy (a int) LOCATION "/path/to/dir"; GRANT ALL ON TABLE dummy TO ROLE test_role;

http://vaygaptrongngay.com/essentials-of-research-methodology-and-dissertation-writing/ essentials of research methodology and dissertation writing Now if you try to run “hdfs dfs -getfacl /path/to/dir”, the test_group should show up and have “rwx” permissions.

Impala query failed with error “IllegalArgumentException: Value cannot be empty”

what is custom This article explains what to do to fix the issue that when running “SHOW DATABASES” or other simply Impala queries, Impala complains “ERROR: IllegalArgumentException: Value cannot be empty”.

Help With Personal Statement The full stack trace showing in the Impala Daemon log as follows:

follow site I0620 10:46:08.436385 47131 Frontend.java:818] analyze query show databases I0620 10:46:08.437651 47131 jni-util.cc:177] java.lang.IllegalArgumentException: Value cannot be empty at org.apache.sentry.provider.file.KeyValue.<init>(KeyValue.java:41) at org.apache.sentry.policy.db.DBWildcardPrivilege.<init>(DBWildcardPrivilege.java:62) at org.apache.sentry.policy.db.DBWildcardPrivilege$DBWildcardPrivilegeFactory.createPrivilege(DBWildcardPrivilege.java:167) at org.apache.sentry.provider.common.ResourceAuthorizationProvider$2.apply(ResourceAuthorizationProvider.java:131) at org.apache.sentry.provider.common.ResourceAuthorizationProvider$2.apply(ResourceAuthorizationProvider.java:128) at com.google.common.collect.Iterators$8.next(Iterators.java:812) at org.apache.sentry.provider.common.ResourceAuthorizationProvider.doHasAccess(ResourceAuthorizationProvider.java:107) at org.apache.sentry.provider.common.ResourceAuthorizationProvider.hasAccess(ResourceAuthorizationProvider.java:91) at com.cloudera.impala.authorization.AuthorizationChecker.hasAccess(AuthorizationChecker.java:171) at com.cloudera.impala.service.Frontend.getDbNames(Frontend.java:630) at com.cloudera.impala.service.JniFrontend.getDbNames(JniFrontend.java:272)

http://pacificcrossroads.net/?shakespeare-essay-help This happens in a cluster with Kerberos and Sentry enabled.

http://caho-logistico.com/?p=ghostwriter-for-android To confirm if the issue is the same as mine, follow the steps below:

  1. Run “SHOW CURRENT ROLES;” command in impala, and capture the role name
  2. Log into Sentry’s Database. I am using MySQL, so my example query will be based on MySQL below
  3. Once logs in, please run the following query:

    custom academic essay writing companies SELECT r.ROLE_ID, r.ROLE_NAME, PRIVILEGE_SCOPE, SERVER_NAME, DB_NAME, TABLE_NAME, COLUMN_NAME, URI, ACTION FROM SENTRY_ROLE r JOIN SENTRY_ROLE_DB_PRIVILEGE_MAP m ON (r.ROLE_ID = m.ROLE_ID) JOIN SENTRY_DB_PRIVILEGE p ON (m.DB_PRIVILEGE_ID = p.DB_PRIVILEGE_ID) WHERE r.ROLE_NAME = '<role-name-from-step-1>';

  4. My output looks like the following:

    college assignment writing service +---------+----------------+-----------------+-----------------+-------------+-------------+------------+-------------+--------------------------+--------+ | ROLE_ID | ROLE_NAME | DB_PRIVILEGE_ID | PRIVILEGE_SCOPE | SERVER_NAME | DB_NAME | TABLE_NAME | COLUMN_NAME | URI | ACTION | +---------+----------------+-----------------+-----------------+-------------+-------------+------------+-------------+--------------------------+--------+ | 1 | test_role | 1 | URI | server1 | __NULL__ | __NULL__ | __NULL__ | __NULL__ | all | +---------+----------------+-----------------+-----------------+-------------+-------------+------------+-------------+--------------------------+--------+

    http://www.surgeskateboard.com/hire-purchase-essays/ And there is only one privilege for this role and scope is URI. Notice that all values for this role are “__NULL__”?

If you have similar output as above, then the fix will be simple. Follow the steps below:

  1. BACKUP Sentry DB again just before you about to do the change (a must do before you make any changes to any production Database)
  2. Run the following query against Sentry Database to update the URI value:
    UPDATE SENTRY_DB_PRIVILEGE set URI = 'hdfs:///dummy' where DB_PRIVILEGE_ID = 1;
    

    update the DB_PRIVILEGE_ID to match in your own case.

  3. Connect to impala-shell using the user who has admin access to sentry
  4. Run ‘INVALIDATE METADATA’ to update the metadata we just changed
  5. Test again using the user who belongs to role “unixadmins”, issue should be resolved

After this change the role will only have access to dummy URI “hdfs:///dummy”, no databases and tables, but you can grant whatever you want to the role afterwards.

Hive MetaStore Server takes a long time to start up

This article explains the possible causes when HiveMetastore server (HMS) takes a long time to start up (more than 10 minutes).

Every time when Hive is restarted through Cloudera Manager (CM), it takes more than 10 minutes for Hive services to become green and users to be able to use beeline CLI.

One possible cause of the issue is:

  • Sentry HDFS sync is enabled
  • There are lots of tables or tables with lots of partitions (hundreds of thousands of partitions)

When the above two conditions are met, when HMS starts up, it will need to scan through all the tables and partitions in HMS database, and then sync with HDFS directories one by one. If there are too many tables or partitions, there will be a lot of HDFS directories that need to be synced, which will take some time.

To confirm whether this is the cause, simply disable the HDFS sync, restart Hive and see how long HMS takes to start up again. If the symptom disappears, then we have confirmed the cause.

The fix is to simply keep the number of tables and partitions per table down:

  • If possible, drop the tables that you do not need
  • If you need to keep the tables that have lots of partitions, try to merge those partitions if possible, by copying data into a new table with merged partitions
  • If there are hundreds of thousands of partitions that you can not merge, then it is time to redesign your tables so that less partitions could be used

 

Hive MetaStore Server takes a long time to start up

This article explains the possible causes when HMS server takes a long time to start up (more than 10 minutes)

literature essay The symptom:

every time when Hive is restarted through CM, it will take more than 10 minutes for Hive services to become green and users to be able to use beeline CLI.

This is from HMS log:


2015-07-21 20:35:17,359 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: Starting hive metastore on port 9083
.........
2015-07-21 20:44:44,495 INFO org.apache.sentry.hdfs.MetastorePlugin: #### Metastore Plugin initialization complete !!
2015-07-21 20:44:44,495 INFO org.apache.sentry.hdfs.MetastorePlugin: #### Finished flushing queued updates to Sentry !!

You can see that HMS started at 8:35PM, and finished the sync at almost 8:45PM.

computer science phd thesis proposal The cause:

One possible cause of the issue is:

  • Sentry HDFS sync is enabled
  • There are lots of tables or tables with lots of partitions (hundreds of thousands of partitions)

essay bus services What you need to do:

When the above two conditions are met, when HMS starts up, it will need to scan through all the tables and partitions in HMS database, and then sync with HDFS directories one by one. If there are too many tables or partitions, there will be a lot of HDFS directories that need to be synced, which will take some time.

If this is the cause, the fix is to simply keep the number of tables and partitions per table down:

  • If possible, drop the tables that you do not need
  • If you need to keep the tables that have lots of partitions, try to merge those partitions if possible, by copying data into a new table with merged partitions
  • If there are hundreds of thousands of partitions that you can not merge, then it is time to redesign your tables so that less partitions could be used