Impala Query fails with NoSuchObjectException error

In the last few months, I have seem CDH users hitting Impala query returning NoSuchObjectException error very often. This happens when running query against a particular table with INT partition types and it failed with below message:

WARNINGS:
ImpalaRuntimeException: Error making 'alter_partitions' RPC to Hive Metastore:
CAUSED BY: InvalidOperationException: Alter partition operation failed: NoSuchObjectException(message:partition values=[2017, 6, 1, 8])

We have confirmed that the table has four partitions with Integer data type, and select individual partition works.

The following scenario will trigger such error:

  • Partitions with INT data type
  • Partition data was inserted from Hive with zero prefixes, something like below query:
    INSERT OVERWRITE TABLE test_tbl PARTITION (year = '2017', month = '06'....) .....
    
  • Partition data will be created under HDFS location like below:
    hdfs://nameservice1/user/hive/warehouse/test_tbl/year=2017/month=06/day=01/hour=08
    
  • When query through Impala, since the data type is INT, Impala will convert values from “06” to 6, “01” to 1 etc, and will be looking for location :
    hdfs://nameservice1/user/hive/warehouse/test_tbl/year=2017/month=6/day=1/hour=8
    

    instead​ of:

    hdfs://nameservice1/user/hive/warehouse/test_tbl/year=2017/month=06/day=01/hour=08
    

    hence triggered NoSuchObjectException error.

To fix the issue, there are two options:

  1. Convert the data type of partition columns to String, instead of Integer:
    ALTER TABLE test_tbl PARTITION COLUMN (year string);
    ALTER TABLE test_tbl PARTITION COLUMN (month string);
    ALTER TABLE test_tbl PARTITION COLUMN (day string);
    ALTER TABLE test_tbl PARTITION COLUMN (hour string);
    
  2. if integer type need to be kept, then we will need to re-build the table into a new one and store them into locations without leading zeros. This can be done by running the following queries from Impala:
    CREATE TABLE new_test_tbl LIKE test_tbl;
    
    INSERT OVERWRITE TABLE new_test_tbl PARTITION (year, month, day, hour) as SELECT * FROM test_tbl;
    

    The new table will have leading zeros in partitions removed and then we can switch over to use the new table. When writing more data into the new table through Hive, please be sure to remove all leading zeros to prevent the issue from happening again.

Above steps should help resolve the issue. Hope they will help.

Unable to Import Data as Parquet into Encrypted HDFS Zone

Recently I have discovered an issue in Sqoop that when it is importing data into Hive table, whose location is in an encrypted HDFS zone, the Sqoop command will fail with the following errors:

Command:

sqoop import --connect <postgres_url> --username <username> --password <password> \
--table sourceTable --split-by id --hive-import --hive-database staging \
--hive-table hiveTable --as-parquetfile

Errors:

2017-05-24 13:38:51,539 INFO [Thread-84] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: 
Setting job diagnostics to Job commit failed: org.kitesdk.data.DatasetIOException: Could not move contents of hdfs://nameservice1/tmp/staging/.
temp/job_1495453174050_1035/mr/job_1495453174050_1035 to 
hdfs://nameservice1/user/hive/warehouse/staging.db/hiveTable
        at org.kitesdk.data.spi.filesystem.FileSystemUtil.stageMove(FileSystemUtil.java:117)
        at org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:406)
        at org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:62)
        at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.commitJob(DatasetKeyOutputFormat.java:387)
        at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:274)
        at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
/tmp/staging/.temp/job_1495453174050_1035/mr/job_1495453174050_1035/964f7b5e-2f55-421d-bfb6-7613cc4bf26e.parquet 
can't be moved into an encryption zone.
        at org.apache.hadoop.hdfs.server.namenode.EncryptionZoneManager.checkMoveValidity(EncryptionZoneManager.java:284)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedRenameTo(FSDirectory.java:564)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.renameTo(FSDirectory.java:478)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInternal(FSNamesystem.java:3929)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameToInt(FSNamesystem.java:3891)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3856)

After some research, I have found out that it is caused by a known Sqoop bug: SQOOP-2943. This happens because Sqoop currently uses the Kite SDK to generate Parquet file, and the Kite SDK uses the /tmp directory to generate the parquet file on the fly. Because the /tmp directory is not encrypted and the hive warehouse directory is encrypted, the final move command to move the parquet file from the /tmp directory to hive warehouse will fail due to the encryption.

The import only fails with parquet format, the text file format currently works as expected.

As SQOOP-2943 is not fixed at this stage, and there is no direct workarounds, I would suggest the following two methods for importing the data into a Hive parquet table, inside the encrypted warehouse:

  • Import the data as text file format into Hive temporary table inside the Hive warehouse (encrypted), and then use Hive query to copy data into destination parquet table
  • Import the data as parquet file into non-encrypted temporary directory outside of Hive warehouse, and then again use Hive to copy data into destination parquet table inside the Hive warehouse (encrypted)

Hope above can help with anyone who noticed the similar issues.

Hive Query Failed with Token Renewer Error | Hive on Spark

If you run Hive on Spark on some CDH versions, you might run into issues when Hive is trying to renew HDFS delegation tokens. See below error message (that you can find from HiveServer2 server logs):

2017-06-27 17:04:08,836 INFO org.apache.hive.spark.client.SparkClientImpl: [stderr-redir-1]: 17/06/27 17:04:08 
WARN security.UserGroupInformation: PriviledgedActionException as:testuser (auth:PROXY) via 
hive/example.hadoop.com@REALM.COM (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: 
testuser tries to renew a token with renewer hive

If you do some googling, you should be able to locate the corresponding upstream Hive JIRA for this issue: HIVE-15485. And from this JIRA, you should also be able to identify that the issue was introduced by HIVE-14383. This is due to the fact that Spark needs the principal/keytab passed in via –principal and –keytab options, and does the renewal by copying the keytab to the cluster and handling login to kerberos inside the application. But the option –principal and –keytab could not work with –proxy-user in spark-submit.sh, so at this moment we could support either the token renewal or the impersonation, but not both.

The only way to avoid such issue is to upgrade CDH to the version that has the fix for HIVE-15485, which has been fixed in the following releases:

CDH5.8.5
CDH5.9.2
CDH5.10.1, CDH5.10.2
CDH5.11.0, CDH5.11.1

Since HIVE-14383 was introduced in the following CDH:

CDH5.8.3, CDH5.8.4, CDH5.8.5
CDH5.9.1, CDH5.9.2
CDH5.10.0, CDH5.10.1, CDH5.10.2 
CDH5.11.0, CDH5.11.1

This makes the following CDH currently will have such issues:

CDH5.8.3, CDH5.8.4, CDH5.9.1, CDH5.10.0

Please deploy the latest maintenance release for your major version to avoid such issue in Hive on Spark.

Hope above helps.