Read Files under Sub-Directories for Hive and Impala

Sometimes you might want to store data under sub-directories in HDFS and then you want Hive or Impala to read from those sub-directories. For example, you have the following directory structure: root hdfs     231206 2017-06-30 02:45 /test/table1/000000_0 root hdfs          0 2017-06-30 02:45 /test/table1/child_directory root …

Oozie Server failed to Start with error java.lang.NoSuchFieldError: EXTERNAL_PROPERTY

This issue happens in CDH distribution of Hadoop that is managed by Cloudera Manager (possibly in other distributions as well, due to known upstream JIRA, but I have not tested). Oozie will fail to start after enabling Oozie HA through Cloudera Manager user interface. The full error message from Oozie’s …

Unable to Import Data as Parquet into Encrypted HDFS Zone | Sqoop Parquet Import

Recently I discovered an issue in Sqoop that when importing data into Hive table, whose location is in an encrypted HDFS zone, it will fail with “can’t be moved into an encryption zone” error: Command: sqoop import –connect <postgres_url> –username <username> –password <password> \ –table sourceTable –split-by id –hive-import –hive-database …