Read Files under Sub-Directories for Hive and Impala

next Sometimes you might want to store data under sub-directories in HDFS and then you want Hive or Impala to read from those sub-directories. For example, you have the following directory structure:

go to link root hdfs     231206 2017-06-30 02:45 /test/table1/000000_0 root hdfs          0 2017-06-30 02:45 /test/table1/child_directory root hdfs     231206 2017-06-30 02:45 /test/table1/child_directory/000000_0

follow By default, Hive will only look for files in the root of directory specified, in my test case is /test/table1. However, Hive supports to read all data under the root table’s sub-directories as well. This can be achieved by updating the following settings:

help me write an essay for college SET mapred.input.dir.recursive=true; SET hive.mapred.supports.subdirectories=true;

Impala however, on the other side, currently does not support reading files from table’s sub-directories. This has been reported in the upstream JIRA of IMPALA-1944. Currently there is no immediate plan to support such feature, but it might be in the future release of Impala.

business plan presentation powerpoint

Hope above information is useful.