Impala Query fails with NoSuchObjectException error

In the last few months, I have seem CDH users hitting Impala query returning NoSuchObjectException error very often. This happens when running query against a particular table with INT partition types and it failed with below message: WARNINGS: ImpalaRuntimeException: Error making 'alter_partitions' RPC to Hive Metastore: CAUSED BY: InvalidOperationException: Alter …

Read Files under Sub-Directories for Hive and Impala

Sometimes you might want to store data under sub-directories in HDFS and then you want Hive or Impala to read from those sub-directories. For example, you have the following directory structure: root hdfs     231206 2017-06-30 02:45 /test/table1/000000_0 root hdfs          0 2017-06-30 02:45 /test/table1/child_directory root …

Impala Reported Corrupt Parquet File After Failed With OutOfMemory Error

Recently I was dealing with an issue that impala reported Corrupt Parquet File after it failed with OutOfMemory error, however, if it does not fail, no corruption will be reported. See below error message reportd in Impala Daemon logs: Memory limit exceeded HdfsParquetScanner::ReadDataPage() failed to allocate 65535 bytes for decompressed …

How to Use Beeline to connect to Impala

You can certainly connect to Impala using Hive Driver from beeline, like below command: beeline -u 'jdbc:hive2://<impala-daemon-host>:21050/default;auth=noSasl' However, the result output format does not work properly: > show tables; customers dim_prod mansi sample_07 sample_08 small web_logs +——-+–+ | name | +——-+–+ +——-+–+ Notice the output is not inside the columns? …