Tag: <span>Impala</span>

Tag: Impala

Big Compressed File Will Affect Query Performance for Impala

As we know, Hadoop/HDFS/MapReduce/Impala is designed to store and process large amount of data, in terms of TBs or PBs. And we also know that having too many small files will hurt query performance, because NameNode needs to store millions of metadata to hold the information about files being stored …

Loading

Impala Query Failed with ERROR “AnalysisException: ORDER BY expression not produced by aggregation output”

Recently, I discovered a bug in Impala that when you are using Expression in the ORDER BY clause, the query will fail with below error message: Customer used a very complicated query, and I managed to simplify it to look something like below: This can be re-produced from CDH5.13.x onward. …

Loading

My new Snowflake Blog is now live. I will not be updating this blog anymore but will continue with new contents in the Snowflake world!