If you have just upgraded CDH or Impala, from CDH5.11 and before to CDH5.12 and after, or Impala 2.8 and before to Impala 2.9 and after, and you noticed that your Impala query against certain VIEWs failed with below error:
AnalysisException: Failed to load metadata for table: 'db.view_name' CAUSED BY: TableLoadingException: Failed to parse view-definition statement of view: db.view_name
Then please read on.
Firstly, please confirm if your VIEW contains a keyword like “SORT”, by running “SHOW CREATE TABLE view_name”. If the output looks something like below:
CREATE VIEW default.test_view AS SELECT a, sort FROM default.test
Notice the word “sort” in the query, then you are hitting the issue I am talking about in this post.
This is because in CDH5.12.x or Impala 2.9 version, the JIRA IMPALA-4166 was introduced. This JIRA adds support for “SORT BY” operation in Impala, which makes “SORT” a keyword in Impala from this version onwards. And because SORT is now a keyword, any reference to it in other part of the query will cause the query to be invalid, and cause the query to fail.
You can see below stacktrace from Impala Coordinator log:
CAUSED BY: TableLoadingException: Failed to parse view-definition statement of view: db.view_name @ 0x83efb9 impala::Status::Status() @ 0xb747c2 impala::JniUtil::GetJniExceptionMsg() @ 0xa7dfcb impala::Frontend::GetExecRequest() @ 0xa98fc8 impala::ImpalaServer::ExecuteInternal() @ 0xaa0278 impala::ImpalaServer::Execute() @ 0xadf28e impala::ImpalaServer::query() @ 0xd71cd5 beeswax::BeeswaxServiceProcessor::process_query() @ 0xd74fd4 beeswax::BeeswaxServiceProcessor::dispatchCall() @ 0x80ecdc apache::thrift::TDispatchProcessor::process() @ 0x1b596ab apache::thrift::server::TThreadPoolServer::Task::run() @ 0x1b40fc9 apache::thrift::concurrency::ThreadManager::Worker::run() @ 0x9d5c69 impala::ThriftThread::RunRunnable() @ 0x9d6a42 boost::detail::function::void_function_obj_invoker0<>::invoke() @ 0xbd6fe2 impala::Thread::SuperviseThread() @ 0xbd7744 boost::detail::thread_data<>::run() @ 0xe6417a (unknown)
To fix the issue, there are two ways:
1. Add “`” to the SORT keyword, by DROP and re-CREATE the view, like below:
DROP VIEW default.test_view; Query: DROP VIEW default.test_view Fetched 0 row(s) in 4.35s CREATE VIEW default.test_view AS SELECT a, `sort` FROM default.test; Query: CREATE VIEW default.test_view AS SELECT a, `sort` FROM default.test Fetched 0 row(s) in 3.87s SELECT * FROM default.test_view; Query: SELECT * FROM default.test_view Fetched 0 row(s) in 3.63s
2. DROP and re-CREATE the VIEW in beeline, without modifying the VIEW creation query, if your view is complex.
This will work because Hive will add “`” for you automatically, which Impala does not.
This should help to resolve your cluster upgrade issue.
If this post helps, please share your experience below in the comments, thanks.