Setting up Cloudera ODBC driver on Windows 10

http://www.aroundlife.net/physician-assistant-personal-statement-help/ I have seen lots of CDH users now have trouble setting up Hive/Impala ODBC drivers on Windows 10 machine to connect to remote Kerberized cluster recently. Connection keeps getting Kerberos related error messages. Like below:

http://foto8.ch/essay-helper/ [Cloudera][Hardy] (34) Error from server: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credential cache is empty).

source site OR

[Cloudera][ImpalaODBC] (100) Error from the Impala Thrift API: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (No credentials cache found)

http://glidecoaching.com/?best-writing-service To help CDH users to get it working without much hassle, I would like to compile a list of steps below for reference. I have tested this in my VM Windows 10.

http://divinedia.com/writing-narrative-essay-help/ 1. For Kerberos authentication to work, you need to get a valid Kerberos ticket on your client machine, which is Windows 10. Hence, you will need to download and install MIT Kerberos client tool so that you can authenticate yourself against the remote cluster, much like running “kinit” on Linux.

http://loovharidus.ee/great-essayists/ great essayists To get the tool, please visit http://web.mit.edu/kerberos/dist and follow the links

http://www.gulinhas.pt/custom-essay-writting/ 2. In order for client machine to talk to remote KDC server that contains principal database, we need a valid krb5 configuration file on client side. This file normally sits under source url /etc/krb5.conf on Linux. On Windows 10, it should be under http://seasaltair.com/effekt-av-cialis/ C:\ProgramData\MIT\Kerberos5\krb5.ini. Please copy the krb5.conf file in your cluster and then copy to this location on your Windows machine. Please be aware that the file name in Windows should be krb5.ini, not krb5.conf. Also note that C:\ProgramData is a hidden directory, so you will need to unhide it first from File Explorer before you can access the files underneath it.

go 3. Make sure that you connect to correct port number, for Hive, it is normally 10000 by default. For Impala, it should be 21050, NOT 21000, which is used by impala-shell.

If you have Load Balancer setup for either Hive or Impala, then the port number could also be different, please consult with your system admin to get the correct port number if this is the case.

coursework help london 4. Add Windows system variable KRB5CCNAME with value of “C:\krb5\krb5cc”, where “krb5cc” is a file name for the kerberos ticket cache, it can be anything, but we commonly use krb5cc or krb5cache. To do so, please follow steps below:

follow a. open “File Explorer”
b. right click on “This PC”
c. select “Properties”
d. next to “Computer name”, click on “Change settings”
e. click on “Advanced” tab and then “Environment Variables”
f. under “System Variables”, click on “New”
g. enter “KRB5CCNAME” in “Variable name” and “C:\krb5\krb5cc” in “Variable value” (without double quotes)
h. click on “OK” and then “OK” again
i. restart Windows

enter site 5. If you have SSL enabled for either Hive or Impala, you will also need to “Enable SSL” for ODBC driver. This can be found under “SSL Options” popup window, see below screenshot for details:

http://sacredsourcework.com/?help-me-write-essay

follow

http://gpsemirates.com/essay-writing-my-best-friend/ Please note that “SSL Options” is only available in newer version of ODBC driver, if you do not see this option, please upgrade ODBC driver to latest version. At the time of writing, Hive ODBC Driver is at 2.5.24.

http://folala.com/?p=custom-essay-on That should be it. The above are the common missing steps by Windows users when trying to connect to Hive or Impala via ODBC. If you have encountered other problems that need extra steps, please leave a comment below and I will update my post.

reputable essay writing services Hope above helps.

Impala query failed with error: “Incompatible Parquet Schema”

next Yesterday, I was dealing with an issue that when running a very simple Impala SELECT query, it failed with “Incompatible Parquet schema” error. I have confirmed the following workflow that triggered the error:

  1. Parquet file is created from external library
  2. Load the parquet file into Hive/Impala table
  3. Query the table through Impala will fail with below error message

    http://popyentradas.com/professional-writing-srvice/ incompatible Parquet schema for column 'db_name.tbl_name.col_name'. Column type: DECIMAL(19, 0), Parquet schema:\noptional byte_array col_name [i:2 d:1 r:0]

  4. The same query works well in Hive

get link This is due to impala currently does not support all decimal specs that are supported by Parquet. Currently Parquet supports the following specs:

  • int32: for 1 <= precision <= 9
  • int64: for 1 <= precision <= 18; precision < 10 will produce a warning
  • fixed_len_byte_array: precision is limited by the array size. Length n can store <= floor(log_10(2^(8*n - 1) - 1)) base-10 digits
  • binaryprecision is not limited, but is required. The minimum number of bytes to store the unscaled value should be used.

assignor assignee Please refer to Parquet Logical Type Definitions page for details.

see However, Impala only supports fixed_len_byte_array, but no others. This has been reported in the upstream JIRA: IMPALA-2494

http://mrmulchmrtopsoil.com/?p=how-to-write-a-dissertation-bibliography The only workaround for now is to create a parquet file that will use supported specs for Decimal column, or simply create parquet file through either Hive or Impala.

Impala Query fails with NoSuchObjectException error

view In the last few months, I have seem CDH users hitting Impala query returning NoSuchObjectException error very often. This happens when running query against a particular table with INT partition types and it failed with below message:

buy a thesis WARNINGS: ImpalaRuntimeException: Error making 'alter_partitions' RPC to Hive Metastore: CAUSED BY: InvalidOperationException: Alter partition operation failed: NoSuchObjectException(message:partition values=[2017, 6, 1, 8])

companies writing personal statements We have confirmed that the table has four partitions with Integer data type, and select individual partition works.

source The following scenario will trigger such error:

  • Partitions with INT data type
  • Partition data was inserted from Hive with zero prefixes, something like below query:

    http://www.serviceindustry101.com/cv-writing-service-us/ INSERT OVERWRITE TABLE test_tbl PARTITION (year = '2017', month = '06'....) .....

  • Partition data will be created under HDFS location like below:

    follow link hdfs://nameservice1/user/hive/warehouse/test_tbl/year=2017/month=06/day=01/hour=08

  • When query through Impala, since the data type is INT, Impala will convert values from “06” to 6, “01” to 1 etc, and will be looking for location :

    home page hdfs://nameservice1/user/hive/warehouse/test_tbl/year=2017/month=6/day=1/hour=8

    http://q8sec.com/?p=phd-candidate-cover-letter instead​ of:

    hdfs://nameservice1/user/hive/warehouse/test_tbl/year=2017/month=06/day=01/hour=08

    hence triggered NoSuchObjectException error.

To fix the issue, there are two options:

  1. Convert the data type of partition columns to String, instead of Integer:
    ALTER TABLE test_tbl PARTITION COLUMN (year string);
    ALTER TABLE test_tbl PARTITION COLUMN (month string);
    ALTER TABLE test_tbl PARTITION COLUMN (day string);
    ALTER TABLE test_tbl PARTITION COLUMN (hour string);
    
  2. if integer type need to be kept, then we will need to re-build the table into a new one and store them into locations without leading zeros. This can be done by running the following queries from Impala:
    CREATE TABLE new_test_tbl LIKE test_tbl;
    
    INSERT OVERWRITE TABLE new_test_tbl PARTITION (year, month, day, hour) as SELECT * FROM test_tbl;
    

    The new table will have leading zeros in partitions removed and then we can switch over to use the new table. When writing more data into the new table through Hive, please be sure to remove all leading zeros to prevent the issue from happening again.

Above steps should help resolve the issue. Hope they will help.

Read Files under Sub-Directories for Hive and Impala

Sometimes you might want to store data under sub-directories in HDFS and then you want Hive or Impala to read from those sub-directories. For example, you have the following directory structure:

root hdfs     231206 2017-06-30 02:45 /test/table1/000000_0
root hdfs          0 2017-06-30 02:45 /test/table1/child_directory
root hdfs     231206 2017-06-30 02:45 /test/table1/child_directory/000000_0

By default, Hive will only look for files in the root of directory specified, in my test case is /test/table1. However, Hive supports to read all data under the root table’s sub-directories as well. This can be achieved by updating the following settings:

SET mapred.input.dir.recursive=true;
SET hive.mapred.supports.subdirectories=true;

Impala however, on the other side, currently does not support reading files from table’s sub-directories. This has been reported in the upstream JIRA of IMPALA-1944. Currently there is no immediate plan to support such feature, but it might be in the future release of Impala.

Hope above information is useful.

Impala Auto Update Metadata Support

There are lots of CDH users requested that Impala to support automatic metadata update, so that they do NOT need to run “INVALIDATE METADATA” every time when create table or data are updated through other components, like Hive or Pig.

I would like to share that this has been reported upstream and tracked via JIRA: IMPALA-3124. Currently it is still not fixed and it requires detailed design to make sure that it will work in the proper way.

There is no ETA at this stage on when this feature will be added. Advise anyone interested on this feature to add comments to the JIRA. The more votes, the better chance that it will be prioritized.

Hope above information is helpful.