Setting up Cloudera ODBC driver on Windows 10

http://www.visiteday.com/?i-want-someone-to-write-my-research-paper i want someone to write my research paper I have seen lots of CDH users now have trouble setting up Hive/Impala ODBC drivers on Windows 10 machine to connect to remote Kerberized cluster recently. Connection keeps getting Kerberos related error messages. Like below:

http://topmarketsoft.com/product/macphun-intensify/ [Cloudera][Hardy] (34) Error from server: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credential cache is empty).

see OR

click [Cloudera][ImpalaODBC] (100) Error from the Impala Thrift API: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (No credentials cache found)

english paper two 2015 To help CDH users to get it working without much hassle, I would like to compile a list of steps below for reference. I have tested this in my VM Windows 10.

http://fmindesign.in/phd-dissertation-sections/ 1. For Kerberos authentication to work, you need to get a valid Kerberos ticket on your client machine, which is Windows 10. Hence, you will need to download and install MIT Kerberos client tool so that you can authenticate yourself against the remote cluster, much like running “kinit” on Linux.

http://gbciconnect.com/good-essay-writing-websites/ To get the tool, please visit http://web.mit.edu/kerberos/dist and follow the links

go site 2. In order for client machine to talk to remote KDC server that contains principal database, we need a valid krb5 configuration file on client side. This file normally sits under college admissions recruiter cover letter /etc/krb5.conf on Linux. On Windows 10, it should be under http://noordinarycamps.org/research-on-paper/ C:\ProgramData\MIT\Kerberos5\krb5.ini. Please copy the krb5.conf file in your cluster and then copy to this location on your Windows machine. Please be aware that the file name in Windows should be krb5.ini, not krb5.conf. Also note that C:\ProgramData is a hidden directory, so you will need to unhide it first from File Explorer before you can access the files underneath it.

see url 3. Make sure that you connect to correct port number, for Hive, it is normally 10000 by default. For Impala, it should be 21050, NOT 21000, which is used by impala-shell.

The Best Essay Writers If you have Load Balancer setup for either Hive or Impala, then the port number could also be different, please consult with your system admin to get the correct port number if this is the case.

thesis on service quality in banking sector 4. Add Windows system variable KRB5CCNAME with value of “C:\krb5\krb5cc”, where “krb5cc” is a file name for the kerberos ticket cache, it can be anything, but we commonly use krb5cc or krb5cache. To do so, please follow steps below:

next a. open “File Explorer”
b. right click on “This PC”
c. select “Properties”
d. next to “Computer name”, click on “Change settings”
e. click on “Advanced” tab and then “Environment Variables”
f. under “System Variables”, click on “New”
g. enter “KRB5CCNAME” in “Variable name” and “C:\krb5\krb5cc” in “Variable value” (without double quotes)
h. click on “OK” and then “OK” again
i. restart Windows

get 5. If you have SSL enabled for either Hive or Impala, you will also need to “Enable SSL” for ODBC driver. This can be found under “SSL Options” popup window, see below screenshot for details:

http://gpsemirates.com/essay-about-service/

writing editing services

Please note that “SSL Options” is only available in newer version of ODBC driver, if you do not see this option, please upgrade ODBC driver to latest version. At the time of writing, Hive ODBC Driver is at 2.5.24.

That should be it. The above are the common missing steps by Windows users when trying to connect to Hive or Impala via ODBC. If you have encountered other problems that need extra steps, please leave a comment below and I will update my post.

Hope above helps.

“No data or no sasl data in the stream” Error in HiveServer2 Log

I have seen lots of users complain about seeing lots of “No data or no sasl data in the stream” errors in the HiveServer2 server log, yet they have not noticed any performance impact nor query failure for Hive. So I think it would be good to write a blog about the possible reason behind this to clarify and remove the concerns that users have.

The following shows the full error message and stacktrace taken from HiveServer2 log:

ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-533556]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:765)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:762)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1687)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:762)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream

The likely cause is below:

  1. You have kerberos enabled
  2. You have multiple HiveServer2 hosts
  3. You have Load Balancer enabled in front of all HS2 servers that have such errors

If you have above setup, the error message you saw in HiveServer2 is harmless and can be safely ignored. This just indicated that SASL negotiation failed for one particular Hive client, which in this case would be the Load Balancer who pings regularly to those HiveServer2’s to check for connectivity. Those pings from LB were trying with PLAIN TCP connection, hence those messages.

There are a couple of ways to avoid those messages:

1. Reduce the frequency of pings from LB, this will reduce the errors in the log, however, won’t avoid it. I do not know a way to configure the LB to avoid PLAIN TCP connection, this is outside of scope of this blog, you might need to consult to F5 or HAProxy manual for further info.

2. Add filter to HiveServer2’s logging to filter out those exceptions:

a. Using Cloudera Manager, navigate to Hive > Configuration > “HiveServer2 Logging Advanced Configuration Snippet (Safety Valve)”
b. Copy and paste the the following configuration into the safety valve:

log4j.appender.RFA.filter.1=org.apache.log4j.filter.ExpressionFilter 
log4j.appender.RFA.filter.1.Expression=EXCEPTION ~= org.apache.thrift.transport.TSaslTransportException 
log4j.appender.RFA.filter.1.AcceptOnMatch=false

c. Then save and restart HiveServer2 service through Cloudera Manager.

Hope above helps.

Sqoop job failed with ClassNotFoundException

In the last few weeks, I was dealing with an issue that when importing data from DB2 into HDFS, it kept failing with NoClassDefFoundError. Below was the command details:

sqoop             
import
--connect
jdbc:db2://<db2-host-url>:3700/db1
--username
user1
--password
changeme
--table
ZZZ001$.part_table
--target-dir
/path/in/hdfs
--fields-terminated-by
\001
-m
1
--validate

And the error message was:

java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
        at org.apache.sqoop.mapreduce.db.DBRecordReader.createValue(DBRecordReader.java:197)
        at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:230)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
        at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:129)
        ... 14 more
Caused by: java.lang.NoClassDefFoundError: ZZZ001$_part_table$1
        at ZZZ001$_part_table.init0(ZZZ001$_part_table.java:43)
        at ZZZ001$_part_table.<init>(ZZZ001$_part_table.java:159)
        ... 19 more
Caused by: java.lang.ClassNotFoundException: ZZZ001$_part_table$1
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 21 more

By looking at the error message, it was highly suspicious that the class name ZZZ001$_part_table$1 looked wrong. This was caused by the table name itself in DB2 contained “$”: buying writing homework online ZZZ001$.part_table. So when sqoop generated the class, the name became personal statement college ZZZ001$_part_table$1, which is invalid Java class name.

To bypass this issue, the workaround is to force Sqoop to generate a customer class name by passing “–class-name” parameter. So the new command becomes:

sqoop             
import
--connect
jdbc:db2://<db2-host-url>:3700/db1
--username
user1
--password
changeme
--table
ZZZ001$.part_table
--target-dir
/path/in/hdfs
--fields-terminated-by
\001
-m
1
--class-name
ZZZ001_part_table
--validate

Hope above helps.