“No data or no sasl data in the stream” Error in HiveServer2 Log

I have seen lots of users complain about seeing lots of “No data or no sasl data in the stream” errors in the HiveServer2 server log, yet they have not noticed any performance impact nor query failure for Hive. So I think it would be good to write a blog about the possible reason behind this to clarify and remove the concerns that users have.

http://masheroa.com/matthew-parkinson-phd-thesis/ The following shows the full error message and stacktrace taken from HiveServer2 log:

http://www.falydelaferiaalrocio.com/i-need-someone-to-write-a-term-paper-for-me/ ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-533556]: Error occurred during processing of message. java.lang.RuntimeException: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:765) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:762) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1687) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:762) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream

Pay Someone To Write Literature Review The likely cause is below:

  1. You have kerberos enabled
  2. You have multiple HiveServer2 hosts
  3. You have Load Balancer enabled in front of all HS2 servers that have such errors

http://wordandspiritmedia.com/admission-essay-editing-service-the-world39s-premier/ If you have above setup, the error message you saw in HiveServer2 is harmless and can be safely ignored. This just indicated that SASL negotiation failed for one particular Hive client, which in this case would be the Load Balancer who pings regularly to those HiveServer2’s to check for connectivity. Those pings from LB were trying with PLAIN TCP connection, hence those messages.

enter There are a couple of ways to avoid those messages:

Buying Phd Thesis 1. Reduce the frequency of pings from LB, this will reduce the errors in the log, however, won’t avoid it. I do not know a way to configure the LB to avoid PLAIN TCP connection, this is outside of scope of this blog, you might need to consult to F5 or HAProxy manual for further info.

2. Add filter to HiveServer2’s logging to filter out those exceptions:

http://www.greedyrooster.it/mother-terasa-essay/ a. Using Cloudera Manager, navigate to Hive > Configuration > “HiveServer2 Logging Advanced Configuration Snippet (Safety Valve)”
b. Copy and paste the the following configuration into the safety valve:

http://www.ctdesign.it/?phd-social-work-thesis log4j.appender.RFA.filter.1=org.apache.log4j.filter.ExpressionFilter log4j.appender.RFA.filter.1.Expression=EXCEPTION ~= org.apache.thrift.transport.TSaslTransportException log4j.appender.RFA.filter.1.AcceptOnMatch=false

c. Then save and restart HiveServer2 service through Cloudera Manager.

Purchase A Dissertation N Natural Phonology Hope above helps.

Enabling Kerberos Debug for Hive

http://www.gitelesprunelles.be/great-thesis-statement-help/ great thesis statement help From time to time, we need to do troubleshooting steps for locating the root cause of Kerberos failure in Hive. I will outline below steps in order to turn on debugging message from both Client and HiveServer2 server side.

  1. To enable on Hive Client side (beeline), simply add the following export commands before you run beeline command:

    http://homestoneproperty.com/?p=legalizing-prostitution-essay export HADOOP_JAAS_DEBUG=true; export HADOOP_OPTS='-Dsun.security.krb5.debug=true -Dsun.security.jgss.debug=true'

    follow url Then the debug message will be printed on the shell when you run beeline.

  2. To enable kerberos debug on HiveServer2 side (assuming you are using Cloudera Manager)
    1. To to CM > Hive > Configuration
    2. locate “HiveServer2 Environment Advanced Configuration Snippet (Safety Valve)”
    3. add following to the textarea:

      http://www.mainframechina.com/online-homework-slave/ Online Homework Slave HADOOP_OPTS='-Dsun.security.krb5.debug=true -Dsun.security.jgss.debug=true'

    4. Save and restart Hive service

    Customized Essay Once restarted, you will be able to locate the kerberos debug message from HiveServer2’s process directory on the server host, which is located under /var/run/cloudera-scm-agent/process/XXX-hive-HIVESERVER2/logs/stdout.log, where XXX is the largest number under the directory for HiveServer2

http://lubeofsarasota.com/research-papers-biology/ research papers biology The sample debug message for kerberos looks like below:

http://monaluison.fr/?p=my-essay-robinson-crusoe Java config name: null Native config name: /etc/krb5.conf Loaded from native config [UnixLoginModule]: succeeded importing info: uid = 0 gid = 0 supp gid = 0 Debug is true storeKey false useTicketCache true useKeyTab false doNotPrompt true ticketCache is null isInitiator true KeyTab is null refreshKrb5Config is false principal is null tryFirstPass is false useFirstPass is false storePass is false clearPass is false Acquire TGT from Cache >>>KinitOptions cache name is /tmp/krb5cc_0 >>>DEBUG client principal is impala/{host-name}@REAL.COM >>>DEBUG server principal is krbtgt/REAL.COM@REAL.COM >>>DEBUG key type: 23 >>>DEBUG auth time: Sun Aug 13 21:07:46 PDT 2017 >>>DEBUG start time: Sun Aug 13 21:07:46 PDT 2017 >>>DEBUG end time: Mon Aug 14 07:07:46 PDT 2017 >>>DEBUG renew_till time: Sun Aug 20 21:07:46 PDT 2017 >>> CCacheInputStream: readFlags() FORWARDABLE; RENEWABLE; INITIAL; PRE_AUTH; >>>DEBUG client principal is impala/{host-name}@REAL.COM >>>DEBUG server principal is X-CACHECONF:/krb5_ccache_conf_data/pa_type/krbtgt/REAL.COM@REAL.COM >>>DEBUG key type: 0 >>>DEBUG auth time: Wed Dec 31 16:00:00 PST 1969 >>>DEBUG start time: null >>>DEBUG end time: Wed Dec 31 16:00:00 PST 1969 >>>DEBUG renew_till time: null >>> CCacheInputStream: readFlags() Principal is impala/{host-name}@REAL.COM [UnixLoginModule]: added UnixPrincipal, UnixNumericUserPrincipal, UnixNumericGroupPrincipal(s), to Subject Commit Succeeded Search Subject for Kerberos V5 INIT cred (<>, sun.security.jgss.krb5.Krb5InitCredential) Found ticket for impala/{host-name}@REAL.COM to go to krbtgt/REAL.COM@REAL.COM expiring on Mon Aug 14 07:07:46 PDT 2017 Entered Krb5Context.initSecContext with state=STATE_NEW Found ticket for impala/{host-name}@REAL.COM to go to krbtgt/REAL.COM@REAL.COM expiring on Mon Aug 14 07:07:46 PDT 2017 Service ticket not found in the subject >>> Credentials acquireServiceCreds: same realm default etypes for default_tgs_enctypes: 23. >>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType >>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType >>> KdcAccessibility: reset >>> KrbKdcReq send: kdc=kdc-host.com TCP:88, timeout=3000, number of retries =3, #bytes=1607 >>> KDCCommunication: kdc=kdc-host.com TCP:88, timeout=3000,Attempt =1, #bytes=1607 >>>DEBUG: TCPClient reading 1581 bytes >>> KrbKdcReq send: #bytes read=1581 >>> KdcAccessibility: remove kdc-host.com >>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType >>> KrbApReq: APOptions are 00100000 00000000 00000000 00000000 >>> EType: sun.security.krb5.internal.crypto.ArcFourHmacEType Krb5Context setting mySeqNumber to: 789412608 Created InitSecContextToken:

dissertation proposal in marketing From above message, you can see at least below info:

  • Client config file for kerberos /etc/krb5.conf
  • Ticket case file: /tmp/krb5cc_0
  • Client principal name: impala/{host-name}@REAL.COM
  • KDC server host: kdc=kdc-host.com and using TCP connection via port 88 (TCP:88)
  • and a lot more others that might be useful for your troubleshooting

follow site Hope above helps.

How to enable HiveServer2 audit log through Cloudera Manager

example of term paper introduction This article explains the steps required to enable audit log for HiveServer2, so that all queries run through HiveServer2 will be audited into a central log file.

go Please follow the steps below:

  1. Go to Cloudera Manager home page > Hive > Configuration
  2. Tick “Enable Audit Collection”
  3. Ensure “Audit Log Directory” location point to a path that has enough disk space
  4. Go to Cloudera Manager home page > click on “Cloudera Management Service” > Instances
  5. Click on “Add Role Instances” button on the top right corner of the page
  6. Choose a host for http://www.acquevini.it/debora-weber-wulff-dissertation/ Navigator Audit Server & Navigator Metadata Server
  7. Then follow on screen instructions to finish adding the new roles
  8. Once the roles are added successfully, Cloudera Manager will ask you to restart a few services, including Hive
  9. Go ahead and restart Hive

click After restarting, Hive’s audit log will be enabled and logged into /var/log/hive/audit directory by default.

see http://www.eqima.org/phd-research-proposal-in-english-literature/ phd research proposal in english literature Please note that you are not required start Navigator services, so if you don’t need them running, you can just leave them at STOP state, the Hive’s audit logs should still function as normal. However, it is a requirement to have Navigator installed for the audit log to function properly, as there are some libraries from Navigator are required for audit to work.

How to query a multiple delimited table in Hive

click This article explains how to query a multi delimited Hive table in CDH5.3.x.

home page Use case as follow:

i need someone to write my essay Having the following table definitions:

http://centrostudicesu.com/?p=internet-corporation-for-assigned-names-and-numbers-icann CREATE TABLE test_multi (a string, b string) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ( "field.delim"="#|", "collection.delim"=":", "mapkey.delim"="@" );

click Query all columns is OK:

http://talkingtech.net/dissertation-uk-wiki/ select * from test_multi; +---------------+---------------+ | test_multi.a | test_multi.b | +---------------+---------------+ | eric | test more | +---------------+---------------+ 1 row selected (1.58 seconds)

critical thinking games for students However, query a single column will get the following error in HS2:

select a from test_multi; Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:334) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:352) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:126) ... 22 more Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1953) at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:304) ... 24 more

This happens in CDh5.3.3, which ships Hive 0.13, and I am not sure whether it also applies to CDh5.4.x and CDh5.5.x.

This is caused by class org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe was not loaded to Hive’s UDF list when HiveServer2 starts up. We need to copy the JAR file that contains this class to Hive’s AUX directory.

Steps as follows:

1) Locate the AUX directory for HiveServer2, if you don’t have one create one and update Hive’s configuration through Cloudera Manager. If you don’t use Cloudera Manager, simply create a directory on HiveServer2 host, in my case is /hive/jars.

2) Create a symlink to file /opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib.jar (or if not using Cloudera Manager, /usr/lib/hive/lib/hive-contrib.jar) from within /hive/jars

ln -s /opt/cloudera/parcels/CDH/lib/hive/lib/hive-contrib.jar /hive/jars/hive-contrib.jar

3) If you don’t use Cloudera Manager, add the following:


to hive-site.xml for HiveServer2.

If you use Cloudera Manager, simply go to step 4.

4) Restart HiveServer2.

This should be able to remove the error we saw earlier in the post and get Hive query working.

Hope this helps.

Kerberos connections to HIveServer2 not working cross domain

The following is the scenario of the cross domain problem with Kerberized cluster:

1. Cluster is within realm “DEV.EXAMPLE.COM”
2. Client is outside cluster with realm “EXAMPLE.COM”
3. Connect to Impala from client machine works
4. Connect to HS2 from client machine does not work and get the following error:

java.lang.IllegalArgumentException: Illegal principal name <user>@EXAMPLE.COM: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
No rules applied to <user>@EXAMPLE.COM
	at org.apache.hadoop.security.User.<init>(User.java:50)
	at org.apache.hadoop.security.User.<init>(User.java:43)
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1221)
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1205)
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:689)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to <user>@EXAMPLE.COM
	at org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
	at org.apache.hadoop.security.User.<init>(User.java:48)
	... 8 more

This is caused by HDFS not resolving the principal from cross domain to the local user in the cluster. To fix the issue, follow the steps below:

1. In Cloudera Manager go to HDFS > Configuration > search for “Trusted Kerberos Realms” > add “EXAMPLE.COM” to list
2. Firstly restart HS2
3. Confirm that we can connect to HS2 from client now
4. Restart the rest of the services

This should allow user to connect to HS2 from outside the cluster’s realm.