“No data or no sasl data in the stream” Error in HiveServer2 Log

enter site I have seen lots of users complain about seeing lots of “No data or no sasl data in the stream” errors in the HiveServer2 server log, yet they have not noticed any performance impact nor query failure for Hive. So I think it would be good to write a blog about the possible reason behind this to clarify and remove the concerns that users have.

http://wpdalya.com/hypothesis-paper-writing-services/ The following shows the full error message and stacktrace taken from HiveServer2 log:

http://themarketgirlblog.com/?p=what-is-the-best-research-paper-writing-service ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-533556]: Error occurred during processing of message. java.lang.RuntimeException: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:765) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:762) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1687) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:762) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream

source site The likely cause is below:

  1. You have kerberos enabled
  2. You have multiple HiveServer2 hosts
  3. You have Load Balancer enabled in front of all HS2 servers that have such errors

http://russianchicagomag.com/how-to-bibliography/ If you have above setup, the error message you saw in HiveServer2 is harmless and can be safely ignored. This just indicated that SASL negotiation failed for one particular Hive client, which in this case would be the Load Balancer who pings regularly to those HiveServer2’s to check for connectivity. Those pings from LB were trying with PLAIN TCP connection, hence those messages.

unemployment essay There are a couple of ways to avoid those messages:

go to site 1. Reduce the frequency of pings from LB, this will reduce the errors in the log, however, won’t avoid it. I do not know a way to configure the LB to avoid PLAIN TCP connection, this is outside of scope of this blog, you might need to consult to F5 or HAProxy manual for further info.

link 2. Add filter to HiveServer2’s logging to filter out those exceptions:

Accounting Assignment Help Australia a. Using Cloudera Manager, navigate to Hive > Configuration > “HiveServer2 Logging Advanced Configuration Snippet (Safety Valve)”
b. Copy and paste the the following configuration into the safety valve:

university of toronto masters thesis log4j.appender.RFA.filter.1=org.apache.log4j.filter.ExpressionFilter log4j.appender.RFA.filter.1.Expression=EXCEPTION ~= org.apache.thrift.transport.TSaslTransportException log4j.appender.RFA.filter.1.AcceptOnMatch=false

http://danandcharlotte.info/dissertation-marking-scheme/ c. Then save and restart HiveServer2 service through Cloudera Manager.

watch Hope above helps.

How to setup multiple KDCs through Cloudera Manager

Thesis Statement Examples For Essays Currently Cloudera Manager does not support setting up multiple KDCs for the krb5.conf file natively, this article explains the workarounds we can have using the existing feature provided by Cloudera Manager.

link This article also assumes that you have krb5.conf file managed by Cloudera Manager.

click here If you are using Cloudera Manager prior to 5.7, following the steps below:

  1. Go to CM > Administration > Settings > click on “Kerberos” on Filters on the left side > locate “KDC Server Host”, enter the KDC host in the text field:

    see kdc-host1.com

  2. On the same page, locate “Advanced Configuration Snippet (Safety Valve) for the Default Realm in krb5.conf”, and enter the following into the text area:

    here kdc = kdc-host2.com

  3. Save and then “Deploy Kerberos Client Configuration” (you might need to stop all service first before you can do this)
    The [realm] section in the krb5.conf will be updated like below:

    http://kemon.vn/?p=custom-dissertation-writing-questions [realms] TEST.COM = { kdc = kdc-host1.com admin_server = kdc-host1.com kdc = kdc-host2.com }

follow If you are using CM5.7 and above, you can also do the following (above steps should still work):

  1. Go to CM > Administration > Settings > click on “Kerberos” on Filters on the left side > locate “KDC Server Host”, empty the KDC host in the text field, so that it contains no value
  2. On the same page, locate “Advanced Configuration Snippet (Safety Valve) for the Default Realm in krb5.conf”, and enter the following into the text area:

    kdc = kdc-host1.com kdc = kdc-host2.com admin_server = kdc-host1.com

  3. Save and then “Deploy Kerberos Client Configuration” (you might need to stop all service first before you can do this)
    The [realm] section in the krb5.conf will be updated like below:

    [realms]
    TEST.COM = {
    kdc = kdc-host1.com
    kdc = kdc-host2.com
    admin_server = kdc-host1.com
    }
    

The second option does not work prior to CM5.7 is because the older version of CM will generate the following line in krb5.conf if the KDC Server Host is empty:

kdc =

which will break the syntax in krb5.conf file.

Unable to generate keytab from within Cloudera Manager

When generating credentials through Cloudera Manager, sometimes Cloudera Manager will return you the following error:

/usr/share/cmf/bin/gen_credentials_ad.sh failed with exit code 53 and
output of <<
+ export PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ KEYTAB_OUT=/var/run/cloudera-scm-server/cmf2781839247630884630.keytab
+ PRINC=sqoop2/<host>@REALM.COM
+ USER=kaupocSuFoZIOIDa
+ PASSWD=REDACTED
+ DIST_NAME=CN=kaupocSuFoZIOIDa,OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=XXXX,DC=com
+ '[' -z /etc/krb5-cdh.conf ']'
+ echo 'Using custom config path '\''/etc/krb5-cdh.conf'\'', contents below:'
+ cat /etc/krb5-cdh.conf
+ SIMPLE_PWD_STR=
+ '[' '' = '' ']'
+ kinit -k -t /var/run/cloudera-scm-server/cmf5575611164358256388.keytab
cdhad@REALM.COM
++ mktemp /tmp/cm_ldap.XXXXXXXX
+ LDAP_CONF=/tmp/cm_ldap.XRbR8Zco
+ echo 'TLS_REQCERT never'
+ echo 'sasl_secprops minssf=0,maxssf=0'
+ export LDAPCONF=/tmp/cm_ldap.XRbR8Zco
+ LDAPCONF=/tmp/cm_ldap.XRbR8Zco
++ ldapsearch -LLL -H ldaps://<ldap-host>:636 -b
OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=xxxx,DC=com
userPrincipalName=sqoop2/<host>@REALM.COM
SASL/GSSAPI authentication started
SASL username: cdhad@REALM
SASL SSF: 0
+ PRINC_SEARCH=
+ set +e
+ echo
+ grep -q userPrincipalName
+ '[' 1 -eq 0 ']'
+ set -e
+ ldapmodify -H ldaps://<ldap-host>:636
++ echo sqoop2/<host>@REALM.COM
++ sed -e 's/\@REALM.COM//g'
++ echo -n '"REDACTED"'
++ iconv -f UTF8 -t UTF16LE
++ base64 -w 0
SASL/GSSAPI authentication started
SASL username: cdhad@REALM.COMSASL SSF: 0
ldap_add: Server is unwilling to perform (53)
additional info: 0000052D: SvcErr: DSID-031A1248, problem 5003
(WILL_NOT_PERFORM), data 0
Generate credentials in Cloudera Manager failed with the following errors:

/usr/share/cmf/bin/gen_credentials_ad.sh failed with exit code 53 and
output of <<
+ export PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ KEYTAB_OUT=/var/run/cloudera-scm-server/cmf2781839247630884630.keytab
+ PRINC=sqoop2/host@REALM.COM
+ USER=kaupocSuFoZIOIDa
+ PASSWD=REDACTED
+ DIST_NAME=CN=kaupocSuFoZIOIDa,OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=xxxx,DC=com
+ '&#91;' -z /etc/krb5-cdh.conf '&#93;'
+ echo 'Using custom config path '\''/etc/krb5-cdh.conf'\'', contents below:'
+ cat /etc/krb5-cdh.conf
+ SIMPLE_PWD_STR=
+ '&#91;' '' = '' '&#93;'
+ kinit -k -t /var/run/cloudera-scm-server/cmf5575611164358256388.keytab
cdhad@REALM.COM
++ mktemp /tmp/cm_ldap.XXXXXXXX
+ LDAP_CONF=/tmp/cm_ldap.XRbR8Zco
+ echo 'TLS_REQCERT never'
+ echo 'sasl_secprops minssf=0,maxssf=0'
+ export LDAPCONF=/tmp/cm_ldap.XRbR8Zco
+ LDAPCONF=/tmp/cm_ldap.XRbR8Zco
++ ldapsearch -LLL -H ldaps://host:636 -b
OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=xxxx,DC=com
userPrincipalName=sqoop2/<host>@REALM.COM
SASL/GSSAPI authentication started
SASL username: cdhad@REALM.COM
SASL SSF: 0
+ PRINC_SEARCH=
+ set +e
+ echo
+ grep -q userPrincipalName
+ '[' 1 -eq 0 ']'
+ set -e
+ ldapmodify -H ldaps://<ldap-host>:636
++ echo sqoop2/<ldap-host>@REALM.COM
++ sed -e 's/\@REALM.COM//g'
++ echo -n '"REDACTED"'
++ iconv -f UTF8 -t UTF16LE
++ base64 -w 0
SASL/GSSAPI authentication started
SASL username: cdhad@REALM.COMSASL SSF: 0
ldap_add: Server is unwilling to perform (53)
additional info: 0000052D: SvcErr: DSID-031A1248, problem 5003
(WILL_NOT_PERFORM), data 0

If you see the similar error and you know that you have AD enabled for your cluster, then you have landed on the right place. This is likely caused by a bug in Cloudera Manager that it does not allow users to change the complexity of the password generated if AD server has password complexity restrictions setup, and Cloudera Manager’s request will be rejected.

To fix this issue is simple, but requires changing some source code in Cloudera Manager, follow the steps below:

  1. Backup file /usr/share/cmf/bin/gen_credentials_ad.sh first on CM host
  2. Add this line to /usr/share/cmf/bin/gen_credentials_ad.sh on line number 15:
    PASSWD="$PASSWD-"
    

    after line:

    PASSWD=$4
    

    Basically this adds a hyphen to CM generated passwords.

  3. Run Generate Credentials again to see if this helps

If same error still happens, go back to step 2 and try different variations for the password:

PASSWD="ABC=$PASSWD" # prepends "ABC=" to generated password.

The idea is to meet the criteria of AD password requirement.

This issue is likely fixed already in Cloudera Manager’s source code to support more flexibility when generating passwords, but it won’t be release until CM5.8 at least.

Kerberos connections to HIveServer2 not working cross domain

The following is the scenario of the cross domain problem with Kerberized cluster:

1. Cluster is within realm “DEV.EXAMPLE.COM”
2. Client is outside cluster with realm “EXAMPLE.COM”
3. Connect to Impala from client machine works
4. Connect to HS2 from client machine does not work and get the following error:

java.lang.IllegalArgumentException: Illegal principal name <user>@EXAMPLE.COM: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
No rules applied to <user>@EXAMPLE.COM
	at org.apache.hadoop.security.User.<init>(User.java:50)
	at org.apache.hadoop.security.User.<init>(User.java:43)
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1221)
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1205)
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:689)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to <user>@EXAMPLE.COM
	at org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
	at org.apache.hadoop.security.User.<init>(User.java:48)
	... 8 more

This is caused by HDFS not resolving the principal from cross domain to the local user in the cluster. To fix the issue, follow the steps below:

1. In Cloudera Manager go to HDFS > Configuration > search for “Trusted Kerberos Realms” > add “EXAMPLE.COM” to list
2. Firstly restart HS2
3. Confirm that we can connect to HS2 from client now
4. Restart the rest of the services

This should allow user to connect to HS2 from outside the cluster’s realm.

What do I need to do to get Hive working after enabling Kerberos

This article explains some clean up tasks needed to be done after Kerberos is enabled in the cluster for Hive to continue functioning.

1) Cleanup YARN user cache directory at /yarn/nm/usercache/xxxxx. This needs to be run on all nodes in the cluster, and all the directories that are defined under config name “NodeManager Local Directory List” in the CM > YARN > Configuration page.

This is because when cluster was running under simple AUTH, the yarn jobs were created by normally yarn or nobody user, depending on setup, but after Kerberos AUTH is enabled, the yarn job will be run as the user who triggered the job, and because user changed, the new job will not be able to overwrite original directories or files.

To fix this:

a) Stop YARN service
b) Remove user cache directory by running “rm -fr /yarn/nm/usercache/*”, remember, this has to be done on all machines in the cluster
c) Restart YARN service

2) Need to sync all users between Kerberos, OS system user as well as HDFS user directory, again, on all machines in the cluster.

For example, if you have a kerberos principal for user “foo”, you will need to create a “foo” system user on all server nodes in the cluster, you will also need to create a HDFS directory for this user under “/user/foo” and owned by “foo:foo” so that it will have permission to write to this directory.

After this change, the Hive permission error should get fixed.