How to setup multiple KDCs through Cloudera Manager

Currently Cloudera Manager does not support setting up multiple KDCs for the krb5.conf file natively, this article explains the workarounds we can have using the existing feature provided by Cloudera Manager.

This article also assumes that you have krb5.conf file managed by Cloudera Manager.

If you are using Cloudera Manager prior to 5.7, following the steps below:

  1. Go to CM > Administration > Settings > click on “Kerberos” on Filters on the left side > locate “KDC Server Host”, enter the KDC host in the text field:
    kdc-host1.com
    
  2. On the same page, locate “Advanced Configuration Snippet (Safety Valve) for the Default Realm in krb5.conf”, and enter the following into the text area:
    kdc = kdc-host2.com
    
  3. Save and then “Deploy Kerberos Client Configuration” (you might need to stop all service first before you can do this)
    The [realm] section in the krb5.conf will be updated like below:

    [realms]
    TEST.COM = {
    kdc = kdc-host1.com
    admin_server = kdc-host1.com
    kdc = kdc-host2.com
    }
    

If you are using CM5.7 and above, you can also do the following (above steps should still work):

  1. Go to CM > Administration > Settings > click on “Kerberos” on Filters on the left side > locate “KDC Server Host”, empty the KDC host in the text field, so that it contains no value
  2. On the same page, locate “Advanced Configuration Snippet (Safety Valve) for the Default Realm in krb5.conf”, and enter the following into the text area:
    kdc = kdc-host1.com
    kdc = kdc-host2.com
    admin_server = kdc-host1.com
    
  3. Save and then “Deploy Kerberos Client Configuration” (you might need to stop all service first before you can do this)
    The [realm] section in the krb5.conf will be updated like below:

    [realms]
    TEST.COM = {
    kdc = kdc-host1.com
    kdc = kdc-host2.com
    admin_server = kdc-host1.com
    }
    

The second option does not work prior to CM5.7 is because the older version of CM will generate the following line in krb5.conf if the KDC Server Host is empty:

kdc =

which will break the syntax in krb5.conf file.

Unable to generate keytab from within Cloudera Manager

When generating credentials through Cloudera Manager, sometimes Cloudera Manager will return you the following error:

/usr/share/cmf/bin/gen_credentials_ad.sh failed with exit code 53 and
output of <<
+ export PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ KEYTAB_OUT=/var/run/cloudera-scm-server/cmf2781839247630884630.keytab
+ PRINC=sqoop2/<host>@REALM.COM
+ USER=kaupocSuFoZIOIDa
+ PASSWD=REDACTED
+ DIST_NAME=CN=kaupocSuFoZIOIDa,OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=XXXX,DC=com
+ '[' -z /etc/krb5-cdh.conf ']'
+ echo 'Using custom config path '\''/etc/krb5-cdh.conf'\'', contents below:'
+ cat /etc/krb5-cdh.conf
+ SIMPLE_PWD_STR=
+ '[' '' = '' ']'
+ kinit -k -t /var/run/cloudera-scm-server/cmf5575611164358256388.keytab
cdhad@REALM.COM
++ mktemp /tmp/cm_ldap.XXXXXXXX
+ LDAP_CONF=/tmp/cm_ldap.XRbR8Zco
+ echo 'TLS_REQCERT never'
+ echo 'sasl_secprops minssf=0,maxssf=0'
+ export LDAPCONF=/tmp/cm_ldap.XRbR8Zco
+ LDAPCONF=/tmp/cm_ldap.XRbR8Zco
++ ldapsearch -LLL -H ldaps://<ldap-host>:636 -b
OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=xxxx,DC=com
userPrincipalName=sqoop2/<host>@REALM.COM
SASL/GSSAPI authentication started
SASL username: cdhad@REALM
SASL SSF: 0
+ PRINC_SEARCH=
+ set +e
+ echo
+ grep -q userPrincipalName
+ '[' 1 -eq 0 ']'
+ set -e
+ ldapmodify -H ldaps://<ldap-host>:636
++ echo sqoop2/<host>@REALM.COM
++ sed -e 's/\@REALM.COM//g'
++ echo -n '"REDACTED"'
++ iconv -f UTF8 -t UTF16LE
++ base64 -w 0
SASL/GSSAPI authentication started
SASL username: cdhad@REALM.COMSASL SSF: 0
ldap_add: Server is unwilling to perform (53)
additional info: 0000052D: SvcErr: DSID-031A1248, problem 5003
(WILL_NOT_PERFORM), data 0
Generate credentials in Cloudera Manager failed with the following errors:

/usr/share/cmf/bin/gen_credentials_ad.sh failed with exit code 53 and
output of <<
+ export PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ KEYTAB_OUT=/var/run/cloudera-scm-server/cmf2781839247630884630.keytab
+ PRINC=sqoop2/host@REALM.COM
+ USER=kaupocSuFoZIOIDa
+ PASSWD=REDACTED
+ DIST_NAME=CN=kaupocSuFoZIOIDa,OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=xxxx,DC=com
+ '&#91;' -z /etc/krb5-cdh.conf '&#93;'
+ echo 'Using custom config path '\''/etc/krb5-cdh.conf'\'', contents below:'
+ cat /etc/krb5-cdh.conf
+ SIMPLE_PWD_STR=
+ '&#91;' '' = '' '&#93;'
+ kinit -k -t /var/run/cloudera-scm-server/cmf5575611164358256388.keytab
cdhad@REALM.COM
++ mktemp /tmp/cm_ldap.XXXXXXXX
+ LDAP_CONF=/tmp/cm_ldap.XRbR8Zco
+ echo 'TLS_REQCERT never'
+ echo 'sasl_secprops minssf=0,maxssf=0'
+ export LDAPCONF=/tmp/cm_ldap.XRbR8Zco
+ LDAPCONF=/tmp/cm_ldap.XRbR8Zco
++ ldapsearch -LLL -H ldaps://host:636 -b
OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=xxxx,DC=com
userPrincipalName=sqoop2/<host>@REALM.COM
SASL/GSSAPI authentication started
SASL username: cdhad@REALM.COM
SASL SSF: 0
+ PRINC_SEARCH=
+ set +e
+ echo
+ grep -q userPrincipalName
+ '[' 1 -eq 0 ']'
+ set -e
+ ldapmodify -H ldaps://<ldap-host>:636
++ echo sqoop2/<ldap-host>@REALM.COM
++ sed -e 's/\@REALM.COM//g'
++ echo -n '"REDACTED"'
++ iconv -f UTF8 -t UTF16LE
++ base64 -w 0
SASL/GSSAPI authentication started
SASL username: cdhad@REALM.COMSASL SSF: 0
ldap_add: Server is unwilling to perform (53)
additional info: 0000052D: SvcErr: DSID-031A1248, problem 5003
(WILL_NOT_PERFORM), data 0

If you see the similar error and you know that you have AD enabled for your cluster, then you have landed on the right place. This is likely caused by a bug in Cloudera Manager that it does not allow users to change the complexity of the password generated if AD server has password complexity restrictions setup, and Cloudera Manager’s request will be rejected.

To fix this issue is simple, but requires changing some source code in Cloudera Manager, follow the steps below:

  1. Backup file /usr/share/cmf/bin/gen_credentials_ad.sh first on CM host
  2. Add this line to /usr/share/cmf/bin/gen_credentials_ad.sh on line number 15:
    PASSWD="$PASSWD-"
    

    after line:

    PASSWD=$4
    

    Basically this adds a hyphen to CM generated passwords.

  3. Run Generate Credentials again to see if this helps

If same error still happens, go back to step 2 and try different variations for the password:

PASSWD="ABC=$PASSWD" # prepends "ABC=" to generated password.

The idea is to meet the criteria of AD password requirement.

This issue is likely fixed already in Cloudera Manager’s source code to support more flexibility when generating passwords, but it won’t be release until CM5.8 at least.

Kerberos connections to HIveServer2 not working cross domain

The following is the scenario of the cross domain problem with Kerberized cluster:

1. Cluster is within realm “DEV.EXAMPLE.COM”
2. Client is outside cluster with realm “EXAMPLE.COM”
3. Connect to Impala from client machine works
4. Connect to HS2 from client machine does not work and get the following error:

java.lang.IllegalArgumentException: Illegal principal name <user>@EXAMPLE.COM: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
No rules applied to <user>@EXAMPLE.COM
	at org.apache.hadoop.security.User.<init>(User.java:50)
	at org.apache.hadoop.security.User.<init>(User.java:43)
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1221)
	at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1205)
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:689)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to <user>@EXAMPLE.COM
	at org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
	at org.apache.hadoop.security.User.<init>(User.java:48)
	... 8 more

This is caused by HDFS not resolving the principal from cross domain to the local user in the cluster. To fix the issue, follow the steps below:

1. In Cloudera Manager go to HDFS > Configuration > search for “Trusted Kerberos Realms” > add “EXAMPLE.COM” to list
2. Firstly restart HS2
3. Confirm that we can connect to HS2 from client now
4. Restart the rest of the services

This should allow user to connect to HS2 from outside the cluster’s realm.

What do I need to do to get Hive working after enabling Kerberos

This article explains some clean up tasks needed to be done after Kerberos is enabled in the cluster for Hive to continue functioning.

1) Cleanup YARN user cache directory at /yarn/nm/usercache/xxxxx. This needs to be run on all nodes in the cluster, and all the directories that are defined under config name “NodeManager Local Directory List” in the CM > YARN > Configuration page.

This is because when cluster was running under simple AUTH, the yarn jobs were created by normally yarn or nobody user, depending on setup, but after Kerberos AUTH is enabled, the yarn job will be run as the user who triggered the job, and because user changed, the new job will not be able to overwrite original directories or files.

To fix this:

a) Stop YARN service
b) Remove user cache directory by running “rm -fr /yarn/nm/usercache/*”, remember, this has to be done on all machines in the cluster
c) Restart YARN service

2) Need to sync all users between Kerberos, OS system user as well as HDFS user directory, again, on all machines in the cluster.

For example, if you have a kerberos principal for user “foo”, you will need to create a “foo” system user on all server nodes in the cluster, you will also need to create a HDFS directory for this user under “/user/foo” and owned by “foo:foo” so that it will have permission to write to this directory.

After this change, the Hive permission error should get fixed.