Unable to generate keytab from within Cloudera Manager

When generating credentials through Cloudera Manager, sometimes Cloudera Manager will return you the following error:

/usr/share/cmf/bin/gen_credentials_ad.sh failed with exit code 53 and
output of <<
+ export PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ KEYTAB_OUT=/var/run/cloudera-scm-server/cmf2781839247630884630.keytab
+ PRINC=sqoop2/<host>@REALM.COM
+ USER=kaupocSuFoZIOIDa
+ PASSWD=REDACTED
+ DIST_NAME=CN=kaupocSuFoZIOIDa,OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=XXXX,DC=com
+ '[' -z /etc/krb5-cdh.conf ']'
+ echo 'Using custom config path '\''/etc/krb5-cdh.conf'\'', contents below:'
+ cat /etc/krb5-cdh.conf
+ SIMPLE_PWD_STR=
+ '[' '' = '' ']'
+ kinit -k -t /var/run/cloudera-scm-server/cmf5575611164358256388.keytab
cdhad@REALM.COM
++ mktemp /tmp/cm_ldap.XXXXXXXX
+ LDAP_CONF=/tmp/cm_ldap.XRbR8Zco
+ echo 'TLS_REQCERT never'
+ echo 'sasl_secprops minssf=0,maxssf=0'
+ export LDAPCONF=/tmp/cm_ldap.XRbR8Zco
+ LDAPCONF=/tmp/cm_ldap.XRbR8Zco
++ ldapsearch -LLL -H ldaps://<ldap-host>:636 -b
OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=xxxx,DC=com
userPrincipalName=sqoop2/<host>@REALM.COM
SASL/GSSAPI authentication started
SASL username: cdhad@REALM
SASL SSF: 0
+ PRINC_SEARCH=
+ set +e
+ echo
+ grep -q userPrincipalName
+ '[' 1 -eq 0 ']'
+ set -e
+ ldapmodify -H ldaps://<ldap-host>:636
++ echo sqoop2/<host>@REALM.COM
++ sed -e 's/\@REALM.COM//g'
++ echo -n '"REDACTED"'
++ iconv -f UTF8 -t UTF16LE
++ base64 -w 0
SASL/GSSAPI authentication started
SASL username: cdhad@REALM.COMSASL SSF: 0
ldap_add: Server is unwilling to perform (53)
additional info: 0000052D: SvcErr: DSID-031A1248, problem 5003
(WILL_NOT_PERFORM), data 0
Generate credentials in Cloudera Manager failed with the following errors:

/usr/share/cmf/bin/gen_credentials_ad.sh failed with exit code 53 and
output of <<
+ export PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ PATH=/usr/kerberos/bin:/usr/kerberos/sbin:/usr/lib/mit/sbin:/usr/sbin:/sbin:/usr/sbin:/bin:/usr/bin
+ KEYTAB_OUT=/var/run/cloudera-scm-server/cmf2781839247630884630.keytab
+ PRINC=sqoop2/host@REALM.COM
+ USER=kaupocSuFoZIOIDa
+ PASSWD=REDACTED
+ DIST_NAME=CN=kaupocSuFoZIOIDa,OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=xxxx,DC=com
+ '[' -z /etc/krb5-cdh.conf ']'
+ echo 'Using custom config path '\''/etc/krb5-cdh.conf'\'', contents below:'
+ cat /etc/krb5-cdh.conf
+ SIMPLE_PWD_STR=
+ '[' '' = '' ']'
+ kinit -k -t /var/run/cloudera-scm-server/cmf5575611164358256388.keytab
cdhad@REALM.COM
++ mktemp /tmp/cm_ldap.XXXXXXXX
+ LDAP_CONF=/tmp/cm_ldap.XRbR8Zco
+ echo 'TLS_REQCERT never'
+ echo 'sasl_secprops minssf=0,maxssf=0'
+ export LDAPCONF=/tmp/cm_ldap.XRbR8Zco
+ LDAPCONF=/tmp/cm_ldap.XRbR8Zco
++ ldapsearch -LLL -H ldaps://host:636 -b
OU=Cloudera,OU=ServersUnix,OU=IT,OU=Basel,OU=AdminUnits,DC=emea,DC=xxxx,DC=com
userPrincipalName=sqoop2/<host>@REALM.COM
SASL/GSSAPI authentication started
SASL username: cdhad@REALM.COM
SASL SSF: 0
+ PRINC_SEARCH=
+ set +e
+ echo
+ grep -q userPrincipalName
+ '[' 1 -eq 0 ']'
+ set -e
+ ldapmodify -H ldaps://<ldap-host>:636
++ echo sqoop2/<ldap-host>@REALM.COM
++ sed -e 's/\@REALM.COM//g'
++ echo -n '"REDACTED"'
++ iconv -f UTF8 -t UTF16LE
++ base64 -w 0
SASL/GSSAPI authentication started
SASL username: cdhad@REALM.COMSASL SSF: 0
ldap_add: Server is unwilling to perform (53)
additional info: 0000052D: SvcErr: DSID-031A1248, problem 5003
(WILL_NOT_PERFORM), data 0

If you see the similar error and you know that you have AD enabled for your cluster, then you have landed on the right place. This is likely caused by a bug in Cloudera Manager that it does not allow users to change the complexity of the password generated if AD server has password complexity restrictions setup, and Cloudera Manager’s request will be rejected.

To fix this issue is simple, but requires changing some source code in Cloudera Manager, follow the steps below:

  1. Backup file /usr/share/cmf/bin/gen_credentials_ad.sh first on CM host
  2. Add this line to /usr/share/cmf/bin/gen_credentials_ad.sh on line number 15:
    PASSWD="$PASSWD-"
    

    after line:

    PASSWD=$4
    

    Basically this adds a hyphen to CM generated passwords.

  3. Run Generate Credentials again to see if this helps

If same error still happens, go back to step 2 and try different variations for the password:

PASSWD="ABC=$PASSWD" # prepends "ABC=" to generated password.

The idea is to meet the criteria of AD password requirement.

This issue is likely fixed already in Cloudera Manager’s source code to support more flexibility when generating passwords, but it won’t be release until CM5.8 at least.

“Update Hive Metastore NameNodes” Through Cloudera Manager Timed-Out

After enabling HA for HDFS, Cloudera Manager has a feature to allow you to perform an update to existing HMS database, so that all the references to the old HDFS URL is updated to use the new nameservice name:

update-hivemetastore-namenode

You can only do it after Hive service is stopped.

However, there is a hard limit of 150 seconds for this command to run in Cloudera Manager, once it is reached, an error will be returned.

"Command aborted because of exception: Command timed-out after 150 seconds"

There is already a JIRA to get this limit removed, however, it won’t happen until Cloudera Manager 5.5.

To solve this problem, we can manually run the Hive metatool from the command line, follow the steps below:

1) Test the metatool to verify that it is working from command line on the Hive gateway host:

hive --service metatool -listFSRoot

2) Run Hive Metatool to update the nameservice:

hive --service metatool -updateLocation hdfs://nameservice1 hdfs://oldnamenode.com -tablePropKey avro.schema.url

If you getting errors like SQL driver not found or not able to connect to HMS database, please add the following steps and try above again on the HMS host:

3) Locate the HMS run time configuration directory on the HMS host, /var/run/cloudera-scm-agent/process/-hive-HIVEMETASTORE, in my below example would be “521”

export HIVE_CONF_DIR=/var/run/cloudera-scm-agent/process/521-hive-HIVEMETASTORE

4) Determine which DB type you are using, MySQL, PostgeSQL or Orcale, in my below example I used PostgeSQL

export HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH/jars/postgresql-9.1-901.jdbc4.jar

Then just wait for the command to finish and then restart Hive Services. So long as there is no other errors, you are good to go.

    Hive METASTORE exited unexpectedly, and no apparent error in the error log

    This article explains how to trouble shoot the issue when Hive METASTORE dies with no apparent error message in the log.

    Cloudera Manager (short for CM) has a mechanism to kill the processes that have reached its memory allocations. So when this scenario happens, it usually means that Hive METASTORE has memory issues and the process got killed by CM.

    The reason that no errors in the server logs is because the server process got killed by CM agent process and it did not get any chance to log its errors.

    The actual errors captured by CM is located in /var/run/cloudera-scm-agent/process/XXX-hive-HIVEMETASTORE/logs/*, where XXX is a number that represents each process ID that’s allocated to the server daemon every time it is restarted.

    If you can find the following error in those files:

    java.lang.OutOfMemoryError: Java heap space
    Dumping heap to /tmp/hive_hive-HIVEMETASTORE-592629fdad78ff2feb384c5dfbac4c5b_pid$$.hprof …
    Heap dump file created [102740174 bytes in 0.777 secs]
    #
    # java.lang.OutOfMemoryError: Java heap space
    # -XX:OnOutOfMemoryError=”/usr/lib64/cmf/service/common/killparent.sh”
    # Executing /bin/sh -c “/usr/lib64/cmf/service/common/killparent.sh”…

    you can confirm that METASTORE is hitting the OOM error.

    To fix it, simply log into CM and increase the HIVEMETASTORE Heaps to 2-4 GB depending on your cluster size, see screenshot below:

    hive-server-memory-config

    And then restart the server afterwards.

    The same trouble shooting technique can also apply to Hive Server 2 and other server processes that are managed by CM.

    Timestamp stored in Parquet file format in Impala Showing GMT Value

    This article explains why Impala and Hive return different timestamp values on the same table that was created and value inserted from Hive. It also outlines the steps to force Impala to apply local time zone conversion when reading timestamp field stored in Parquet file format.

    When Hive stores a timestamp value into Parquet format, it converts local time into UTC time, and when it reads data out, it converts back to local time.

    Impala, however on the other hand, does no conversion when reads the timestamp field out, hence, UTC time is returned instead of local time.

    Both behaviors are by design and work in the right way. More information can be found at: TIMESTAMP Data Type

    However, Impala can be set to apply the conversion as well to the timestamp field stored in Parquet file format (only available in Cloudera Manager 5.4), which is also mentioned in the link above. To do this, follow the steps below:

    1. Go to Impala Services home page
    2. Click on “Configuration
    3. ​On the left side under “Filters“, click “Impala Daemon” under “Scope” and “Advanced” under “Category
    4. Locate “Impala Daemon Command Line Argument Advanced Configuration Snippet (Safety Valve)“, and then enter the following:
    --convert_legacy_hive_parquet_utc_timestamps=true
    1. Save the changes
    2. Restart all Impala Daemons

    impala-config

    To confirm that the change takes effect, follow the steps below:

    1. Go to Impala Home page
    2. Click on “Instances” tab
    3. Click on any “Impala Daemon” link (make sure you have restarted all of them)
    4. Under “Summary” > “Quick Links“, click on “Impala Daemon Web UI
    5. A new page will open, click on the last tab on the top of the page named “/varz
    6. Search “convert_legacy_hive_parquet_utc_timestamps” and confirm that it is set to “true”: –convert_legacy_hive_parquet_utc_timestamps=true

    impala-flags


    This enables Impala to do the time zone conversion when reading timestamp field from Parquet file.

    Update:

    Please be warned that this will have some performance hit if you go with this path, please refer to upstream Impala JIRA: IMPALA-3316 for more details.