Apache Sqoop became the Top-Level Project in Apache in March 2012. Since then, Sqoop has developed a lot and become very popular amongst Hadoop ecosystem. In this post, I will cover the ways to specify database passwords to Sqoop in a secure way.
The following ways are common to pass database passwords to Sqoop:
sqoop import --connect jdbc:mysql://myexample.com/test \ --username myuser -P \ --table mytable sqoop import --connect jdbc:mysql://myexample.com/test --username myuser \ --password mypassword \ --table mytable
The first one is secure as other people can’t see the password, however, it is only practical to use in the command line.
And we all agree that the second one is insecure as everyone can see what the password is to access the database.
The more secure way of passing the password is through the use of so called password file. The command as follows:
echo -n "password" > /home/ericlin/.mysql.password chmod 400 /home/ericlin/.mysql.password sqoop import --connect jdbc:mysql://myexample.com/test \ --username myuser \ --password-file /home/ericlin/.mysql.password \ --table mytable
Please note that we need “-n” option for the “echo” command so that no newline will be added to the end of the password. And, please do not use “vim” to create the file as “vim” will add newline automatically to the end of the file, which will cause Sqoop to fail as the password contains a newline character.
However, storing password in a text file is still considered not secure even though we have set the permissions. As of Sqoop 1.4.5, Sqoop supports the use of JAVA Key Store to store passwords, so that you do not need to store passwords in clear text in a file.
To generate the key:
[ericlin@localhost ~] $ hadoop credential create mydb.password.alias -provider jceks://hdfs/user/ericlin/mysql.password.jceks Enter password: Enter password again: mysql.password has been successfully created. org.apache.hadoop.security.alias.JavaKeyStoreProvider has been updated.
On prompt, enter the password that will be used to access the database.
The “mydb.password.alias” is the alias that we can use to pass to Sqoop when running the command, so that no password is needed.
Then you can run the following Sqoop command:
sqoop import -Dhadoop.security.credential.provider.path=jceks://hdfs/user/ericlin/mysql.password.jceks \ -–connect ‘jdbc:mysql://myexample.com/test’ \ -–table mytable \ -–username myuser \ -–password-alias mydb.password.alias
This way password is hidden inside jceks://hdfs/user/ericlin/mysql.password.jceks and no one is able to see it.
Hope this helps.