Recently I discovered a bug in Sqoop that it wrongly detects a malformed query in options file, but in fact the query is totally correct. Example as below sqoop command:

sqoop --options-file /path/to/options-file.txt

and in the /path/to/options-file.txt, it contains the following content:

import
--connect
jdbc:mysql://......
--username
XXXXXXXX
--password
XXXXXXXX
--query
SELECT * FROM test_table WHERE a = 'b'
....

Sqoop will fail with the following error:

16/11/22 16:08:59 ERROR sqoop.Sqoop: Error while expanding arguments
java.lang.Exception: Malformed option in options file(/path/to/options-file.txt): 
SELECT * FROM test_table WHERE a = 'b'
at org.apache.sqoop.util.OptionsFileUtil.removeQuoteCharactersIfNecessary(OptionsFileUtil.java:170)
at org.apache.sqoop.util.OptionsFileUtil.removeQuotesEncolosingOption(OptionsFileUtil.java:136)
at org.apache.sqoop.util.OptionsFileUtil.expandArguments(OptionsFileUtil.java:90)
at com.cloudera.sqoop.util.OptionsFileUtil.expandArguments(OptionsFileUtil.java:33)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:199)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Malformed option in options file(/path/to/options-file.txt): SELECT * FROM test_table WHERE a = 'b'

This is caused by a bug in Sqoop code that Sqoop checks for the beginning and ending quotes in the query section of the options file. If a query does not start with a quote but ends with a quote, then it will fail with the mentioned error.

From the example shown in the above, the query had an ending single quote “‘”, Sqoop will wrongly treat it as an invalid query and then Exception will be throw. If you look at the function that does the check for quotes in the query string:

https://github.com/apache/sqoop/blob/release-1.4.6-rc3/src/java/org/apache/sqoop/util/OptionsFileUtil.java#L156-L175

  private static String removeQuoteCharactersIfNecessary(String fileName,
      String option, char quote) throws Exception {
    boolean startingQuote = (option.charAt(0) == quote);
    boolean endingQuote = (option.charAt(option.length() - 1) == quote);

    if (startingQuote && endingQuote) {
      if (option.length() == 1) {
        throw new Exception("Malformed option in options file("
            + fileName + "): " + option);
      }
      return option.substring(1, option.length() - 1);
    }

    if (startingQuote || endingQuote) {
       throw new Exception("Malformed option in options file("
           + fileName + "): " + option);
    }

    return option;
  }

Based on the above code, the sqoop command using options file will fail if the it either only starts or ends with a quote (single or double). I have created upstream JIRA for this bug SQOOP-3061 and provided a patch to fix the issue.

Currently, the workaround is to NOT end the query with a quote, either double or single quote, so simply add something like “AND 1=1” will resolve the issue.

Leave a Reply

Your email address will not be published. Required fields are marked *