Yarn Job Failed with Error: “Split metadata size exceeded 10000000”

When you run a really big job in Hive that failed with the following error:

2016-06-28 18:55:36,830 INFO [Thread-58] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Setting job diagnostics to Job init failed : org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: Split metadata size exceeded 10000000. Aborting job job_1465344841306_1317
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1568)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1432)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1390)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1289)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1057)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1500)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1496)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1429)
Caused by: java.io.IOException: Split metadata size exceeded 10000000. Aborting job job_1465344841306_1317
at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1563)
... 17 more

This indicated that the value for mapreduce.job.split.metainfo.maxsize is too small for your job (default value of 10000000).

There are two options to fix this:

1. Set the value of mapreduce.job.split.metainfo.maxsize to be “-1” (unlimited) specifically for this job just before running it:

SET mapreduce.job.split.metainfo.maxsize=-1;

This should remove the limit, however, be warned that it will effectively let YARN to create unlimited metadata splits, if there is not enough resources on your cluster, it could have the potential to bring down the host.

2. The safer way is to increase the value to maybe double the value of default, which is 10000000:

SET mapreduce.job.split.metainfo.maxsize=20000000;

You could gradually increase the value and monitor your cluster to make sure that it will not bring down your machines.

I have seen other posts on Google that people were suggesting to set the value of mapreduce.job.split.metainfo.maxsize in mapred-site.xml configuration file. In my opinion, this only affect small number of queries when running against very BIG data set, so it is better to set this value at job level, so that no cluster restart will be required.

Please note that if you are using MapReduce V1, the setting should beĀ mapreduce.jobtracker.split.metainfo.maxsize instead, which does the same thing.

Hope this helps.

Hive query failed with error: Killing the Job. mapResourceReqt: 1638 maxContainerCapability:1200″

This article explains how to fix the following error when running a hive query:

MAP capability required is more than the supported max container capability in the cluster. 
Killing the Job. mapResourceRequest: <memory:1638, vCores:1> maxContainerCapability:<memory:1200, vCores:2>


This error might not be obvious, however, this is caused by the following config not setup properly in YARN:

mapreduce.map.memory.mb = 1638
yarn.scheduler.maximum-allocation-mb = 1200
yarn.nodemanager.resource.memory-mb = 1300

The solution here is to change the above mentioned property to have the following values:

mapreduce.map.memory.mb < yarn.nodemanager.resource.memory-mb < yarn.scheduler.maximum-allocation-mb mapreduce.map.memory.mb is the amount of heap can be allocated to map task, which is inside a container. yarn.nodemanager.resource.memory-mb is the amount of heap to be allocated to the container that will hold either mappers or reducers. yarn.scheduler.maximum-allocation-mb is the amount of maximum memory can be allocated to container, which need to be higher than yarn.nodemanager.resource.memory-mb. Then the problem will be solved. Hope this helps.

Hive query failed with error: Killing the Job. mapResourceReqt: 1638 maxContainerCapability:1200

When running a Hive query, get the following error in the jobhistory:

MAP capability required is more than the supported max container capability in the cluster. 
Killing the Job. mapResourceRequest: <memory:1638, vCores:1> maxContainerCapability:<memory:1200, vCores:2>


This is caused by the following settings in YARN:

mapreduce.map.memory.mb => 1638
yarn.scheduler.maximum-allocation-mb => 1200
yarn.nodemanager.resource.memory-mb => 1300

The solution is to setup the settings mentioned above in the following way:

mapreduce.map.memory.mb < yarn.nodemanager.resource.memory-mb < yarn.scheduler.maximum-allocation-mb

Then the problem should be resolved.

Sqoop Fails with FileNotFoundException in CDH

The following Exceptions occur when executing Sqoop on a cluster managed by Cloudera Manager:

15/05/11 20:42:55 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: hdfs://nameservice1/mnt/var/opt/CDH-5.3.3-1.cdh5.3.3.p0.5/lib/sqoop/lib/hsqldb-
15/05/11 20:42:55 ERROR tool.ImportTool: Encountered IOException running import job: java.io.FileNotFoundException: File does not exist: hdfs://nameservice1/mnt/var/opt/CDH-5.3.3-1.cdh5.3.3.p0.5/lib/sqoop/lib/hsqldb-
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1093)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:267)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:388)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:481)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1313)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:198)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:171)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:268)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:665)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

This is caused by Sqoop needs configuration deployment throught a YARN Gateway.

To fix this problem, in Cloudera Manager, go to:

YARN > Instances > Add Roles Instances > Gateway (Select Hosts) > (Click on target hosts) > OK

then “Deploy Client Configuration” is required, go to:

YARN > Intances > Actions (top right corner) > Deploy Client Configuration

Now run the sqoop command again and it should be working.

The reason that the YARN Gateway is needed is that when Sqoop runs on a particular host, this host has to know where the Resource Manager is, whether it is in the local mode or cluster mode etc, so that it knows how to submit those MapReduce jobs. By deploying client configuration to the host, it will have those information and it knows what to do when Sqoop is run, as Sqoop jobs are actually MapReduce jobs.

The solution is not obvious, and hope it can help.

    Enabling Snappy Compression Support in Hadoop 2.4 under CentOS 6.3

    After Hadoop is install manually using binary package on CentOS, Snappy compression is not supported by default and there are extra steps required in order for Snappy to work in Hadoop. It is straightforward but might not be obvious if you don’t know what to do.

    Firstly, if you are using 64 bit version of CentOS, you will need to replace the default native hadoop library which is shipped with Hadoop (it is only compiled for 32 bit), you can try to download it from here, and then put it under “$HADOOP_HOME/lib/native” directory. If there is a symlink, you can just remove the symlink with the actual file. If it still doesn’t work, then you might need to compile yourself on your machine which is out of scope of this post, you can follow instructions on this site.

    Secondly you will need to install native snappy library for your operating system (CentOS 6.3 in my case):

    $ sudo yum install snappy snappy-devel

    This will create a file called libsnappy.so under /usr/lib64 directory, we need to create a link to this file under “$HADOOP_HOME/lib/native”

    sudo ln -s /usr/lib64/libsnappy.so $HADOOP_HOME/lib/native/libsnappy.so

    Then update three configuration files:





    And finally add the following line into $HADOOP_HOME/etc/hadoop/hadoop-env.sh to tell Hadoop to load the native library from the exact location:

    export JAVA_LIBRARY_PATH="/usr/local/hadoop/lib/native"

    That’s it, just restart HDFS and Yarn by running:


    Now you should be able to create hive tables with Snappy compressed.