SYMPTOM :
The full trace stack looks like following in ingestion task log :
2018-02-06T23:40:58,830 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 0% reduce 0% 2018-02-06T23:40:58,841 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1517956798737_0003 failed with state FAILED due to: Application application_1517956798737_0003 failed 2 times due to AM Container for appattempt_1517956798737_0003_000002 exited with exitCode: 1 For more detailed output, check the application tracking page: http://ip-172-31-12-219.us-west-2.compute.internal:8088/cluster/app/application_1517956798737_0003 Then click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_e02_1517956798737_0003_02_000001 Exit code: 1 Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Failing this attempt. Failing the application. 2018-02-06T23:40:58,858 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Counters: 0 2018-02-06T23:40:58,860 ERROR [task-runner-0-priority-0] io.druid.indexer.DetermineHashedPartitionsJob - Job failed: job_1517956798737_0003 2018-02-06T23:40:58,860 INFO [task-runner-0-priority-0] io.druid.indexer.JobHelper - Deleting path[/tmp/druid-indexing/wikipedia-Hadoop/2018-02-06T234045.883Z_52c926c5479747fc96e3e474fa7a3b5c] 2018-02-06T23:40:58,880 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[HadoopIndexTask{id=index_hadoop_wikipedia-Hadoop_2018-02-06T23:40:45.883Z, type=index_hadoop, dataSource=wikipedia-Hadoop}] java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?] at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:218) ~[druid-indexing-service-0.11.0-iap5.jar:0.11.0-iap5] at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:177) ~[druid-indexing-service-0.11.0-iap5.jar:0.11.0-iap5] at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.11.0-iap5.jar:0.11.0-iap5] at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.11.0-iap5.jar:0.11.0-iap5] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:215) ~[druid-indexing-service-0.11.0-iap5.jar:0.11.0-iap5] ... 7 more Caused by: io.druid.java.util.common.ISE: Job[class io.druid.indexer.DetermineHashedPartitionsJob] failed! at io.druid.indexer.JobHelper.runJobs(JobHelper.java:390) ~[druid-indexing-hadoop-0.11.0-iap5.jar:0.11.0-iap5] at io.druid.indexer.HadoopDruidDetermineConfigurationJob.run(HadoopDruidDetermineConfigurationJob.java:91) ~[druid-indexing-hadoop-0.11.0-iap5.jar:0.11.0-iap5] at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:309) ~[druid-indexing-service-0.11.0-iap5.jar:0.11.0-iap5] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:215) ~[druid-indexing-service-0.11.0-iap5.jar:0.11.0-iap5] ... 7 more 2018-02-06T23:40:58,886 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_wikipedia-Hadoop_2018-02-06T23:40:45.883Z] status changed to [FAILED]. 2018-02-06T23:40:58,889 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: { "id" : "index_hadoop_wikipedia-Hadoop_2018-02-06T23:40:45.883Z", "status" : "FAILED", "duration" : 8701 }
and the error in YARN resource manager reads :
2018-02-06 23:26:00,942 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output 2018-02-06 23:26:00,953 ERROR [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.VerifyError: class com.fasterxml.jackson.datatype.guava.deser.HostAndPortDeserializer overrides final method deserialize.(Lcom/fasterxml/jackson/core/JsonParser;Lcom/fasterxml/jackson/databind/DeserializationContext;)Ljava/lang/Object; at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at com.fasterxml.jackson.datatype.guava.GuavaModule.setupModule(GuavaModule.java:22) at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:524) at io.druid.jackson.DefaultObjectMapper.<init>(DefaultObjectMapper.java:47) at io.druid.jackson.DefaultObjectMapper.<init>(DefaultObjectMapper.java:35) at io.druid.jackson.JacksonModule.jsonMapper(JacksonModule.java:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Root Cause : This error is due to conflicts between different library versions used by Druid and Hadoop .
SOLUTION :
Force MapReduce to use library preferred by Druid, by adding JobProperties config under tuningConfig in ingestion spec :
"tuningConfig" : {
"type" : "hadoop",
"jobProperties": {
"mapreduce.job.user.classpath.first": "true",
"mapreduce.job.classloader.system.classes": "-javax.validation.,java.,javax.,org.apache.commons.logging.,org.apache.log4j.,org.apache.hadoop."
},
"partitionsSpec" : {
"type" : "hashed",
"targetPartitionSize" : 5000000
},
"forceExtendableShardSpecs" : true
}
For more details, please refer to : http://druid.io/docs/0.11.0/operations/other-hadoop.html
Comments
0 comments
Please sign in to leave a comment.