SYMPTOM:
In scenario where segments had already been created by batch ingestion previously, and users want to append new rows from Kafka to these segments, the ingestion may fail with following error:
2018-08-22T10:39:58,587 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[AbstractTask{id='index_kafka_<DATA_SOURCE>_5e23bade0948741_fcddpdca', groupId='index_kafka_<DATA_SOURCE>', taskResource=TaskResource{availabilityGroup='index_kafka_<DATA_SOURCE>_5e23bade0948741', requiredCapacity=1}, dataSource='<DATA_SOURCE>', context={checkpoints={"0":{"0":40978243,"1":40978241,"2":40978242,"3":40978241,"4":40978243,"5":40978243,"6":40978242,"7":40972354,"8":40978243,"9":40978241}}, IS_INCREMENTAL_HANDOFF_SUPPORTED=true}}] io.druid.java.util.common.ISE: Could not allocate segment for row with timestamp[2018-08-21T17:22:18.000Z] at io.druid.indexing.kafka.KafkaIndexTask.run(KafkaIndexTask.java:640) ~[?:?] at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:444) [druid-indexing-service-0.12.2.jar:0.12.2] at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:416) [druid-indexing-service-0.12.2.jar:0.12.2] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_162] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_162] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_162] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
However, data ingestion would be successful if ingesting the same data from Kafka to a brand new dataSource.
ROOT CAUSE:
There are two possible reasons why appending fails:
1. Batch ingestion and Kafka ingestion are not on a same segment granularity. For example, one is on HOUR granularity while the other is on DAY.
2. The batch ingestion spec had set "forceExtendableShardSpecs"
to default false
, and made the segment not append-able.
RESOLUTION:
Setting forceExtendableShardSpecs
to true
in the batch ingestion spec is the necessary pre-requisite to make the segments appendable later by Kafka. If the segment is created with forceExtendableShardSpecs=false
, then unfortunately it's locked and we cannot append more data to it.
Comments
0 comments
Please sign in to leave a comment.