When creating a new Druid DataSource, admins may want to add new dimensions that's not in raw data, and fill the columns with default values. This can be achieved by using `transformSpec` during data ingestion.
Here is a example of adding two dummy columns `dummyCol1` and `dummyCol2` to `Wikipedia` dataSource in ingestion spec :
{ "type" : "index", "spec" : { "dataSchema" : { "dataSource" : "wikipedia", "parser" : { "type" : "string", "parseSpec" : { "format" : "json", "timestampSpec" : { "column" : "timestamp", "format" : "iso" }, "dimensionsSpec" : { "dimensions" : [ "diffUrl", "isRobot", { "name" : "added", "type" : "long" }, ..... "name" : "metroCode", "type" : "long" }, "dummyCol1", "dummyCol2" ] } } }, "metricsSpec" : [ ], "granularitySpec" : { ... }, "rollup" : false, "intervals" : null }, "transformSpec" : { "filter" : null, "transforms" : [ { "type" : "expression", "name" : "dummyCol1", "expression" : "nvl(\"dummyCol1\", 'HAPPY')" }, { "type" : "expression", "name" : "dummyCol2", "expression" : "nvl(\"dummyCol2\", 'JOY')" } ] } }, "ioConfig" : { "type" : "index", "firehose" : { "type" : "http", "uris" : [ "https://SERVER/data/wikipedia.json.gz" ], ... }, "appendToExisting" : false }, "tuningConfig" : { ... } }, "dataSource" : "wikipedia" }
After ingestion is done, we can see the two new dimensions are created :
Comments
0 comments
Please sign in to leave a comment.