I would like to create single visual showing multiple histograms on it. I have simple arrays of values, like so:
"data": {"values": {"foo": [0,0,0,1,1,1,2,2,2], "baz": [2,2,2,3,3,3,4,4,4]}}
I want to use different color bars to show the spread of values for "foo" and "baz". I am able to make a single histogram for "foo" like so:
{
"data": {"values": {"foo": [0,0,0,1,1,1,2,2,2]}},
"mark": "bar",
"transform": [{"flatten": ["foo"]}],
"encoding": {
"x": {"field": "foo", "type": "quantitative"},
"y": {"field": "foo", "type": "quantitative", "aggregate": "count"}
}
}
However, I cannot find the correct way to flatten out the arrays. This doesn't work:
{
"data": {"values": {"foo": [0,0,0,1,1,1,2,2,2], "bar": [0,0,0,1,1,1,2,2,2]}},
"mark": "bar",
"transform": [{"flatten": ["foo", "baz"]}],
"encoding": {
"x": {"field": "foo", "type": "quantitative"},
"y": {"field": "foo", "type": "quantitative", "aggregate": "count"}
},
"layer": [{
"mark": "bar",
"encoding": {
"y": {"field": "baz", "type": "quantitative", "aggregate": "count"}
}
}]
}
https://vega.github.io/editor/#/url/vega-lite/N4IgJghgLhIFygG4QDYFcCmBneoBmA9gfANoAMANJZQIwV10BMFzjAuhSAEYQBep1KvWFMWLNgF8JnALYQATgGt43BSE5R5EAHZZC8maXwpoUDNtIhCxTj36SOIcwGMCYAJbaA5rhAAPXzx3DBQwFWt1ECgATwAHDBUARzQdKHcYNMQE6RBowODQ8KJImPiklO00jPcsyIgvL3kML2gEuBBXNEqQKU4TaIx5IxA5JRUeIc4XN08fBFz8kLD2uxK4tpBk1PToGoTOesbm1pVO7qkJSSA
Inspecting data_0, there is are columns for foo and its counts, but nothing for baz.
This doesn't work, either:
{
"data": {
"values": {
"foo": [0, 0, 0, 1, 1, 1, 2, 2, 2],
"baz": [0, 0, 0, 1, 1, 1, 2, 2, 2]
}
},
"mark": "bar",
"transform": [{"flatten": ["foo"]},{"flatten": ["baz"]}],
"encoding": {
"x": {"field": "foo", "type": "quantitative"},
"y": {"field": "foo", "type": "quantitative", "aggregate": "count"}
},
"layer": [
{
"mark": "bar",
"encoding": {
"y": {"field": "baz", "type": "quantitative", "aggregate": "count"}
}
}
]
}
https://vega.github.io/editor/#/url/vega-lite/N4IgJghgLhIFygG4QDYFcCmBneoBmA9gfANoAMANJZQIwV10BMFzjAuhSAEYQBep1KvWFMWLNgF8JnALYQATgGt43BSE5R5EAHZZC8maXwpoUDNtIhCxSRWOnzlnv0kcQ5gMYEwAS20BzXBAADyC8HwwUMBVrdRAoAE8ABwwVAEc0HSgfGGzEVOkQBLCIqJiiOMSU9MztbNyffLiIf395DH9oVLgQLzQ6kClOEwSMeSMQOSUVHnHOT28-QIQiksjonudK5O6QDKyc6EbUzha2jq6VPoGpCUkgA
That still only gives columns for foo and its count, but now the count is 27 for each bucket!
How can I accomplish a multi-histogram graphic starting with array data?
You can do this using a flatten transform followed by a fold transform, and then use a color encoding to separate the two datasets. For example (open in editor):
{
"data": {
"values": {
"foo": [0, 0, 1, 1, 1, 1, 2, 2, 2],
"baz": [4, 4, 5, 5, 6, 6, 6, 6, 7]
}
},
"transform": [{"flatten": ["foo", "baz"]}, {"fold": ["foo", "baz"]}],
"mark": "bar",
"encoding": {
"x": {"field": "value", "type": "quantitative"},
"y": {
"field": "value",
"type": "quantitative",
"aggregate": "count",
"stack": null
},
"color": {"field": "key", "type": "nominal"}
}
}
As an aside, your layer approach also works if you put the encodings in separate layers, so that the outer foo aggregate doesn't clobber the baz data, but it's a bit more verbose than the approach based on fold:
{
"data": {
"values": {
"foo": [0, 0, 1, 1, 1, 1, 2, 2, 2],
"baz": [4, 4, 5, 5, 6, 6, 6, 6, 7]
}
},
"transform": [{"flatten": ["foo", "baz"]}],
"layer": [
{
"mark": {"type": "bar", "color": "orange"},
"encoding": {
"x": {"field": "foo", "type": "quantitative"},
"y": {"field": "foo", "type": "quantitative", "aggregate": "count"}
}
},
{
"mark": "bar",
"encoding": {
"x": {"field": "baz", "type": "quantitative"},
"y": {"field": "baz", "type": "quantitative", "aggregate": "count"}
}
}
]
}
Related
I would like to implement an example of RoseWid chart in Data Studio using the Vega community visualization, using this great code from Stan Nowak. Once adapted for the Vega plugin, it looks like this:
{"$schema": "https://vega.github.io/schema/vega/v5.json",
"width": 400,
"height": 400,
"data": [
{
"name": "table",
"values": [
{"g": "1", "c": 0, "r": 11},
{"g": 1, "c": 1, "r": 22},
{"g": 1, "c": 2, "r": 13},
{"g": 2, "c": 0, "r": 24},
{"g": 2, "c": 1, "r": 35},
{"g": 2, "c": 2, "r": 36},
{"g": 3, "c": 0, "r": 42},
{"g": 3, "c": 1, "r": 32},
{"g": 3, "c": 2, "r": "32"},
{"g": 4, "c": 0, "r": 6},
{"g": 4, "c": 1, "r": 27},
{"g": 4, "c": 2, "r": 16},
{"g": 5, "c": 0, "r": 52},
{"g": 5, "c": 1, "r": 79},
{"g": 5, "c": 2, "r": 38},
{"g": 6, "c": 0, "r": 19},
{"g": 6, "c": 1, "r": 83},
{"g": 6, "c": 2, "r": 3}
]
},
{
"name": "angles",
"source": "table",
"transform": [
{"type": "aggregate", "groupby": ["g"]},
{"type": "pie"}
]
},{
"name": "stack",
"source": "table",
"transform": [
{"type": "stack", "groupby": ["g"], "sortby": ["c"], "field": "r"},
{"type": "lookup", "from": "angles", "key": "g", "fields": ["g"], "as": ["obj"]}
]
}
],
"scales": [
{
"name": "color",
"type": "linear",
"domain": {"data": "stack", "field": "c"},
"range": {"scheme": "redyellowgreen"}
},
{
"name": "r",
"type": "sqrt",
"domain": {"data": "table", "field": "y"},
"range": [20, 200]
}
],
"marks": [
{
"type": "arc",
"from": {"data": "stack"},
"encode": {
"enter": {
"x": {"field": {"group": "width"}, "mult": 0.5},
"y": {"field": {"group": "height"}, "mult": 0.5},
"startAngle": {"data": "table", "field": "obj.startAngle"},
"endAngle": {"data": "table", "field": "obj.endAngle"},
"innerRadius": {"field": "y0"},
"outerRadius": {"field": "y1"},
"stroke": {"value": "black"}
},
"update": {"fill": {"scale": "color", "field": "c"}},
"hover": {"fill": {"value": "red"}}
}
}
],
"config": {}
}
The dataset has beed arranged in datastudio to have the same structure as the provided sample table ( with 3 columns with the same structure as the inline table). However I´m stuck on how to replace the table with the Data Studio bindings (namely, $dimension0, $dimension1 and $metric0)¨.
So far I tried:
"data": [
{
"name": "table",
"values": ["$dimension0","$dimension1","$metric0"],
"as": ["g","c","r"]
},
....
and some variations on this, all to no result. The visualization keeps blank and there´s little information on what is failing.
Any help would be greatly appreciated.
EDIT : Here's a google sheet that is reproducing the same structure as the table in the inline code, which would be used as data for the Data Studio, and you can find an example of the working visualization & attempts to solve here
First of all, there is an error in your example Vega source: stack transform does not have "sortby". Instead, specify "sort":
"sort": { "field": ["c"],
"order": ["ascending"]
},
View example in Vega online editor
See Vega documentation for stack and compare:
For Google Data Studio, I have no experience using it
but at the "Visualizations" web page, there are examples of using Vega in Data Studio:
"Data Studio Vega Viz" by Jerry Chen
Based on his examples, "$dimension0","$dimension1" and "$metric0" can be used directly as Vega data fields.
Your question is how to use the Vega source that already has field names "g", "c" and "r". One way is to globally replace all "g", "c" and "r" with "$dimension0","$dimension1" and "$metric0" as shown in Jerry Chen's examples.
An alternative approach is to create formula fields "g", "c" and "r" as substitutes for "$dimension0","$dimension1" and "$metric0" as shown below. If this works then it is a general solution for converting any existing Vega spec to Google Data Studio and also for retaining meaningful field names in the Vega spec.
"data": [
{ "name": "table",
"transform": [
{
"type": "formula",
"as": "g",
"expr": "datum['$dimension0']"
},
{
"type": "formula",
"as": "c",
"expr": "datum['$dimension1']"
},
{
"type": "formula",
"as": "r",
"expr": "datum['$metric0']"
}
]
},
{
"name": "angles",
"source": "table",
"transform": [
{"type": "aggregate", "groupby": ["g"]},
{"type": "pie"}
]
}
... etc
I have come across the same issue and managed to solve this after so many trial-and-errors.
To bind with data studio, you have to name your data as
"default" and you can simply rename the fields by transforming with "type" : "project" in Vega.
"data": [
{
"name": "default",
"transform" : [{ "type" : "project",
"fields" : ["$dimension0","$dimension1","$metric0"],
"as": ["g","c","r"]}
]
},
....
This sample code from vega transformations is not working.
I am not getting the expected results.
Can you please help me run this example in the vega editor?
[
{"foo": {"a": 5, "b": "abc"}, "bar": 0},
{"foo": {"a": 6, "b": "def"}, "bar": 1},
{"foo": {"a": 7, "b": "ghi"}, "bar": 2}
]
To extract the "bar" field along with the "a" and "b" sub-fields into new objects, use the transform:
{
"type": "project",
"fields": ["bar", "foo.a", "foo.b"],
"as": ["bar", "a", "b"]
}
This produces the following output:
[
{"bar":0, "a":5, "b":"abc"},
{"bar":1, "a":6, "b":"def"},
{"bar":2, "a":7, "b":"ghi"}
]
Ok, I used the very first sample in the editor (the bar chart one) to demonstrate the transformation for you. Here is the modified code you can paste in the editor:
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"description": "A basic bar chart example, with value labels shown upon mouse hover.",
"width": 400,
"height": 200,
"padding": 5,
"data": [
{
"name": "table",
"values": [
{"foo": {"a": 5, "b": "abc"}, "bar": 0},
{"foo": {"a": 6, "b": "def"}, "bar": 1},
{"foo": {"a": 7, "b": "ghi"}, "bar": 2}
],
"transform": [
{
"type": "project",
"fields": ["bar", "foo.a", "foo.b"],
"as": ["bar", "a", "b"]
}
]
}
],
"signals": [
{
"name": "tooltip",
"value": {},
"on": [
{"events": "rect:mouseover", "update": "datum"},
{"events": "rect:mouseout", "update": "{}"}
]
}
],
"scales": [
{
"name": "xscale",
"type": "band",
"domain": {"data": "table", "field": "bar"},
"range": "width",
"padding": 0.05,
"round": true
},
{
"name": "yscale",
"domain": {"data": "table", "field": "a"},
"nice": true,
"range": "height"
}
],
"axes": [
{ "orient": "bottom", "scale": "xscale" },
{ "orient": "left", "scale": "yscale" }
],
"marks": [
{
"type": "rect",
"from": {"data":"table"},
"encode": {
"enter": {
"x": {"scale": "xscale", "field": "bar"},
"width": {"scale": "xscale", "band": 1},
"y": {"scale": "yscale", "field": "a"},
"y2": {"scale": "yscale", "value": 0}
},
"update": {
"fill": {"value": "steelblue"}
},
"hover": {
"fill": {"value": "red"}
}
}
},
{
"type": "text",
"encode": {
"enter": {
"align": {"value": "center"},
"baseline": {"value": "bottom"},
"fill": {"value": "#333"}
},
"update": {
"x": {"scale": "xscale", "signal": "tooltip.bar", "band": 0.5},
"y": {"scale": "yscale", "signal": "tooltip.a", "offset": -2},
"text": {"signal": "tooltip.b"},
"fillOpacity": [
{"test": "datum === tooltip", "value": 0},
{"value": 1}
]
}
}
}
]
}
And here would be the expected result:
As you can see in the data viewer of the editor, "bar" "a" and "b" have the correct values, and I have changed the bar chart so it uses those values to show the bars, the appropriate height, as well as the tooltip when you hover over a bar.
I'm fairly new to vega-lite. I'd really like to get the following nested bar chart working.
This nested bar chart depicts aggregated values across multiple categories. The input data is subdivided according to two fields (with uneven category membership). Each sub-group is then aggregated to show the average value of a third, quantitative field.
Example on vega:Nested Bar Chart Example
How not to use the row function?
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"values": [
{"a0": 0,"a": 0, "b": "a", "c": 6.3},
{"a0": 0,"a": 0, "b": "a", "c": 4.2},
{"a0": 0,"a": 0, "b": "b", "c": 6.8},
{"a0": 0,"a": 0, "b": "c", "c": 5.1},
{"a0": 0,"a": 1, "b": "b", "c": 4.4},
{"a0": 0,"a": 2, "b": "b", "c": 3.5},
{"a0": 0,"a": 2, "b": "c", "c": 6.2}
]
},
"transform": [
{"window": [{"op": "count", "as": "room2"}]}
],
"vconcat": [
{
"facet": {"column": {"field": "a0"},"row": {"field": "a"}},
"spec": {
"width": 100,
"encoding": {
"y": {"field": "room2", "type": "nominal","axis": null},
"x": {"value": 100, "type": "quantitative"}
},
"layer": [
{
"mark": {"type": "bar", "cornerRadius": 10},
"encoding": {
"color": {
"field": "room2"
}
}
}
]
}
}
]
}
I have converted the Example on vega: Nested Bar Chart Example to vega-lite. So I hope it helps you understand or resolve your case. If you need help with your chart then I would need some proper inputs from your end. Refer the code below or in Editor
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"values": [
{"a": 0, "b": "a", "c": 6.3},
{"a": 0, "b": "a", "c": 4.2},
{"a": 0, "b": "b", "c": 6.8},
{"a": 0, "b": "c", "c": 5.1},
{"a": 1, "b": "b", "c": 4.4},
{"a": 2, "b": "b", "c": 3.5},
{"a": 2, "b": "c", "c": 6.2}
]
},
"transform": [
{
"aggregate": [{"field": "c", "as": "avg_c", "op": "average"}],
"groupby": ["a", "b"]
}
],
"vconcat": [
{
"facet": {"row": {"field": "a", "title": null, "header": null}},
"spec": {
"width": 300,
"encoding": {
"y": {"field": "b", "type": "nominal", "axis": null},
"x": {"field": "avg_c", "type": "quantitative"}
},
"layer": [
{
"mark": {"type": "bar", "cornerRadius": 10},
"encoding": {"color": {"field": "a"}}
},
{
"mark": {"type": "text", "align": "right", "dx": -5},
"encoding": {
"text": {"field": "b"},
"x": {"datum": "0", "type": "quantitative"}
}
}
]
}
}
]
}
is there a way to dynamically calculate growth rates in Vega-Lite.
For example:
[
{"date": "1/1/2020", "b": 27},
{"date": "1/2/2020", "b": 30},
{"date": "1/3/2020", "b": 33}
]
How could I create data (and a chart) that shows the daily +3 (or the ~+10%)?
Edit: Thanks for the answer, #jakevdp.
Should have outlined the added complexity earlier; apologies: I need to aggregate prior to calculating changes. See below for the data and my attempt (dates seem offset and last date's drop doesn't make sense.
[Vega Editor][1]
{
"data": {
"values": [
{"date": "2020-01-01", "country": "CHN", "count": 0},
{"date": "2020-01-02", "country": "CHN", "count": 2},
{"date": "2020-01-03", "country": "CHN", "count": 4},
{"date": "2020-01-01", "country": "GER", "count": 0},
{"date": "2020-01-02", "country": "GER", "count": 2},
{"date": "2020-01-03", "country": "GER", "count": 4},
{"date": "2020-01-04", "country": "GER", "count": 6}
]
},
"transform": [
{
"aggregate": [{"op": "sum", "field":"count", "as":"daily_count"}],
"groupby": ["date"]
},
{
"window": [
{"op": "lead", "field": "daily_count", "as": "daily_count_tomorrow"}
]
},
{"filter": "isValid(datum.daily_count_tomorrow)"},
{"calculate": "datum.daily_count_tomorrow - datum.daily_count", "as": "change"}
],
"mark": "bar",
"encoding": {
"x": {"type": "ordinal", "field": "date", "timeUnit": "yearmonthdate"},
"y": {"type": "quantitative", "field": "change"}
}
}
[1]: https://vega.github.io/editor/#/url/vega-lite/N4KABGBEAmCGAutIC4yghSA3WAbArgKYDOKYA2uBhMDAoWZAEwAMrAtCwIydeQA0UAMYB7fADt4AJwCejAMIAJAHIDhYyWRYBfflWq048BqmZsWvTkzWRRE6XNNLVg2xvhkmu-RkP1GrBzcnADMNnaSsgoq4e5kACzePjR0xgHmltyx9lGmAOIAogBK2ZqoOnrUKUYmUIEWwWylDoyFJa4RHqhelVV+aab1mWEd7rlQbc0J3lVoqbVmQTws8c3jkJOj9mQAbNo+ALpU3pDSsOLEAGYiUgC2ZJQGyVCwAOavUoSv-qjktCIAB0YxHw91clwAloRcNAUG5tq5YKRkHQIbgZAB9TqQbQHXrUSAfMQAgBGjgo80gR2oM18z0gAHcIeJoCIGQ9nilAYxcIRYLDwVCYYw4GjMdjEcioKL0Vj3Bj4CJbjcpGycc9qRhaSlIbhjFJGBDiAA1PAQ6AACiMoIAdDLxfLFcqpKqGQBKHH4uZCPBCfC4H7ShC2+1y+wKpUqtlgdhga23O2wMVhzSSxhCAAW51eDH2EDxICokFusCkAGtGCTSwIi4RxKJoMzXmR0BhIAAPFunGQAhY3RviPA2SHQ2GmGo2eAQ26EACq4ghXSgMj5dxEkgzE+1y678B7CwAjvhzlPEFOsAxBaP01nxDn1RB9togA
Yes, you can do this using the window transform with the lead or lag operation. For example (vega editor):
{
"data": {
"values": [
{"date": "2020-01-01", "b": 29},
{"date": "2020-01-02", "b": 30},
{"date": "2020-01-03", "b": 32},
{"date": "2020-01-04", "b": 31},
{"date": "2020-01-05", "b": 34}
]
},
"transform": [
{"window": [{"op": "lead", "field": "b", "as": "b1"}]},
{"filter": "isValid(datum.b1)"},
{"calculate": "datum.b1 - datum.b", "as": "change"}
],
"mark": "bar",
"encoding": {
"x": {"type": "ordinal", "field": "date", "timeUnit": "yearmonthdate"},
"y": {"type": "quantitative", "field": "change"},
"color": {
"condition": {"test": "datum.change > 0", "value": "green"},
"value": "red"
}
}
}
I'm trying to use values from the data in order to set the colors of the bars. I would also like this to be reflected in a legend.
So I've figured out how to use a specific color for a bar, based on a value in the data:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "A bar chart that directly encodes color names in the data.",
"data": {
"values": [
{
"color": "rgb(0, 0, 0)",
"b": 28,
"type": "outside"
},
{
"color": "rgb(255, 0, 0)",
"b": 55,
"type": "inside"
},
{
"color": "rgb(0, 255, 0)",
"b": 43,
"type": "dew"
}
]
},
"mark": "bar",
"encoding": {
"x": {
"field": "type",
"type": "nominal"
},
"y": {
"field": "b",
"type": "quantitative"
},
"color": { "field": "color", "type": "nominal", "legend": {}, "scale": null}
}
}
Correctly colored bars:
The above only works due to the "scale": null which prevents the legend from showing. If I remove this, then the legend shows, but the custom colors are lost and I get the rbg values showing up in the legend:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "A bar chart that directly encodes color names in the data.",
"data": {
"values": [
{
"color": "rgb(0, 0, 0)",
"b": 28,
"type": "outside"
},
{
"color": "rgb(255, 0, 0)",
"b": 55,
"type": "inside"
},
{
"color": "rgb(0, 255, 0)",
"b": 43,
"type": "dew"
}
]
},
"mark": "bar",
"encoding": {
"x": {
"field": "type",
"type": "nominal"
},
"y": {
"field": "b",
"type": "quantitative"
},
"color": { "field": "color", "type": "nominal", "legend": {}}
}
}
Colors lost, wrong legend labels:
I can obviously get the correct legend labels with:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "A bar chart that directly encodes color names in the data.",
"data": {
"values": [
{
"color": "rgb(0, 0, 0)",
"b": 28,
"type": "outside"
},
{
"color": "rgb(255, 0, 0)",
"b": 55,
"type": "inside"
},
{
"color": "rgb(0, 255, 0)",
"b": 43,
"type": "dew"
}
]
},
"mark": "bar",
"encoding": {
"x": {
"field": "type",
"type": "nominal"
},
"y": {
"field": "b",
"type": "quantitative"
},
"color": { "field": "type", "type": "nominal", "legend": {}}
}
}
But still I don't get the colors I want:
Is it possible to have both custom colors and a legend?
The way to get custom colors to appear in a legend is to use a scale with a custom scheme. For example, you could create the chart you have in mind this way:
(view in vega editor)
{
"data": {
"values": [
{"b": 28, "type": "outside"},
{"b": 55, "type": "inside"},
{"b": 43, "type": "dew"}
]
},
"mark": "bar",
"encoding": {
"x": {"field": "type", "type": "nominal"},
"y": {"field": "b", "type": "quantitative"},
"color": {
"field": "type",
"type": "nominal",
"scale": {
"domain": ["outside", "inside", "dew"],
"range": ["rgb(0, 0, 0)", "rgb(255, 0, 0)", "rgb(0, 255, 0)"]
}
}
}
}
I don't know of any way to draw this color scheme definition from the data, or to force a legend to be drawn when setting scale to null, but you could hack it by essentially drawing the legend yourself. It might look something like this:
(view in vega editor)
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"description": "A bar chart that directly encodes color names in the data.",
"data": {
"values": [
{"color": "rgb(0, 0, 0)", "b": 28, "type": "outside"},
{"color": "rgb(255, 0, 0)", "b": 55, "type": "inside"},
{"color": "rgb(0, 255, 0)", "b": 43, "type": "dew"}
]
},
"hconcat": [
{
"mark": "bar",
"encoding": {
"x": {"field": "type", "type": "nominal"},
"y": {"field": "b", "type": "quantitative"},
"color": {
"field": "color",
"type": "nominal",
"legend": {},
"scale": null
}
}
},
{
"title": "type",
"mark": {"type": "point", "size": 80, "shape": "square", "filled": true},
"encoding": {
"y": {
"field": "type",
"type": "nominal",
"axis": {"orient": "right", "title": null}
},
"color": {"field": "color", "type": "nominal", "scale": null}
}
}
]
}