prevent histogram bars from overlapping - data-visualization

I am creating a display for 4 similar histograms. This is what I have:
{
"data": {
"values": {
"one":[8,8,7,8,7,8,8,8,8,8,8,8,8,8,9,9,8,8,8,8,8,7,9,8,8,8,8,9,8,7,8,7,8,8,8,8,7,9,8,8,8,8,8,7,8,7,9,8,8,7,9,7,8,8,8,8,8,8,7,9,8,8,8,9,8,8,8,8,8,8,7,8,8,8,9,8,8,8,9,8,8,8,8,9,8,8,8,8,9,8,9,8,8,7,8,9,8,8,8,9],
"two":[3,4,4,4,4,4,4,4,4,3,4,4,3,3,4,3,4,4,3,4,4,4,4,4,4,3,4,4,3,4,3,4,3,4,4,4,4,4,4,4,4,4,4,3,4,3,4,3,4,4,4,3,3,4,4,3,4,4,3,4,4,3,4,4,4,3,4,4,3,3,4,4,3,3,4,4,3,4,4,4,4,4,4,4,4,4,4,3,4,4,4,4,4,4,4,4,4,4,4,4],
"three": [3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3],
"four":[3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,3,3,3,3,3,3,3,3,3,3,3,3,3,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3]
}
},
"transform": [{"flatten": ["one", "two", "three", "four"]}, {"fold": ["one", "two", "three", "four"]}],
"mark": {"type": "bar"},
"encoding": {
"x": {"field": "value", "type": "quantitative"},
"y": {
"field": "value",
"type": "quantitative",
"aggregate": "count",
"stack": null
},
"color": {"field": "key", "type": "nominal"}
}
}
https://vega.github.io/editor/#/url/vega-lite/N4IgJghgLhIFygG4QDYFcCmBneoD2AdhvANoAcANJQOxUW2WN1MsUCc7zXNnrlHPQd3q9hQ2gLoSRfLtNaTZQporqqVY0aoaiVAXQogoAdzykAzBQAs12zft2Klh5edPHbh1-cvbbz47e3v5+ge4BER5hkTFRQWHBCUlBBkYAFgBOGMRwJK7hBflFhSXFZaUV5VWVNdV1tQ31TY21qQBmeGgZFi3Nfb0lNv3VQwPDYxPjdXoAvjOGUBkQBFgdGQC2pKBtKNBQGASkIITEC6YgC5nZFyAdXSCzFNt4KGBHJzcmZpdZp7ed3VmqXWEAyAGtcEYAJ4ABxyIAARqCQPMQAcAMZ4MAASwIAHNIQAPSFtbEYV7wEDIdB-KCw+EARzQyyg2JgrMQxFRUJJZIpcCpqEwn3plKZLLZ0GxnJuEDxeKyeOg8MxaAIUBuWBg6IhcAIaBQKFRmJQeG6CFufLeArBGB5C1FAoIeHWuNQKLmQA
The problem is that it is difficult to read where the histogram bars appear on top of each other. I tried adding opacity as well, but it seemed really messy and still difficult to read. I also tried adding a column attribute to encoding, however with the histograms in separate graphs it is not as easy to get a quick visual comparison of the distributions.
I would like to try placing the bars of the histograms next to each other, similar to this Matplotlib example:
How can I accomplish this in Vega Lite?

It sounds like you're looking for a Grouped Bar Chart. For your data, you could follow that example and do something like this (editor):
{
"data": {
"values": {
"one":[8,8,7,8,7,8,8,8,8,8,8,8,8,8,9,9,8,8,8,8,8,7,9,8,8,8,8,9,8,7,8,7,8,8,8,8,7,9,8,8,8,8,8,7,8,7,9,8,8,7,9,7,8,8,8,8,8,8,7,9,8,8,8,9,8,8,8,8,8,8,7,8,8,8,9,8,8,8,9,8,8,8,8,9,8,8,8,8,9,8,9,8,8,7,8,9,8,8,8,9],
"two":[3,4,4,4,4,4,4,4,4,3,4,4,3,3,4,3,4,4,3,4,4,4,4,4,4,3,4,4,3,4,3,4,3,4,4,4,4,4,4,4,4,4,4,3,4,3,4,3,4,4,4,3,3,4,4,3,4,4,3,4,4,3,4,4,4,3,4,4,3,3,4,4,3,3,4,4,3,4,4,4,4,4,4,4,4,4,4,3,4,4,4,4,4,4,4,4,4,4,4,4],
"three": [3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3],
"four":[3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,3,3,3,3,3,3,3,3,3,3,3,3,3,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3]
}
},
"transform": [
{"flatten": ["one", "two", "three", "four"]},
{"fold": ["one", "two", "three", "four"]}
],
"mark": {"type": "bar"},
"encoding": {
"x": {"field": "key", "type": "nominal", "axis": null},
"column": {
"field": "value",
"type": "quantitative",
"spacing": 2,
"header": {"titleOrient": "bottom", "labelOrient": "bottom"}
},
"y": {
"field": "value",
"type": "quantitative",
"aggregate": "count",
"stack": null
},
"color": {"field": "key", "type": "nominal"}
},
"width": {"step": 12},
"config": {"view": {"stroke": "transparent"}, "axis": {"domainWidth": 1}}
}

Related

How would I add a tooltip to a multi series line chart

I'm new to WebStorm and vega/vegalite and I am working on creating a visual with different types of gasoline and their prices from 1996-2020.
I've been able to create a graph with all of the information, but it's pretty hard to discern anything.
I've been going over the Vega-Lite documentation and I see that that I can use tooltips to zoom into the graphic. I tried implementing it, but I don't think I quite understand the scope of some of the properties.
Could someone show me how they might approach this task? Or perhaps even recommend videos or websites that might help me better understand how to do?
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "Area charts of stock prices, with an interactive overview and filtered detail views.",
"width": 720,
"height": 480,
"padding": 5,
"data": {
"name": "gas_prices",
"url": "data/testInfo.csv",
"format": {"type": "csv", "parse": {"A1": "number", "date": "date"}}
},
"repeat": {
"layer": ["A1","A2","A3","R1","R2","R3","M1","M2","M3","P1","P2","P3","D1"]
},
"spec": {
"mark": "line",
"encoding": {
"x": {
"timeUnit": "yearmonth",
"title": "Date",
"field": "date"
},
"y": {
"field": {"repeat":"layer"},
"title": "Gas Prices",
"type": "quantitative"
},
"color": {
"datum": {"repeat": "layer"},
"type": "nominal"
}
}
}
}
You can refer the documentation for tooltip, which says about the options available to enable tooltips on your chart.
Below is a sample config having default tooltip or refer editor:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"repeat": ["Horsepower", "Miles_per_Gallon", "Acceleration", "Displacement"],
"columns": 2,
"spec": {
"data": {"url": "data/cars.json"},
"mark": {"type": "bar", "tooltip": true},
"encoding": {
"x": {"field": {"repeat": "repeat"}, "bin": true},
"y": {"aggregate": "count"},
"color": {"field": "Origin"}
}
}
}
To have tooltips with some customizations, you can refer below code or editor:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"repeat": ["Horsepower", "Miles_per_Gallon", "Acceleration", "Displacement"],
"columns": 2,
"spec": {
"data": {"url": "data/cars.json"},
"mark": "bar",
"encoding": {
"tooltip": [
{"aggregate": "count", "title": "YAxis"},
{"field": {"repeat": "repeat"}, "title": "myXAxis"}
],
"x": {"field": {"repeat": "repeat"}, "bin": true},
"y": {"aggregate": "count"},
"color": {"field": "Origin"}
}
}
}

How to add vertical rules as new layer and same x-axis?

When add the strips as a new layer (in the 2-layers chart), stops to work: there are no visualization and a "WARN Cannot project a selection on encoding channel "y", which has no field".
The first two layer-definitions bellow was working fine when only two lines.
vglSpec.push(['#vis2a',{
$schema: vglVers,
data: {"url":"MyDataset1"},
// old "encoding": { x: {"field": "instant", "type": "temporal"} }
width:680,
layer: [
{
"mark": {"stroke": "#68C", "type": "line", "point": true},
"encoding": { x: {"field": "instant", "type": "temporal"}, "y": {
"field": "n_count",
"type": "quantitative"
}},
"selection": {"grid": {"type":"interval", "bind":"scales"}} //zoom
},
{
"mark": {"stroke": "red", "type": "line", "strokeOpacity": 0.4},
"encoding": { x: {"field": "instant", "type": "temporal"}, "y": {
"field": "instant_totmin",
"type": "quantitative"
}}
},
{
"mark": "rule",
"data": {"url":"MyDataset2"}, // little subset of instant of Dataset1
"encoding": {
"x": { "field": "instant", "type": "temporal"},
"color": {"value": "yellow"},
"size": {"value": 5}
},
//resolve:? x is same axis and the only visualization field
}
],
resolve: {"scale": {"y": "independent"}}
}]);
PS: only removed names and titles, all real script.
Emulating with dummy data: working fine!
Please click on the 3rd example of rule guide... And replace or adapt it for this VEGA-lite script:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "data/movies.json"},
"layer": [
{
"mark": "bar",
"encoding": {
"x": {"bin": true, "field": "IMDB_Rating", "type": "quantitative"},
"y": {"aggregate": "count", "type": "quantitative"}
}
},
{
"mark": "rule",
"data": {"values": [{"IMDB_Rating":3.5},{"IMDB_Rating":7.8}]},
"encoding": {
"x": { "field": "IMDB_Rating","type": "quantitative" },
"color": {"value": "yellow"},
"size": {"value": 4}
}
}
]
}
You're using an independent y-scale, and the y-scale of a rule mark with no y encoding is not well defined. The best way to address this is probably to combine the rule mark with one of the other layers, so it can use that y scale:
vglSpec.push(['#vis2a',{
$schema: vglVers,
data: {"url":"MyDataset1"},
// old "encoding": { x: {"field": "instant", "type": "temporal"} }
width:680,
layer: [
{
"mark": {"stroke": "#68C", "type": "line", "point": true},
"encoding": { x: {"field": "instant", "type": "temporal"}, "y": {
"field": "n_count",
"type": "quantitative"
}},
"selection": {"grid": {"type":"interval", "bind":"scales"}} //zoom
},
{
layer: [
{
"mark": {"stroke": "red", "type": "line", "strokeOpacity": 0.4},
"encoding": { x: {"field": "instant", "type": "temporal"}, "y": {
"field": "instant_totmin",
"type": "quantitative"
}}
},
{
"mark": "rule",
"data": {"url":"MyDataset2"}, // little subset of instant of Dataset1
"encoding": {
"x": { "field": "instant", "type": "temporal"},
"color": {"value": "yellow"},
"size": {"value": 5}
},
//resolve:? x is same axis and the only visualization field
}
]
}
],
resolve: {"scale": {"y": "independent"}}
}]);
(note, I've not actually tried this solution because you didn't include data in your question, but the approach should work).

Clamping y-axis when layering aggregated charts in vega-lite

This is a follow up from a previous question for which I built a test case in a (hopefully now public) notebook and noticed the following behavior:
At the end of the notebook, in the section bugs you will notice that y-axis of the max_precipitation of the layered chart using is clamped to 10.
I tried changing the domain but the bars do not go above 10.
Here the code example in vega-lite's editor reproduced below:
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"title": "Top Months by Mean Precipitation",
"data": {"url": "data/seattle-weather.csv"},
"transform": [
{"timeUnit": "month", "field": "date", "as": "month_date"},
{
"aggregate": [
{"op": "mean", "field": "precipitation", "as": "mean_precipitation"},
{"op": "max", "field": "precipitation", "as": "max_precipitation"}
],
"groupby": ["month_date"]
},
{
"window": [{"op": "row_number", "as": "rank"}],
"sort": [{"field": "mean_precipitation", "order": "descending"}]
}
],
"encoding": {
"x": {
"field": "month_date",
"type": "ordinal",
"timeUnit": "month",
"title": "month (descending by max precip)",
"sort": {
"field": "max_precipitation",
"op": "average",
"order": "descending"
}
}
},
"layer": [
{
"mark": {"type": "bar"},
"encoding": {
"y": {
"field": "max_precipitation",
"type": "quantitative",
"title": "precipitation (mean & max)"
}
}
},
{
"mark": "tick",
"encoding": {
"y": {"field": "mean_precipitation", "type": "quantitative"},
"color": {"value": "red"},
"size": {"value": 15}
}
}
]
}
Please help me understand what I am doing wrong?
It appears that the precipitation column is being parsed as strings rather than as numbers. You can specify the parsing format for the column using :
"data": {
"url": "data/seattle-weather.csv",
"format": {"parse": {"precipitation": "number"}}
},
The result is here:

Why column facet in Vega Lite not working properly with layer?

I'm trying to create 3 column plot and it works when there's no layer.
But when I add the layer - 3 columns get merged into one plot (open in editor).
How to make it to be separated into 3 columns by the duration field?
CODE
For the plot with the full data please use editor link above.
{
"encoding": {
"column": { "field": "duration", "type": "nominal" },
"x": { "field": "bin_i", "type": "ordinal" }
},
"layer": [
{
"mark": { "type": "bar", "size": 2 },
"encoding": {
"y": { "field": "min", "type": "quantitative" },
"y2": { "field": "max", "type": "quantitative" }
}
},
{
"mark": { "type": "tick" },
"encoding": {
"y": { "field": "mean", "type": "quantitative" }
}
}
],
"data": {
"values": [
{
"bin_i": 1,
"duration": 1,
"max": 1.9642835793718165,
"mean": 1.0781367168962268,
"min": 0.3111818864927448
},
...
]
}
}
A layered chart does not accept a faceted encoding. If you want to facet a layered chart, you should use the facet operator rather than a facet encoding.
For your example, it would look like this (Vega Editor):
{
"facet": {"column": {"field": "duration", "type": "nominal"}},
"spec": {
"encoding": {
"x": {"field": "bin_i", "type": "ordinal"}
},
"layer": [
{
"mark": {"type": "bar", "size": 2},
"encoding": {
"y": {"field": "min", "type": "quantitative"},
"y2": {"field": "max"}
}
},
{
"mark": {"type": "tick"},
"encoding": {
"y": {"field": "mean", "type": "quantitative"}
}
}
]
},
"data": {
"values": [
{
"bin_i": 1,
"duration": 1,
"max": 1.9642835793718165,
"mean": 1.0781367168962268,
"min": 0.3111818864927448
},
...
]
}
}

vega-lite line plot - color not getting applied in transform filter

Vega Editor link here
I've an overlay color change based on filter condition in a multi line chart. Got it working with single line here but 'red' overlay line(along with red dot) doesn't come up with this above multi-line example. Could anyone help me out?
Short answer: your chart is working, except the filtered values are not colored red.
The core issue is that encodings always supersede mark properties, as you can see in this simpler example: editor link
{
"$schema": "https://vega.github.io/schema/vega-lite/v3.json",
"description": "A scatterplot showing horsepower and miles per gallons.",
"data": {"url": "data/cars.json"},
"mark": {"type": "point", "color": "red"},
"encoding": {
"x": {"field": "Horsepower", "type": "quantitative"},
"y": {"field": "Miles_per_Gallon", "type": "quantitative"},
"color": {"field": "Origin", "type": "nominal"},
"shape": {"field": "Origin", "type": "nominal"}
}
}
Notice that although we specify that the mark should have color red, this is overridden by the color encoding. This is by design within Vega-Lite, because encodings are more specific than properties.
Back to your chart: because you specify the color encoding in the parent chart, each individual layer inherits that color encoding, and those colors override the "color": "red" that you specify in the individual layers.
To make it do what you want, you can move the color encoding into the individual layers (and use a detail encoding to ensure the data are still grouped by that field). For example (editor link):
{
"$schema": "https://vega.github.io/schema/vega-lite/v3.json",
"data": {...},
"width": 1000,
"height": 200,
"autosize": {"type": "pad", "resize": true},
"transform": [
{
"window": [{"op": "rank", "as": "rank"}],
"sort": [{"field": "dateTime", "order": "descending"}]
},
{"filter": "datum.rank <= 100"}
],
"layer": [
{
"mark": {"type": "line"},
"encoding": {
"color": {
"field": "name",
"type": "nominal",
"legend": {"title": "Type"}
}
}
},
{
"mark": {"type": "line", "color": "red"},
"transform": [
{
"as": "count",
"calculate": "if(datum.anomaly == true, datum.count, null)"
},
{"calculate": "true", "as": "baseline"}
]
},
{
"mark": {"type": "circle", "color": "red"},
"transform": [
{"filter": "datum.anomaly == true"},
{"calculate": "true", "as": "baseline"}
]
}
],
"encoding": {
"x": {
"field": "dateTime",
"type": "temporal",
"timeUnit": "hoursminutesseconds",
"sort": {"field": "dateTime", "op": "count", "order": "descending"},
"axis": {"title": "Time", "grid": false}
},
"y": {
"field": "count",
"type": "quantitative",
"axis": {"title": "Count", "grid": false}
},
"detail": {"field": "name", "type": "nominal"}
}
}