Vegalite daily line chart wobble - vega

I've run into an issue with vegalite whereby charts exhibit a 'wobbling line' when the line should be straight, and dates are not equally spaced.
Can anyone verify that this is a bug, or am I making a mistake in my spec?
.
I have found that this issue becomes more severe when you increase the number of data points.
To replicate this issue, paste the following spec into the vega lite editor:
{
"description": "",
"data": {
"values": [
{
"date": "2017-01-23",
"value": 100
},
{
"date": "2017-01-24",
"value": 200
},
{
"date": "2017-01-25",
"value": 300
},
{
"date": "2017-01-26",
"value": 400
},
{
"date": "2017-01-27",
"value": 500
},
{
"date": "2017-01-28",
"value": 600
},
{
"date": "2017-01-29",
"value": 700
},
{
"date": "2017-01-30",
"value": 800
},
{
"date": "2017-01-31",
"value": 900
},
{
"date": "2017-02-01",
"value": 1000
},
{
"date": "2017-02-02",
"value": 1100
},
{
"date": "2017-02-03",
"value": 1200
},
{
"date": "2017-02-04",
"value": 1300
},
{
"date": "2017-02-05",
"value": 1400
},
{
"date": "2017-02-06",
"value": 1500
},
{
"date": "2017-02-07",
"value": 1600
}
]
},
"mark": "line",
"encoding": {
"x": {
"field": "date",
"type": "temporal"
},
"y": {
"field": "value"
}
},
"config": [],
"embed": {
"renderer": "canvas",
"actions": {
"export": false,
"source": false,
"editor": false
}
}
}
Edit: Followup - experimenting in Altair, it seems like the date aspect of this is irrelevant. You get the same problem with both of the following code blocks:
import pandas as pd
import numpy as np
from altair import *
s1 = pd.date_range(start="2017-01-23", end="2020-02-07")
s2 = np.arange(1,len(s1)+1)*100
df = pd.DataFrame({"date":s1, "value":s2})
Chart(df).mark_line(
).encode(
x='date',
y='value'
)
and
import pandas as pd
import numpy as np
from altair import *
s1 = np.arange(1,1000,1)
s2 = np.arange(1,len(s1)+1)*100
df = pd.DataFrame({"x":s1, "value":s2})
Chart(df).mark_line(
).encode(
x='x',
y='value'
)
Conversely the following produced a smooth plot (pandas and matplotlib):
%matplotlib inline
df.plot('date', 'value')

The wiggle is caused by an effect of rounding error during the calculation of the pixel coordinates relevant to the data values.
Looking at the vega code produced by vega-lite, one can see the "round": true entries for the defined scales. Changing this to false solves the problem on my screen, and making vega-lite do that, is also possible by adding:
"config": {"scale": {"round" : false}},
instead of the
"config": [],
line in the vega-lite spec.

Related

How do I format currency in Vega-Lite?

I'm trying to format values as currency in the Vega-Lite editor. I'm attempting to copy the docs but I'm getting a weird error. The Y axis is a numerical value. Passing in a formatting string gives "value expected".
Here's the json:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"description": "Protocol Chart",
"width": 500,
"height": 225,
"data": {
"values": [
{
"asset": "eth",
"time": "2021-06-15T00:00:00Z",
"ReferenceRateUSD": "2577.04473863238"
},
{
"asset": "eth",
"time": "2021-06-16T00:00:00Z",
"ReferenceRateUSD": "2552.74103641146"
},
{
"asset": "eth",
"time": "2021-06-17T00:00:00Z",
"ReferenceRateUSD": "2360.99938690824"
}
]
},
"config": {
"view": {
"stroke": "transparent"
}
},
"mark": "line",
"encoding": {
"x": {
"axis": {
"domainColor": "#DDD",
"grid": false,
"labelColor": "#AEAEAE",
"ticks": false,
"labelPadding": 10
},
"field": "time",
"type": "temporal",
"title": ""
},
"y": {
"axis": {
"labelOffset": 2,
"domainColor": "white",
"labelColor": "#AEAEAE",
"ticks": false,
"labelPadding": 10,
"format": '$.2f'
},
"field": "ReferenceRateUSD",
"type": "quantitative",
"title": "",
"scale": {
"zero": false
}
},
"color": {
"field": "doesntmatter",
"type": "nominal",
"legend": null,
"scale": {
"range": ["#91DB97"]
}
}
}
}
What am I missing here? How do I get it to accept my formatting string?
"$.2f" looks like a the correct d3-format string for currency, but note that this is only valid if the associated formatType is "number" (see axis label docs).
Since you did not include a full reproducible example of the problem you're seeing, I can only venture a guess that your data type is not numeric, and this is why the formatting is failing. If that's not the case, I'd suggest editing your question to provide a complete example of the error you're seeing.
Edit: your full example appears to work correctly with the current version of vega/vega-lite (view in editor):
Perhaps you need to update your vega/vega-lite libraries?

How to prepare VOTT JSON dataset to retrain COCO SSD Tensorflow api?

Hi I have annotated multi object dataset and exported as tensorflow format in VoTT. however I have no clue how to use it with tensorflow api.
Vott produces json file for each annotated images as follows:
"asset": {
"format": "jpg",
"id": "0b1e1aac9a6f2cc4e51d95ef368dbfe7",
"name": "lemon160.jpg",
"path": "file:/Volumes/Solix/Datasets/limedata/imgs_raw/lemon160.jpg",
"size": {
"width": 1280,
"height": 720
},
"state": 2,
"type": 1
},
"regions": [
{
"id": "HWaHAokRV",
"type": "RECTANGLE",
"tags": [
"Expired Lime"
],
"boundingBox": {
"height": 362.8564453125,
"width": 510.81555834378923,
"left": 625.7465495608532,
"top": 355.5866350446429
},
"points": [
{
"x": 625.7465495608532,
"y": 355.5866350446429
},
{
"x": 1136.5621079046425,
"y": 355.5866350446429
},
{
"x": 1136.5621079046425,
"y": 718.4430803571429
},
{
"x": 625.7465495608532,
"y": 718.4430803571429
}
]
}
}
can somebody suggest me a way to convert these files to tfrecords?

Vega-lite Duplicate x-axis labels

The image says it all really, there are duplicate labels for each day and what I really want is just one label per bar. My data set consists of 8 data points:
[
{
"date":"2019-06-21T00:00:00.000Z",
"value":44.6,
},
{
"date":"2019-06-22T00:00:00.000Z",
"value":916.4,
},
{
"date":"2019-06-23T00:00:00.000Z",
"value":948.4,
},
{
"date":"2019-06-24T00:00:00.000Z",
"value":872.4,
},
{
"date":"2019-06-25T00:00:00.000Z",
"value":952.4,
},
{
"date":"2019-06-26T00:00:00.000Z",
"value":1006.4,
},
{
"date":"2019-06-27T00:00:00.000Z",
"value":945.4,
},
{
"date":"2019-06-28T00:00:00.000Z",
"value":320.8,
}
]
And my chart definition is as follows:
{
'$schema': 'https://vega.github.io/schema/vega-lite/v3.json',
'description': 'Electricity consumption by month',
'height': 320,
'autosize': {
'type': 'fit',
'resize': false,
'contains': 'padding'
},
'layer': [{
'data': {
'name': 'data'
},
'layer': [{
'mark': 'bar',
'encoding': {
'x': { 'field': 'date', 'timeUnit': 'day', 'type': 'temporal', 'axis': { 'grid': false, 'labelAngle': 0 } },
'y': {
'field': 'value',
'type': 'quantitative',
'axis': {
'format': '.2f',
'title': 'kwh'
}
}
}
}
]
}]
}
Additonally, I think the values are being bucketed by day as there are 7 bars and 8 data points, but what I really want a bar per data point.
So I think I have two questions:
how can I remove the duplicate labels?
how can I remove the bucketing but still label the x-axis with the name of the day? This chart is suppose to show "last 8 days", with a bar per day.
Both of your issues are caused by your specification of "timeUnit": "day". What this says is that you'd like all of your values put into seven buckets, labeled by day of the week, and that all axis labels should consist only of the day.
So I would try the following:
replace "timeUnit": "day" with "timeUnit": "yearmonthdate" so you are no longer binning your data into seven day-of-week categories
use "format": "%A" in the axis to format your labels as day of week (see d3-time-format for more information).
Additionally, bars look better with an ordinal type rather than a temporal or quantitative type, and this will ensure that there are no duplicate labels
With a bit of additional cleanup (putting your data inline, using double quotes rather than single to make it valid in JSON, setting an explicit width for reproducibility regardless of embedding environment, and removing the two extraneous nested layer statements) this is the result (vega editor link):
{
"$schema": "https://vega.github.io/schema/vega-lite/v3.json",
"description": "Electricity consumption by month",
"height": 320,
"width": 320,
"data": {
"values": [
{
"date": "2019-06-21T00:00:00.000Z",
"value": 44.6
},
{
"date": "2019-06-22T00:00:00.000Z",
"value": 916.4
},
{
"date": "2019-06-23T00:00:00.000Z",
"value": 948.4
},
{
"date": "2019-06-24T00:00:00.000Z",
"value": 872.4
},
{
"date": "2019-06-25T00:00:00.000Z",
"value": 952.4
},
{
"date": "2019-06-26T00:00:00.000Z",
"value": 1006.4
},
{
"date": "2019-06-27T00:00:00.000Z",
"value": 945.4
},
{
"date": "2019-06-28T00:00:00.000Z",
"value": 320.8
}
]
},
"mark": "bar",
"encoding": {
"x": {
"field": "date",
"timeUnit": "yearmonthdate",
"type": "ordinal",
"axis": {
"grid": false,
"format": "%A",
"labelAngle": 0
}
},
"y": {
"field": "value",
"type": "quantitative",
"axis": {
"format": ".2f",
"title": "kwh"
}
}
}
}
Use labelSeparation in axis vega-config.
https://vega.github.io/vega-lite/docs/axis.html#labels
You can do something like :
labelSeparation : 200
Adjust as per your need.

How to add legend for single or multi series chart in Vega Lite?

How do I add a legend to a basic chart in Vega?
I'm using Vega in the web app where I want all my charts to include a legend even if its a single series.
i.e in Google Sheets it looks like
Since Datum hasn't been implemented yet I added an extra layer as a workaround (This also works for Multi Series Charts by adding additional values into data.values for the rule.)
{
"mark": {
"type": "rule"
},
"data": {
"values": [
{
"color": "Total Units"
}
]
},
"encoding": {
"color": {
"field": "color",
// If you want to update the color of the legend...
"scale": {"range": ["blue", "#000"]},
"sort": false,
"type": "nominal",
"legend": { "title": "" }
}
}
}
Also for those that want to view an example in VegaLite Editor https://vega.github.io/editor/#/
{
"layer": [
{
"mark": "bar",
"data": {
"values": [
{
"goal": 25,
"project": "a",
"score": 25
},
{
"goal": 47,
"project": "b",
"score": 57
},
{
"goal": 30,
"project": "c",
"score": 23
},
{
"goal": 27,
"project": "d",
"score": 19
}
]
},
"encoding": {
"x": {
"type": "nominal",
"field": "project"
},
"y": {
"type": "quantitative",
"field": "score"
}
},
"height": 300,
"width": 400
},
{
"mark": {
"type": "rule"
},
"data": {
"values": [
{
"color": "Goal"
}
]
},
"encoding": {
"color": {
"field": "color",
"sort": false,
"type": "nominal",
"legend": { "title": "" }
}
}
}
]
}

Calculation in vega

I use vega in Kibana.
I select two values from two different index in section "data". But now I need to summarize this values and visualize it in the section "marks". Is anybody know, how can I do this? Now in the section "marks" I use only one value from first "data".
My code is the following:
{
"$schema": "https://vega.github.io/schema/vega/v3.0.json",
"title": {
"text": "Lead time, hr.",
"orient": "bottom"
},
"data": [
{
"name": "source_1",
"url": {
"index": "metrics-bitbucket-*",
"%context_query%": "#timestamp",
"body": {
"size": 0,
"aggs": {
"etb": {
"avg": {
"field": "elapsed_time",
"script": {"source": "_value/3600*10"}
}
}
}
}
},
"format": {"type": "json", "property": "aggregations.etb"}
},
{
"name": "source_2",
"url": {
"index": "metrics-jenkins-*",
"%context_query%": "#timestamp",
"body": {
"size": 0,
"aggs": {
"etj": {
"avg": {
"field": "elapsed_time",
"script": {"source": "_value/3600*10"}
}
}
}
}
},
"format": {"type": "json", "property": "aggregations.etj"}
}
],
"marks": {
"type": "text",
"from": {"data": "source_1"},
"encode": {
"update": {
"text": {"signal": "round(datum.value)/10"},
"fontSize": {"value": 60},
"fontStyle": {"value": "bold"},
"x": {"signal": "width/2-50"},
"y": {"signal": "height/2"}
}
}
}
}
You essentially have two lists of data objects: source1: [{}, {}, ...] and source2: [{}, {}, {}, ...]. When you draw items, you have to specify just one data source. That data source could be the concatenation of the first two, e.g. you create source3 with its source parameter set to ["source1", "source2"], which includes all elements from both, but I suspect this is not what you want here. Rather, you need to merge the data using lookup transform - iterate over the items in one data source, and pull the corresponding values from another data source. Afterwards, add a formula transform to sum the values. The mark would than use the result of the formula for drawing.