Tileset union aggregation sum results incorrect and depends on zoom levels - sum

I'm trying to create summary census statistics at different zoom levels for a state in the USA (census block group, tract, county and state), and I've found that the total population results generated with aggregation based on unions and a concatenated key using state,county,track census block group - that depends on zoom level can be incorrect if I set the zoom levels to low for e.g. census block group level (i.e. want to show more detail at lower zoom levels)
e.g. testing this recipe below for Rhode Island. Gave me a total population of 343,405 for the state instead of the correct number of 1,097,379 - which I've double checked is correct. There were no errors shown in the tile service after using the tilesets API to upload. If I change the zoom levels to e.g.
>= 0 for STATEFO
>= 5 for COUNTYFP
>= 8 for TRACTCE
>= 10 for BLKGRPCE
then the total population for the state is changed to the correct value 1,097,379. What am I missing here - how can data in the geojson dataset just get ignored?
Thanks
David
{
"version": 1,
"layers": {
"NAME": {
"source": SOURCE,
"minzoom": 0,
"maxzoom": 10,
"features": {
"simplification": {
"outward_only": true,
"distance": 1
},
"attributes": {
"set": {
"key": [
"concat",
[
"case",
[
">=",
[
"zoom"
],
0
],
[
"get",
"STATEFP"
],
""
],
[
"case",
[
">=",
[
"zoom"
],
3
],
[
"get",
"COUNTYFP"
],
""
],
[
"case",
[
">=",
[
"zoom"
],
5
],
[
"get",
"TRACTCE"
],
""
],
[
"case",
[
">=",
[
"zoom"
],
8
],
[
"get",
"BLKGRPCE"
],
""
]
]
},
"allowed_output": ["tpop","STATEFP","COUNTYFP","TRACTCE","BLKGRPCE"]
}
},
"tiles": {
"union": [
{
"group_by": [
"key"
],
"aggregate": {
"tpop": "sum"
},
"simplification": {
"distance": 4,
"outward_only": false
}
}
],
"layer_size": 2500
}
}
}
}

Related

booleanPointInPolygon is returning wrong values at the border

I have the following polygon displayed on a map:
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-125.59984949809844,
45.262153541142055
],
[
-64.97100463461506,
39.503280047917194
],
[
-71.53494497281665,
25.360849581306127
],
[
-121.81059696453559,
26.995032595715646
],
[
-125.59984949809844,
45.262153541142055
]
]
]
},
"properties": {}
}
Calling
console.log(booleanPointInPolygon([-98.65195, 49.42827], polygon)); //logs false
console.log(booleanPointInPolygon([-106.53965, 27.69895], polygon)); //logs true
when the expected output should be the opposite. I'm pretty sure my data is in the right form [longitude, latitude], I am wondering what's giving me the wrong output?

Query an array element in an JSONB Object

I have a jsonb column called data in a table called reports. Here is what report.id = 1 looks like
[
{
"Product": [
{
"productIDs": [
"ABC1",
"ABC2"
],
"groupID": "Food123"
},
{
"productIDs": [
"EFG1"
],
"groupID": "Electronic123"
}
],
"Package": [
{
"groupID": "Electronic123"
}
],
"type": "Produce"
},
{
"Product": [
{
"productIDs": [
"ABC1",
"ABC2"
],
"groupID": "Clothes123"
}
],
"Package": [
{
"groupID": "Food123"
}
],
"type": "Wearables"
}
]
and here is what report.id = 2 looks like:
[
{
"Product": [
{
"productIDs": [
"XYZ1",
"XYZ2"
],
"groupID": "Food123"
}
],
"Package": [],
"type": "Wearable"
},
{
"Product": [
{
"productIDs": [
"ABC1",
"ABC2"
],
"groupID": "Clothes123"
}
],
"Package": [
{
"groupID": "Food123"
}
],
"type": "Wearables"
}
]
I am trying to get a list of all entries in reports table where at least one of data column's element has following:
type = Produce AND
where any elements of Product array OR any elements of Product array's groupID start with Food
So from the example above this query will only return the first index since
The type = Produce
groupID starts with Food for first element of Product array
The second index will be filtered out because type is not Produce.
I am not sure how to query to do AND query for groupID. Here is what I have tried to get all entries for type Produce
select * from reports r, jsonb_to_recordset(r.data) as items(type text) where items.type like 'Produce';
Sample structure and result: dbfiddle
select r.*
from reports r
cross join jsonb_array_elements(r.data) l1
cross join jsonb_array_elements(l1.value -> 'Product') l2
where l1 ->> 'type' = 'Produce'
and l2.value ->> 'groupID' ~ '^Food';

Getting the last datum in a vega dataset

I have a data source A and I'd like to create a new data source B containing just the last element of A. What is the best way to do this in Vega?
This is relatively straight forward to do. Although I am slightly confused by your use of "max" in the aggregation since this isn't the last value?
Either way here is my solution for obtaining the last value in a dataset using this series of transforms,
transform: [
{
type: window
ops: [
row_number
]
}
{
type: joinaggregate
fields: [
row_number
]
ops: [
max
]
as: [
max_row_number
]
}
{
type: filter
expr: datum.row_number==datum.max_row_number
}
]
I was able to get this working in the Vega Editor using the following:
{
"$schema": "https://vega.github.io/schema/vega/v5.json",
"data": [
{
"name": "source",
"url": "https://raw.githubusercontent.com/vega/vega/master/docs/data/cars.json",
"transform": [
{
"type": "filter",
"expr": "datum['Horsepower'] != null && datum['Miles_per_Gallon'] != null && datum['Acceleration'] != null"
}
]
},
{
"name": "avg",
"source":"source",
"transform":[
{
"type":"aggregate",
"groupby":["Horsepower"],
"ops": ["average"],
"fields":["Miles_per_Gallon"],
"as":["Avg_Miles_per_Gallon"]
}
]
},
{
"name":"last",
"source": "avg",
"transform": [
{
"type": "aggregate",
"ops": ["max"],
"fields": ["Horsepower"],
"as": ["maxHorsepower"]
},
{
"type": "lookup",
"from": "avg",
"key": "Horsepower",
"fields": ["maxHorsepower"],
"values": ["Horsepower","Avg_Miles_per_Gallon"]
}
]
}
]
}
maxHorsepower
Horsepower
Avg_Miles_per_Gallon
230
230
16
I'd be interested to know if there are better ways, but this worked for me.

find object in nested array with lodash

I have json data similar to this:
{
"Sections": [
{
"Categories": [
{
"Name": "Book",
"Id": 1,
"Options": [
{
"Name": "AAAA",
"OptionId": 111
},
"Selected": 0
},
{
"Name": "Car",
"Id": 2,
"Options": [
{
"Name": "BBB",
"OptionId": 222
},
"Selected": 0
},
],
"SectionName": "Main"
},
... more sections like the one above
]
}
Given this data, I want to find a category inside a section based on its (Category) Id, and set its selected option, I tried this, but couldn't get it to work....Note Category Id will be unique in the whole data set.
_.find(model.Sections, { Categories: [ { Id: catId } ]});
According to your data model, it looks like you're trying to find an element that is inside a matrix: Sections can have multiple Categories and a Category can have multiple types (car, book...).
I'm afraid there isn't a function in lodash that allows a deep find, you'll have to implement it the 'traditional' way (a couple of fors).
I provide this solution that is a bit more 'functional flavoured' than the traditional nested fors. It also takes advantage of the fact that when you explicitly return false inside a forEach, the loop finishes. Thus, once an element with the provided id is found, the loop is ended and the element returned (if it's not found, undefined is returned instead).
Hope it helps.
const findCategoryById = (sections, id) => {
var category;
_.forEach(sections, (section) => {
category = _.find(section.Categories, ['Id', id]);
return _.isUndefined(category);
});
return category;
};
const ex = {
"Sections": [{
"Categories": [{
"Name": "Book",
"Id": 1,
"Options": [{
"Name": "AAAA",
"OptionId": 111
}],
"Selected": 0
},
{
"Name": "Car",
"Id": 2,
"Options": [{
"Name": "BBB",
"OptionId": 222
}],
"Selected": 0
}
],
"SectionName": "Main"
}]
};
console.log(findCategoryById(ex.Sections, 2));
<script src="https://cdn.jsdelivr.net/npm/lodash#4.17.5/lodash.min.js"></script>

Improve search result based on field boost in elasticsearch

I am using ElasticSearch 1.7 first time and I have setup weight based on fields. It might change as per requirement. I am getting result from my query but issue is that if I change field weight dramatically then I can't see that much effect on records. Please check my below query and let me know if I am doing anything wrong.
ElasticSearch Query :
{
"from": 0,
"size": 10,
"highlight": {
"pre_tags": [
"<b>"
],
"post_tags": [
"</b>"
],
"fields": {
"title": {},
"description": {}
}
},
"query": {
"function_score": {
"query": {
"query_string": {
"query": "any keyword",
"fields": [
"fullText",
"title^100",
"authors^4",
"pubYear^4",
"publisher^4",
"abstract^2",
"documentTypeName^2",
"topic^6",
"topicSynonym^6"
"quality_value^6",
"domain^2"
],
"default_operator": "AND",
"analyze_wildcard": true
}
},
"score_mode": "sum",
"boost_mode": "sum",
"max_boost": 100
}
}
}
Sample Data:
{
"took": 44,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1465,
"max_score": 14.961364,
"hits": [
{
"_index": "snData",
"_type": "report",
"_id": "159",
"_score": 14.961364,
"_source": {
"str_ID": "159",
"topic": [
"strategy",
"Consumer-targeted strategy"
],
"topicSynonym": [
"assistance",
"coping",
"coping strategies",
"encouragement",
"support"
],
"fullText": "Background: As the incidence and prevalence of prostate cancer continue to rise, the number of men needing help and support to assist them in coping with disease and treatment-related symptoms and their psychosocial effects is likely to increase.",
"quality_value": 1,
"ID": 24034,
"title": "Psychosocial interventions for men with prostate cancer",
"authors": "Parahoo K E Noyes",
"pubYear": "2013",
"publisher": "",
"abstractEN": "Background: As the incidence and prevalence of prostate cancer continue to rise, the number of men needing help and support to assist them in coping with disease and treatment-related symptoms and their psychosocial effects is likely to increase.",
"uniqueID": "",
"documentTypeName": "Review of effects",
"viewCount": 28,
},
"highlight": {
"title": [
"Interventions for men with prostate <b>cancer</b>"
],
"abstract": [
"Background: As the incidence and prevalence of prostate <b>cancer</b> continue to rise, the number of men"
]
}
}
]
}
}
Use Case : If I change weight of quality_value or of anyone else then it should change result based on field weight. I am not sure whether my query is correct or I am missing anything. I am using ElasticSearch 1.7.