Snowflake querying items of list of jsons - sql

I'm looking to query all items inside an array of jsons, similar to what Elasticsearch is doing with it's nested data type.
For example, every row in my table looks like this, where a class has a column named students with a list of students:
{
"class": "history"
"students": [
{"first_name": "joe", last_name: "doe", "age": 16},
{"first_name": "tony", last_name: "helen", "age": 10},
{"first_name": "erica", last_name: "kran", "age": 12},
]
}
{
"class": "math"
"students": [
{"first_name": "joe", last_name: "no", "age": 16},
{"first_name": "yo", last_name: "wha", "age": 18},
{"first_name": "dan", last_name: "test", "age": 12},
]
}
I want to make a query that can query inside each items in the list, for example:
Get all classes, where there is a student named joe who is over the age of 14, and there is a student named erica under the age of 14.
This query should only return the "history" class, since it's the only class that has a student with these properties.
EDIT: editing my example data to illustrate the problem:
Looking for classes that have a student named joe over 14 AND a student named erica under 14.
Only the first class should match, since it satisfies both conditions, the second class does not because it doesnt have erica, although there is a joe.

The equivalent with pure SQL and a flatten would be:
with data as (
select parse_json('{
"class": "history",
"students": [
{"first_name": "joe", last_name: "doe", "age": 16},
{"first_name": "tony", last_name: "helen", "age": 10},
{"first_name": "erica", last_name: "kran", "age": 17},
]
}') x
union all select parse_json('{
"class": "math",
"students": [
{"first_name": "joe", last_name: "no", "age": 12},
{"first_name": "yo", last_name: "wha", "age": 18},
{"first_name": "dan", last_name: "test", "age": 12},
]
}')
)
select x:class::string
from data, table(flatten(x:students)) y
where (y.value:first_name='joe' and y.value:age>=14)
or (y.value:first_name='erica' and y.value:age<=14)
;
"history"

Related

Find authors with age under 40 in my postgresql table

I have the following table with one row. I have tried to query this table to return authors under the age of 40 and have been unable to do so.
CREATE TABLE bookstuff (
data JSON
);
insert into bookstuff(data)
VALUES('
{
"the_books": {
"publishers": [
{
"name": "Dunder Mifflin",
"address": "Scranton, PA",
"country": "USA",
"CEO": "David Wallace"
},
{
"name": "Sabre",
"address": "Tallahassee, FL",
"country": "USA",
"CEO": "Jo Bennett"
},
{
"name": "Michael Scott Paper company",
"address": "Scranton, PA",
"country": "USA",
"CEO": "Michael Gary Scott"
},
{
"name": "Vance Refrigeration",
"address": "Scranton, PA",
"country": "USA",
"CEO": "Bob Vance"
}
],
"authors": [
{
"id": 1,
"name": "Michael Scott",
"age": 45,
"country": "USA",
"agentname": "Jan Levinson",
"books_written": "book1"
},
{
"id": 2,
"name": "Jim Halpert",
"age": 35,
"country": "USA",
"agentname": "Pam Beesly",
"books_written": "book3"
},
{
"id": 3,
"name": "Dwight Schrute",
"age": 40,
"country": "USA",
"agentname": "Angela Martin",
"books_written": "book2"
},
{
"id": 4,
"name": "Pam Halpert",
"age": 35,
"country": "USA",
"agentname": "Angela Martin",
"books_written": "book4"
}
],
"books": [
{
"isbn": "book1",
"title": "The Dundies",
"price": 10.99,
"year": 2005,
"publishername": "Dunder Mifflin"
},
{
"isbn": "book2",
"title": "Bears, Beets, Battlestar Galactica",
"price": 8.99,
"year": 2006,
"publishername": "Dunder Mifflin"
},
{
"isbn": "book3",
"title": "The Sabre Store",
"price": 12.99,
"year": 2007,
"publishername": "Sabre"
},
{
"isbn": "book4",
"title": "Branch Wars",
"price": 14.99,
"year": 2015,
"publishername": "Sabre"
}
]
}
}');
I have tried the following query to get the author's age
SELECT data->'the_books'->'authors'
FROM bookstuff
WHERE (data->'the_books'->'authors'->>'age')::integer > 40;
I expect it to return two values 'Jim halpert' and 'pam halpert' but instead I get no result back, not even null.
I have also tried this query, just to see if i could get anything back at all from the table and still no results:
SELECT data->'the_books'->'authors'
FROM bookstuff
where (data->'the_books'->'authors'->'name')::jsonb ? 'Michael Scott';
I'm new to postgresql, is there a different way I should be going about this?
Using json_array_elements:
select (v -> 'name')#>>'{}' from bookstuff b
cross join json_array_elements(b.data -> 'the_books' -> 'authors') v
where ((v -> 'age')#>>'{}')::int < 40
See fiddle
Another option, slightly more verbose:
select distinct(author->>'name') as author_name from
(select json_array_elements(b.data->'the_books'->'authors') author from bookstuff b) author
where (author->>'age')::int < 40
The distinct might be unnecessary if you really just have one database row and no duplicates in the authors array of that row.
Three considerations of why your final solution doesn't work
where filters out rows - this happens before the 'select'. the row contains everything in this case.
'?' predicate matches an array containing your choosen value "Does the key/element string exist within the JSON value?" You don't have a simple array here array->'key' doesn't pick that attribute into a new array
your select projection isn't called however it was it would contain the whole array (remember where doesn't transform just filters out rows)

Dataweave: How to concatenate values for a certain key in json array

My input is
[
{
"Id": 5,
"FirstName": "ALEX",
"LastName": "JOHNSON"
},
{
"Id": 4,
"FirstName": "BOB",
"LastName": "BROWN"
},
{
"Id": 2,
"FirstName": "JANE",
"LastName": "DOE"
},
{
"Id": 1,
"FirstName": "JOHN",
"LastName": "SMITH"
},
{
"Id": 6,
"FirstName": "JOHN",
"LastName": "WILKINS"
},
{
"Id": 3,
"FirstName": "TIMOTHY",
"LastName": "WALTERS"
}
]
Output I want is a string concatenating all the FirstName values in the order in which they are listed in the input
"ALEX, BOB, JANE, JOHN, JOHN, TIMOTHY"
I'm new to Dataweave and not sure how to do this
Thanks in advance
You can also try it with the use of descendant selector
%dw 2.0
output application/json
---
payload..FirstName joinBy ","
One way to do it is to first map each element into the FirstName value only then use the joinBy() function to concatenate them separate by a comma:
%dw 2.0
output application/json
---
payload map $.FirstName joinBy ", "
You can alternatively use the reduce() function.

Unflatten an Array in Mulesoft Dataweave 2.0

I have a flat array of Customers, and could not find an appropriate Method (Way) to Unflatten it so that each Customer & his Age becomes Nested.
INPUT
{
"1st Customer": "2216",
"Age": "90",
"2nd Customer": "2231",
"Age": "90",
"3rd Customer": "2249",
"Age": "120",
"4th Customer": "2302",
"Age": "150",
"5th Customer": "",
"Age": ""
}
OUTPUT
{
"customers": [
{
"CustomerSeq": "1",
"CustomerID": "2216",
"Age": 90,
},
{
"CustomerSeq": "2",
"CustomerID": "2231",
"Age": 90,
},
{
"CustomerSeq": "3",
"CustomerID": "2249",
"Age": 120,
},
{
"CustomerSeq": "5",
"CustomerID": "2302",
"Age": 150,
}
]
}
Thanks a lot, Alex
The divideBy is exactly what I was looking for
The Solution [including the removal of the empty values]:
%dw 2.0
output application/json
import * from dw::core::Objects
---
customers:
((payload filterObject ((value, key, index) -> value != "")) divideBy 2) map (
(pCustomer, indexOfpCustomer) -> {
CustomerSeq:indexOfpCustomer+1,
CustomerID: pCustomer[0],
Age: pCustomer[1]
})
There is a good function for it in Mule: divideBy.
It will split an array of your values from the object, and then the requested object could be created:
%dw 2.0
var x={
"1st Customer": "2216",
"Age": "90",
"2nd Customer": "2231",
"Age": "90",
"3rd Customer": "2249",
"Age": "120",
"4th Customer": "2302",
"Age": "150",
"5th Customer": "",
"Age": ""
}
var keys=x pluck $$
var values=x pluck $
import * from dw::core::Arrays
output application/dw
---
values divideBy 2 map (item,index) ->
{
CustomerSeq:index+1,
CustomerID:item[0],
Age:item[1]
}
output
[
{
CustomerSeq: 1,
CustomerID: "2216",
Age: "90"
},
{
CustomerSeq: 2,
CustomerID: "2231",
Age: "90"
},
{
CustomerSeq: 3,
CustomerID: "2249",
Age: "120"
},
{
CustomerSeq: 4,
CustomerID: "2302",
Age: "150"
},
{
CustomerSeq: 5,
CustomerID: "",
Age: ""
}
]

Select game scores with a player's context from Postgres

I am developing a web app with rails and postgres as a shooting range game backend. It has an endpoint to save players scores, which writes the scores to DB and returns the first three places and the players' ranks with some context around it: a place before and a place after. For example if a player has rank 23, the method should return 1st, 2nd, 3rd places, the player rank itself and also two records with ranks 22 and 24 besides it. I don't store the rank in the DB it is calculated dynamically using following sql query:
SELECT RANK() OVER(ORDER BY score DESC) AS place, name, score
FROM scores
WHERE game_id=? AND rules_version=? AND game_mode=?
LIMIT 10
POST request with following data:
{
"players": [
{ "name": "Jack", "score": 100, "local_id": 1, "stats": {}},
{ "name": "Anna", "score": 200, "local_id": 2, "stats": {}}
]
}
should return following result set:
{
"scores": [
{
"name": "Piter",
"place": 1,
"score": 800
},
{
"name": "Lisa",
"place": 2,
"score": 700
},
{
"name": "Philip",
"place": 3,
"score": 600
},
{
"name": "Max",
"place": 25,
"score": 204
},
{
"name": "Anna",
"place": 26,
"score": 200,
"local_id": 2
},
{
"name": "Ashley",
"place": 27,
"score": 193
}
{
"name": "Norman",
"place": 36,
"score": 103
},
{
"name": "Jack",
"place": 37,
"score": 100,
"local_id": 1
},
{
"name": "Chris",
"place": 38,
"score": 95
}
]
}
Every field except local_id and as I said place is stored in DB. I can't figure out the right sql query to do such select. Please help.

Inserting and removing values within a collection array

I'm new to working with MongoDb using Express. I currently have a collection that has an array within an object. The array is meant to hold an unlimited number of values.
My question is when I add a new item to that array in the collection, do I always have to pass all the values in the object?
For example, with the following collection. Say I wanted to add a new contact.
{
"owner": "Tom Smith",
"age": "29",
"contacts": [
{
"firstname": "Fred",
"lastname": "Anderson",
"age": "22"
},
{
"firstname": "Linda",
"lastname": "Smith",
"age": "32"
},
{
"firstname": "Tom",
"lastname": "James",
"age": "42"
},
{
"firstname": "Cal",
"lastname": "Hallaway",
"age": "57"
}
],
"city": "New York"
}
Do I need to explicitly declare all my values in the object I pass to the end point?
Example:
obj.owner = 'Tom Smith';
obj.age = '29';
obj.contacts.firstname = 'Fred';
obj.contacts.lastname = 'Anderson';
obj.contacts.age = '22';
... etc.
and then add my new contact and push the full object to the endpoint to update?
Is there a way that I can just add a new contact without pushing all the data that already exists in the collection?
To add a new data in a nested attribute array:
Model.findOneAndUpdate({
_id: 'THE ID OF YOUR GYUS'
}, {
$push: {
contacts: {
firstname: 'TOTO',
lastname: 'TITI',
age: 42,
},
},
});