Sum of column values ​inside Json array with group by - sql

I have next Django model.
class StocksHistory(models.Model):
wh_data = models.JsonField()
created_at = models.DateTimeField()
I store JSON data in wh_data.
[
{
"id":4124124,
"stocks":[
{
"wh":507,
"qty":2
},
{
"wh":2737,
"qty":1
}
],
},
{
"id":746457457,
"stocks":[
{
"wh":507,
"qty":3
}
]
}
]
Note: it's data for one row - 2022-06-06.
I need to calculate the sum inside stocks by grouping them by wh and by created_at so that the output is something like this
[
{
"wh":507,
"qty":5,
"created_at":"2022-06-06"
},
{
"wh":2737,
"qty":1,
"created_at":"2022-06-06"
},
{
"wh":507,
"qty":0,
"created_at":"2022-06-07"
},
{
"wh":2737,
"qty":2,
"created_at":"2022-06-07"
}
]
I know how to group by date, but I don't understand how to proceed with aggregations inside JsonField.
StocksHistory.objects.extra(select={'day': 'date( created_at )'})
.values('day')
.annotate(
???
)
A solution is suitable, both through Django ORM and through RAW SQL.

demo
WITH cte AS (
SELECT
jsonb_path_query(js, '$[*].stocks.wh')::numeric AS wh,
jsonb_path_query(js, '$[*].stocks.qty')::numeric AS b,
_date
FROM (
VALUES ('[
{
"id":4124124,
"stocks":[
{
"wh":507,
"qty":2
},
{
"wh":2737,
"qty":1
}
]
},
{
"id":746457457,
"stocks":[
{
"wh":507,
"qty":3
}
]
}
]'::jsonb)) v (js),
(
VALUES ('2022-06-06'), ('2022-06-07')) ss_ (_date)
),
cte2 AS (
SELECT
wh, sum(b) AS qty,
_date
FROM
cte
GROUP BY
1,
3
ORDER BY
1
)
SELECT
array_agg(row_to_json(cte2.*)::jsonb)
FROM
cte2;

Related

How to use PSQL to extract data from an object (inside an array inside an object inside an array)

This is data that is currently sitting in a single cell (e.g. inside warehouse table in warehouse_data column) in our database (I'm unable to change the structure/DB design so would need to work with this), how would I be able to select the name of the shirt with the largest width? In this case, would expect output to be tshirt_b (without quotation marks)
{
"wardrobe": {
"apparel": {
"variety": [
{
"data": {
"shirt": {
"size": {
"width": 30
}
}
},
"names": [
{
"name": "tshirt_a"
}
]
},
{
"data": {
"shirt": {
"size": {
"width": 40
}
}
},
"names": [
{
"name": "tshirt_b"
}
]
}
]
}
}
}
I've tried a select statement, being able to get out
"names": [
{
"name": "tshirt_b"
}
]
but not too much further than that e.g.:
select jsonb_array_elements(warehouse_data#>'{wardrobe,apparel,variety}')->>'names'
from 'warehouse'
where id = 1;
In this table, we'd have 2 columns, one with the data and one with a unique identifier. I imagine I'd need to be able to select into size->>width, order DESC and limit 1 (if that's able to then limit it to include the entire object with data & shirt or with the max() func?
I'm really stuck so any help would be appreciated, thank you!
You'll first want to normalise the data into a relational structure:
SELECT
(obj #>> '{data,shirt,size,width}')::int AS width,
(obj #>> '{names,0,name}') AS name
FROM warehouse, jsonb_array_elements(warehouse_data#>'{wardrobe,apparel,variety}') obj
WHERE id = 1;
Then you can do your processing on that as a subquery, e.g.
SELECT name
FROM (
SELECT
(obj #>> '{data,shirt,size,width}')::int AS width,
(obj #>> '{names,0,name}') AS name
FROM warehouse, jsonb_array_elements(warehouse_data#>'{wardrobe,apparel,variety}') obj
WHERE id = 1
) shirts
ORDER BY width DESC
LIMIT 1;

How to convert sql query with exist into mongodb query

I have two documents on mongodb, these are percentages and items. I'm good at SQL, I can write PLSql query as follows but i can not convert to mongodb query. Because my mongodb level of knowledge is at the beginning. Actually I know I have to use $gt for the and condition. But I don't know how I can say not exists or union keyword for mongodb. How can I write mongodb query? which keywords should i search for?
select p.*, "to_top" as list
from percentages p
where p.percentage > 5
and p.updatetime > sysdate - 1/24
and not exists (select 1
from items i
where i.id = p.p_id
and i.seller = p.seller)
order by p.percentage desc
union
select p2.*, "to_bottom" as list
from percentages p2
where p2.percentage > 5
and p2.updatetime > sysdate - 1/24
and exists (select 1
from items i2
where i2.id = p2.p_id
and i2.seller = p2.seller)
order by p2.percentage desc
There is no UNION for MongoDB. Luckely, each query is performed on the same collection and have very close condition, so we can implement "Mongo way" query.
Explanation
Normally, alsmost all complex SQL queries are done with the MongoDB aggregation framework.
We filter document by percentage / updatetime. Explanation why we need to use $expr
SQL JOIN / Subquery is done with the $lookup operator.
SQL SYSDATE in MongoDB way can be NOW or CLUSTER_TIME variable.
db.percentages.aggregate([
{
$match: {
percentage: { $gt: 5 },
$expr: {
$gt: [
"$updatetime",
{
$subtract: [
ISODate("2020-06-14T13:00:00Z"), //Change to $$NOW or $$CLUSTER_TIME
3600000
]
}
]
}
}
},
{
$lookup: {
from: "items",
let: {
p_id: "$p_id",
seller: "$seller"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$eq: [ "$$p_id", "$id"]
},
{
$eq: [ "$$seller", "$seller"]
}
]
}
}
},
{
$limit: 1
}
],
as: "items"
}
},
{
$addFields: {
list: {
$cond: [
{
$eq: [{$size: "$items"}, 0]
},
"$to_top",
"$to_bottom"
]
},
items: "$$REMOVE"
}
},
{
$sort: { percentage: -1 }
}
])
MongoPlayground
Note: The MongoDB aggregation has the $facet operator that allows to perform different queries on the same collection.
SCHEMA:
db.percentages.aggregate([
{$facet:{
q1:[...],
q2:[...],
}},
//We apply "UNION" the result documents for each pipeline into single array
{$project:{
data:{$concatArrays:["$q1","$q2"]}
}},
//Flatten array into single object
{$unwind:"$data"}
//Replace top-level document
{$replaceWith:"$data"}
])
MongoPlayground
why you don't import your mangoDB data into oracle and use sql(that is more easy and powerful than mango.)

Invalid column name on stream analytics, when column exists

Given this dataset:
[
{
"dataChannelId": 8516,
"measures": [
{
"dateTime": "2019-01-01T12:00:00",
},
{
"dateTime": "2019-01-02T12:00:00",
}
}]
And this query:
WITH
temp AS
(
SELECT
dataChannelId,
arrayElement.ArrayValue as element
FROM GriegInputStream
CROSS APPLY GetArrayElements([mesurasdfes]) AS arrayElement
)
SELECT
temp.dataChannelId as sensorId, temp.element.dateTime, temp.element.value,temp.element.unit,temp.element.maxValue, temp.element.minValue
INTO
Sensoroutput
FROM
temp
I get invalid column name, does not exist on dataChannelId, but measures seem to work fine. How can I access this value without stream analytics complaining?
Your sample json data lack a square bracket ].
sample data:
[
{
"dataChannelId": 8516,
"measures": [
{
"dateTime": "2019-01-01T12:00:00",
},
{
"dateTime": "2019-01-02T12:00:00",
}
]
}
]
Query sql:
WITH
temp AS
(
SELECT
jsoninput.dataChannelId,
arrayElement.ArrayValue as element
FROM jsoninput
CROSS APPLY GetArrayElements(jsoninput.measures) AS arrayElement
)
SELECT
temp.dataChannelId as sensorId, temp.element.dateTime
INTO
Sensoroutput
FROM
temp
Output:

filter data on jsonb with postgres

imagine you have website where people post their ads. So, each ad has some selected properties, for example cars has different engine types, gears, colors and etc. Those properties user selects before submiting a listing.
I store selected properties in a jsonb format in listings table, look at the data column:
.
So, each listing contains data like this:
{
"properties":[
{
"id":"1",
"value_id":"1"
},
{
"id":"2",
"value_id":"5"
},
{
"id":"3",
"value_id":"9"
},
{
"id":"4",
"value":"2.0"
},
{
"id":"7",
"value":"2017"
},
{
"id":"6",
"value":"180.000"
}
]
}
Now, the question is:
1) How to filter listings by those id's and value's which are in json? For example, show listings where id = 2 and it's value = 5 AND id = 3 and it's value = 9 and so on. I dont need OR, i need AND. So, filter data by multiple id's and value's.
2) First point + ability to compare id's and value's (greater or lower than).
answering first point, it's probably the first time when I find use for jsonb[]:
t=# with c(a,j) as (values(18,'{
"properties":[
{
"id":"1",
"value_id":"1"
},
{
"id":"2",
"value_id":"5"
},
{
"id":"3",
"value_id":"9"
},
{
"id":"4",
"value":"2.0"
},
{
"id":"7",
"value":"2017"
},
{
"id":"6",
"value":"180.000"
}
]
}'::jsonb), (19,'{"properties":[{"id": "1", "value_id": "1"}]}'))
, m as (select a, array_agg(jb.value)::jsonb[] ar from c, jsonb_array_elements(j->'properties') jb group by a)
select a
from m
where '{"id": "1", "value_id": "1"}'::jsonb = any(ar)
and '{"id": "3", "value_id": "9"}'::jsonb = any(ar);
a
----
18
(1 row)
and for the second requirement - it won't be that short, as you need to compare (and thus parse json):
t=# with c(a,j) as (values(18,'{
"properties":[
{
"id":"1",
"value_id":"1"
},
{
"id":"2",
"value_id":"5"
},
{
"id":"3",
"value_id":"9"
},
{
"id":"4",
"value":"2.0"
},
{
"id":"7",
"value":"2017"
},
{
"id":"6",
"value":"180.000"
}
]
}'::jsonb), (19,'{"properties":[{"id": "1", "value_id": "1"}]}'))
, m as (select a, jb.value->>'id' id,jb.value->>'value_id' value_id from c, jsonb_array_elements(j->'properties') jb)
, n as (select m.*, count(1) over (partition by m.a)
from m
join c on c.a = m.a and ((id::int >= 1 and value_id::int <2) or (id::int >2 and value_id::int <= 9)))
select distinct a from n
where count > 1;
a
----
18
(1 row)
with basic idea to use OR to get possible rows and then check if ALL of OR conditions were met

MDX: Query Data analysed

In this query i used were clause in that year is 2015 and quarter-[2013]&[Quarter1], how is it possible, and getting result set 10 records. actually result set is not displaying.
WITH MEMBER [Measures].[Test] AS ( [Measures].[ProgramAssessmentPatientCnt] + [Measures].[AssessmentPatientCnt] )
MEMBER [Measures].[Test1] AS ( [Measures].[CCMPatientCnt] + [Measures].[CareteamCnt] + [Measures].[CCMPatientCnt] )
SELECT ( ( { [DimEnrollStatus].[EnrollmentStatus].[EnrollmentStatus] } ),
{ [Measures].[AssessmentPatientCnt], [Measures].[Test], [Measures].[Test1] } ) ON COLUMNS,
Subset (
NonEmpty (
{
( { [DimAssessment].[AssessmentText].[AssessmentText] },
{ [DimAssessment].[QuestionText].[QuestionText] },
{ [DimAssessment].[AnswerText].[AnswerText] } )
},
{ [Measures].[AssessmentPatientCnt], [Measures].[Test], [Measures].[Test1] }
),
0,
10
) ON ROWS
FROM [NavigateCube]
WHERE (
{
( { [DimManagedPopulation].[ManagedPopulationName].&[1044]&[LTC Lincoln Centers] },
{ [DimAnchorDate].[Calender Year].&[2015] },
{ [DimAnchorDate].[Calendar Semester Des].[All] },
{ [DimAnchorDate].[Calendar Quarter Des].&[2013]&[Quarter1] },
{ [DimAnchorDate].[English Month Name Desc].[All] } )
} )
Does this return any rows?
WHERE
(
[DimManagedPopulation].[ManagedPopulationName].&[1044]&[LTC Lincoln Centers],
[DimAnchorDate].[Calender Year].&[2015],
//[DimAnchorDate].[Calendar Semester Des].[All],
[DimAnchorDate].[Calendar Quarter Des].&[2013]&[Quarter1],
[DimAnchorDate].[English Month Name Desc].[All]
);
Maybe the following:
WHERE
(
[DimManagedPopulation].[ManagedPopulationName].&[1044]&[LTC Lincoln Centers],
{
[DimAnchorDate].[Calender Year].&[2015],
[DimAnchorDate].[Calendar Semester Des].[All],
[DimAnchorDate].[Calendar Quarter Des].&[2013]&[Quarter1],
[DimAnchorDate].[English Month Name Desc].[All]
}
);