filter data on jsonb with postgres - sql

imagine you have website where people post their ads. So, each ad has some selected properties, for example cars has different engine types, gears, colors and etc. Those properties user selects before submiting a listing.
I store selected properties in a jsonb format in listings table, look at the data column:
.
So, each listing contains data like this:
{
"properties":[
{
"id":"1",
"value_id":"1"
},
{
"id":"2",
"value_id":"5"
},
{
"id":"3",
"value_id":"9"
},
{
"id":"4",
"value":"2.0"
},
{
"id":"7",
"value":"2017"
},
{
"id":"6",
"value":"180.000"
}
]
}
Now, the question is:
1) How to filter listings by those id's and value's which are in json? For example, show listings where id = 2 and it's value = 5 AND id = 3 and it's value = 9 and so on. I dont need OR, i need AND. So, filter data by multiple id's and value's.
2) First point + ability to compare id's and value's (greater or lower than).

answering first point, it's probably the first time when I find use for jsonb[]:
t=# with c(a,j) as (values(18,'{
"properties":[
{
"id":"1",
"value_id":"1"
},
{
"id":"2",
"value_id":"5"
},
{
"id":"3",
"value_id":"9"
},
{
"id":"4",
"value":"2.0"
},
{
"id":"7",
"value":"2017"
},
{
"id":"6",
"value":"180.000"
}
]
}'::jsonb), (19,'{"properties":[{"id": "1", "value_id": "1"}]}'))
, m as (select a, array_agg(jb.value)::jsonb[] ar from c, jsonb_array_elements(j->'properties') jb group by a)
select a
from m
where '{"id": "1", "value_id": "1"}'::jsonb = any(ar)
and '{"id": "3", "value_id": "9"}'::jsonb = any(ar);
a
----
18
(1 row)
and for the second requirement - it won't be that short, as you need to compare (and thus parse json):
t=# with c(a,j) as (values(18,'{
"properties":[
{
"id":"1",
"value_id":"1"
},
{
"id":"2",
"value_id":"5"
},
{
"id":"3",
"value_id":"9"
},
{
"id":"4",
"value":"2.0"
},
{
"id":"7",
"value":"2017"
},
{
"id":"6",
"value":"180.000"
}
]
}'::jsonb), (19,'{"properties":[{"id": "1", "value_id": "1"}]}'))
, m as (select a, jb.value->>'id' id,jb.value->>'value_id' value_id from c, jsonb_array_elements(j->'properties') jb)
, n as (select m.*, count(1) over (partition by m.a)
from m
join c on c.a = m.a and ((id::int >= 1 and value_id::int <2) or (id::int >2 and value_id::int <= 9)))
select distinct a from n
where count > 1;
a
----
18
(1 row)
with basic idea to use OR to get possible rows and then check if ALL of OR conditions were met

Related

How to use PSQL to extract data from an object (inside an array inside an object inside an array)

This is data that is currently sitting in a single cell (e.g. inside warehouse table in warehouse_data column) in our database (I'm unable to change the structure/DB design so would need to work with this), how would I be able to select the name of the shirt with the largest width? In this case, would expect output to be tshirt_b (without quotation marks)
{
"wardrobe": {
"apparel": {
"variety": [
{
"data": {
"shirt": {
"size": {
"width": 30
}
}
},
"names": [
{
"name": "tshirt_a"
}
]
},
{
"data": {
"shirt": {
"size": {
"width": 40
}
}
},
"names": [
{
"name": "tshirt_b"
}
]
}
]
}
}
}
I've tried a select statement, being able to get out
"names": [
{
"name": "tshirt_b"
}
]
but not too much further than that e.g.:
select jsonb_array_elements(warehouse_data#>'{wardrobe,apparel,variety}')->>'names'
from 'warehouse'
where id = 1;
In this table, we'd have 2 columns, one with the data and one with a unique identifier. I imagine I'd need to be able to select into size->>width, order DESC and limit 1 (if that's able to then limit it to include the entire object with data & shirt or with the max() func?
I'm really stuck so any help would be appreciated, thank you!
You'll first want to normalise the data into a relational structure:
SELECT
(obj #>> '{data,shirt,size,width}')::int AS width,
(obj #>> '{names,0,name}') AS name
FROM warehouse, jsonb_array_elements(warehouse_data#>'{wardrobe,apparel,variety}') obj
WHERE id = 1;
Then you can do your processing on that as a subquery, e.g.
SELECT name
FROM (
SELECT
(obj #>> '{data,shirt,size,width}')::int AS width,
(obj #>> '{names,0,name}') AS name
FROM warehouse, jsonb_array_elements(warehouse_data#>'{wardrobe,apparel,variety}') obj
WHERE id = 1
) shirts
ORDER BY width DESC
LIMIT 1;

How to remove object by value from a JSONB type array?

I want to remove a JSONB object by their unique 'id' value from a JSONB array. I am no expert at writing SQL code, but I managed to write the concatenate function.
For an example: Remove this object from an array below.
{
"id": "ad26e2be-19fd-4862-8f84-f2f9c87b582e",
"title": "Wikipedia",
"links": [
"https://en.wikipedia.org/1",
"https://en.wikipedia.org/2"
]
},
Schema:
CREATE TABLE users (
url text not null,
user_id SERIAL PRIMARY KEY,
name VARCHAR,
list_of_links jsonb default '[]'
);
list_of_links format:
[
{
"id": "ad26e2be-19fd-4862-8f84-f2f9c87b582e",
"title": "Wikipedia",
"links": [
"https://en.wikipedia.org/1",
"https://en.wikipedia.org/2"
]
},
{
"id": "451ac172-b93e-4158-8e53-8e9031cfbe72",
"title": "Russian Wikipedia",
"links": [
"https://ru.wikipedia.org/wiki/",
"https://ru.wikipedia.org/wiki/"
]
},
{
"id": "818b99c8-479b-4846-ac15-4b2832ec63b5",
"title": "German Wikipedia",
"links": [
"https://de.wikipedia.org/any",
"https://de.wikipedia.org/any"
]
},
...
]
The concatenate function:
update users set list_of_links=(
list_of_links || (select *
from jsonb_array_elements(list_of_links)
where value->>'id'='ad26e2be-19fd-4862-8f84-f2f9c87b582e'
)
)
where url='test'
returning *
;
Your json data is structured so you have to unpack it, operate on the unpacked data, and then repack it again:
SELECT u.url, u.user_id, u.name,
jsonb_agg(
jsonb_build_object('id', l.id, 'title', l.title, 'links', l.links)
) as list_of_links
FROM users u
CROSS JOIN LATERAL jsonb_to_recordset(u.list_of_links) AS l(id uuid, title text, links jsonb)
WHERE l.id != 'ad26e2be-19fd-4862-8f84-f2f9c87b582e'::uuid
GROUP BY 1, 2, 3
The function jsonb_to_recordset is a set-returning function so you have to use it as a row source, joined to its originating table with the LATERAL clause so that the list_of_links column is available to the function to be unpacked. Then you can delete the records you are not interested in using the WHERE clause, and finally repack the structure by building the record fields into a jsonb structure and then aggregating the individual records back into an array.
I wrote this on JS but that does not matter to how it works. Essentially, its getting all the items from the array, then finding the matching id which returns an index. And using that index, I use "-" operator which takes the index and removes it from the array. Sorry if my grammar is bad.
//req.body is this JSON object
//{"url":"test", "id": "ad26e2be-19fd-4862-8f84-f2f9c87b582e"}
var { url, id } = req.body;
pgPool.query(
`
select list_of_links
from users
where url=$1;
`,
[url],
(error, result) => {
//block code executing further if error is true
if (error) {
res.json({ status: "failed" });
return;
}
if (result) {
// this function just returns the index of the array element where the id matches from request's id
// 0, 1, 2, 3, 4, 5
var index_of_the_item = result.rows.list_of_links
.map(({ id: db_id }, index) =>
db_id === id ? index : false
)
.filter((x) => x !== false)[0];
//remove the array element by it's index
pgPool.query(
`
update users
set list_of_links=(
list_of_links - $1::int
)
where url=$2
;
`,
[index_of_the_item, url], (e, r) => {...}
);
}
}
);

How to convert sql query with exist into mongodb query

I have two documents on mongodb, these are percentages and items. I'm good at SQL, I can write PLSql query as follows but i can not convert to mongodb query. Because my mongodb level of knowledge is at the beginning. Actually I know I have to use $gt for the and condition. But I don't know how I can say not exists or union keyword for mongodb. How can I write mongodb query? which keywords should i search for?
select p.*, "to_top" as list
from percentages p
where p.percentage > 5
and p.updatetime > sysdate - 1/24
and not exists (select 1
from items i
where i.id = p.p_id
and i.seller = p.seller)
order by p.percentage desc
union
select p2.*, "to_bottom" as list
from percentages p2
where p2.percentage > 5
and p2.updatetime > sysdate - 1/24
and exists (select 1
from items i2
where i2.id = p2.p_id
and i2.seller = p2.seller)
order by p2.percentage desc
There is no UNION for MongoDB. Luckely, each query is performed on the same collection and have very close condition, so we can implement "Mongo way" query.
Explanation
Normally, alsmost all complex SQL queries are done with the MongoDB aggregation framework.
We filter document by percentage / updatetime. Explanation why we need to use $expr
SQL JOIN / Subquery is done with the $lookup operator.
SQL SYSDATE in MongoDB way can be NOW or CLUSTER_TIME variable.
db.percentages.aggregate([
{
$match: {
percentage: { $gt: 5 },
$expr: {
$gt: [
"$updatetime",
{
$subtract: [
ISODate("2020-06-14T13:00:00Z"), //Change to $$NOW or $$CLUSTER_TIME
3600000
]
}
]
}
}
},
{
$lookup: {
from: "items",
let: {
p_id: "$p_id",
seller: "$seller"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$eq: [ "$$p_id", "$id"]
},
{
$eq: [ "$$seller", "$seller"]
}
]
}
}
},
{
$limit: 1
}
],
as: "items"
}
},
{
$addFields: {
list: {
$cond: [
{
$eq: [{$size: "$items"}, 0]
},
"$to_top",
"$to_bottom"
]
},
items: "$$REMOVE"
}
},
{
$sort: { percentage: -1 }
}
])
MongoPlayground
Note: The MongoDB aggregation has the $facet operator that allows to perform different queries on the same collection.
SCHEMA:
db.percentages.aggregate([
{$facet:{
q1:[...],
q2:[...],
}},
//We apply "UNION" the result documents for each pipeline into single array
{$project:{
data:{$concatArrays:["$q1","$q2"]}
}},
//Flatten array into single object
{$unwind:"$data"}
//Replace top-level document
{$replaceWith:"$data"}
])
MongoPlayground
why you don't import your mangoDB data into oracle and use sql(that is more easy and powerful than mango.)

Working with arrays with BigQuery LegacySQL

Each row in my table has a field that is an array, and I'd like to get a field from the first array entry.
For example, if my row is
[
{
"user_dim": {
"user_id": "123",
"user_properties": [
{
"key": "content_group",
"value": {
"value": {
"string_value": "my_group"
}
}
}
]
},
"event_dim": [
{
"name": "main_menu_item_selected",
"timestamp_micros": "1517584420597000"
},
{
"name": "screen_view",
"timestamp_micros": "1517584420679001"
}
]
}
]
I'd like to get
user_id: 123, content_group: my_group, timestamp_1517584420597000
As Elliott mentioned - BigQuery Standard SQL has way much better support for ARRAYs than legacy SQL. And in general, BigQuery team recommend using Standard SQL
So, below is for BigQuery Standard SQL (including handling wildcard stuff)
#standardSQL
SELECT
user_dim.user_id AS user_id,
(SELECT value.value.string_value
FROM UNNEST(user_dim.user_properties)
WHERE key = 'content_group' LIMIT 1
) content_group,
(SELECT event.timestamp_micros
FROM UNNEST(event_dim) event
WHERE name = 'main_menu_item_selected'
) ts
FROM `project.dataset.app_events_*`
WHERE _TABLE_SUFFIX BETWEEN '20180129' AND '20180202'
with result (for the dummy example from your question)
Row user_id content_group ts
1 123 my_group 1517584420597000

sql to mongodb translation

I wonder how we can do the below translation from sql to mongoDB:
Assume the table has below structure:
table
=====
-----
##id contribution time
1 300 Jan 2, 1990
2 1000 March 3, 1991
And I want to find a ranking list of ids in the descending orders of their number of contributions.
'$' This is what I do using sql:
select id, count(*) c from table group by id order by c desc;
How can I translate this complex sql into mongoDB using count(), order() and group()?
Thank you very much!
Setting up test data with:
db.donors.insert({donorID:1,contribution:300,date:ISODate('1990-01-02')})
db.donors.insert({donorID:2,contribution:1000,date:ISODate('1991-03-03')})
db.donors.insert({donorID:1,contribution:900,date:ISODate('1992-01-02')})
You can use the new Aggregation Framework in MongoDB 2.2:
db.donors.aggregate(
{ $group: {
_id: "$donorID",
total: { $sum: "$contribution" },
donations: { $sum: 1 }
}},
{ $sort: {
donations: -1
}}
)
To produce the desired result:
{
"result" : [
{
"_id" : 1,
"total" : 1200,
"donations" : 2
},
{
"_id" : 2,
"total" : 1000,
"donations" : 1
}
],
"ok" : 1
}
Check the mongodb Aggregation
db.colection.group(
{key: { id:true},
reduce: function(obj,prev) { prev.sum += 1; },
initial: { sum: 0 }
});
After you get the result, sort it by sum.