How to make a IN query with hstore? - sql

I have a field (content) in a table containing keys and values (hstore) like this :
content: {"price"=>"15.2", "quantity"=>"3", "product_id"=>"27", "category_id"=>"2", "manufacturer_id"=>"D"}
I can easily select product having ONE category_id with :
SELECT * FROM table WHERE "content #> 'category_id=>27'"
I want to select all lines having (for example) category_id IN a list of value.
In classic SQL it would be something like this :
SELECT * FROM TABLE WHERE category_id IN (27, 28, 29, ....)
Thanks you in advance

De-reference the key and test it with IN as normal.
CREATE TABLE hstoredemo(content hstore not null);
INSERT INTO hstoredemo(content) VALUES
('"price"=>"15.2", "quantity"=>"3", "product_id"=>"27", "category_id"=>"2", "manufacturer_id"=>"D"');
then one of these. The first is cleaner, as it casts the extracted value to integer rather than doing string compares on numbers.
SELECT *
FROM hstoredemo
WHERE (content -> 'category_id')::integer IN (2, 27, 28, 29);
SELECT *
FROM hstoredemo
WHERE content -> 'category_id' IN ('2', '27', '28', '29');
If you had to test more complex hstore contains operations, say with multiple keys, you could use #> ANY, e.g.
SELECT *
FROM hstoredemo
WHERE
content #> ANY(
ARRAY[
'"category_id"=>"27","product_id"=>"27"',
'"category_id"=>"2","product_id"=>"27"'
]::hstore[]
);
but it's not pretty, and it'll be a lot slower, so don't do this unless you have to.

category_ids = ["27", "28", "29"]
Tablename.where("category_id IN(?)", category_ids)

Related

Sum values in Athena table where column having key/value pair json

I have an Athena table with one column having JSON and key/value pairs.
Ex:
Select test_client, test_column from ABC;
test_client, test_column
john, {"d":13, "e":210}
mark, {"a":1,"b":10,"c":1}
john, {"e":100,"a":110,"b":10, "d":10}
mark, {"a":56,"c":11,"f":9, "e": 10}
And I need to sum the values corresponding to keys and return in some sort like the below: return o/p format doesn't matter. I want to sum it up.
john: d: 23, e:310, a:110, b:10
mark: a:57, b:10, c:12, f:9, e:10
It is a combination of a few useful functions in Trino:
WITH example_table AS
(SELECT 'john' as person, '{"d":13, "e":210}' as _json UNION ALL
SELECT 'mark', ' {"a":1,"b":10,"c":1}' UNION ALL
SELECT 'john', '{"e":100,"a":110,"b":10, "d":10}' UNION ALL
SELECT 'mark', '{"a":56,"c":11,"f":9, "e": 10}')
SELECT person, reduce(
array_agg(CAST(json_parse(_json) AS MAP(VARCHAR, INTEGER))),
MAP(ARRAY['a'],ARRAY[0]),
(s, x) -> map_zip_with(
s,x, (k, v1, v2) ->
if(v1 is null, 0, v1) +
if(v2 is null, 0, v2)
),
s -> s
)
FROM example_table
GROUP BY person
json_parse - Parses the string to a JSON object
CAST ... AS MAP... - Creates a MAP from the JSON object
array_agg - Aggregates the maps for each Person based on the group by
reduce - steps through the aggregated array and reduce it to a single map
map_zip_with - applies a function on each similar key in two maps
if(... is null ...) - puts 0 instead of null if the key is not present

Querying based on JSON array sub element

Tried multiple answers from here and elsewhere and couldn't find the right answer yet.
create table mstore (
muuid uuid PRIMARY KEY,
msid text,
m_json JSONb[] not NULL
);
inserted first row:
insert into mstore (muuid, msid, m_json) values (
'3b691440-ee54-4d9d-a5b3-5f1863b78755'::uuid,
'<163178891004.4772968682254423915#XYZ-73SM>',
(array['{"m": 123, "mts": "2021-09-16T10:53:43.599012", "dstatus": "Dropped", "rcpt": "abc1#xyz.com"}']::jsonb[])
);
inserted second row:
insert into mstore (muuid, msid, m_json) values (
'3b691440-ee54-4d9d-a5b3-5f1863b78757'::uuid,
'<163178891004.4772968682254423915#XYZ-75SM>',
(array['{"m": 125, "mts": "2021-09-16T10:53:43.599022", "dstatus": "Dropped", "rcpt": "abc3#xyz.com"}']::jsonb[])
);
updated the first row:
update mstore
set m_json = m_json || '{"m": 124, "mts": "2021-09-16T10:53:43.599021", "dstatus": "Delivered", "rcpt": "abc2#xyz.com"}'::jsonb
where muuid = '3b691440-ee54-4d9d-a5b3-5f1863b78755';
Now table looks like:
muuid | msid | m_json
--------------------------------------+----------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3b691440-ee54-4d9d-a5b3-5f1863b78757 | <163178891004.4772968682254423915#XYZ-75SM> | {"{\"mid\": 125, \"rcpt\": \"abc3#xyz.com\", \"msg_ts\": \"2021-09-16T10:53:43.599022\", \"dstatus\": \"Dropped\"}"}
3b691440-ee54-4d9d-a5b3-5f1863b78755 | <163178891004.4772968682254423915#XYZ-73SM> | {"{\"mid\": 123, \"rcpt\": \"abc1#xyz.com\", \"msg_ts\": \"2021-09-16T10:53:43.599012\", \"dstatus\": \"Dropped\"}","{\"mid\": 124, \"rcpt\": \"abc2#xyz.com\", \"msg_ts\": \"2021-09-16T10:53:43.599021\", \"dstatus\": \"Delivered\"}"}
Now, I need to query based on the status. I tried few but most relevant one was
select * from mstore,jsonb_array_elements(m_json) with ordinality arr(item_object, position) where item_object->>'{"dstatus": "Delivered"}';
and
select * from mstore where m_json #> '[{"dstatus": "Delivered"}]';
Neither work, as they have syntax errors. How to run this query with dstatus values?
Please note that mstore.m_json is a Postgres array of JSONB elements and not a JSONB array and therefore unnest must be used rather than jsonb_array_elements. Also have a look at ->> operator in the documentation.
The same applies to your second example. It would work if mstore.m_json is a JSONB array and not a Postgres array of JSONB elements.
select m.muuid, m.msid, l.item_object, l.pos
from mstore m
cross join lateral unnest(m.m_json) with ordinality l(item_object, pos)
where l.item_object ->> 'dstatus' = 'Delivered';
It would be better to use JSONB data type for column mstore.m_json rather than JSONB[] or - much better - normalize the data design.

Select rows where nested json array field includes any of values from a provided array in PostgreSQL?

I'm trying to write an sql query that would find the rows in a table that match any value of the provided json array.
To put it more concretely, I have the following db table:
CREATE TABLE mytable (
name text,
id SERIAL PRIMARY KEY,
config json,
matching boolean
);
INSERT INTO "mytable"(
"name", "id", "config", "matching"
)
VALUES
(
E 'Name 1', 50,
E '{"employees":[1,7],"industries":["1","3","4","13","14","16"],"levels":["1110","1111","1112","1113","1114"],"revenue":[0,5],"states":["AK","Al","AR","AZ","CA","CO","CT","DC","DE","FL","GA","HI","IA","ID","IL"]}',
TRUE
),
(
E 'Name 2', 63,
E '{"employees":[3,5],"industries":["1"],"levels":["1110"],"revenue":[2,5],"states":["AK","AZ","CA","CO","HI","ID","MT","NM","NV","OR","UT","WA","WY"]}',
TRUE,
),
(
E 'Name 3', 56,
E '{"employees":[0,0],"industries":["14"],"levels":["1111"],"revenue":[7,7],"states":["AK","AZ","CA","CO","HI","ID","MT","NM","NV","OR","UT","WA","WY"]}',
TRUE,
),
(
E 'Name 4', 61,
E '{"employees":[3,8],"industries":["1"],"levels":["1110"],"revenue":[0,5],"states":["AK","AZ","CA","CO","HI","ID","WA","WY"]}',
FALSE
);
I need to perform search queries on this table with the given filtering params. The filtering params basically correspond to the json keys in config field. They come from the client side and can look something like this:
{"employees": [1, 8], "industries": ["12", "5"]}
{"states": ["LA", "WA", "CA"], "levels": ["1100", "1100"], "employees": [3]}
And given such filters, I need to find the rows in my table that include any of the array elements from the corresponding filter key for every filter key provided.
So given the filter {"employees": [1, 8], "industries": ["12", "5"]} the query would have to return all the rows where (employees key in config field contains either 1 or 8 AND where industries key in config field contains either 12 or 5);
I need to generate such a query dynamically from the javascript code so that I could include/exclude filtering by a certain parameter bu adding/removing the AND operator.
What I have so far is a super long-running query that generates all possible combinations of array elements in config field which feels very wrong:
select * from mytable
cross join lateral json_array_elements(config->'employees') as e1
cross join lateral json_array_elements(config->'states') as e2
cross join lateral json_array_elements(config->'levels') as e3
cross join lateral json_array_elements(config->'revenue') as e4;
I've also tried to do something like this:
select * from mytable
where
matching = TRUE
and (config->'employees')::jsonb #> ANY(ARRAY ['[1, 7, 8]']::jsonb[])
and (config->'states')::jsonb #> ANY(ARRAY ['["AK", "AZ"]']::jsonb[])
and ........;
however this didn't work, although looked promising.
Also, I've tried playing with ?| operator but to no avail.
Basically, what I need is: given an array key in a json field check if this field contains any of the provided values in another array (which is my filtering parameter); and I have to do this for multiple filtering parameters dynamically.
So the logic is the following:
select all rows from the table
*where*
matching = TRUE
*and* config->key1 includes any of the keys from [5,6,8,7]
*and* config->key2 includes any of the keys from [8,6,2]
*and* so forth;
Could you help me with implementing such an sql query?
Or maybe such sql queries will always be extremely slow and its best to do such filtering outside of the database level?
I'd try with something like that. I guess there are certain side effects (e.g. What if the comparison data is empty?) and I didn't test it on larger data sets... It was just the first which came to my mind... :
demo:db<>fiddle
SELECT
*
FROM
mytable t
JOIN (SELECT '{"states": ["LA", "WA", "CA"], "levels": ["1100", "1100"], "employees": [3]}'::json as data) c
ON
CASE WHEN c.data -> 'employees' IS NOT NULL THEN
ARRAY(SELECT json_array_elements_text(t.config -> 'employees')) && ARRAY(SELECT json_array_elements_text(c.data -> 'employees'))
ELSE TRUE END
AND
CASE WHEN c.data -> 'industries' IS NOT NULL THEN
ARRAY(SELECT json_array_elements_text(t.config -> 'industries')) && ARRAY(SELECT json_array_elements_text(c.data -> 'industries'))
ELSE TRUE END
AND
CASE WHEN c.data -> 'states' IS NOT NULL THEN
ARRAY(SELECT json_array_elements_text(t.config -> 'states')) && ARRAY(SELECT json_array_elements_text(c.data -> 'states'))
ELSE TRUE END
AND
CASE WHEN c.data -> 'revenue' IS NOT NULL THEN
ARRAY(SELECT json_array_elements_text(t.config -> 'revenue')) && ARRAY(SELECT json_array_elements_text(c.data -> 'revenue'))
ELSE TRUE END
AND
CASE WHEN c.data -> 'levels' IS NOT NULL THEN
ARRAY(SELECT json_array_elements_text(t.config -> 'levels')) && ARRAY(SELECT json_array_elements_text(c.data -> 'levels'))
ELSE TRUE END
Explanation of the join condition:
CASE WHEN c.data -> 'levels' IS NOT NULL THEN
ARRAY(SELECT json_array_elements_text(t.config -> 'levels')) && ARRAY(SELECT json_array_elements_text(c.data -> 'levels'))
ELSE TRUE END
If your comparision data does not contain the specific attribute, the condition is true and therefore will be ignored. If it contains an attribute, compare the table and comparision arrays for this attribute by transforming both JSON arrays into simple Postgres arrays

How do I search for particular key values in postgres jsonb columns?

I want to get all the rows which have particular key-value pairs as received in the input. For. e.g. Consider below data:-
CREATE TABLE places(id int, place jsonb) ;
insert into places values (1, '{"country": {"name": "Brazil", "speak_portuguese": true, "river": "amazon"}}');
insert into places values (2, '{"name": "USA", "speak_portuguese": false, "river": "missisipi"}');
insert into places values (3, '{"continent": "South America", "speak_portuguese": true, "river": "amazon"}');
The table data looks as below:-
Now, if I get below key-value in input, then rows 1 and 3 should appear in result as the key-value pairs are present in row 1 and 3.
{
"speaks_portuguese": true,
"river": "amazon"
}
I am trying using jsonb_path_exists but having a hard time. Can someone help?
Note: the keys can appear at any level of nesting (e.g. country->state->city etc) and one row can have data of multiple places.
As those values appear at different paths, you can use an or condition:
select *
from places
where place #> '{"speak_portuguese": true, "river": "amazon"}'
or place -> 'country' #> '{"speak_portuguese": true, "river": "amazon"}'
If you don't know the level and if you are using Postgres 12, you can use a JSON path expression:
select *
from places
where place ## '$.**.speak_portuguese == true && $.**.river == "amazon"'
select * from places
where (place->'speak_portuguese')::boolean is true
or (place->'country'->'speak_portuguese')::boolean

Strange Behaviour: SQL And operator with multiple IN operators

I am using multiple IN operators with AND in my sql query where clause as given below...
---
where ID in (1, 3, 234, 2332, 2123, 989) AND tag in ('wow', 'wonderful')
But surprisingly behaviour of result seems to be of OR type rather then AND type. What I mean is it is ignoring AND operator...
Can you please explain me why?
I couldn't reproduce the result using SQL Server 2008.
SELECT * FROM
(
SELECT 0 AS ID, 'wow' as Tag
) X
WHERE ID in (1, 3, 234, 2332, 2123, 989) AND tag in ('wow', 'wonderful')
Result:
No records
SELECT * FROM
(
SELECT 1 AS ID, 'wow' as Tag
) X
WHERE ID in (1, 3, 234, 2332, 2123, 989) AND tag in ('wow', 'wonderful')
Result:
ID Tag
1 wow
Check your code again.