How to loop through JSON array of JSON objects to see if it contains a value that I am looking for in postgres? - sql

Here is an example of the json object
rawJSON = [
{"a":0, "b":7},
{"a":1, "b":8},
{"a":2, "b":9}
]
And I have a table that essentially looks like this.
demo Table
id | ...(other columns) | rawJSON
------------------------------------
0 | ...(other columns info) | [{"a":0, "b":7},{"a":1, "b":8}, {"a":2, "b":9}]
1 | ...(other columns info) | [{"a":0, "b":17},{"a":11, "b":5}, {"a":12, "b":5}]
What I want is to return a row which insideRawJSON has value from "a" of less than 2 AND the value from "b" of less than 8. THEY MUST BE FROM THE SAME JSON OBJECT.
Essentially the query would similarly look like this
SELECT *
FROM demo
WHERE FOR ANY JSON OBJECT in rawJSON column -> "a" < 2 AND -> "b" < 8
And therefore it will return
id | ...(other columns) | rawJSON
------------------------------------
0 | ...(other columns info) | [{"a":0, "b":7},{"a":1, "b":8}, {"a":2, "b":9}]
I have searched from several posts here but was not able to figure it out.
https://dba.stackexchange.com/questions/229069/extract-json-array-of-numbers-from-json-array-of-objects
https://dba.stackexchange.com/questions/54283/how-to-turn-json-array-into-postgres-array
I was thinking of creating a plgpsql function but wasn't able to figure out .
Any advice I would greatly appreciate it!
Thank you!!
I would like to avoid cross join lateral because it will slow down a lot.

You can use a subquery that searches through the array elements together with EXISTS.
SELECT *
FROM demo d
WHERE EXISTS (SELECT *
FROM jsonb_array_elements(d.rawjson) a(e)
WHERE (a.e->>'a')::integer < 2
AND (a.e->>'b')::integer < 8);
db<>fiddle
If the datatype for rawjson is json rather than jsonb, use json_array_elements() instead of jsonb_array_elements().

Related

Athena query get the index of any element in a list

I need to access to the elements in a column whose type is list according to the other elements' locations in another list-like column. Say, my dataset is like:
WITH dataset AS (
SELECT ARRAY ['hello', 'amazon', 'athena'] AS words,
ARRAY ['john', 'tom', 'dave'] AS names
)
SELECT * FROM dataset
And I'm going to achieve
SELECT element_at(words, index(names, 'john')) AS john_word
FROM dataset
Is there a way to have a function in Athena like "index"? Or how can I customize one like this? The desired result should be like:
| -------- |
| john_word|
| -------- |
| hello |
| -------- |
array_position:
array_position(x, element) → bigint
Returns the position of the first occurrence of the element in array x (or 0 if not found).
Note that in presto array indexes start from 1.
SELECT element_at(words, array_position(names, 'john')) AS john_word
FROM dataset

How can I find the last member of an array in PostgreSQL?

How can I find the last member of an array in PostgreSQL?
for example you have an array of the numbers which each of them are id of a course, in the list of your chosen courses I want to know the last one : [1,32,4,6] I need number 6 ! how can I find the last number which is 6 ?
You can use array function array_upper() to get the index of the last element.
Consider:
with t as (select '{1,32,4,6}'::int[] a)
select a[array_upper(a, 1)] last_element from t
Demo on DB Fiddle:
| last_element |
| -----------: |
| 6 |
You can use ARRAY_UPPER to get the upper bound of your array. You can then retrieve the value at that returned index.
SELECT yourcolumn[ARRAY_UPPER(yourcolumn,1)] FROM yourtable;

KQL: mv-expand OR bag_unpack equivalent command to convert a list to multiple columns

According to mv-expand documentation:
Expands multi-value array or property bag.
mv-expand is applied on a dynamic-typed column so that each value in the collection gets a separate row. All the other columns in an expanded row are duplicated.
Just like the mv-expand operator will create a row each for the elements in the list -- Is there an equivalent operator/way to make each element in a list an additional column?
I checked the documentation and found Bag_Unpack:
The bag_unpack plugin unpacks a single column of type dynamic by treating each property bag top-level slot as a column.
However, it doesn't seem to work on the list, and rather works on top-level JSON property.
Using bag_unpack (like the below query):
datatable(d:dynamic)
[
dynamic({"Name": "John", "Age":20}),
dynamic({"Name": "Dave", "Age":40}),
dynamic({"Name": "Smitha", "Age":30}),
]
| evaluate bag_unpack(d)
It will do the following:
Name Age
John 20
Dave 40
Smitha 30
Is there a command/way (see some_command_which_helps) I can achieve the following (convert a list to columns):
datatable(d:dynamic)
[
dynamic(["John", "Dave"])
]
| evaluate some_command_which_helps(d)
That translates to something like:
Col1 Col2
John Dave
Is there an equivalent where I can convert a list/array to multiple columns?
For reference: We can run the above queries online on Log Analytics in the demo section if needed (however, it may require login).
you could try something along the following lines
(that said, from an efficiency standpoint, you may want to check your options of restructuring the data set to begin with, using a schema that matches how you plan to actually consume/query it)
datatable(d:dynamic)
[
dynamic(["John", "Dave"]),
dynamic(["Janice", "Helen", "Amber"]),
dynamic(["Jane"]),
dynamic(["Jake", "Abraham", "Gunther", "Gabriel"]),
]
| extend r = rand()
| mv-expand with_itemindex = i d
| summarize b = make_bag(pack(strcat("Col", i + 1), d)) by r
| project-away r
| evaluate bag_unpack(b)
which will output:
|Col1 |Col2 |Col3 |Col4 |
|------|-------|-------|-------|
|John |Dave | | |
|Janice|Helen |Amber | |
|Jane | | | |
|Jake |Abraham|Gunther|Gabriel|
To extract key value pairs from text and convert them to columns without hardcoding the key names in query:
print message="2020-10-15T15:47:09 Metrics: duration=2280, function=WorkerFunction, count=0, operation=copy_into, invocationId=e562f012-a994-4fc9-b585-436f5b2489de, tid=lct_b62e6k59_prd_02, table=SALES_ORDER_SCHEDULE, status=success"
| extend Properties = extract_all(#"(?P<key>\w+)=(?P<value>[^, ]*),?", dynamic(["key","value"]), message)
| mv-apply Properties on (summarize make_bag(pack(tostring(Properties[0]), Properties[1])))
| evaluate bag_unpack(bag_)
| project-away message

In Postgres: Select columns from a set of arrays of columns and check a condition on all of them

I have a table like this:
I want to perform count on different set of columns (all subsets where there is at least one element from X and one element from Y). How can I do that in Postgres?
For example, I may have {x1,x2,y3}, {x4,y1,y2,y3},etc. I want to count number of "id"s having 1 in each set. So for the first set:
SELECT COUNT(id) FROM table WHERE x1=1 AND x2=1 AND x3=1;
and for the second set does the same:
SELECT COUNT(id) FROM table WHERE x4=1 AND y1=1 AND y2=1 AND y3=1;
Is it possible to write a loop that goes over all these sets and query the table accordingly? The array will have more than 10000 sets, so it cannot be done manually.
You should be able convert the table columns to an array using ARRAY[col1, col2,...], then use the array_positions function, setting the second parameter to be the value you're checking for. So, given your example above, this query:
SELECT id, array_positions(array[x1,x2,x3,x4,y1,y2,y3,y4], 1)
FROM tbl
ORDER BY id;
Will yield this result:
+----+-------------------+
| id | array_positions |
+----+-------------------+
| a | {1,4,5} |
| b | {1,2,4,7} |
| c | {1,2,3,4,6,7,8} |
+----+-------------------+
Here's a SQL Fiddle.

How to unwind jsonb array into object per jsonb column based on object id?

Given an existing data structure similar to the following:
CREATE TEMP TABLE sample (id int, metadata_array jsonb, text_id_one jsonb, text_id_two jsonb);
INSERT INTO sample
VALUES ('1', '[{"id": "textIdOne", "data": "foo"},{"id": "textIdTwo", "data": "bar"}]'), ('2', '[{"id": "textIdOne", "data": "baz"},{"id": "textIdTwo", "data": "fiz"}]');
I'm trying to unwind the jsonb array of objects from an existing metadata column into new jsonb columns in the same table; that I've already created based on the known fixed list of id keys being textIdOne, textIdTwo, etc.
I thought I was close using jsonb_populate_recordset() but then realized that will populate columns per all the jsonb object's keys; not what I want. Desired result is object per column based on object id.
The only other tricky part of this operation is that my JSON object's id values use camelCase and it seems one should avoid quoted/cased column names, BUT I don't mind quoting or modifying the column names as a means to an end & once the update query is completed I can manually change the column names as needed.
I'm using PostgreSQL 9.5.2
Existing data & structure:
id | metadata_array jsonb | text_id_one jsonb | text_id_two jsonb
---------------------------------------------------------------------------------------------
1 | [{"id": "textIdOne"...}, {"id": "textIdTwo"...}] | NULL | NULL
2 | [{"id": "textIdOne"...}, {"id": "textIdTwo"...}] | NULL | NULL
Desired result:
id | metadata_array jsonb | text_id_one jsonb | text_id_two jsonb
-------------------------------------------------------------------------------
1 | [{"id": "textIdOne",... | {"id": "textIdOne"...} | {"id": "textIdTwo"...}
2 | [{"id": "textIdOne",... | {"id": "textIdOne"...} | {"id": "textIdTwo"...}
Clarifications:
Thanks for the answers thus far everyone! Though I do know the complete list of keys (about 9) I cannot count on the ordering being consistent.
If all of the json arrays contain two elements for the two new columns then use fixed paths like in dmfay's answer. Otherwise you should unnest the arrays using jsonb_array_elements() twice, for text_id_one and text_id_two separately.
update sample t set
text_id_one = value1,
text_id_two = value2
from sample s,
jsonb_array_elements(s.metadata_array) as e1(value1),
jsonb_array_elements(s.metadata_array) as e2(value2)
where s.id = t.id
and value1->>'id' = 'textIdOne'
and value2->>'id' = 'textIdTwo'
returning t.*
Test the query in SqlFiddle.
In case of more than two elements of the arrays this variant may be more efficient (and more convenient too):
update sample t
set
text_id_one = arr1->0,
text_id_two = arr2->0
from (
select
id,
jsonb_agg(value) filter (where value->>'id' = 'textIdOne') as arr1,
jsonb_agg(value) filter (where value->>'id' = 'textIdTwo') as arr2
from sample,
jsonb_array_elements(metadata_array)
group by id
) s
where t.id = s.id
returning t.*
SqlFiddle.
You said the id list is 'fixed'; is the ordering of objects in metadata_array consistent? You can do that with plain traversal:
UPDATE sample
SET text_id_one = metadata_array->0,
text_id_two = metadata_array->1;