PostgreSQL get json as record - sql

I would like to be able to get a json object as a record.
SELECT select row_number() OVER () AS gid, feature->'properties' FROM dataset
The output of this query looks like this:
gid
?column? json
1
{"a": "1", "b":"2", "c": "3"}
2
{"a": "3", "b":"2", "c": "1"}
3
{"a": "1"}
The desired result :
gid
a
b
c
1
1
2
3
2
3
2
1
3
1
null
null
I can't use json_to_record because i don't know the number of fields. However, all my fields are text.

There is no generic way to do that, because the number, type and name of columns have to be known at query parse time. So you would have to do:
SELECT row_number() OVER () AS gid,
CAST(feature #>> '{properties,a}' AS integer) AS a,
CAST(feature #>> '{properties,b}' AS integer) AS b,
CAST(feature #>> '{properties,c}' AS integer) AS c
FROM dataset;
Essentially, you have to know the columns ahead of time and hard code them in the query.

Related

Is there a way to search for an element inside a JSON array using sql in snowflake?

I have a column in a table which contains, for each row, a JSONarray. I need to extract certain the same elements from it for each row, however, as it is an array, the order of the elements inside the array is not always the same and I can't call these elements by their names.Is there a way for me to do a for loop or something similar that goes through every index of the array and when it doesn't return null it breaks?
An extension to Lukasz's great answer:
With a CTE with a couple of rows of "id, json" we can see how FLATTEN pulls it apart:
WITH fake_data(id, json) as (
SELECT column1, parse_json(column2) FROM VALUES
(1, '[1,2,3]'),
(2, '{"4":4, "5":5}')
)
SELECT t.*
,f.*
FROM fake_data AS t
,LATERAL FLATTEN(INPUT => t.json) f
ID
JSON
SEQ
KEY
PATH
INDEX
VALUE
THIS
1
[ 1, 2, 3 ]
1
[0]
0
1
[ 1, 2, 3 ]
1
[ 1, 2, 3 ]
1
[1]
1
2
[ 1, 2, 3 ]
1
[ 1, 2, 3 ]
1
[2]
2
3
[ 1, 2, 3 ]
2
{ "4": 4, "5": 5 }
2
4
['4']
4
{ "4": 4, "5": 5 }
2
{ "4": 4, "5": 5 }
2
5
['5']
5
{ "4": 4, "5": 5 }
The Flatten gives seq, key, path, index, value and this
Seq : is the row of the input, which is super useful if you are pulling rows apart and want to merge them back together, but not mix up different rows.
Key : is the name of the property if the thing being FLATTEN'ed was an object, which is the case for the second row.
Path : is how that value could be accessed. aka t.json[2] would with you 3
Index : is the step into the object if it's an array
Value: is the value
This: is the thing that getting looped, useful for get things like the next one, etc.
There is no need to know the size of array:
CREATE OR REPLACE TABLE tab_name
AS
SELECT 1 AS id, PARSE_JSON('[1,2,3]') AS col_array
UNION ALL
SELECT 2 AS id, PARSE_JSON('[1]') AS col_array;
Query:
SELECT t.id
,f.INDEX
,f.VALUE
FROM tab_name t
, LATERAL FLATTEN(INPUT => t.col_array) f
-- WHERE f.VALUE::INT = 1;
Output:
Lateral flatten can help extract the fields of a JSON object and is a very good alternative to extracting them one by one using the respective names. However, sometimes the JSON object can be nested and normally extracting those nested objects requires knowing their names.
Here is an article that might help you to DYNAMICALLY EXTRACT THE FIELDS OF A MULTI-LEVEL JSON OBJECT USING LATERAL FLATTEN

SQL select all rows within a group by

With the following table design:
Table "devices"
model
serial_number
active
A
11111
1
A
22222
1
A
33333
1
A
44444
0
B
XXXXX
1
B
YYYYY
1
I would like to retrieve the model, a count of the number of active devices (active = 1) for each model, and a list of all serial numbers for each model.
Expected output would be something like this:
[{
"model": "A",
"count": 3,
"serials": ["11111", "22222", "33333"]
}, {
"model": "B",
"count": 2,
"serials": ["XXXXX", "YYYYY"]
}]
I am able to retrieve the (grouped) models and count but how do I get the serial numbers?
SELECT model, count(*) as count
FROM devices
WHERE active = 1
GROUP BY model
I suspect I need a sub-query but I can't wrap my head around this. Thanks.
You can try to use FOR JSON PATH with STRING_AGG function which will group connect string serial_number from each model
QUOTENAME will help you make [] array brackets
SELECT model,
count(*) 'count',
JSON_QUERY(QUOTENAME(STRING_AGG('"' + STRING_ESCAPE(serial_number, 'json') + '"', ','))) serial_number
FROM devices
WHERE active = 1
GROUP BY model
FOR JSON PATH
sqlfiddle
If you don't want to get JSON result you might use STRING_AGG function directly.
SELECT model,
count(*) 'count',
STRING_AGG(serial_number, ',') serial_number
FROM devices
WHERE active = 1
GROUP BY model

Expand nested JSONB data in SELECT query result

Assuming I have a table with JSONB data:
create table x (
id integer primary key generated always as identity,
name text,
data jsonb
);
Assuming data can have nested data, I would like to display all data inside data to have this kind of result:
id name data.a data.b.0 data.b.1 data.c
1 test 1 foo bar baz
2 test2 789 pim pam boom
Is there a way to do this without specifying all the JSONB properties names?
JSONB_TO_RECORDSET() function might be used within such a Select statement
SELECT a AS "data.a",
(b::JSONB) ->> 0 AS "data.b.0", (b::JSONB) ->> 1 AS "data.b.1",
c AS "data.c"
FROM x,
JSONB_TO_RECORDSET(data) AS j(a INT, b TEXT, c TEXT)
ORDER BY id
Presuming you have such JSONB values in the data column
[ { "a": 1, "b": ["foo","bar"], "c": "baz" }]
[ { "a": 789, "b": ["pim","pam"], "c": "boom" }]
Demo

EAV on postgres using jsonb

I'm trying to implement EAV pattern using Attribute->Value tables but unlike standard way values stored in jsonb filed like {"attrId":[values]}. It's help make easy search request like:
SELECT * FROM products p WHERE p.attributes #> "{1:[2]} AND p.attributes #> "{1:[4]}"
Now I'm wondering is it will be a good approach, and what is a effective way to calculate count of available variations, for example:
-p1- {"width":[1]}
-p2- {"width":[2],"height":[3]}
-p3- {"width":[1]}
Output will
width: 1 (count 2); 2 (count 1)
height: 3 (count 1)
when select width 2
width: 1 (count 0); 2 (count 1)
height: 3 (count 1)
"Flat is better than nested" -- the zen of python
I think you would be better served to use simple key/value pairs and in the rare event you have a complex value, then make it a list. But I don't see that use case.
Here is an example which answers your question. It could be modified to use your structure, but let's keep it simple:
First create a table and insert some JSON:
# create table foo (a jsonb);
# insert into foo values ('{"a":"1", "b":"2"}');
# insert into foo values ('{"c":"3", "d":"4"}');
# insert into foo values ('{"e":"5", "a":"6"}');
Here are the records:
# select * from foo;
a
----------------------
{"a": "1", "b": "2"}
{"c": "3", "d": "4"}
{"a": "6", "e": "5"}
(3 rows)
Here is the output of the json_each_text() function from https://www.postgresql.org/docs/9.6/static/functions-json.html
# select jsonb_each_text(a) from foo;
jsonb_each_text
-----------------
(a,1)
(b,2)
(c,3)
(d,4)
(a,6)
(e,5)
(6 rows)
Now we need to put it in a table expression to be able to get access to the individual fields:
# with t1 as (select jsonb_each_text(a) as rec from foo)
select (rec).key, (rec).value from t1;
key | value
-----+-------
a | 1
b | 2
c | 3
d | 4
a | 6
e | 5
(6 rows)
And lastly here is a grouping with the SUM function. Notice the a key which was in the database 2x, has been properly summed.
# with t1 as (select jsonb_each_text(a) as rec from foo)
select (rec).key, sum((rec).value::int) from t1 group by (rec).key;
key | sum
-----+-----
c | 3
b | 2
a | 7
e | 5
d | 4
(5 rows)
As a final note, (rec) has parentheses around it because otherwise it is incorrectly looked at as a table and will result in this error:
ERROR: missing FROM-clause entry for table "rec"

Find duplicated values on array column

I have a table with a array column like this:
my_table
id array
-- -----------
1 {1, 3, 4, 5}
2 {19,2, 4, 9}
3 {23,46, 87, 6}
4 {199,24, 93, 6}
And i want as result what and where is the repeated values, like this:
value_repeated is_repeated_on
-------------- -----------
4 {1,2}
6 {3,4}
Is it possible? I don't know how to do this. I don't how to start it! I'm lost!
Use unnest to convert the array to rows, and then array_agg to build an array from the ids
It should look something like this:
SELECT v AS value_repeated,array_agg(id) AS is_repeated_on FROM
(select id,unnest(array) as v from my_table)
GROUP by v HAVING Count(Distinct id) > 1
Note that HAVING Count(Distinct id) > 1 is filtering values that don't appear even once
The clean way to call a set-returning function like unnest() is in a LATERAL join, available since Postgres 9.3:
SELECT value_repeated, array_agg(id) AS is_repeated_on
FROM my_table
, unnest(array_col) value_repeated
GROUP BY value_repeated
HAVING count(*) > 1
ORDER BY value_repeated; -- optional
About LATERAL:
Call a set-returning function with an array argument multiple times
There is nothing in your question to rule out shortcut duplicates (the same element more than once in the same array (like I#MSoP commented), so it must be count(*), not count (DISTINCT id).