Get Postgres Table Data as Json Without Field Names - sql

I want to convert Postgres table data to JSON without repeated field names at JSON result. When I use current PostgreSQL json functions, JSON results look likes similar to this : [{"id":"1234","name":"XYZ"},....]. But by this way, all field names unnecessarily exists on every row. So we does not prefer this way for the network bandwith.
We want to get JSON result such as [["1234","XYZ"],....]. So total length of result json string will be much smaller.

Well, you could use json(b)_build_array() to turn each record to an array - this requires you to enumerate the column names:
select jsonb_build_array(id, name) js from mytable
If you want all rows in a single array of arrays, then you can use aggregation on top of this:
select jsonb_agg(jsonb_build_array(id, name)) res from mytable
Demo on DB Fiddle:
select jsonb_agg(jsonb_build_array(id, name)) res
from (values(1, 'foo'), (2, 'bar')) as t(id, name)
| res |
| :----------------------- |
| [[1, "foo"], [2, "bar"]] |

Related

how to get the first tuple in a string column using presto

so i am having a column in the table, the data type of the column is varchar, but it contains an array of tuples, so what I need is to extract the first tuple of the array in the table
this is the original table
userid
comments
1
[["hello world",1],["How did you",1],[" this is the one",1]]
2
[["hello ",1],["How ",1],[" this",1]]
and this is what i am looking for , please notice that the datatype of 'comments' column is varchar.
userid
comments
1
hello world
2
hello
json_extract_scalar should do the trick:
WITH dataset (userid, comments) AS (
VALUES (1, json '[["hello world",1],["How did you",1],[" this is the one",1]]'),
(2, json '[["hello ",1],["How ",1],[" this",1]]')
)
--query
select userid,
json_extract_scalar(comments, '$[0][0]')
from dataset
Output:
userid
comments
1
hello world
2
hello
Note that it will allow to extract only single value, if you want multiple values you will need to do some casting (similar to one done here but using arrays, for example array(json)).

PostgreSQL - Create an array of jsonb using group by clause

I have an SQL query that returns the data attached in the image. I would like to group the information and basically have just 2 rows (in this particular case). The row #1 would be: "evaluation_id": 1, "nombre": "Método: Jackson, Pollock & Ward" and "formulario" should be an array of json objects (in this case, records: 1,2,3 and 4). The row #2 would be: "evaluation_id": 2, "nombre": "Método: Medición Antropométrica Estándar" and "formulario" should be an array with only one json object (record #5).
evaluation_id
nombre
formulario
1
Jackson, Pollock & Ward
[{json1},{json2},{json3},{json4}]
2
Medición Antropométrica Estándar
[{json5}]
I have tried with functions like: array_to_json, json_object, json_aggr, json_array_elements but I couldn't make it work.
Please check the output of this query your expected
-- if use json type
select evaluation_id, nombre, json_agg(formulario)
from your_table
group by 1, 2
-- if use jsonb type
select evaluation_id, nombre, jsonb_agg(formulario)
from your_table
group by 1, 2

How to get unique values from a column of Arrays PostgreSQL

I am trying to extract DISTINCT value of a column of arrays.
For example, If I have two rows:
{jonathan,michelle}
{jonathan,michael}
The output should be:
{jonathan,michelle,michael}
The output can be an array or a "virtual column" it is not a problem.
The following seems to do the trick:
SELECT DISTINCT(unnest(tbl.ar)) FROM tbl
You can unnest and aggregate back, ignoring duplicates:
select array_agg(distinct u.val) new_ar
from mytable t
cross join lateral unnest(t.ar) as u(val)
Note that this does not guarantee the order in which elements will appear in the final array (there are options, but you did not specify what you wanted in that regard).
Demo on DB Fiddle:
| new_ar |
| :-------------------------- |
| {jonathan,michael,michelle} |

BigQuery - nested json - select where nested item equals

Having the following table in BigQuery database, where the f0_
Row | f0_
1 | {"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}
2 | {"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}
3 | {"configuration":[{"param1":"value1"},{"param2":[3.0,36]}]}
4 | {"configuration":[{"param1":"value1"},{"param2":[3.0,46]}]}
5 | {"configuration":[{"param1":"value1"},{"param2":[3.0,30]}]}
6 | {"configuration":[{"param1":"value1"}]}
f0_ column is a pure string.
Is there a way to write a select query, where the "param2" value is equal to [3.0, 45] array meaning it would only return rows 1 and 2? Preferably would be great to accomplish it without directly indexing the first element in the "configuration" array as the order might not be guaranteed.
Below is for BigQuery Standrad SQL
#standardSQL
SELECT line
FROM `project.dataset.table`
WHERE REGEXP_EXTRACT(JSON_EXTRACT(line, '$.configuration'), r'{"param2":(.*?)}') = '[3.0,45]'
You can test, play with above using sample data from your question as in example below
#standardSQL
WITH `project.dataset.table` AS (
SELECT '{"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}' line UNION ALL
SELECT '{"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}' UNION ALL
SELECT '{"configuration":[{"param1":"value1"},{"param2":[3.0,36]}]}' UNION ALL
SELECT '{"configuration":[{"param1":"value1"},{"param2":[3.0,46]}]}' UNION ALL
SELECT '{"configuration":[{"param1":"value1"},{"param2":[3.0,30]}]}' UNION ALL
SELECT '{"configuration":[{"param1":"value1"}]}'
)
SELECT line
FROM `project.dataset.table`
WHERE REGEXP_EXTRACT(JSON_EXTRACT(line, '$.configuration'), r'{"param2":(.*?)}') = '[3.0,45]'
with result
Row line
1 {"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}
2 {"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}
Preferably would be great to accomplish it without directly indexing the first element in the "configuration" array as the order might not be guaranteed.
Note: this solution does not depend on position of "param2" in the configuration array
You can use some of BQ's neat JSON functions as described here.
Based on that, you can locate param2 and check if its value matches what you're looking for. If you aren't sure of the configuration order, you can iterate through the array to find param2, but it's not particularly efficient. I recommend you try to find a way where param2 is always the second field in the array. I was able to get the correct results like so:
SELECT json_text AS correct_configurations
FROM UNNEST([
'{"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}',
'{"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}',
'{"configuration":[{"param1":"value1"},{"param2":[3.0,36]}]}',
'{"configuration":[{"param1":"value1"},{"param2":[3.0,46]}]}',
'{"configuration":[{"param1":"value1"},{"param2":[3.0,30]}]}',
'{"configuration":[{"param1":"value1"}]}'
])
AS json_text
WHERE JSON_EXTRACT(json_text, '$.configuration[1].param2') LIKE "[3.0,45]";
Gives a result of:
Row | correct_configurations
1 | {"configuration":[{"param1":"value1"},{"param2":[3.0,45]}]}
2 | {"configuration":[{"param1":"value2"},{"param2":[3.0,45]}]}

How to get index of an array value in PostgreSQL?

I have a table called pins like this:
id (int) | pin_codes (jsonb)
--------------------------------
1 | [4000, 5000, 6000]
2 | [8500, 8400, 8600]
3 | [2700, 2300, 2980]
Now, I want the row with pin_code 8600 and with its array index. The output must be like this:
pin_codes | index
------------------------------
[8500, 8500, 8600] | 2
If I want the row with pin_code 2700, the output :
pin_codes | index
------------------------------
[2700, 2300, 2980] | 0
What I've tried so far:
SELECT pin_codes FROM pins WHERE pin_codes #> '[8600]'
It only returns the row with wanted value. I don't know how to get the index on the value in the pin_codes array!
Any help would be great appreciated.
P.S:
I'm using PostgreSQL 10
If you were storing the array as a real array not as a json, you could use array_position() to find the (first) index of a given element:
select array_position(array['one', 'two', 'three'], 'two')
returns 2
With some text mangling you can cast the JSON array into a text array:
select array_position(translate(pin_codes::text,'[]','{}')::text[], '8600')
from the_table;
The also allows you to use the "operator"
select *
from pins
where '8600' = any(translate(pin_codes::text,'[]','{}')::text[])
The contains #> operator expects arrays on both sides of the operator. You could use it to search for two pin codes at a time:
select *
from pins
where translate(pin_codes::text,'[]','{}')::text[] #> array['8600','8400']
Or use the overlaps operator && to find rows with any of multiple elements:
select *
from pins
where translate(pin_codes::text,'[]','{}')::text[] && array['8600','2700']
would return
id | pin_codes
---+-------------------
2 | [8500, 8400, 8600]
3 | [2700, 2300, 2980]
If you do that a lot, it would be more efficient to store the pin_codes as text[] rather then JSON - then you can also index that column to do searches more efficiently.
Use the function jsonb_array_elements_text() using with ordinality.
with my_table(id, pin_codes) as (
values
(1, '[4000, 5000, 6000]'::jsonb),
(2, '[8500, 8400, 8600]'),
(3, '[2700, 2300, 2980]')
)
select id, pin_codes, ordinality- 1 as index
from my_table, jsonb_array_elements_text(pin_codes) with ordinality
where value::int = 8600;
id | pin_codes | index
----+--------------------+-------
2 | [8500, 8400, 8600] | 2
(1 row)
As has been pointed out previously the array_position function is only available in Postgres 9.5 and greater.
Here is custom function that achieves the same, derived from nathansgreen at github.
-- The array_position function was added in Postgres 9.5.
-- For older versions, you can get the same behavior with this function.
create function array_position(arr ANYARRAY, elem ANYELEMENT, pos INTEGER default 1) returns INTEGER
language sql
as $BODY$
select row_number::INTEGER
from (
select unnest, row_number() over ()
from ( select unnest(arr) ) t0
) t1
where row_number >= greatest(1, pos)
and (case when elem is null then unnest is null else unnest = elem end)
limit 1;
$BODY$;
So in this specific case, after creating the function the following worked for me.
SELECT
pin_codes,
array_position(pin_codes, 8600) AS index
FROM pins
WHERE array_position(pin_codes, 8600) IS NOT NULL;
Worth bearing in mind that it will only return the index of the first occurrence of 8600, you can use the pos argument to index which ever occurrence that you like.
In short, normalize your data structure, or don't do this in SQL. If you want this index of the sub-data element given your current data structure, then do this in your application code (take result, cast to list/array, get index).
Try to unnest the string and assign numbers as follows:
with dat as
(
select 1 id, '8700, 5600, 2300' pins
union all
select 2 id, '2300, 1700, 1000' pins
)
select dat.*, t.rn as index
from
(
select id, t.pins, row_number() over (partition by id) rn
from
(
select id, trim(unnest(string_to_array(pins, ','))) pins from dat
) t
) t
join dat on dat.id = t.id and t.pins = '2300'
If you insist on storing Arrays, I'd defer to klins answer.
As the alternative answer and extension to my comment...don't store SQL data in arrays. 'Normalize' your data in advance and SQL will handle it significantly better. Klin's answer is good, but may suffer for performance as it's outside of what SQL does best.
I'd break the Array prior to storing it. If the number of pincodes is known, then simply having the table pin_id,pin1,pin2,pin3, pinetc... is functional.
If the number of pins is unknown, a first table as pin that stored the pin_id and any info columns related to that pin ID, and then a second table as pin_id, pin_seq,pin_value is also functional (though you may need to pivot this later on to make sense of the data). In this case, select pin_seq where pin_value = 260 would work.