how to convert array elements from string to int in postgres? - sql

I have a column in the table as jsonb [14,21,31]
and I want to get all the rows with selected element eg
SELECT *
FROM t_customers
WHERE tags ?| array['21','14']
but the jsonb elements are in integer format
how do i convert the sql array elements into integer
i tried removing the quotes from the array but it gives an error

A naive solution would be:
t=# with t_customers(tags) as (values('[14,21,31]'::jsonb))
select
tags
, translate(tags::text,'[]','{}')::int[] jsonb_int_to_arr
, translate(tags::text,'[]','{}')::int[] #> array['21','14']::int[] includes
from
t_customers;
tags | jsonb_int_to_arr | includes
--------------+------------------+----------
[14, 21, 31] | {14,21,31} | t
(1 row)
https://www.postgresql.org/docs/current/static/functions-array.html
if you want to cast as array - you should use #> operator to check if contains.
(at first I proposed it because I misunderstood the question - so it goes the opposite way, "turning" jsonb to array and checking if it contains, but now maybe this naive approach is the shortest)
the right approach here probably would be:
t=# with t_customers(tags) as (values('[14,21,31]'::jsonb))
, t as (select tags,jsonb_array_elements(tags) from t_customers)
select jsonb_agg(jsonb_array_elements::text) ?| array['21','14'] tags from t group by tags;
tags
------
t
(1 row)
which is basically "repacking" jsonb array with text representations of integers

Related

Search strings in json array column

I have this PostgreSQL table:
id | something
1 | ["something1", "something2", "something3"]
2 | ["something1"]
3 | ["something2", "something4"]
I am using this query to get all the datas having the string something1 in the something column:
select * from my_table where (something)::jsonb ? 'something1'
How can i modify (or also there's a better way) this query to get all the datas that contains something1 OR something2?
You can use ?:
where something::jsonb ? 'something1'
To check for several possible values, use ?| against a text array:
where something::jsonb ?| array['something1', 'something2']
This checks if any value from the array exists in the jsonb array. If you want to check if all array elements exist in the jsonb payload, then use ?& instead.

Remove numeric item from JsonB array

I have jsonb value with a nested JSON array and need remove an element:
{"values": ["11", "22", "33"]}
jsonb_set(column_name, '{values}', ((column_name -> 'values') - '33')) -- WORKS!
I also have a similar jsonb value with numbers, not strings:
{"values": [11, 22, 33]}
jsonb_set(column_name, '{values}', ((column_name -> 'values') - 33)) -- FAILS!
In this case 33 is used as index of the array.
How to remove items from JSON array when those items are numbers?
Two assertions:
Many Postgres JSON functions and operators target the key in key/value pairs. Strings ("abc" or "33") in JSON arrays are treated like keys without value. But numeric (33 or 123.45) array elements are treated as values.
There are currently three variants of the - operator. Two of them apply here. As the recently clarified manual describes (currently /devel):
Operator
Description
Example(s)
:---------------------
jsonb - text → jsonb
Deletes a key (and its value) from a JSON object, or matching string value(s) from a JSON array.
'{"a": "b", "c": "d"}'::jsonb - 'a' → {"c": "d"}
'["a", "b", "c", "b"]'::jsonb - 'b' → ["a", "c"]
...
jsonb - integer → jsonb
Deletes the array element with specified index (negative integers count from the end).
Throws an error if JSON value is not an array.
'["a", "b"]'::jsonb - 1 → ["a"]
With the right operand being a numeric literal, Postgres operator type resolution arrives at the later variant.
Unfortunately, we cannot use the former variant to begin with, due to assertion 1.
So we have to use a workaround like:
SELECT jsonb_set(column_name
, '{values}'
, (SELECT jsonb_agg(val)
FROM jsonb_array_elements(t.column_name -> 'values') x(val)
WHERE val <> jsonb '33')
) AS column_name
FROM tbl t;
db<>fiddle here -- with extended test case
Do not cast unnested elements to integer (like another answer suggests).
Numeric values may not fit integer.
JSON arrays (unlike Postgres arrays) can hold a mix of element types. So some array elements may be numeric, but others string, etc.
It's more expensive to cast all array elements (on the left). Just cast the value to replace (on the right).
So this works for any types, not just integer (JSON numeric). Example:
'{"values": ["abc", "22", 33]}')
Unfortunately, Postgres json operator - only supports string values, as explained in the documentation:
operand: -
right operand type: text
description: Delete key/value pair or string element from left operand. Key/value pairs are matched based on their key value.
On the other hand, if you pass an integer value as right operand, Postgres considers it the index of the array element that needs to be removed.
An alternative option is to unnest the array with jsonb_array_elements() and a lateral join, filter out the unwanted value, then re-aggregate:
select jsonb_set(column_name, '{values}', new_values) new_column_name
from mytable t
left join lateral (
select jsonb_agg(val) new_values
from jsonb_array_elements(t.column_name -> 'values') x(val)
where val::int <> 33
) x on 1 = 1
Demo on DB Fiddle:
with mytable as (select '{"values": [11, 22, 33]}'::jsonb column_name)
select jsonb_set(column_name, '{values}', new_values) new_column_name
from mytable t
left join lateral (
select jsonb_agg(val) new_values
from jsonb_array_elements(t.column_name -> 'values') x(val)
where val::int <> 33
) x on 1 = 1
| new_column_name |
| :------------------- |
| {"values": [11, 22]} |

unnest() not exploding array, returns error Column alias list has 1 entries but 't' has 2 columns available

I have some json data which includes a property 'characters' and it looks like this:
select json_data['characters'] from latest_snapshot_events
Returns: [{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]
This is returned on a single row. I would like a single row for each item within the array.
I found several SO posts and other blogs advising me to use unnest(). I've tried this several times and cannot get a result to return. For example, here is the documentation from presto. The bottom covers unnest as a stand in for hive's lateral view explode:
SELECT student, score
FROM tests
CROSS JOIN UNNEST(scores) AS t (score);
So I tried to apply this to my table:
characters as (
select
jdata.characters
from latest_snapshot_events
cross join unnest(json_data) as t(jdata)
)
select * from characters;
where json_data is the field in latest_snapshot_events that contains the the property 'characters' which is an array like the one shown above.
This returns an error:
[Simba]AthenaJDBC An error has been thrown from the AWS Athena client. SYNTAX_ERROR: line 69:12: Column alias list has 1 entries but 't' has 2 columns available
How can I unnest/explode latest_snapshot_events.json_data['characters'] onto multiple rows?
Since characters is a JSON array in textual representation, you'll have to:
Parse the JSON text with json_parse to produce a value of type JSON.
Convert the JSON value into a SQL array using CAST.
Explode the array using UNNEST.
For instance:
WITH data(characters) AS (
VALUES '[{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]'
)
SELECT entry
FROM data, UNNEST(CAST(json_parse(characters) AS array(json))) t(entry)
which produces:
entry
-----------------------------------------------------------------------
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,...
In the example above, I convert the JSON value into an array(json), but
you can further convert it to something more concrete if the values inside each
array entry have a regular schema. For example, for your data, it is
possible to cast it to an array(map(varchar, json)) since every element in the
array is a JSON object.
json_parse works if your initial data is a JSON string. However, for array(row) types (i.e. an array of objects/dictionaries), casting to array(json) will convert each row into an array, removing all keys from the object and preventing you from using dot notation or json_extract functions.
To unnest array(row) data, the syntax is much simpler:
CROSS JOIN UNNEST(my_array) AS my_row
I got stuck with this error trying to unpivot data.
This might help someone:
SELECT a_col, b_col
FROM
(
SELECT MAP(
ARRAY['a', 'b', 'c', 'd'],
ARRAY[1, 2, 3, 4]
) my_col
) CROSS JOIN UNNEST(my_col) as t(a_col, b_col)
t() allows you define multiple columns as outputs.

Get max on comma separated values in column

How to get max on comma separated values in Original_Ids column and get max value in one column and remaining ids in different column.
|Original_Ids | Max_Id| Remaining_Ids |
|123,534,243,345| 534 | 123,234,345 |
Upadte -
If I already have Max_id and just need below equation?
Remaining_Ids = Original_Ids - Max_id
Thanks
Thanks to the excellent possibilities of array manipulation in Postgres, this could be done relatively easy by converting the string to an array and from there to a set.
Then regular queries on that set are possible. With max() the maximum can be selected and with EXCEPT ALL the maximum can be removed from the set.
A set can then be converted to an array and with array_to_string() and the array can be converted to a delimited string again.
SELECT ids original_ids,
(SELECT max(un.id::integer)
FROM unnest(string_to_array(ids,
',')) un(id)) max_id,
array_to_string(ARRAY((SELECT un.id::integer
FROM unnest(string_to_array(ids,
',')) un(id)
EXCEPT ALL
SELECT max(un.id::integer)
FROM unnest(string_to_array(ids,
',')) un(id))),
',') remaining_ids
FROM elbat;
Another option would have been regexp_split_to_table() which directly produces a set (or regexp_split_to_array() but than we'd had the possible regular expression overhead and still had to convert the array to a set).
But nevertheless you just should (almost) never use delimited lists (nor arrays). Use a table, that's (almost) always the best option.
SQL Fiddle
You can use a window function (https://www.postgresql.org/docs/current/static/tutorial-window.html) to get the max element per unnested array. After that you can reaggregate the elements and remove the calculated max value from the array.
Result:
a max_elem remaining
123,534,243,345 534 123,243,345
3,23,1 23 3,17
42 42
56,123,234,345,345 345 56,123,234
This query needs only one split/unnest as well as only one max calculation.
SELECT
a,
max_elem,
array_remove(array_agg(elements), max_elem) as remaining -- C
FROM (
SELECT
*,
MAX(elements) OVER (PARTITION BY a) as max_elem -- B
FROM (
SELECT
a,
unnest((string_to_array(a, ','))::int[]) as elements -- A
FROM arrays
)s
)s
GROUP BY a, max_elem
A: string_to_array converts the string list into an array. Because the arrays are treated as string arrays you need the cast them into integer arrays by adding ::int[]. The unnest() expands all array elements into own rows.
B: window function MAX gives the maximum value of the single arrays as max_elem
C: array_agg reaggregates the elements through the GROUP BY id. After that array_remove removes the max_elem value from the array.
If you do not like to store them as pure arrays but as string list again you could add array_to_string. But I wouldn't recommend this because your data are integer arrays and not strings. For every further calculation you would need this string cast. A even better way (as already stated by #stickybit) is not to store the elements as arrays but as unnested data. As you can see in nearly every operation should would do the unnest before.
Note:
It would be better to use an ID to adress the columns/arrays instead of the origin string as in SQL Fiddle with IDs
If you install the extension intarray this is quite easy.
First you need to create the extension (you have to be superuser to do that):
create extension intarray;
Then you can do the following:
select original_ids,
original_ids[1] as max_id,
sort(original_ids - original_ids[1]) as remaining_ids
from (
select sort_desc(string_to_array(original_ids,',')::int[]) as original_ids
from bad_design
) t
But you shouldn't be storing comma separated values to begin with

Remove n elements from array using start and end index

I have the following table in a Postgres database:
CREATE TABLE test(
id SERIAL NOT NULL,
arr int[] NOT NULL
)
And the array contains about 500k elements.
I would like to know if there is an efficient way to update arr column by removing a set of elements from the array given the start and end index or just the number of "n first elements" to remove.
You can access individual elements or ranges of elements:
If you e.g. want to remove elements 5 to 8, you can do:
select arr[1:4]||arr[9:]
from test;
or as an update:
update test
set arr = arr[1:4]||arr[9:];
To remove the "first n elements", just use the slice after the n+1 element, e.g. to get remove the first 5 elements:
select arr[6:]
from test;
The syntax arr[6:] requires Postgres 9.6 or later, for earlier versions you need
select arr[6:cardinality(arr)]
from test;
cardinality() was introduced in 9.4, if you are using an even older version, you need:
select arr[6:array_lengt(arr,1)]
from test;
You can use slices (see 8.15.3. Accessing Arrays).
create table example
as select array[1,2,3,4,5,6,7,8] arr;
Remove first 3 elements:
select arr[4:8]
from example;
arr
-------------
{4,5,6,7,8}
(1 row)
Remove elements from 4 to 5:
select arr[1:3] || arr[6:8] as arr
from example;
arr
---------------
{1,2,3,6,7,8}
(1 row)
Remove first 5 elements if the length of the array is unknown:
select arr[6:array_length(arr,1)]
from example;
arr
---------
{6,7,8}
(1 row)