Postgres SQL to query array text[] in specific element - sql

This is the data that I'm trying to query:
Table name: "test", column "data"
7;"{{Hello,50},{Wazaa,90}}"
8;"{{Hello,50},{"Dobar Den",15}}"
To query this data I'm using this SQL query:
SELECT *, pg_column_size(data) FROM test WHERE data[1][1] = 'Hello'
How I can search in all elements but in the first sub element and not in the second for example:
SELECT *, pg_column_size(data) FROM test WHERE data[][1] = 'Hello'
because if I search like this:
SELECT *, pg_column_size(data) FROM test WHERE data[1][1] = "Wazaa"
it won't return anything because I'm hardcoding to look at first sub element and I have to modify it like this:
SELECT *, pg_column_size(data) FROM test WHERE data[2][1] = 'Wazaa'
How to make it to check all parent elements and first sub element?
there is solution using "ANY" to query all elements but I don't want to touch second element in where statement because if I have numbers in first sub element it will query the second parameter which is also number.
SELECT * FROM test WHERE '90' = ANY (data);

PostgreSQL's support for arrays is not particularly good. You can unnest a 1-dimensional array easy enough, but a n-dimensional array is completely flattened, rather than only the first dimension. Still, you can use this approach to find the desired set of records, but it is rather ugly:
SELECT test.*, pg_column_size(test.data) AS column_size
FROM test
JOIN (SELECT id, unnest(data) AS strings FROM test) AS id_strings USING (id)
WHERE id_strings.strings = 'Wazaa';
Alternatively, write this function to reduce a 2-dimensional array into records of 1-dimensional arrays and then you can basically use all of the SQL queries in your question.

Related

How to check if array contains an item in JSON column using Sqlite?

I'm using sqlite to store JSON data that I have no control over. I have a logs table that looks like this.
id
value
s8i13s85e8f34zm8vikkcv5n
{"key":["a","b"]}
m2abxfn2n9pkyc9kjmko5462
{"key": "sometext"}
Then I use the following query to get the rows where value.key contains a:
SELECT * FROM logs WHERE EXISTS (SELECT * FROM json_each(json_extract(logs.value,'$.key')) WHERE json_each.value = 'a')
The query works fine if key is an array or if it doesn't exist. But it fails if is a string (like the second row of the table)
The error I get is:
SQL error or missing database (malformed JSON)
And it is because json_each throws if the parameter is an string.
Because of the requirements I can't control the user data or the queries.
Ideally I would like to figure out a query that either doesn't fail or that detects that the value is a string instead of an array and uses LIKE to see if the string contains 'a'.
Any help would be appreciated. Happy holidays :)
Use a CASE expression in the WHERE clause which checks if the value is an array or not:
SELECT *
FROM logs
WHERE CASE
WHEN value LIKE '{"key":[%]}' THEN
EXISTS (
SELECT *
FROM json_each(json_extract(logs.value,'$.key'))
WHERE json_each.value = 'a'
)
ELSE json_extract(value,'$.key') = 'a'
END;
See the demo.

Convert strings into table columns in biq query

I would like to convert this table
to something like this
the long string can be dynamic so it's important to me that it's not a fixed solution for these values specifically
Please help, i'm using big query
You could start by using SPLIT SPLIT(value[, delimiter]) to convert your long string into separate key-value pairs in an array.
This will be sensitive to you having commas as part of your values.
SPLIT(session_experiments, ',')
Then you could either FLATTEN that array or access each element, and then use some REGEXs to separate the key and the value.
If you share more context on your restrictions and intended result I could try and put together a query for you that does exactly what you want.
It's not possible what you want, however, there is a better practice for BigQuery.
You can use arrays of structs to store that information in a table.
Let's say you have a table like that
You can use that sample query to understand how to use it.
with rawdata AS
(
SELECT 1 as id, 'test1-val1,test2-val2,test3-val3' as experiments union all
SELECT 1 as id, 'test1-val1,test3-val3,test5-val5' as experiments
)
select
id,
(select array_agg(struct(split(param, '-')[offset(0)] as experiment, split(param, '-')[offset(1)] as value)) from unnest(split(experiments)) as param ) as experiments
from rawdata
The output will look like that:
After having that output, it's more convenient to manipulate the data

How to transfer a column in an array using PostgreSQL, when the columns data type is a composite type?

I'm using PostgreSQL 9.4 and I'm currently trying to transfer a columns values in an array. For "normal" (not user defined) data types I get it to work.
To explain my problem in detail, I made up a minimal example.
Let's assume we define a composite type "compo" and create a table "test_rel" and insert some values. Looks like this and works for me:
CREATE TYPE compo AS(a int, b int);
CREATE TABLE test_rel(t1 compo[],t2 int);
INSERT INTO test_rel VALUES('{"(1,2)"}',3);
INSERT INTO test_rel VALUES('{"(4,5)","(6,7)"}',3);
Next, we try to get an array with column t2's values. The following also works:
SELECT array(SELECT t2 FROM test_rel WHERE t2='3');
Now, we try to do the same stuff with column t1 (the column with the composite type). My problem is now, that the following does'nt work:
SELECT array(SELECT t1 FROM test_rel WHERE t2='3');
ERROR: could not find array type for data type compo[]
Could someone please give me a hint, why the same statement does'nt work with the composite type? I'm not only new to stackoverflow, but also to PostgreSQL and plpgsql. So, please tell me, when I'm doing something the wrong way.
There were some discussion about this in the PostgreSQL mailing list.
Long story short, both
select array(select array_type from ...)
select array_agg(array_type) from ...
represents a concept of array of arrays, which PostgreSQL doesn't support. PostgreSQL supports multidimensional arrays, but they have to be rectangular. F.ex. ARRAY[[0,1],[2,3]] is valid, but ARRAY[[0],[1,2]] is not.
There were some improvement with both the array constructor & the array_agg() function in 9.5.
Now, they explicitly states, that they will accumulate array arguments as a multidimensional array, but only if all of its parts have equal dimensions.
array() constructor: If the subquery's output column is of an array type, the result will be an array of the same type but one higher dimension; in this case all the subquery rows must yield arrays of identical dimensionality, else the result would not be rectangular.
array_agg(any array type): input arrays concatenated into array of one higher dimension (inputs must all have same dimensionality, and cannot be empty or NULL)
For 9.4, you could wrap the array into a row: this way, you could create something, which is almost an array of arrays:
SELECT array(SELECT ROW(t1) FROM test_rel WHERE t2='3');
SELECT array_agg(ROW(t1)) FROM test_rel WHERE t2='3';
Or, you could use a recursive CTE (and an array concatenation) to workaround the problem, like:
with recursive inp(arr) as (
values (array[0,1]), (array[1,2]), (array[2,3])
),
idx(arr, idx) as (
select arr, row_number() over ()
from inp
),
agg(arr, idx) as (
select array[[0, 0]] || arr, idx
from idx
where idx = 1
union all
select agg.arr || idx.arr, idx.idx
from agg
join idx on idx.idx = agg.idx + 1
)
select arr[array_lower(arr, 1) + 1 : array_upper(arr, 1)]
from agg
order by idx desc
limit 1;
But of course this solution is highly dependent of your data ('s dimensions).

Get an average value for element in column of arrays of json data in postgres

I have some data in a postgres table that is a string representation of an array of json data, like this:
[
{"UsageInfo"=>"P-1008366", "Role"=>"Abstract", "RetailPrice"=>2, "EffectivePrice"=>0},
{"Role"=>"Text", "ProjectCode"=>"", "PublicationCode"=>"", "RetailPrice"=>2},
{"Role"=>"Abstract", "RetailPrice"=>2, "EffectivePrice"=>0, "ParentItemId"=>"396487"}
]
This is is data in one cell from a single column of similar data in my database.
The datatype of this stored in the db is varchar(max).
My goal is to find the average RetailPrice of EVERY json item with "Role"=>"Abstract", including all of the json elements in the array, and all of the rows in the database.
Something like:
SELECT avg(json_extract_path_text(json_item, 'RetailPrice'))
FROM (
SELECT cast(json_items to varchar[]) as json_item
FROM my_table
WHERE json_extract_path_text(json_item, 'Role') like 'Abstract'
)
Now, obviously this particular query wouldn't work for a few reasons. Postgres doesn't let you directly convert a varchar to a varchar[]. Even after I had an array, this query would do nothing to iterate through the array. There are probably other issues with it too, but I hope it helps to clarify what it is I want to get.
Any advice on how to get the average retail price from all of these arrays of json data in the database?
It does not seem like Redshift would support the json data type per se. At least, I found nothing in the online manual.
But I found a few JSON function in the manual, which should be instrumental:
JSON_ARRAY_LENGTH
JSON_EXTRACT_ARRAY_ELEMENT_TEXT
JSON_EXTRACT_PATH_TEXT
Since generate_series() is not supported, we have to substitute for that ...
SELECT tbl_id
, round(avg((json_extract_path_text(elem, 'RetailPrice'))::numeric), 2) AS avg_retail_price
FROM (
SELECT *, json_extract_array_element_text(json_items, pos) AS elem
FROM (VALUES (0),(1),(2),(3),(4),(5)) a(pos)
CROSS JOIN tbl
) sub
WHERE json_extract_path_text(elem, 'Role') = 'Abstract'
GROUP BY 1;
I substituted with a poor man's solution: A dummy table counting from 0 to n (the VALUES expression). Make sure you count up to the maximum number of possible elements in your array. If you need this on a regular basis create an actual numbers table.
Modern Postgres has much better options, like json_array_elements() to unnest a json array. Compare to your sibling question for Postgres:
Can get an average of values in a json array using postgres?
I tested in Postgres with the related operator ->>, where it works:
SQL Fiddle.

Check if value exists in Postgres array

Using Postgres 9.0, I need a way to test if a value exists in a given array. So far I came up with something like this:
select '{1,2,3}'::int[] #> (ARRAY[]::int[] || value_variable::int)
But I keep thinking there should be a simpler way to this, I just can't see it. This seems better:
select '{1,2,3}'::int[] #> ARRAY[value_variable::int]
I believe it will suffice. But if you have other ways to do it, please share!
Simpler with the ANY construct:
SELECT value_variable = ANY ('{1,2,3}'::int[])
The right operand of ANY (between parentheses) can either be a set (result of a subquery, for instance) or an array. There are several ways to use it:
SQLAlchemy: how to filter on PgArray column types?
IN vs ANY operator in PostgreSQL
Important difference: Array operators (<#, #>, && et al.) expect array types as operands and support GIN or GiST indices in the standard distribution of PostgreSQL, while the ANY construct expects an element type as left operand and can be supported with a plain B-tree index (with the indexed expression to the left of the operator, not the other way round like it seems to be in your example). Example:
Index for finding an element in a JSON array
None of this works for NULL elements. To test for NULL:
Check if NULL exists in Postgres array
Watch out for the trap I got into: When checking if certain value is not present in an array, you shouldn't do:
SELECT value_variable != ANY('{1,2,3}'::int[])
but use
SELECT value_variable != ALL('{1,2,3}'::int[])
instead.
but if you have other ways to do it please share.
You can compare two arrays. If any of the values in the left array overlap the values in the right array, then it returns true. It's kind of hackish, but it works.
SELECT '{1}' && '{1,2,3}'::int[]; -- true
SELECT '{1,4}' && '{1,2,3}'::int[]; -- true
SELECT '{4}' && '{1,2,3}'::int[]; -- false
In the first and second query, value 1 is in the right array
Notice that the second query is true, even though the value 4 is not contained in the right array
For the third query, no values in the left array (i.e., 4) are in the right array, so it returns false
unnest can be used as well.
It expands array to a set of rows and then simply checking a value exists or not is as simple as using IN or NOT IN.
e.g.
id => uuid
exception_list_ids => uuid[]
select * from table where id NOT IN (select unnest(exception_list_ids) from table2)
Hi that one works fine for me, maybe useful for someone
select * from your_table where array_column ::text ilike ANY (ARRAY['%text_to_search%'::text]);
"Any" works well. Just make sure that the any keyword is on the right side of the equal to sign i.e. is present after the equal to sign.
Below statement will throw error: ERROR: syntax error at or near "any"
select 1 where any('{hello}'::text[]) = 'hello';
Whereas below example works fine
select 1 where 'hello' = any('{hello}'::text[]);
When looking for the existence of a element in an array, proper casting is required to pass the SQL parser of postgres. Here is one example query using array contains operator in the join clause:
For simplicity I only list the relevant part:
table1 other_name text[]; -- is an array of text
The join part of SQL shown
from table1 t1 join table2 t2 on t1.other_name::text[] #> ARRAY[t2.panel::text]
The following also works
on t2.panel = ANY(t1.other_name)
I am just guessing that the extra casting is required because the parse does not have to fetch the table definition to figure the exact type of the column. Others please comment on this.