Is there any way in MariaDB to search for less than value from array of json objects - sql

Here's my json doc:
[
{
"ID":1,
"Label":"Price",
"Value":399
},
{
"ID":2,
"Label":"Company",
"Value":"Apple"
},
{
"ID":2,
"Label":"Model",
"Value":"iPhone SE"
},
]
Here's my table:
+----+------------------------------------------------------------------------------------------------------------------------------------+
| ID | Properties |
+----+------------------------------------------------------------------------------------------------------------------------------------+
| 1 | [{"ID":1,"Label":"Price","Value":399},{"ID":2,"Label":"Company","Value":"Apple"},{"ID":3,"Label":"Model","Value":"iPhone SE"}] |
| 2 | [{"ID":1,"Label":"Price","Value":499},{"ID":2,"Label":"Company","Value":"Apple"},{"ID":3,"Label":"Model","Value":"iPhone X"}] |
| 3 | [{"ID":1,"Label":"Price","Value":699},{"ID":2,"Label":"Company","Value":"Apple"},{"ID":3,"Label":"Model","Value":"iPhone 11"}] |
| 4 | [{"ID":1,"Label":"Price","Value":999},{"ID":2,"Label":"Company","Value":"Apple"},{"ID":3,"Label":"Model","Value":"iPhone 11 Pro"}] |
+----+------------------------------------------------------------------------------------------------------------------------------------+
Here's what I want to search on search query:
SELECT *
FROM mobiles
WHERE ($.Label = "Price" AND $.Value < 400)
AND ($.Label = "Model" AND $.Value = "iPhone SE")
Above mentioned query is just for illustration purpose only. I just wanted to convey what I want to perform.
Also I know the table can be normalized into two. But this table is also a place holder table and let's just say it is going to stay the same.
I need to know if it's possible to query the given json structure for following operators: >, >=, <, <=, BETWEEN AND, IN, NOT IN, LIKE, NOT LIKE, <>

Since MariaDB does not support JSON_TABLE(), and JSON_PATH supports only member/object selector, it is not so straightforward to filter JSON here. You can try this query, that tries to overcome that limitations:
with a as (
select 1 as id, '[{"ID":1,"Label":"Price","Value":399},{"ID":2,"Label":"Company","Value":"Apple"},{"ID":3,"Label":"Model","Value":"iPhone SE"}]' as properties union all
select 2 as id, '[{"ID":1,"Label":"Price","Value":499},{"ID":2,"Label":"Company","Value":"Apple"},{"ID":3,"Label":"Model","Value":"iPhone X"}]' as properties union all
select 3 as id, '[{"ID":1,"Label":"Price","Value":699},{"ID":2,"Label":"Company","Value":"Apple"},{"ID":3,"Label":"Model","Value":"iPhone 11"}]' as properties union all
select 4 as id, '[{"ID":1,"Label":"Price","Value":999},{"ID":2,"Label":"Company","Value":"Apple"},{"ID":3,"Label":"Model","Value":"iPhone 11 Pro"}]' as properties
)
select *
from a
where json_value(a.properties,
/*Get path to Price property and replace property name to Value*/
replace(replace(json_search(a.properties, 'one', 'Price'), '"', ''), 'Label', 'Value')
) < 400
and json_value(a.properties,
/*And the same for Model name*/
replace(replace(json_search(a.properties, 'one', 'Model'), '"', ''), 'Label', 'Value')
) = "iPhone SE"
| id | properties
+----+------------
| 1 | [{"ID":1,"Label":"Price","Value":399},{"ID":2,"Label":"Company","Value":"Apple"},{"ID":3,"Label":"Model","Value":"iPhone SE"}]
db<>fiddle here.

I would not use string functions. What is missing in MariaDB is the ability to unnest the array to rows - but it has all the JSON accessors we need to access to the data. Using these methods rather than string methods avoids edge cases, for example when the values contain embedded double quotes.
You would typically unnest the array with the help of a table of numbers that has at least as many rows as there are elements in the biggest array. One method to generate that on the fly is row_number() against a table with sufficient rows - say sometable.
You can unnest the arrays as follows:
select t.id,
json_unquote(json_extract(t.properties, concat('$[', n.rn, '].Label'))) as label,
json_unquote(json_extract(t.properties, concat('$[', n.rn, '].Value'))) as value
from mytable t
inner join (select row_number() over() - 1 as rn from sometable) n
on n.rn < json_length(t.properties)
The rest is just aggregation:
select t.id
from (
select t.id,
json_unquote(json_extract(t.properties, concat('$[', n.rn, '].Label'))) as label,
json_unquote(json_extract(t.properties, concat('$[', n.rn, '].Value'))) as value
from mytable t
inner join (select row_number() over() - 1 as rn from sometable) n
on n.rn < json_length(t.properties)
) t
group by id
having
max(label = 'Price' and value + 0 < 400) = 1
and max(label = 'Model' and value = 'iPhone SE') = 1
Demo on DB Fiddle

Related

Redshift Postgresql - How to Parse Nested JSON

I am trying to parse a JSON text using JSON_EXTRACT_PATH_TEXT() function.
JSON sample:
{
"data":[
{
"name":"ping",
"idx":0,
"cnt":27,
"min":16,
"max":33,
"avg":24.67,
"dev":5.05
},
{
"name":"late",
"idx":0,
"cnt":27,
"min":8,
"max":17,
"avg":12.59,
"dev":2.63
}
]
}
'
I tried JSON_EXTRACT_PATH_TEXT(event , '{"name":"late"}', 'avg') function to get 'avg' for name = "late", but it returns blank.
Can anyone help, please?
Thanks
This is a rather complicated task in Redshift, that, unlike Postgres, has poor support to manage JSON, and no function to unnest arrays.
Here is one way to do it using a number table; you need to populate the table with incrementing numbers starting at 0, like:
create table nums as
select 0 i union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 n union all select 6
union all select 7 union all select 8 union all select 9
;
Once the table is created, you can use it to walk the JSON array using json_extract_array_element_text(), and check its content with json_extract_path_text():
select json_extract_path_text(item, 'avg') as my_avg
from (
select json_extract_array_element_text(t.items, n.i, true) as item
from (
select json_extract_path_text(mycol, 'data', true ) as items
from mytable
) t
inner join nums n on n.i < json_array_length(t.items, true)
) t
where json_extract_path_text(item, 'name') = 'late';
You'll need to use json_array_elements for that:
select obj->'avg'
from foo f, json_array_elements(f.event->'data') obj
where obj->>'name' = 'late';
Working example
create table foo (id int, event json);
insert into foo values (1,'{
"data":[
{
"name":"ping",
"idx":0,
"cnt":27,
"min":16,
"max":33,
"avg":24.67,
"dev":5.05
},
{
"name":"late",
"idx":0,
"cnt":27,
"min":8,
"max":17,
"avg":12.59,
"dev":2.63
}]}');

Case statement with four columns, i.e. attributes

I have a table with values "1", "0" or "". The table has four columns: p, q, r and s.
I need help creating a case statement that returns values when the attribute is equal to 1.
For ID 5 the case statement should return "p s".
For ID 14 the case statement should return "s".
For ID 33 the case statement should return 'p r s". And so on.
Do I need to come with a case statement that has every possible combination? Or is there a simpler way. Below is what I have come up with thus far.
case
when p = 1 and q =1 then "p q"
when p = 1 and r =1 then "p r"
when p = 1 and s =1 then "p s"
when r = 1 then r
when q = 1 then q
when r = 1 then r
when s = 1 then s
else ''
end
One solution could be this which uses a case for each attribute to return the correct value, surrounded by a trim to remove the trailing space.
with tbl(id, p, q, r, s) as (
select 5,1,0,0,1 from dual union all
select 14,0,0,0,1 from dual
)
select id,
trim(regexp_replace(case p when 1 then 'p' end ||
case q when 1 then 'q' end ||
case r when 1 then 'r' end ||
case s when 1 then 's' end, '(.)', '\1 '))
from tbl;
The real solution would be to fix the database design. This design technically violates Boyce-Codd 4th normal form in that it contains more than 1 independent attribute. The fact an ID "has" or "is part of" attribute p or q, etc should be split out. This design should be 3 tables, the main table with the ID, the lookup table containing info about attributes that the main ID could have (p, q, r or s) and the associative table that joins the two where appropriate (assuming an ID row could have more than one attribute and an attribute could belong to more than one ID), which is how to model a many-to-many relationship.
main_tbl main_attr attribute_lookup
ID col1 col2 main_id attr_id attr_id attr_desc
5 5 1 1 p
14 5 4 2 q
14 4 3 r
4 s
Then it would be simple to query this model to build your list, easy to maintain if an attribute description changes (only 1 place to change it), etc.
Select from it like this:
select m.ID, m.col1, listagg(al.attr_desc, ' ') within group (order by al.attr_desc) as attr_desc
from main_tbl m
join main_attr ma
on m.ID = ma.main_id
join attribute_lookup al
on ma.attr_id = al.attr_id
group by m.id, m.col1;
You can use concatenations with decode() functions
select id, decode(p,1,'p','')||decode(q,1,'q','')
||decode(r,1,'r','')||decode(s,1,'s','') as "String"
from t;
Demo
If you need spaces between letters, consider using :
with t(id,p,q,r,s) as
(
select 5,1,0,0,1 from dual union all
select 14,0,0,0,1 from dual union all
select 31,null,0,null,1 from dual union all
select 33,1,0,1,1 from dual
), t2 as
(
select id, decode(p,1,'p','')||decode(q,1,'q','')
||decode(r,1,'r','')||decode(s,1,'s','') as str
from t
), t3 as
(
select id, substr(str,level,1) as str, level as lvl
from t2
connect by level <= length(str)
and prior id = id
and prior sys_guid() is not null
)
select id, listagg(str,' ') within group (order by lvl) as "String"
from t3
group by id;
Demo
in my opinion, its a bad practice to use columns for relationships.
you should have two tables, one that's called arts and another that is called mapping art looks like this:
ID - ART
1 - p
2 - q
3 - r
4 - 2
...
and mapping maps your base-'ID's to your art-ids and looks like this
MYID - ARTID
5 - 1
5 - 4
afterwards, you should make use of oracles pivot operator. its more dynamically

BigQuery : filter repeated fields with legacy SQL

I have the following table :
row | query_params | query_values
1 foo bar
param val
2 foo baz
JSON :
{
"query_params" : [ "foo", "param"],
"query_values" : [ "bar", "val" ]
}, {
"query_params" : [ "foo" ],
"query_values" : [ "baz" ]
}
Using legacy SQL I want to filter repeated field on their value, something like
SELECT * FROM table WHERE query_params = 'foo'
Which would output
row | query_params | query_values
1 foo bar
2 foo baz
PS : this question is related to the same question but using standard SQL answered here
I can't think of any better ideas for legacy SQL aside from using a JOIN after flattening each array separately. If you have a table T with the contents indicated above, you can do:
SELECT
[t1.row],
t1.query_params,
t2.query_values
FROM
FLATTEN((SELECT [row], query_params, POSITION(query_params) AS pos
FROM T WHERE query_params = 'foo'), query_params) AS t1
JOIN
FLATTEN((SELECT [row], query_values, POSITION(query_values) AS pos
FROM T), query_values) AS t2
ON [t1.row] = [t2.row] AND
t1.pos = t2.pos;
The idea is to associate the elements of the two arrays by row and position after filtering for query_params that are equal to 'foo'.
Try below version
SELECT [row], query_params, query_values
FROM (
SELECT [row], query_params, param_pos, query_values, POSITION(query_values) AS value_pos
FROM FLATTEN((
SELECT [row], query_params, POSITION(query_params) AS param_pos, query_values
FROM YourTable
), query_params)
WHERE query_params = 'foo'
)
WHERE param_pos = value_pos

Returning result even for elements in IN list that don't exist in table

I am trying to find the easiest way to return a result set that indicates if some values are or are not present in a table. Consider this table:
id
------
1
2
3
7
23
I'm going to receive a list of IDs and I need to respond with the same list, indicating which are present in the table. If the list I get looks like this: '1','2','3','4','8','23', I need to produce a result set that looks like this:
id | status
-------------
1 | present
2 | present
3 | present
4 | missing
8 | missing
23 | present
So far, I've managed to come up with something using UNPIVOT:
select id, 'present' as status
from my_table
where id in ('1','2','3')
union
select subq.v as id, 'missing' as status
from (
select v
from
(
(
select '1' v1, '2' v2, '3' v3 from dual
)
unpivot
(
v
for x in (v1,v2,v3)
)
)
) subq
where subq.v not in
(
select id
from my_table
where id in ('1','2','3')
);
It looks a little weird, but it does work. The problem with this is the select '1' v1, '2' v2, '3' v3 from dual part: I have no idea how I can populate this with a JDBC prepared statement. The list of IDs is not fixed, so each call to the function that uses this query could pass a different list of IDs.
Are there any other ways to get this done? I think I'm missing something obvious, but I'm not sure...
(working with Oracle 11)
From the SQL side you could define a table type and use that to join to your real data, something like:
create type my_array_type as table of number
/
create or replace function f42 (in_array my_array_type)
return sys_refcursor as
rc sys_refcursor;
begin
open rc for
select a.column_value as id,
case when t.id is null then 'missing'
else 'present' end as status
from table(in_array) a
left join t42 t on t.id = a.column_value
order by id;
return rc;
end f42;
/
SQL Fiddle demo with a wrapper function so you can query it directly, which gives:
ID STATUS
---------- --------------------
1 present
2 present
3 present
4 missing
8 missing
23 present
From Java you can define an ARRAY based on the table type, populate from a Java array, and call the function directly; your single parameter bind variable is the ARRAY, and you get back a result set you can iterate over as normal.
As an outline of the Java side:
int[] ids = { 1, 2, 3, 4, 8, 23 };
ArrayDescriptor aDesc = ArrayDescriptor.createDescriptor("MY_ARRAY_TYPE",
conn);
oracle.sql.ARRAY ora_ids = new oracle.sql.ARRAY(aDesc, conn, ids);
cStmt = (OracleCallableStatement) conn.prepareCall("{ call ? := f42(?) }");
cStmt.registerOutParameter(1, OracleTypes.CURSOR);
cStmt.setArray(2, ora_ids);
cStmt.execute();
rSet = (OracleResultSet) cStmt.getCursor(1);
while (rSet.next())
{
System.out.println("id " + rSet.getInt(1) + ": " + rSet.getString(2));
}
Which gives:
id 1: present
id 2: present
id 3: present
id 4: missing
id 8: missing
id 23: present
As Maheswaran Ravisankar mentions, this allows any number of elements to be passed; you don't need to know how many elements there are at compile time (or deal with a theoretical maximum), you aren't limited by the maximum number of expressions allowed in an IN or by the length of a single delimited string, and you don't have to compose and decompose a string to pass multiple values.
As ThinkJet pointed out, if you don't want to create your own table type you can use a predefined collection, demonstrated here; the main function is the same apart from the declaration of the parameter:
create or replace function f42 (in_array sys.odcinumberlist)
return sys_refcursor as
...
The wrapper function populates the array slightly differently, but on the Java side you only need to change this line:
ArrayDescriptor aDesc =
ArrayDescriptor.createDescriptor("SYS.ODCINUMBERLIST", conn );
Using this also means (as ThinkJet also pointed out!) that you can run your original stand-alone query without defining a function:
select a.column_value as id,
case when t.id is null then 'missing'
else 'present' end as status
from table(sys.odcinumberlist(1, 2, 3, 4, 8, 23)) a
left join t42 t on t.id = a.column_value
order by id;
(SQL Fiddle).
And that means you can call the query directly from Java:
int[] ids = { 1, 2, 3, 4, 8, 23 };
ArrayDescriptor aDesc = ArrayDescriptor.createDescriptor("SYS.ODCINUMBERLIST", conn );
oracle.sql.ARRAY ora_ids = new oracle.sql.ARRAY(aDesc, conn, ids);
sql = "select a.column_value as id, "
+ "case when t.id is null then 'missing' "
+ "else 'present' end as status "
+ "from table(?) a "
+ "left join t42 t on t.id = a.column_value "
+ "order by id";
pStmt = (OraclePreparedStatement) conn.prepareStatement(sql);
pStmt.setArray(1, ora_ids);
rSet = (OracleResultSet) pStmt.executeQuery();
while (rSet.next())
{
System.out.println("id " + rSet.getInt(1) + ": " + rSet.getString(2));
}
... which you might prefer.
There's a pre-defined ODCIVARCHAR2LIST type too, if you're actually passing strings - your original code seems to be working with strings even though they contain numbers, so not sure which you really need.
Because these types are defined as VARRAY(32767) you are limited to 32k values, while defining your own table removes that restriction; but obviously that only matters if you're passing a lot of values.
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table IDs (id) AS
SELECT 1 FROM DUAL
UNION ALL SELECT 2 FROM DUAL
UNION ALL SELECT 3 FROM DUAL
UNION ALL SELECT 7 FROM DUAL
UNION ALL SELECT 23 FROM DUAL
/
Query 1:
Input the IDs as a string containing a list of numbers and then use a hierarchical query and regular expressions to split the string into rows:
WITH input AS (
SELECT '1,2,3,4,8,23' AS input FROM DUAL
),
split_inputs AS (
SELECT TO_NUMBER( REGEXP_SUBSTR( input, '\d+', 1, LEVEL ) ) AS id
FROM input
CONNECT BY LEVEL <= REGEXP_COUNT( input, '\d+' )
)
SELECT s.id,
CASE WHEN i.id IS NULL THEN 'Missing' ELSE 'Present' END AS status
FROM split_inputs s
LEFT OUTER JOIN
IDs i
ON ( s.id = i.id )
ORDER BY
s.id
Results:
| ID | STATUS |
|----|---------|
| 1 | Present |
| 2 | Present |
| 3 | Present |
| 4 | Missing |
| 8 | Missing |
| 23 | Present |

GROUP BY or COUNT Like Field Values - UNPIVOT?

I have a table with test fields, Example
id | test1 | test2 | test3 | test4 | test5
+----------+----------+----------+----------+----------+----------+
12345 | P | P | F | I | P
So for each record I want to know how many Pass, Failed or Incomplete (P,F or I)
Is there a way to GROUP BY value?
Pseudo:
SELECT ('P' IN (fields)) AS pass
WHERE id = 12345
I have about 40 test fields that I need to somehow group together and I really don't want to write this super ugly, long query. Yes I know I should rewrite the table into two or three separate tables but this is another problem.
Expected Results:
passed | failed | incomplete
+----------+----------+----------+
3 | 1 | 1
Suggestions?
Note: I'm running PostgreSQL 7.4 and yes we are upgrading
I may have come up with a solution:
SELECT id
,l - length(replace(t, 'P', '')) AS nr_p
,l - length(replace(t, 'F', '')) AS nr_f
,l - length(replace(t, 'I', '')) AS nr_i
FROM (SELECT id, test::text AS t, length(test::text) AS l FROM test) t
The trick works like this:
Transform the rowtype into its text representation.
Measure character-length.
Replace the character you want to count and measure the change in length.
Compute the length of the original row in the subselect for repeated use.
This requires that P, F, I are present nowhere else in the row. Use a sub-select to exclude any other columns that might interfere.
Tested in 8.4 - 9.1. Nobody uses PostgreSQL 7.4 anymore nowadays, you'll have to test yourself. I only use basic functions, but I am not sure if casting the rowtype to text is feasible in 7.4. If that doesn't work, you'll have to concatenate all test-columns once by hand:
SELECT id
,length(t) - length(replace(t, 'P', '')) AS nr_p
,length(t) - length(replace(t, 'F', '')) AS nr_f
,length(t) - length(replace(t, 'I', '')) AS nr_i
FROM (SELECT id, test1||test2||test3||test4 AS t FROM test) t
This requires all columns to be NOT NULL.
Essentially, you need to unpivot your data by test:
id | test | result
+----------+----------+----------+
12345 | test1 | P
12345 | test2 | P
12345 | test3 | F
12345 | test4 | I
12345 | test5 | P
...
- so that you can then group it by test result.
Unfortunately, PostgreSQL doesn't have pivot/unpivot functionality built in, so the simplest way to do this would be something like:
select id, 'test1' test, test1 result from mytable union all
select id, 'test2' test, test2 result from mytable union all
select id, 'test3' test, test3 result from mytable union all
select id, 'test4' test, test4 result from mytable union all
select id, 'test5' test, test5 result from mytable union all
...
There are other ways of approaching this, but with 40 columns of data this is going to get really ugly.
EDIT: an alternative approach -
select r.result, sum(char_length(replace(replace(test1||test2||test3||test4||test5,excl1,''),excl2,'')))
from mytable m,
(select 'P' result, 'F' excl1, 'I' excl2 union all
select 'F' result, 'P' excl1, 'I' excl2 union all
select 'I' result, 'F' excl1, 'P' excl2) r
group by r.result
You could use an auxiliary on-the-fly table to turn columns into rows, then you would be able to apply aggregate functions, something like this:
SELECT
SUM(fields = 'P') AS passed,
SUM(fields = 'F') AS failed,
SUM(fields = 'I') AS incomplete
FROM (
SELECT
t.id,
CASE x.idx
WHEN 1 THEN t.test1
WHEN 2 THEN t.test2
WHEN 3 THEN t.test3
WHEN 4 THEN t.test4
WHEN 5 THEN t.test5
END AS fields
FROM atable t
CROSS JOIN (
SELECT 1 AS idx
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
) x
WHERE t.id = 12345
) s
Edit: just saw the comment about 7.4, I don't think this will work with that ancient version (unnest() came a lot later). If anyone thinks this is not worth keeping, I'll delete it.
Taking Erwin's idea to use the "row representation" as a base for the solution a bit further and automatically "normalize" the table on-the-fly:
select id,
sum(case when flag = 'F' then 1 else null end) as failed,
sum(case when flag = 'P' then 1 else null end) as passed,
sum(case when flag = 'I' then 1 else null end) as incomplete
from (
select id,
unnest(string_to_array(trim(trailing ')' from substr(all_column_values,strpos(all_column_values, ',') + 1)), ',')) flag
from (
SELECT id,
not_normalized::text AS all_column_values
FROM not_normalized
) t1
) t2
group by id
The heart of the solution is Erwin's trick to make a single value out of the complete row using the cast not_normalized::text. The string functions are applied to strip of the leading id value and the brackets around it.
The result of that is transformed into an array and that array is transformed into a result set using the unnest() function.
To understand that part, simply run the inner selects step by step.
Then the result is grouped and the corresponding values are counted.