How to extract a json value inside an array inside a json in SQL - sql

I have a JSON being returned in my query called MetaDataJSON, inside which is an array of JSONs called Results. Inside each JSON in Results are two values, Chronic and Probability. There are a couple other tables that have been joined too. Is there a way to get Chronic in a column by itself? Right now I have gotten this far (table and variable names have been made generic):
SELECT DISTINCT
JSON_QUERY(mdj.value, '$.Results[0]') [Results]
FROM table1 t1
JOIN table2 t2 ON t2.parameter1 = t1.parameter1
AND t2.parameter2 = 'ASDF'
JOIN table3 t3 ON oad.parameter3 = oa.parameter3
AND t3.parameter4 = 11
CROSS APPLY OPENJSON(t3.MetaDataJSON) as mdj
This gets me a column called Results where each entry looks like:
{"Chronic": 0, "Probability": 0.0016}
Is there an efficient way to get Chronic in a column by itself? Thanks!

You can do it like this,
WITH jsons (x)
AS
(
-- replace this part with your query
Select a.x from (select '{"Chronic": 0, "Probability": 0.0016}' x ) as a, (SELECT 1 as y union select 2 as y ) as b
)
select
(SELECT value FROM OPENJSON(x,'$') where [key]='Chronic') as "Chronic",
(SELECT value FROM OPENJSON(x,'$') where [key]='Probability') as "Probability"
from jsons;
I think you can change the #json equation to use your query. But I cannot try since I don't have your tables...
BTW, I assume you are using MSSQL...

Related

check if all elements in hive array contain a string pattern

I have two columns in a hive table that look something like this:
code codeset
AB AB123,MU124
LM LM123,LM234
I need to verify that all elements in codeset column contain the value in code column so in the above example the first row would be false and the second row would be true.
Is there a simple way to do this that I am missing? I already read about array_contains but that returns true if just one element matches, I need all elements to contain what's in the code column.
Thanks in advance.
split the string, explode and use lateral view to unpivot the data. Then check using locate if the split codeset contains each code (which is done with group by and having).
select code,codeset
from tbl
lateral view explode(split(codeset,',')) t as split_codeset
group by code,codeset
having sum(cast(locate(code,split_codeset)>0 as int))=count(*)
select a.pattern,b.listOfInputs from ( select * from (select a.pattern, case when inputCount = sumPatternMatchResult then true else false end finalResult from (select pattern , sum(patternMatchResult) as sumPatternMatchResult from (select pattern,case when locate(pattern,input) !=0 then 1 else 0 end patternMatchResult from (select pattern,explode(split(listOfInputs,',')) as input from tbl)a ) b group by pattern) a join (select pattern , count(input) inputCount from (select pattern,explode(split(listOfInputs,',')) as input from tbl)a group by pattern) b on a.pattern=b.pattern )c where finalResult=true)a join (select * from tbl) b on a.pattern=b.pattern
This works too.
column mapping details for your table:
code -> pattern
codeset -> listOfInputs

How to group by more than one row value?

I am working with POSTGRESQL and I can't find out how to solve a problem. I have a model called Foobar. Some of its attributes are:
FOOBAR
check_in:datetime
qr_code:string
city_id:integer
In this table there is a lot of redundancy (qr_code is not unique) but that is not my problem right now. What I am trying to get are the foobars that have same qr_code and have been in a well known group of cities, that have checked in at different moments.
I got this by querying:
SELECT * FROM foobar AS a
WHERE a.city_id = 1
AND EXISTS (
SELECT * FROM foobar AS b
WHERE a.check_in < b.check_in
AND a.qr_code = b.qr_code
AND b.city_id = 2
AND EXISTS (
SELECT * FROM foobar as c
WHERE b.check_in < c.check_in
AND c.qr_code = b.qr_code
AND c.city_id = 3
AND EXISTS(...)
)
)
where '...' represents more queries to get more persons with the same qr_code, different check_in date and those well known cities.
My problem is that I want to group this by qr_code, and I want to show the check_in fields of each qr_code like this:
2015-11-11 14:14:14 => [2015-11-11 14:14:14, 2015-11-11 16:16:16, 2015-11-11 17:18:20] (this for each different qr_code)
where the data at the left is the 'smaller' date for that qr_code, and the right part are all the other dates for that qr_code, including the first one.
Is this possible to do with a sql query only? I am asking this because I am actually doing this app with rails, and I know that I can make a different approach with array methods of ruby (a solution with this would be well received too)
You could solve that with a recursive CTE - if I interpret your question correctly:
Assuming you have a given list of cities that must be visited in order by the same qr_code. Your text doesn't say so, but your query indicates as much.
WITH RECURSIVE
c AS (SELECT '{1,2,3}'::int[] AS cities) -- your list of city_id's here
, route AS (
SELECT f.check_in, f.qr_code, 2 AS idx
FROM foobar f
JOIN c ON f.city_id = c.cities[1]
UNION ALL
SELECT f.check_in, f.qr_code, r.idx + 1
FROM route r
JOIN foobar f USING (qr_code)
JOIN c ON f.city_id = c.cities[r.idx]
WHERE r.check_in < f.check_in
)
SELECT qr_code, array_agg(check_in) AS check_in_list
FROM (
SELECT *
FROM route
ORDER BY qr_code, idx -- or check_in
) sub
HAVING count(*) = (SELECT array_length(cities) FROM c);
GROUP BY 1;
Provide the list as array in the first (non-recursive) CTE c.
In the recursive part start with any rows in the first city and travel along your array until the last element.
In the final SELECT aggregate your check_in column in order. Only return qr_code that have visited all cities of the array.
Similar:
Recursive query used for transitive closure

How to use a variable AS a where clause?

I have one where clause which I have to use multiple times. I am quite new to Oracle SQL, so please forgive me for my newbe mistakes :). I have read this website, but could not find the answer :(. Here's the SQL statement:
var condition varchar2(100)
exec :condition := 'column 1 = 1 AND column2 = 2, etc.'
Select a.content, b.content
from
(Select (DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,3)) as content
from table_name
where category = X AND :condition
group by (DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,3))
) A
,
(Select (DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,100)) as content
from table_name
where category = Y AND :condition
group by (DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,100))) B
GROUP BY
a.content, b.content
The content field is a CLOB field and unfortunately all values needed are in the same column. My query does not work ofcourse.
You can't use a bind variable for that much of a where clause, only for specific values. You could use a substitution variable if you're running this in SQL*Plus or SQL Developer (and maybe some other clients):
define condition = 'column 1 = 1 AND column2 = 2, etc.'
Select a.content, b.content
from
(Select (DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,3)) as content
from table_name
where category = X AND &condition
...
From other places, including JDBC and OCI, you'd need to have the condition as a variable and build the query string using that, so it's repeated in the code that the parser sees. From PL/SQL you could use dynamic SQL to achieve the same thing. I'm not sure why just repeating the conditions is a problem though, binding arguments if values are going to change. Certainly with two clauses like this it seems a bit pointless.
But maybe you could approach this from a different angle and remove the need to repeat the where clause. Querying the table twice might not be efficient anyway. You could apply your condition once as a subquery, but without knowing your indexes or the selectivity of the conditions this could be worse:
with sub_table as (
select category, content
from my_table
where category in (X, Y)
and column 1 = 1 AND column2 = 2, etc.
)
Select a.content, b.content
from
(Select (DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,3)) as content
from sub_table
where category = X
group by (DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,3))
) A
,
(Select (DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,100)) as content
from sub_table
where category = Y
group by (DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,100))) B
GROUP BY
a.content, b.content
I'm not sure what the grouping is for - to eliminate duplicates? This only really makes sense if you have a single X and Y record matching the other conditions, doesn't it? Maybe I'm not following it properly.
You could also use a case statement:
select max(content_x), max(content_y)
from (
select
case when category = X
then DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,3) end as content_x,
case when category = Y
then DBMS_LOB.SUBSTR(ost_bama_vrij_veld.inhoud,100) end as content_y,
from my_table
where category in (X, Y)
and column 1 = 1 AND column2 = 2, etc.
)

Different results on the basis of different foreign key value using a falg in where clause

Please see attached image.
alt text http://img241.imageshack.us/img241/3585/customcost.png
Can you please tell me what query will work. Please ignore isdefinite and productpriceid columns.
Thanks
If you want a single query, this should do it if I've interpreted your question properly:
SELECT DISTINCT t1.SupplierVenueProductID, [...]
FROM table t1
LEFT JOIN
table t2
ON t1.SupplierVenueProductID = t2.SupplierVenueProductID
AND t2.iscustomcost = 1
WHERE t2.SupplierVenueProductID IS NULL
OR t1.iscustomcost = 1
I don't know your table name, but you join it to itself.
I'm a bit lost on what you want to accomplish here,
going by your requirement if isCustomCost = 1 then return record #3 from SupplierVenueProductId 1 and both records from SupplierVenueProductId 2
Trying to generalize this, I think what you need is :
return all rows from the table, unless when there is a record for a SupplierVenueProductId that has isCustomCost = 1, then only return that record for this SupplierVenueProductId
Which then becomes something along the lines of :
SELECT t1.*
FROM myTable t1
WHERE t1.isCustomCost = 1
OR NOT EXISTs (SELECT *
FROM t2
WHERE t2.SupplierVenueProductId = t1.SupplierVenueProductId
AND t2.isCustomCost = 1)
Hope this helps.

Selecting elements that don't exist

I am working on an application that has to assign numeric codes to elements. This codes are not consecutives and my idea is not to insert them in the data base until have the related element, but i would like to find, in a sql matter, the not assigned codes and i dont know how to do it.
Any ideas?
Thanks!!!
Edit 1
The table can be so simple:
code | element
-----------------
3 | three
7 | seven
2 | two
And I would like something like this: 1, 4, 5, 6. Without any other table.
Edit 2
Thanks for the feedback, your answers have been very helpful.
This will return NULL if a code is not assigned:
SELECT assigned_codes.code
FROM codes
LEFT JOIN
assigned_codes
ON assigned_codes.code = codes.code
WHERE codes.code = #code
This will return all non-assigned codes:
SELECT codes.code
FROM codes
LEFT JOIN
assigned_codes
ON assigned_codes.code = codes.code
WHERE assigned_codes.code IS NULL
There is no pure SQL way to do exactly the thing you want.
In Oracle, you can do the following:
SELECT lvl
FROM (
SELECT level AS lvl
FROM dual
CONNECT BY
level <=
(
SELECT MAX(code)
FROM elements
)
)
LEFT OUTER JOIN
elements
ON code = lvl
WHERE code IS NULL
In PostgreSQL, you can do the following:
SELECT lvl
FROM generate_series(
1,
(
SELECT MAX(code)
FROM elements
)) lvl
LEFT OUTER JOIN
elements
ON code = lvl
WHERE code IS NULL
Contrary to the assertion that this cannot be done using pure SQL, here is a counter example showing how it can be done. (Note that I didn't say it was easy - it is, however, possible.) Assume the table's name is value_list with columns code and value as shown in the edits (why does everyone forget to include the table name in the question?):
SELECT b.bottom, t.top
FROM (SELECT l1.code - 1 AS top
FROM value_list l1
WHERE NOT EXISTS (SELECT * FROM value_list l2
WHERE l2.code = l1.code - 1)) AS t,
(SELECT l1.code + 1 AS bottom
FROM value_list l1
WHERE NOT EXISTS (SELECT * FROM value_list l2
WHERE l2.code = l1.code + 1)) AS b
WHERE b.bottom <= t.top
AND NOT EXISTS (SELECT * FROM value_list l2
WHERE l2.code >= b.bottom AND l2.code <= t.top);
The two parallel queries in the from clause generate values that are respectively at the top and bottom of a gap in the range of values in the table. The cross-product of these two lists is then restricted so that the bottom is not greater than the top, and such that there is no value in the original list in between the bottom and top.
On the sample data, this produces the range 4-6. When I added an extra row (9, 'nine'), it also generated the range 8-8. Clearly, you also have two other possible ranges for a suitable definition of 'infinity':
-infinity .. MIN(code)-1
MAX(code)+1 .. +infinity
Note that:
If you are using this routinely, there will generally not be many gaps in your lists.
Gaps can only appear when you delete rows from the table (or you ignore the ranges returned by this query or its relatives when inserting data).
It is usually a bad idea to reuse identifiers, so in fact this effort is probably misguided.
However, if you want to do it, here is one way to do so.
This the same idea which Quassnoi has published.
I just linked all ideas together in T-SQL like code.
DECLARE
series #table(n int)
DECLARE
max_n int,
i int
SET i = 1
-- max value in elements table
SELECT
max_n = (SELECT MAX(code) FROM elements)
-- fill #series table with numbers from 1 to n
WHILE i < max_n BEGIN
INSERT INTO #series (n) VALUES (i)
SET i = i + 1
END
-- unassigned codes -- these without pair in elements table
SELECT
n
FROM
#series AS series
LEFT JOIN
elements
ON
elements.code = series.n
WHERE
elements.code IS NULL
EDIT:
This is, of course, not ideal solution. If you have a lot of elements or check for non-existing code often this could cause performance issues.