PostgreSQL ALL(A) <# ANY(B) - sql

The objective is to solve the following use case:
table contains many numrange[] fields. Let A be one of those fields
we need to request rows with a parameter of type numrange[] = B according to this rule : ALL(A) <# ANY(B)
A sample of a request on table dt.t with B = {[1,3],[9,10]} would be :
select * from dt.t where ALL(A) <# ANY(ARRAY[numrange(1,3),numrange(9,10)])
So it seems feasible. But the ALL operator can only be used on the right side of the condition...
And turning it around for about a day I don't find a clue on how to solve this use case (not using functions if possible).
The real use case will be using filtering on many fields so the solution needs to be working for multiple fields in the same where clause
select *
from dt.t
where ALL(A1) <# ANY(ARRAY[numrange(1,3),numrange(9,10)])
and ALL(A2) <# ANY(ARRAY[numrange(10,13),numrange(20,20)])

Found this solution :
select *
from dt.t t1
where (
select count(1)
from (
select unnest(A) a
from dt.t t2
where t2.id=t1.id
) t
where t.a <# ANY(ARRAY['[1,3)'::numrange])) = array_length(A,1);
The idea is :
select unnest(A) a from dt.t t2 where t2.id=t1.id => gives each element of ARRAY field A
t.a <# ANY(ARRAY['[1,3)'::numrange]) => tests if this element is included in the parameter => <# ANY(B) part
(select count(1) [...]) = array_length(A,1) => checks that all elements of A is valid => the ALL(A) part of the problem
Tried it, works, and seems legit. The only thing really important is that B is the minimal union of itself (there shall be no numrange[] equivalent to B with less numrange in it).
Apart from that, seems to work. Thank you all for your help and time.

Related

How to extract a json value inside an array inside a json in SQL

I have a JSON being returned in my query called MetaDataJSON, inside which is an array of JSONs called Results. Inside each JSON in Results are two values, Chronic and Probability. There are a couple other tables that have been joined too. Is there a way to get Chronic in a column by itself? Right now I have gotten this far (table and variable names have been made generic):
SELECT DISTINCT
JSON_QUERY(mdj.value, '$.Results[0]') [Results]
FROM table1 t1
JOIN table2 t2 ON t2.parameter1 = t1.parameter1
AND t2.parameter2 = 'ASDF'
JOIN table3 t3 ON oad.parameter3 = oa.parameter3
AND t3.parameter4 = 11
CROSS APPLY OPENJSON(t3.MetaDataJSON) as mdj
This gets me a column called Results where each entry looks like:
{"Chronic": 0, "Probability": 0.0016}
Is there an efficient way to get Chronic in a column by itself? Thanks!
You can do it like this,
WITH jsons (x)
AS
(
-- replace this part with your query
Select a.x from (select '{"Chronic": 0, "Probability": 0.0016}' x ) as a, (SELECT 1 as y union select 2 as y ) as b
)
select
(SELECT value FROM OPENJSON(x,'$') where [key]='Chronic') as "Chronic",
(SELECT value FROM OPENJSON(x,'$') where [key]='Probability') as "Probability"
from jsons;
I think you can change the #json equation to use your query. But I cannot try since I don't have your tables...
BTW, I assume you are using MSSQL...

How to find all pair of polygons which only touch each other in a point and only list each pair once

How to find all pair of polygons which only touch each other in a point and only list each pair once in PostgreSQL using PostGIS?
like the cycle shown on the picture:
I have written the following query:
with kms as (
select
a.county as cn1,
b.county as cn2
from spatial.us_counties as a, spatial.us_counties as b
where ST_Touches(a.geom, b.geom) = 'true' and a.id != b.id and ST_GeometryType(ST_Intersection(a.geom,b.geom)) = 'ST_Point'
)
/** below is for remove reversed pairs **/
SELECT t1.cn1
,t1.cn2
FROM kms AS t1
LEFT OUTER JOIN kms AS t2
ON t1.cn1 = t2.cn2
AND t1.cn2 = t2.cn1
WHERE t2.cn1 IS NULL
OR t1.cn1 < t2.cn1
But this query caused serious performance issue and it returned all pairs twice (reversed pair)
This approach is not the solution at all.
So is there anyone can help me with that or give me any hints?
I'm not absolutely sure so I need your feedback for this answer..
Try:
SELECT DISTINCT A.county
FROM spatial.us_counties AS A, spatial.us_counties AS B
WHERE ST_Touches(A.geom, B.geom) = 'true'
According to: https://postgis.net/docs/ST_Touches.html ST_Touches should return touching polygons only and not intersecting so this should eliminate the need for the where statement that checks if it's a point intersection. Selecting DISTINCT should help with the duplicates.
Adding an index https://postgis.net/docs/using_postgis_dbmanagement.html#idm2269 to the table will help speed up the geometry queries. Let me know if you've already done all this, I can edit my answer.

Nested Query Alternatives in AWS Athena

I am running a query that gives a non-overlapping set of first_party_id's - ids that are associated with one third party but not another. This query does not run in Athena, however, giving the error: Correlated queries not yet supported.
Was looking at prestodb docs, https://prestodb.io/docs/current/sql/select.html (Athena is prestodb under the hood), for an alternative to nested queries. The with statement example given doesn't seem to translate well for this not in clause. Wondering what the alternative to a nested query would be - Query below.
SELECT
COUNT(DISTINCT i.third_party_id) AS uniques
FROM
db.ids i
WHERE
i.third_party_type = 'cookie_1'
AND i.first_party_id NOT IN (
SELECT
i.first_party_id
WHERE
i.third_party_id = 'cookie_2'
)
There may be a better way to do this - I would be curious to see it too! One way I can think of would be to use an outer join. (I'm not exactly sure about how your data is structured, so forgive the contrived example, but I hope it would translate ok.) How about this?
with
a as (select *
from (values
(1,'cookie_n',10,'cookie_2'),
(2,'cookie_n',11,'cookie_1'),
(3,'cookie_m',12,'cookie_1'),
(4,'cookie_m',12,'cookie_1'),
(5,'cookie_q',13,'cookie_1'),
(6,'cookie_n',13,'cookie_1'),
(7,'cookie_m',14,'cookie_3')
) as db_ids(first_party_id, first_party_type, third_party_id, third_party_type)
),
b as (select first_party_type
from a where third_party_type = 'cookie_2'),
c as (select a.third_party_id, b.first_party_type as exclude_first_party_type
from a left join b on a.first_party_type = b.first_party_type
where a.third_party_type = 'cookie_1')
select count(distinct third_party_id) from c
where exclude_first_party_type is null;
Hope this helps!
You can use an outer join:
SELECT
COUNT(DISTINCT i.third_party_id) AS uniques
FROM
db.ids a
LEFT JOIN
db.ids b
ON a.first_party_id = b.first_party_id
AND b.third_party_id = 'cookie_2'
WHERE
a.third_party_type = 'cookie_1'
AND b.third_party_id is null -- this line means we select only rows where there is no match
You should also use caution when using NOT IN for subqueries that may return NULL values since the condition will always be true. Your query is comparing a.first_party_id to NULL, which will always be false and so NOT IN will lead to the condition always being true. Nasty little gotcha.
One way to avoid this is to avoid using NOT IN or to add a condition to your subquery i.e. AND third_party_id IS NOT NULL.
See here for a longer explanation.

How to group by more than one row value?

I am working with POSTGRESQL and I can't find out how to solve a problem. I have a model called Foobar. Some of its attributes are:
FOOBAR
check_in:datetime
qr_code:string
city_id:integer
In this table there is a lot of redundancy (qr_code is not unique) but that is not my problem right now. What I am trying to get are the foobars that have same qr_code and have been in a well known group of cities, that have checked in at different moments.
I got this by querying:
SELECT * FROM foobar AS a
WHERE a.city_id = 1
AND EXISTS (
SELECT * FROM foobar AS b
WHERE a.check_in < b.check_in
AND a.qr_code = b.qr_code
AND b.city_id = 2
AND EXISTS (
SELECT * FROM foobar as c
WHERE b.check_in < c.check_in
AND c.qr_code = b.qr_code
AND c.city_id = 3
AND EXISTS(...)
)
)
where '...' represents more queries to get more persons with the same qr_code, different check_in date and those well known cities.
My problem is that I want to group this by qr_code, and I want to show the check_in fields of each qr_code like this:
2015-11-11 14:14:14 => [2015-11-11 14:14:14, 2015-11-11 16:16:16, 2015-11-11 17:18:20] (this for each different qr_code)
where the data at the left is the 'smaller' date for that qr_code, and the right part are all the other dates for that qr_code, including the first one.
Is this possible to do with a sql query only? I am asking this because I am actually doing this app with rails, and I know that I can make a different approach with array methods of ruby (a solution with this would be well received too)
You could solve that with a recursive CTE - if I interpret your question correctly:
Assuming you have a given list of cities that must be visited in order by the same qr_code. Your text doesn't say so, but your query indicates as much.
WITH RECURSIVE
c AS (SELECT '{1,2,3}'::int[] AS cities) -- your list of city_id's here
, route AS (
SELECT f.check_in, f.qr_code, 2 AS idx
FROM foobar f
JOIN c ON f.city_id = c.cities[1]
UNION ALL
SELECT f.check_in, f.qr_code, r.idx + 1
FROM route r
JOIN foobar f USING (qr_code)
JOIN c ON f.city_id = c.cities[r.idx]
WHERE r.check_in < f.check_in
)
SELECT qr_code, array_agg(check_in) AS check_in_list
FROM (
SELECT *
FROM route
ORDER BY qr_code, idx -- or check_in
) sub
HAVING count(*) = (SELECT array_length(cities) FROM c);
GROUP BY 1;
Provide the list as array in the first (non-recursive) CTE c.
In the recursive part start with any rows in the first city and travel along your array until the last element.
In the final SELECT aggregate your check_in column in order. Only return qr_code that have visited all cities of the array.
Similar:
Recursive query used for transitive closure

Squeryl Select Duplicates

I would like to find overlapping data with a Squeryl query. I can do so by using the method found here with normal SQL, but can't figure out how to do so using Squeryl.
Basically I need to convert this line that finds Non-Distinct rows to Squeryl
SELECT *
FROM myTable L1
JOIN(
SELECT myField1,myField2
FROM myTable
GROUP BY myField1,myField2
HAVING COUNT(*) >= 2
) L2
ON L1.myField1 = L2.myField1 AND L1.myField2 = L2.myField2;
EDIT : More importantly I need to be able to do this dynamically. I have a bit of a complex dynamic query that I call that may rely on different options being passed. If an Option is defined then it should call this, otherwise inhibit if null. But groupBy does not support an inhibitBy method. To see a full explanation of my current method look here
def getAllJoined(
hasFallback:Option[String] = None
showDuplicates:Option[String] = None):List[(Type1,Type2)] = transaction{
join(mainTable,
table2,
table3,
table3,
table4.leftOuter,
table4.leftOuter,
table5,
table6)((main, attr1, attr2, attr3, attr4, attr5, attr6, attr7) =>
where(
main.fallBack.isNotNull.inhibitWhen(!hasFallback.isDefined)
)
//What to do here to only find duplicates when showDuplicates.isDefined? AKA Non-Distinct
select(main,attr1,attr2,attr3,attr4,attr5,attr6,attr7)
on(
(main.attr1Col === attr1.id) ,
(main.attr2Col === attr2.id) ,
(main.attr3Col === attr3.id) ,
(main.attr4Col === attr4.map(_.id)) ,
(main.attr5Col === attr5.map(_.id)) ,
(main.attr6Col === attr6.id) ,
(main.attr7Col === attr7.id)
)
).toList
Check out this discussion on Google Groups. Looks like they had fixed a bug related to inhibited having in 2011, but not sure why it still persists in your case. They also have an example query using the having clause in the same thread.