MySQL - Grouping using MIN not working as expected - sql

I have the following union query where i'm setting a dummy row number:
SELECT #row_num:=2 as row_num, id, name, admin1, admin3, admin4
FROM locations
WHERE feature_class IN ('P', 'A')
AND name LIKE 'cornwall%'
UNION ALL
SELECT #row_num:=1 as row_num, id, name, admin1, admin3, admin4
FROM locations
WHERE feature_class = 'L' AND feature_code = 'RGN'
AND name LIKE 'cornwall%'
This returns the following results:
row_num | id | name | admin1 | admin3 | admin4
-------------------------------------------------------------------------
2 | 2652355 | Cornwall | ENG | |
1 | 11609029 | Cornwall | ENG | |
Now when I add this query as a subquery and use MIN to select row with the lowest row_num I get incorrect results:
SELECT MIN(t0.row_num),
t0.id,
t0.name,
t4.name AS town_admin4,
t3.name AS county_admin3,
t1.name AS admin1
FROM (
SELECT #row_num:=2 as row_num, id, name, admin1, admin3, admin4
FROM locations
WHERE feature_class IN ('P', 'A')
AND name LIKE 'cornwall%'
UNION ALL
SELECT #row_num:=1 as row_num , id, name, admin1, admin3, admin4
FROM locations
WHERE feature_class = 'L' AND feature_code = 'RGN'
AND name LIKE 'cornwall%'
) t0
LEFT JOIN locations t1 ON t1.admin1 = t0.admin1 AND t1.feature_code = 'ADM1'
LEFT JOIN locations t3 ON t3.admin3 = t0.admin3 AND t3.feature_code = 'ADM3'
LEFT JOIN locations t4 ON t4.admin4 = t0.admin4 AND t4.feature_code = 'ADM4'
GROUP BY t0.name,
t4.name,
t3.name,
t1.name
ORDER BY t0.name
I get the following results:
MIN(t0.row_num) | id | name | town_admin4 | county_admin3 | admin1
--------------------------------------------------------------------------------------------
1 | 2652355 | Cornwall | | | England
As you can see the above is incorrect. Row 1 should be record with id 11609029
Why is it behaving like this - why is the MIN operator not working as expected?

Related

Tie-breaking mutliple matches on MAX() in SQL

I have a table that looks like this:
| client_id | program_id | provider_id | date_of_service | data_entry_date | data_entry_time |
| --------- | ---------- | ----------- | --------------- | --------------- | --------------- |
| 2 | 5 | 6 | 02/02/2022 | 02/02/2022 | 0945 |
| 2 | 5 | 6 | 02/02/2022 | 02/07/2022 | 0900 |
| 2 | 5 | 6 | 02/04/2022 | 02/04/2022 | 1000 |
| 2 | 5 | 6 | 02/04/2022 | 02/04/2022 | 1700 |
| 2 | 5 | 6 | 02/04/2022 | 02/05/2022 | 0800 |
| 2 | 5 | 6 | 02/04/2022 | 02/05/2022 | 0900 |
I need to get the most recent date_of_service entered. From the table above, the desired result/row is:
date_of_service = 02/04/2022, data_entry_date = 02/05/2022, data_entry_time = 0900
This resulting date_of_service will be left joined to the master table.
This query mostly works:
SELECT t1.client_id, t1.program_id, t1.provider_id, t2.date_of_service
FROM table1 as t1
WHERE provider_id = '6'
LEFT JOIN
(SELECT client_id, program_id, provider_id, date_of_service
FROM table2) as t2
ON t2.client_id = t1.client_id
AND t2.program_id = t1.program_id
AND t2.provider_id = t1.provider_id
AND t2.date_of_service =
(SELECT MAX(date_of_service)
FROM t2 as t3
WHERE t3.client_id = t1.client_id
AND t3.program_id = t1.program_id
AND t3.provider_id = t1.provider_id
)
)
But it also returns multiple rows whenever there is more than one match on the max(date_of_service).
To solve this, I need to use the max data_entry_date to break any ties whenever there is more than one row that matches the max(date_of_service). Likewise, I also need to use the max data_entry_time to break any ties whenever there is more than one row that also matches the max data_entry_date.
I tried the following:
SELECT t1.client_id, t1.program_id, t1.provider_id, t2.date_of_service
FROM table1 as t1
WHERE provider_id = '6'
LEFT JOIN
(SELECT TOP(1) client_id, program_id, provider_id, date_of_service, data_entry_date, data_entry_time
FROM table2
ORDER BY date_of_service DESC, data_entry_date DESC, data_entry_time DESC
) as t2
ON t2.client_id = t1.client_id
AND t2.program_id = t1.program_id
AND t2.provider_id = t1.provider_id
But I can only get it to return null values for the date_of_service.
Likewise, this:
SELECT t1.client_id, t1.program_id, t1.provider_id, t2.date_of_service
FROM table1 as t1
WHERE provider_id = '6'
LEFT JOIN
(
SELECT TOP(1) client_id AS client_id2, program_id AS program_id2, provider_id AS provider_id2, date_of_service, data_entry_date, data_entry_time
FROM table2 AS t3
JOIN
(SELECT
MAX(date_of_service) AS max_date_of_service
,MAX(data_entry_date) AS max_data_entry_date
FROM table1
WHERE date_of_service = (SELECT MAX(date_of_service) FROM table2)
) AS t4
ON t3.date_of_service = t4.max_date_of_service
AND t3.data_entry_date = t4.max_data_entry_date
ORDER BY data_entry_time
) AS t2
ON t2.client_id2 = t1.client_id
AND t2.program_id2 = t1.program_id
AND t2.provider_id2 = t1.provider_id
... works (meaning it doesn't throw any errors), but it only seems to return null values for me.
I've tried various combinations of MAX, ORDER BY, and multiple variations of JOIN's, but haven't found one that works yet.
I don't know what version my SQL database is, but it doesn't appear to handle window functions like OVER and PARTITION or other things like COALESCE. I've been using DBeaver 22.2.0 to test the SQL scripts.
Based on your what you've provided, looks like you can simply query table2:
SELECT client_id, program_id, provider_id, MAX(date_of_service), MAX(data_entry_date), MAX(data_entry_time)
FROM table2
GROUP BY client_id, program_id, provider_id
If you need to join this result set to table1, just JOIN to the statement above on client_id, program_id, provider_id
Try using below query. This is using just joins and sub query.
SELECT TOP 1 * FROM table1 t1
JOIN (
SELECT
MAX(date_of_Service) AS Max_date_of_Service
,MAX(data_entry_date) AS Max_data_entry_date
FROM table1
WHERE date_of_Service = (SELECT MAX(date_of_Service) FROM table1)
)t2
ON t1.date_of_Service = t2.Max_date_of_Service
AND t1.data_entry_date = t2.Max_data_entry_date
ORDER BY data_entry_time

Postgres group by empty string question to include empty string in output

I have following table in Postgres
| phone | group | spec |
| 1 | 1 | 'Lock' |
| 1 | 2 | 'Full' |
| 1 | 3 | 'Face' |
| 2 | 1 | 'Lock' |
| 2 | 3 | 'Face' |
| 3 | 2 | 'Scan' |
Tried this
SELECT phone, string_agg(spec, ', ')
FROM mytable
GROUP BY phone;
Need this ouput for each phone where there is empty string for missing group.
| phone | spec
| 1 | Lock, Full, Face
| 2 | Lock, '' , Face
| 3 | '', Scan ,''
You need a CTE which returns all possible combinations of phone and group and a left join to the table so you can group by phone:
with cte as (
select *
from (
select distinct phone from mytable
) m cross join (
select distinct "group" from mytable
) g
)
select c.phone, string_agg(coalesce(t.spec, ''''''), ',') spec
from cte c left join mytable t
on t.phone = c.phone and t."group" = c."group"
group by c.phone
See the demo.
Results:
| phone | spec |
| ----- | -------------- |
| 1 | Lock,Full,Face |
| 2 | Lock,'',Face |
| 3 | '',Scan,'' |
You can use conditional aggregation:
select phone,
(max(case when group = 1 then spec else '''''' end) || ', ' ||
max(case when group = 2 then spec else '''''' end) || ', ' ||
max(case when group = 3 then spec else '''''' end)
) as specs
from mytable t
group by phone;
Alternatively, you can general all the groups using generate_series() and then aggregation:
select p.phone,
string_agg(coalesce(t.spec, ''''''), ', ') as specs
from (select distinct phone from mytable) p cross join
generate_series(1, 3, 1) gs(grp) left join
mytable t
on t.phone = p.phone and t.group = gs.grp
group by p.phone
You can consider using a self - (RIGHT/LEFT)JOIN with all three distinct groups (which's stated within the subquery just after RIGHT JOIN keywords ) and a correlated query for your table :
WITH mytable1 AS
(
SELECT distinct t1.phone, t2."group",
( SELECT spec FROM mytable WHERE phone = t1.phone AND "group"=t2."group" )
FROM mytable t1
RIGHT JOIN ( SELECT distinct "group" FROM mytable ) t2
ON t2."group" = coalesce(t2."group",t1."group")
)
SELECT phone, string_agg(coalesce(spec,''''''), ', ') as spec
FROM mytable1
GROUP BY phone;
Demo

Count missing values

I have a following table called Test:
Id | SomeId | Value
-----------------------------------------------------
019D9E52-41D1-45DF-81B6-C7CC484115A7 | 1 | 1
262640CA-65C2-4E30-8654-E187ACA1EEF4 | 1 | 1
53710AFC-4E19-4B1C-B68B-CDB713EC3D62 | 1 | 2
8FF7E77C-D04C-4961-82D9-87C2E5A1A096 | 1 | 2
-----------------------------------------------------
119D9E52-41D1-45DF-81B6-C7CC484115A7 | 2 | 1
762640CA-65C2-4E30-8654-E187ACA1EEF4 | 2 | 1
93710AFC-4E19-4B1C-B68B-CDB713EC3D62 | 2 | 2
4FF7E77C-D04C-4961-82D9-87C2E5A1A096 | 2 | 2
And there is a view called TestView:
SomeId | Value | Description
----------------------------
1 | 1 | 'One'
1 | 2 | 'Two'
1 | 3 | 'Three'
----------------------------
2 | 1 | 'One'
2 | 2 | 'Two'
These are just pseudo code examples.
I want to count all the values from the Test table (for a specific [SomeId]), and if value from the TestView (with a specific [SomeId]) is not in the Test table I just want to display 0 as count.
If I wanted to count values WHERE [Test].[SomeId] = 1, here's the expected result:
Value | Count
-----------------
One | 2
Two | 2
Three | 0
This is my query so far:
SELECT
tv.[Description] AS [Value],
COUNT(t.[Id]) - COUNT(tv.[Value]) AS [Count]
FROM [TestView] AS tv
LEFT JOIN [Test] AS t ON
t.[SomeId] = tv.[SomeId]
AND t.[Value] = tv.[Value]
WHERE
t.[SomeId] = 1
GROUP BY
tv.[Description]
But this gives me bad result... Anyways, here's the SQL Fiddle
EDIT:
This is just an addition to a Test table. What is Test table has one more foreign key Id, let's call it OtherId. Now when I use the query from the answer I won't get the result I wanted. Here's the modified query:
SELECT
t1.Description AS Value,
COUNT(t2.Value) AS Count
FROM TestView t1
LEFT JOIN test t2
ON t1.Value = t2.Value AND t1.SomeId = t2.SomeId
WHERE t1.SomeId = 1
AND t2.[OtherId] = *something* -- this is the addition
GROUP BY t1.Value, t1.Description
ORDER BY t1.Value;
Try this:
SELECT
t1.Description AS Value,
COUNT(t2.Value) AS Count
FROM TestView t1
LEFT JOIN test t2
ON t1.Value = t2.Value AND t1.SomeId = t2.SomeId
WHERE t1.SomeId = 1
GROUP BY t1.Value, t1.Description
ORDER BY t1.Value;
Demo
Below is your Solution
SELECT
tv.[Description] AS [Value],
COUNT(t.[Id]) AS [Count]
FROM [TestView] AS tv
LEFT OUTER JOIN [Test] AS t ON tv.SomeId = t.SomeId
AND t.Value = tv.value
AND t.[SomeId] = 1
GROUP BY
tv.[Description]

How to get Order numbers where collection number has a top and a corresponding bottom?

This is how the main order table looks :-
| Order_num | Collection_Num |
+--------------+----------------+
| 20143045585 | 123456 |
| 20143045585 | 789012 |
| 20143045585 | 456897 |
| 20143758257 | 546465 |
+--------------+----------------+
These are the collections:-
| tops | bottom |
+--------------+----------------+
| 353735 | 745758 |
| 123456 | 789012 |
| 456456 | 456897 |
| 323456 | 546465 |
+--------------+----------------+
Desired Output:-
| Order_num |
+--------------+
| 20143045585 |
Here Order number 20143045585 has both a top and a bottom from the same row in table number 2 (each row in 2nd table forms a particular combination called 'A Collection' i.e. 1 top and corresponding bottom ).
What I want to know -
All the order numbers which have a top and a corresponding bottom in 'Collection_num' column.
Can anyone help me with a SQL code for this ?
Let me know if any of this is unclear.
select Order_num
From table_1 as A
where exists
(select tops from table_2 as B where B.tops = A.Collection_num)
AND
(select bottom from table2 as B where B.bottom = A.Collection_num)
I am assuming you just have the first table of data and each order can only have the two relevant collections or less. Perhaps:
select T1.Order_Num
,T1.Collection_Num AS Tops
,T2.Collection_Num AS Bottom
from Table1 T1
inner join Table1 T2
on T1.Order_Num = T2.Order_Num
and T1.Collection_Num < T2.Collection_Num
order by T1.Order_Num
You can try using subquery
select distinct order_num from #yourorder where collection_num
in (select tops from #yourcollections)
and order_num in
( select order_num from #yourorder where collection_num in
(select bottom from #yourcollections) )
Pretty sure that something like this should work for you. I am just using the ctes here to create the test data so it can be queried.
with Orders (OrderNum, CollectionNum) as
(
select 20143045585, 123456 union all
select 20143045585, 789012 union all
select 20143045585, 456897 union all
select 20143758257, 546465
)
, Collections (CollectionID, tops, bottoms) as
(
select 1, 353735, 745758 union all
select 2, 123456, 789012 union all
select 3, 456456, 456897 union all
select 4, 323456, 546465
)
select o.OrderNum
, t.tops
, b.bottoms
from Orders o
join Collections t on t.tops = o.CollectionNum
join
(
select o.OrderNum
, b.bottoms
, b.CollectionID
from Orders o
join Collections b on b.bottoms = o.CollectionNum
) b on b.CollectionID = t.CollectionID
Here is the query that I used:
Select *
From (select A.Order_num, B.Coll_ID, B.Bottoms from Orders_table as A
Join Collections_Table as B
on A.Collection_num = B.Bottoms
) as C
join
(select K.Order_num, M.Coll_ID, M.Tops from Orders_table as K
Join Collections_Table as M
on A.Collection_num = B.Tops
) as D
on C.Orders_B = D.Orders_Num AND C.Coll_ID = D.Coll_ID
)

how to query range?

Raw Data
| ID | STATUS |
| 1 | A |
| 2 | A |
| 3 | B |
| 4 | B |
| 5 | B |
| 6 | A |
| 7 | A |
| 8 | A |
| 9 | C |
Result
| START | END |
| 1 | 2 |
| 6 | 8 |
Range of STATUS A
How to query ?
This should give you the correct ranges:
SELECT
STATUS,
MIN(ID),
max_id
FROM (
SELECT
t1.STATUS,
t1.ID,
COALESCE(MAX(t2.ID), t1.ID) max_id
FROM
yourtable t1 LEFT JOIN yourtable t2
ON t1.STATUS=t2.STATUS AND t1.ID<t2.ID
WHERE
NOT EXISTS (SELECT NULL
FROM yourtable t3
WHERE
t3.STATUS!=t1.STATUS
AND t3.ID>t1.ID AND t3.ID<t2.ID)
GROUP BY
t1.ID,
t1.STATUS
) s
WHERE
status = 'A'
GROUP BY
STATUS,
max_id
Please see fiddle here.
You are probably better off with a cursor-based solution or a client-side function.
However, if you were using Oracle - the following would work.
WITH LOWER_VALS AS
( -- All the Ids with no immediate predecessor
SELECT ROWNUM AS RN, STATUS, ID AS LOWER FROM
(
SELECT STATUS, ID
FROM RAWDATA RD1
WHERE RD1.ID -1 NOT IN
(SELECT ID FROM RAWDATA PRED_TABLE WHERE PRED_TABLE.STATUS = RD1.STATUS)
ORDER BY STATUS, ID
)
) ,
UPPER_VALS AS
( -- All the Ids with no immediate successor
SELECT ROWNUM AS RN, STATUS, ID AS UPPER FROM
(
SELECT STATUS, ID
FROM RAWDATA RD2
WHERE RD2.ID +1 NOT IN
(SELECT ID FROM RAWDATA SUCC_TABLE WHERE SUCC_TABLE.STATUS = RD2.STATUS)
ORDER BY STATUS, ID
)
)
SELECT
L.STATUS, L.LOWER, U.UPPER
FROM
LOWER_VALS L
JOIN UPPER_VALS U ON
U.RN = L.RN;
Results in the set
A 1 2
A 6 8
B 3 5
C 9 9
http://sqlfiddle.com/#!4/10184/2
There is not a lot to go on from what you put, but I think this might work. I am using T-SQL because I don't know what you are using?
SELECT
min(ID)
, max(ID)
FROM RawData
WHERE [Status] = 'A'