MySQL - Getting summary of multiple grouped rows? - sql

2 tables: owners & cars
An owner can have many cars. A car can be marked as usable_offroad, usable_onroad, or both. The cars table has usable offroad and usable_onroad fields which can be set to 0 or 1 (no or yes)
Consider the following query:
SELECT *
FROM owners
LEFT JOIN cars on cars.owner_id = owners.id
GROUP BY owners.id
ORDER BY owners.last_name
My goal is to return a list of owners, and whether or not each owns a an onroad or offroad vehicle, or both:
Last Name First Name Has Offroad Car Has Onroad Car
----------------------------------------------------------------------
Smith Todd Yes No
Smith Tom Yes Yes
Test Sue No Yes
Thumb Joe No No
White Al Yes No
How do I query this? Was thinking of using ROLLUP, but would prefer if the summary wasn't an appended row but an actual field on the already grouped owner row instead.

Use:
SELECT DISTINCT
o.lastname,
o.firstname,
CASE WHEN COALESCE(y.num_offroad, 0) > 0 THEN 'yes' ELSE 'no' END AS "Has Offroad Car"
CASE WHEN COALESCE(x.num_onroad, 0) > 0 THEN 'yes' ELSE 'no' END AS "Has Onroad Car"
FROM OWNERS o
LEFT JOIN (SELECT c.owner_id,
COUNT(*) AS num_onroad
FROM CARS c
WHERE c.usable_onroad = 1
GROUP BY c.owner_id) x ON x.owner_id = o.id
LEFT JOIN (SELECT c.owner_id,
COUNT(*) AS num_offroad
FROM CARS c
WHERE c.usable_offroad = 1
GROUP BY c.owner_id) y ON y.owner_id = o.id

Try this:
SELECT
T1.lastname,
T1.firstname,
T1.id in (SELECT owner_id FROM cars WHERE usable_offroad) AS `Has Offroad Car`,
T1.id in (SELECT owner_id FROM cars WHERE usable_onroad) AS `Has Onroad Car`
FROM owners T1
ORDER BY T1.lastname, T1.firstname
Results:
'Smith', 'Todd', 1, 0
'Smith', 'Tom', 1, 1
'Test', 'Sue', 0, 1
'Thumb', 'Joe', 0, 0
'White', 'Al', 1, 0
Here's my test data:
CREATE TABLE owners (id INT NOT NULL, firstname NVARCHAR(100) NOT NULL, lastname NVARCHAR(100) NOT NULL);
INSERT INTO owners (id, firstname, lastname) VALUES
(1, 'Todd', 'Smith'),
(2, 'Tom', 'Smith'),
(3, 'Sue', 'Test'),
(4, 'Joe', 'Thumb'),
(5, 'Al', 'White');
CREATE TABLE cars (id INT NOT NULL, owner_id INT NOT NULL, usable_onroad INT NOT NULL, usable_offroad INT NOT NULL);
INSERT INTO cars (id, owner_id, usable_offroad, usable_onroad) VALUES
(1, 1, 1, 0),
(2, 2, 1, 0),
(3, 2, 0, 1),
(4, 3, 0, 1),
(5, 3, 0, 1),
(6, 5, 1, 0);

Using a sub-query to sum all the flags, and then checking if one is > 1 should do the job:
SELECT last_name, first_name,
CASE WHEN usable_offroad_count > 0 THEN 'Yes' ELSE 'No' END has_offroad_car,
CASE WHEN usable_onroad_count > 0 THEN 'Yes' ELSE 'No' END has_onroad_car
FROM (
SELECT owners.last_name, owners.first_name,
SUM( cars.usable_offroad ) usable_offroad_count,
SUM( cars.usable_onroad ) usable_onroad_count
FROM owners
LEFT JOIN cars on cars.owner_id = owners.id
GROUP BY owners.id
)
ORDER BY last_name

SELECT * ,
if ( cars.usable offroad = 1 and usable_onroad= 1 , 'Both'
, if( cars.usable offroad = 1 and usable_onroad= 0 , 'Offroad' , 'Onroad')
) as Status
FROM owners
LEFT JOIN cars on cars.owner_id = owners.id
GROUP BY owners.id
ORDER BY owners.last_name

Related

Transpose many aggregates in SQL

I have two tables, cases, my main table, and activities, which shows work being done against certain cases.
CREATE TABLE cases
([caseno] int, [case_detail] varchar(8), [date_received] datetime)
;
INSERT INTO cases
([caseno], [case_detail], [date_received])
VALUES
(1, 'DETAIL A', '2018-04-01 00:00:00'),
(2, 'DETAIL B', '2018-05-01 00:00:00'),
(3, 'DETAIL C', '2018-06-01 00:00:00')
;
CREATE TABLE activities
([caseno] int, [activity] int, [team] varchar(1))
;
INSERT INTO activities
([caseno], [activity], [team])
VALUES
(1, 00, 'A'),
(1, 10, 'A'),
(1, 00, 'A'),
(1, 00, 'B'),
(1, 90, 'C'),
(1, 00, 'C'),
(1, 00, 'A'),
(2, 10, 'A'),
(2, 00, 'A'),
(2, 00, 'B'),
(3, 90, 'C'),
(3, 00, 'C')
;
I'm interested in aggregating the activities data, for activity = '00', split by team, and attaching to the cases data.
I've achieved this in the following way but I suspect it is not optimal. The cases table is about 1million rows and activities table is 200million rows or so.
SELECT T.*, A.A, B.B, C.C FROM cases T
LEFT JOIN (SELECT caseno, COUNT(*) AS A FROM activities WHERE activity = '00' AND team = 'A' GROUP BY caseno) A ON T.[caseno] = A.[caseno]
LEFT JOIN (SELECT caseno, COUNT(*) AS B FROM activities WHERE activity = '00' AND team = 'B' GROUP BY caseno) B ON T.[caseno] = B.[caseno]
LEFT JOIN (SELECT caseno, COUNT(*) AS C FROM activities WHERE activity = '00' AND team = 'C' GROUP BY caseno) C ON T.[caseno] = C.[caseno]
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=92632c9af821935790a7986e6f654b13
You could use conditional aggregation:
SELECT c.caseno, case_detail, date_received,
COUNT(CASE WHEN team = 'A' THEN 1 END) AS a,
COUNT(CASE WHEN team = 'B' THEN 1 END) AS b,
COUNT(CASE WHEN team = 'C' THEN 1 END) AS c
FROM cases c
LEFT JOIN activities a
ON c.caseno = a.caseno
AND a.activity = '00'
GROUP BY c.caseno, case_detail, date_received;
db<>fiddle demo
EDIT
Without typing all columns in GROUP BY:
WITH cte AS (
SELECT c.caseno,
COUNT(CASE WHEN team = 'A' THEN 1 END) AS a,
COUNT(CASE WHEN team = 'B' THEN 1 END) AS b,
COUNT(CASE WHEN team = 'C' THEN 1 END) AS c
FROM cases c
LEFT JOIN activities a
ON c.caseno = a.caseno
AND a.activity = '00'
GROUP BY c.caseno -- only PK
)
SELECT * FROM cte JOIN cases c ON cte.caseno = c.caseno;
Pivot solution
select *
from
( select cs.*,A.team
from cases cs
join activities a on cs.caseno=a.caseno and a.activity = 00
) C
pivot
(count(team)
for team in ([A],[B],[C])
) pvt
give us same result
sample

Best SQL query to retrieve the data which has all required data

I have a transaction table with item details for each company. I want to write a query to retrieve the companies only having item numbers 1,2 and 3 (according to my sample code in below). Selected companies should have all 1,2,3 items. If some company has only item 1, then it shouldn't come. How can I write this?
CREATE TABLE #TmpTran
(
ID BIGINT IDENTITY,
COMPANY_ID BIGINT,
ITEM_NAME VARCHAR(50),
ITEM_NUMBER INT
)
INSERT INTO #TmpTran (COMPANY_ID, ITEM_NAME, ITEM_NUMBER)
VALUES (1, 'ABC', 1), (1, 'DEF', 2), (1, 'HIJ', 3),
(2, 'KLM', 4), (2, 'KLM', 5), (2, 'ABC', 1)
How can I get only Company 1 data using WHERE or JOIN query?
You can do this with group by and having:
select company_id
from #tmptran tt
where item_number in (1, 2, 3)
group by company_id
having count(distinct item_number) = 3;
Another way (more flexible approach)
select company_id
from #tmptran tt
group by company_id
having count(case when item_number = 1 then 1 end) > 0;
and count(case when item_number = 2 then 1 end) > 0;
and count(case when item_number = 3 then 1 end) > 0;
select tt.company_id
from #tmptran tt
where tt.item_number in (1, 2, 3)
group by tt.company_id
having sum(max(case tt.item_number when 1 then 1 end)) +
and sum(max(case tt.item_number when 2 then 1 end)) +
and sum(max(case tt.item_number when 3 then 1 end)) = 3
You said you have a lot of fields. Probably the easiest for the reader to follow would be something like:
select distinct tt.company_id
from #tmptran tt
where tt.item_number in (1, 2, 3)
and exists(select 1
from #tmptran ttSub
where ttSub.company_id = tt.company_id and ttSub.item_number = 1)
and exists(select 1
from #tmptran ttSub
where ttSub.company_id = tt.company_id and ttSub.item_number = 2)
and exists(select 1
from #tmptran ttSub
where ttSub.company_id = tt.company_id and ttSub.item_number = 3)

Logic to check if exact ids (3+ records) are present in a group in SQL Server

I have some sample data like:
INSERT INTO mytable
([FK_ID], [TYPE_ID])
VALUES
(10, 1),
(11, 1), (11, 2),
(12, 1), (12, 2), (12, 3),
(14, 1), (14, 2), (14, 3), (14, 4),
(15, 1), (15, 2), (15, 4)
Now, here I am trying to check if in each group by FK_ID we have exact match of TYPE_ID values for 1, 2 & 3.
So, the expected output is like:
(10, 1) this should fail
As in group FK_ID = 10 we only have one record
(11, 1), (11, 2) this should also fail
As in group FK_ID = 11 we have two records.
(12, 1), (12, 2), (12, 3) this should pass
As in group FK_ID = 12 we have two records.
And all the TYPE_ID are exactly matching 1, 2 & 3 values.
(14, 1), (14, 2), (14, 3), (14, 4) this should also fail
As we have 4 records here.
(15, 1), (15, 2), (15, 4) this should also fail
Even though we have three records, it should fail as the TYPE_ID here (1, 2, 4) are not matching with required match (1, 2, 3).
Here is my attempt:
select * from mytable t1
where exists (select COUNT(t2.TYPE_ID)
from mytable t2 where t2.FK_ID = t1.FK_ID
and t2.TYPE_ID IN (1, 2, 3)
group by t2.FK_ID having COUNT(t2.TYPE_ID) = 3);
This is not working as expected, because it also pass for FK_ID = 14 which has four records.
Demo: SQL Fiddle
Also, how we can make it generic so that if we need to check for 4 or more TYPE_ID values like (1,2,3,4) or (1,2,3,4,5), we can do that easily by updating few values.
The following query will do what you want:
select fk_id
from t
group by fk_id
having sum(case when type_id in (1, 2, 3) then 1 else 0 end) = 3 and
sum(case when type_id not in (1, 2, 3) then 1 else 0 end) = 0;
This assumes that you have no duplicate pairs (although depending on how you want to handle duplicates, it might be as easy as using, from (select distinct * from t) t).
As for "genericness", you need to update the in lists and the 3.
If you want something more generic:
with vals as (
select id
from (values (1), (2), (3)) v(id)
)
select fk_id
from t
group by fk_id
having sum(case when type_id in (select id from vals) then 1 else 0 end) = (select count(*) from vals) and
sum(case when type_id not in (select id from vals) then 1 else 0 end) = 0;
You can use this code:
SELECT y.fk_id FROM
(SELECT x.fk_id, COUNT(x.type_id) AS count, SUM(x.type_id) AS sum
FROM mytable x GROUP BY (x.fk_id)) AS y
WHERE y.count = 3 AND y.sum = 6
For making it generic, you can equal y.count with N and y.sum with N*(N-1)/2, where N is the number you are looking for (1, 2, ..., N).
You can try this query. COUNT and DISTINCT used for eliminate duplicate records.
SELECT
[FK_ID]
FROM
#mytable T
GROUP BY
[FK_ID]
HAVING
COUNT(DISTINCT CASE WHEN [TYPE_ID] IN (1,2,3) THEN [TYPE_ID] END) = 3
AND COUNT(CASE WHEN [TYPE_ID] NOT IN (1,2,3) THEN [TYPE_ID] END) = 0
Try this:
select FK_ID,count(distinct TYPE_ID) from mytable
where TYPE_ID<=3
group by FK_ID
having count(distinct TYPE_ID)=3
You should use CTE with Dynamic pass Value which you have mentioned in Q.
WITH CTE
AS (
SELECT FK_ID,
COUNT(*) CNT
FROM #mytable
GROUP BY FK_ID
HAVING COUNT(*) = 3) <----- Pass Value here What you want to Display Result,
CTE1
AS (
SELECT T.[ID],
T.[FK_ID],
T.[TYPE_ID],
ROW_NUMBER() OVER(PARTITION BY T.[FK_ID] ORDER BY
(
SELECT NULL
)) RN
FROM #mytable T
INNER JOIN CTE C ON C.FK_ID = T.FK_ID),
CTE2
AS (
SELECT C1.FK_ID
FROM CTE1 C1
GROUP BY C1.FK_ID
HAVING SUM(C1.TYPE_ID) = SUM(C1.RN))
SELECT TT1.*
FROM CTE2 C2
INNER JOIN #mytable TT1 ON TT1.FK_ID = C2.FK_ID;
From above SQL Command which will produce Result (I have passed 3) :
ID FK_ID TYPE_ID
4 12 1
5 12 2
6 12 3

Update column based on IF Else Condition

I have two tables A and B
Table A
ID_number as PK
first_name,
L_Name
Table B
ID_number,
Email_id,
Flag
I have several people who have multiple email ID and are already flagged as X on table B.
Whereas i am trying to find list of people who have an email id or multiple email ID, but were never flagged.
e.g John clark might have 2 email in table B, but was never flagged.
Simply use not exists:
select a.*
from a
where not exists (select 1
from b
where b.id_number = a.id_number and b.flag = 'X'
);
You may want to perform an update, but your question seems to be only about selecting (probably to update based on select). It should be something like this:
SELECT A.L_Name
FROM A
WHERE NOT EXISTS (
SELECT 1
FROM B
WHERE B.ID_number = A.ID_number AND B.Flag = 'X'
)
OR the LEFT JOIN version
SELECT 1
FROM A
LEFT JOIN B ON B.ID_number = A.ID_number AND B.Flag = 'X'
WHER B.ID_number IS NULL
Usually, the first version is faster than the second one.
Forget Table A...
SELECT DISTINCT ID_number FROM table_b t1
WHERE NOT EXISTS(
SELECT NULL FROM table_b t2 WHERE t1.ID_number=t2.ID_number AND t2.flag='X'
)
Judging by your responses in the comments, I believe this is what you are looking for:
--drop table update_test;
create table update_test
(
id_num number,
email_id number,
flag varchar2(1) default null
);
insert into update_test values (1, 1, null);
insert into update_test values (1, 2, null);
insert into update_test values (2, 3, null);
insert into update_test values (2, 7, null);
insert into update_test values (3, 2, null);
insert into update_test values (3, 3, 'X');
insert into update_test values (3, 7, null);
select * from update_test;
select id_num, min(email_id)
from update_test
group by id_num;
update update_test ut1
set flag = case
when email_id = (
select min(email_id)
from update_test ut2
where ut2.id_num = ut1.id_num
) then 'X'
else null end
where id_num not in (
select id_num
from update_test
where Flag is not null);
The last update statement will update and set the Flag field on the record for each id_num group with the lowest email_id. If the id_num group already has the Flag field set for one it will ignore it.

Stuck on this union / except

Trying to find the best way to proceed with this, for some reason it is really tripping me up.
I have data like this:
transaction_id(pk) decision_id(pk) accepted_ind
A 1 NULL
A 2 <blank>
A 4 Y
B 1 <blank>
B 2 Y
C 1 Y
D 1 N
D 2 O
D 3 Y
Each transaction is guaranteed to have decision 1
There can be multiple decision possibilities (what-if's) type of scenarios
Accepted can have multiple values or be blank or NULL but only one can be accepted_ind = Y
I am trying to write a query to:
Return one row for each transaction_id
Return the decision_id where the accepted_ind = Y or if the transaction has no rows accepted_ind = Y, then return the row with decision_id = 1 (regardless of value in the accepted_ind)
I have tried:
1. Using logical "or" to pull the records, kept getting duplicates.
2. Using a union and except but can not quite get the logic down correctly.
Any assistance is appreciated. I am not sure why this is tripping me up so much!
Adam
Try this. Basically the WHERE clause says:
Where Accepted = 'Y'
OR
There is no accepted row for this transaction and the decision_id = 1
SELECT Transaction_id, Decision_ID, Accepted_id
FROM MyTable t
WHERE Accepted_ind = 'Y'
OR (NOT EXISTS (SELECT 1 FROM MyTable t2
WHERE Accepted_ind = 'Y'
and t2.Transaction_id = t.transaction_id)
AND Decision_id = 1)
This approach uses ROW_NUMBER() and therefore will only work on SQL Server 2005 or later
I have modified your sample data as as it stands, all transaction_id have a Y indicator!
DECLARE #t TABLE (
transaction_id NCHAR(1),
decision_id INT,
accepted_ind NCHAR(1) NULL
)
INSERT #t VALUES
( 'A' , 1 , NULL ),
( 'A' , 2 , '' ),
( 'A' , 4 , 'Y' ),
( 'B' , 1 , '' ),
( 'B' , 2 , 'N' ), -- change from your sample data
( 'C' , 1 , 'Y' ),
( 'D' , 1 , 'N' ),
( 'D' , 2 , 'O' ),
( 'D' , 3 , 'Y' )
And here is the query itself:
SELECT transaction_id, decision_id, accepted_ind FROM (
SELECT transaction_id, decision_id, accepted_ind,
ROW_NUMBER() OVER (
PARTITION BY transaction_id
ORDER BY
CASE
WHEN accepted_ind = 'Y' THEN 1
WHEN decision_id = 1 THEN 2
ELSE 3
END
) rn
FROM #t
) Raw
WHERE rn = 1
Results:
transaction_id decision_id accepted_ind
-------------- ----------- ------------
A 4 Y
B 1
C 1 Y
D 3 Y
The ROW_NUMBER() clause gives a 'priority' to each criterion you mention; we then ORDER BY to pick the best, and take the first row.
There's probably a neater/more efficient query, but I think this will get the job done. It assumes the table name is Decision:
SELECT CASE
WHEN accepteddecision.transaction_id IS NOT NULL THEN
accepteddecision.transaction_id
ELSE firstdecision.transaction_id
END AS transaction_id,
CASE
WHEN accepteddecision.decision_id IS NOT NULL THEN
accepteddecision.decision_id
ELSE firstdecision.decision_id
END AS decision_id,
CASE
WHEN accepteddecision.accepted_ind IS NOT NULL THEN
accepteddecision.accepted_ind
ELSE firstdecision.accepted_ind
END AS accepted_ind
FROM decision
LEFT OUTER JOIN (SELECT *
FROM decision AS accepteddecision
WHERE accepteddecision.accepted_ind = 'Y') AS
accepteddecision
ON accepteddecision.transaction_id = decision.transaction_id
LEFT OUTER JOIN (SELECT *
FROM decision AS firstdecision
WHERE firstdecision.decision_id = 1) AS firstdecision
ON firstdecision.transaction_id = decision.transaction_id
GROUP BY accepteddecision.transaction_id,
firstdecision.transaction_id,
accepteddecision.decision_id,
firstdecision.decision_id,
accepteddecision.accepted_ind,
firstdecision.accepted_ind
Out of interest, the following uses UNION and EXCEPT (plus a JOIN) as specified in the question title:
WITH T AS (SELECT * FROM (
VALUES ('A', 1, NULL),
('A', 2, ''),
('A', 4, 'Y'),
('B', 1, ''),
('B', 2, 'Y'),
('C', 1, 'Y'),
('D', 1, 'N'),
('D', 2, 'O'),
('D', 3, 'Y'),
('E', 2, 'O'), -- smaple data extended
('E', 1, 'N') -- smaple data extended
) AS T (transaction_id, decision_id, accepted_ind)
)
SELECT *
FROM T
WHERE accepted_ind = 'Y'
UNION
SELECT T.*
FROM (
SELECT transaction_id
FROM T
WHERE decision_id = 1
EXCEPT
SELECT transaction_id
FROM T
WHERE accepted_ind = 'Y'
) D
JOIN T
ON T.transaction_id = D.transaction_id
AND T.decision_id = 1;