Order SQL query by the sum of specefic columns - sql

Here is an extract from the fairly large table (SQL Server 2005) I'm querying against:
id (primary key) | account | phone | employee | address
------------------------------------------------------------------
1 | 123 | Y | Y | N
2 | 456 | N | N | N
3 | 789 | Y | Y | Y
I need to only return the rows that have at least one Y in phone, employee, or address (there are about 10 others not shown here). Then I need to order those results by the number of Y's they have in any of the three.
I've tried getting the "tagTotal" like this:
SELECT
SUM(
CASE WHEN [phone] = 'Y' THEN 1 ELSE 0 END
+ CASE WHEN [employee] = 'Y' THEN 1 ELSE 0 END
+ CASE WHEN [address] = 'Y' THEN 1 ELSE 0 END
)
FROM table
GROUP BY id
this returns:
tagTotal
---------------
2
0
3
I'm at a loss on how to combine this with my existing giant query and order by it without adding each column to the group by at the end.

Since the sum of values you're after is on the same row, you don't need to aggregrate the results, thereby eliminating the need for the group by..
SELECT
CASE WHEN [phone] = 'Y' THEN 1 ELSE 0 END +
CASE WHEN [employee] = 'Y' THEN 1 ELSE 0 END +
CASE WHEN [address] = 'Y' THEN 1 ELSE 0 END as Total
FROM table

You can just do the addition as a column and then order the results. The aggregation seems unnecessary, at least with the sample data in the question. There is only one row per id.
SELECT t.*
FROM (SELECT t.*,
((CASE WHEN [phone] = 'Y' THEN 1 ELSE 0 END) +
(CASE WHEN [employee] = 'Y' THEN 1 ELSE 0 END) +
(CASE WHEN [address] = 'Y' THEN 1 ELSE 0 END)
) as NumYs
FROM table t
) t
WHERE NumYs > 0
ORDER BY NumYs DESC;

Try selecting the ID and ordering by the sum?
SELECT id,
SUM(
CASE WHEN [phone] = 'Y' THEN 1 ELSE 0 END
+ CASE WHEN [employee] = 'Y' THEN 1 ELSE 0 END
+ CASE WHEN [address] = 'Y' THEN 1 ELSE 0 END
) as numsum
FROM table
ORDER BY numsum

This should work:
select *
from
(
SELECT
id,
SUM(
CASE WHEN [phone] = 'Y' THEN 1 ELSE 0 END
+ CASE WHEN [employee] = 'Y' THEN 1 ELSE 0 END
+ CASE WHEN [address] = 'Y' THEN 1 ELSE 0 END
) tagTotal
FROM table
GROUP BY id
) x
where x.tagTotal <> 0
order by x.tagTotal desc
The inner query is basically yours, with the addition of the Id (which I assume you need) and giving the sum a name. This is then used as the input to the outer query, excluding those with a zero total and sorting with highest sum first.
(Incidentally, this is not a large query. The largest single select statement I have written covered over 250 lines, took 20 minutes to run, and did the daily P&L of a commodity trading company. That was large...)

Related

Adding a dummy identifier to data that varies by position and value

I am working on a project in SQL Server with diagnosis codes and a patient can have up to 4 codes but not necessarily more than 1 and a patient cannot repeat a code more than once. However, codes can occur in any order. My goal is to be able to count how many times a Diagnosis code appears in total, as well as how often it appears in a set position.
My data currently resembles the following:
PtKey
Order #
Order Date
Diagnosis1
Diagnosis2
Diagnosis3
Diagnosis 4
345
1527
7/12/20
J44.9
R26.2
NULL
NULL
367
1679
7/12/20
R26.2
H27.2
G47.34
NULL
325
1700
7/12/20
G47.34
NULL
NULL
NULL
327
1710
7/12/20
I26.2
J44.9
G47.34
NULL
I would think the best approach would be to create a dummy column here that would match up the diagnosis by position. For example, Diagnosis 1 with A, and Diagnosis 2 with B, etc.
My current plan is to rollup the diagnosis using an unpivot:
UNPIVOT ( Diag for ColumnALL IN (Diagnosis1, Diagnosis2, Diagnosis3, Diagnosis4)) as unpvt
However, this still doesn’t provide a way to count the diagnoses by position on a sales order.
I want it to look like this:
Diagnosis
Total Count
Diag1 Count
Diag2 Count
Diag3 Count
Diag4 Count
J44.9
2
1
1
0
0
R26.2
1
1
0
0
0
H27.2
1
0
1
0
0
I26.2
1
1
0
0
0
G47.34
3
1
0
2
0
You can unpivot using apply and aggregate:
select v.diagnosis, count(*) as cnt,
sum(case when pos = 1 then 1 else 0 end) as pos_1,
sum(case when pos = 2 then 1 else 0 end) as pos_2,
sum(case when pos = 3 then 1 else 0 end) as pos_3,
sum(case when pos = 4 then 1 else 0 end) as pos_4
from data d cross apply
(values (diagnosis1, 1),
(diagnosis2, 2),
(diagnosis3, 3),
(diagnosis4, 4)
) v(diagnosis, pos)
where diagnosis is not null;
Another way is to use UNPIVOT to transform the columns into groupable entities:
SELECT Diagnosis, [Total Count] = COUNT(*),
[Diag1 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis1' THEN 1 ELSE 0 END),
[Diag2 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis2' THEN 1 ELSE 0 END),
[Diag3 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis3' THEN 1 ELSE 0 END),
[Diag4 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis4' THEN 1 ELSE 0 END)
FROM
(
SELECT * FROM #x UNPIVOT (Diagnosis FOR DiagGroup IN
([Diagnosis1],[Diagnosis2],[Diagnosis3],[Diagnosis4])) up
) AS x GROUP BY Diagnosis;
Example db<>fiddle
You can also manually unpivot via UNION before doing the conditional aggregation:
SELECT Diagnosis, COUNT(*) As Total Count
, SUM(CASE WHEN Position = 1 THEN 1 ELSE 0 END) As [Diag1 Count]
, SUM(CASE WHEN Position = 2 THEN 1 ELSE 0 END) As [Diag2 Count]
, SUM(CASE WHEN Position = 3 THEN 1 ELSE 0 END) As [Diag3 Count]
, SUM(CASE WHEN Position = 4 THEN 1 ELSE 0 END) As [Diag4 Count]
FROM
(
SELECT PtKey, Diagnosis1 As Diagnosis, 1 As Position
FROM [MyTable]
UNION ALL
SELECT PtKey, Diagnosis2 As Diagnosis, 2 As Position
FROM [MyTable]
WHERE Diagnosis2 IS NOT NULL
UNION ALL
SELECT PtKey, Diagnosis3 As Diagnosis, 3 As Position
FROM [MyTable]
WHERE Diagnosis3 IS NOT NULL
UNION ALL
SELECT PtKey, Diagnosis4 As Diagnosis, 4 As Position
FROM [MyTable]
WHERE Diagnosis4 IS NOT NULL
) d
GROUP BY Diagnosis
Borrowing Aaron's fiddle, to avoid needing to rebuild the schema from scratch, and we get this:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=d1f7f525e175f0f066dd1749c49cc46d

Oracle SQL Developer question from newbie

Sorry could not think of more descriptive title. I have data that looks like:
MEMBERID
TICKETID
STATUS
A
123
Y
A
012
N
A
456
Y
B
XYZ
N
B
ABC
N
C
DEF
Y
C
789
Y
I want to separate the above into three tables:
(1) Members that ONLY have tickets with Status=Y
(2) Members that have mixed status tickets (so at least one ticket with status=Y and at least one ticket with status=N)
(3) Members that ONLY have tickets with Status=N
In Excel I would just do a pivot table that results in something like:
MEMBERID
"Y"
"N"
A
2
1
B
0
2
C
2
0
...then add a 4th column with a formula that allows me to separate member IDs by "Only Y", "Only N", and "Y/N". I'm new to SQL though, and can't seem to get "pivot" to run correctly, or maybe there's a "where" clause that could resolve this without using pivot? Help!
You could pivot but it's probably simpler to just do the aggregation yourself:
select memberid,
count(case when status = 'Y' then ticketid end) as y,
count(case when status = 'N' then ticketid end) as n
from your_table
group by memberid
order by memberid;
To get the fourth column you can either repeat the counts within another case expression:
select memberid,
count(case when status = 'Y' then ticketid end) as y,
count(case when status = 'N' then ticketid end) as n,
case
when count(case when status = 'Y' then ticketid end) > 0
and count(case when status = 'N' then ticketid end) > 0
then 'Y/N'
when count(case when status = 'Y' then ticketid end) > 0
then 'Only Y'
when count(case when status = 'N' then ticketid end) > 0
then 'Only N'
end as yn
from your_table
group by memberid
order by memberid;
Or put the initial query into a CTE or inline view which is clearer and has less repetition, so easier to maintain:
select memberid, y, n,
case
when y > 0 and n > 0 then 'Y/N'
when y > 0 then 'Only Y'
when n > 0 then 'Only N'
end as yn
from (
select memberid,
count(case when status = 'Y' then ticketid end) as y,
count(case when status = 'N' then ticketid end) as n
from your_table
group by memberid
)
order by memberid;
Either way you end up with:
MEMBERID Y N YN
-------- - - ------
A 2 1 Y/N
B 0 2 Only N
C 2 0 Only Y
SQL Fiddle

Combine multiple rows into 1 row

Say for example I have a table that contains a description of a customer's activities while in a cafe. (Metaphor of the actual table I am working on)
Customer Borrowed Book Ordered Drink Has Company
1 1
1 1
1 Yes
2 1
3 1
3 Yes
4 1 1
4 1
I wish to combine the rows in this way
Customer Borrowed Book Ordered Drink Has Company
1 1 1 Yes
2 1
3 1 Yes
4 1 2
I did self join with coalesce, but it did not give my desired results.
You can do this by group by,
select Customer,sum([borrowed book]), sum([ordered drink]), max([has company])
from customeractivity group by Customer
As per your comment, initial table is a temp table,
Try to make the result as a cte result, then do aggregation on that, like the below query.
; WITH cte_1
AS
( //your query to return the result set)
SELECT customer,sum([borrowed book]) BorrowedBook,
sum([ordered drink]) OrderedDrink,
max([has company]) HasCompany
FROM cte_1
GROUP BY Customer
Use Group By:
DECLARE #tblTest as Table(
Customer INT,
BorrowedBook INT,
OrderedDrink INT,
HasCompany BIt
)
INSERT INTO #tblTest VALUES
(1,1,NULL,NULL)
,(1,NULL,1,NULL)
,(1,NULL,NULL,1)
,(2,NULL,1,NULL)
,(3,NULL,1,NULL)
,(3,NULL,NULL,1)
,(4,1,1,NULL)
,(4,NULL,1,NULL)
SELECT
Customer,
SUM(ISNULL(BorrowedBook,0)) AS BorrowedBook,
SUM(ISNULL(OrderedDrink,0)) AS OrderedDrink,
CASE MIN(CAST(HasCompany AS INT)) WHEN 1 THEN 'YES' ELSE '' END AS HasCompany
FROM #tblTest
GROUP BY Customer
Not sure, why you are getting error with group by.
Your coalesce should be correct. Refer below way.
Select customer
, case when [borrowed] = 0 then NULL else [borrowed] end as [borrowed]
, case when [ordered] = 0 then NULL else [ordered] end as [ordered]
, case when [company] = 1 then 'Yes' end as company
from
(
Select customer,
coalesce(
case when (case when borrowed = '' then null else borrowed end) = 1 then 'borrowed' end,
case when (case when ordered = '' then null else ordered end) = 1 then 'ordered' end,
case when (case when company = '' then null else company end) = 'Yes' then 'company' end
) val
from Table
) main
PIVOT
(
COUNT (val)
FOR val IN ( [borrowed], [ordered], [company] )
) piv
OUTPUT:
customer | borrowed | ordered | company
---------------------------------------
1 1 1 Yes
2 NULL 1 NULL
3 NULL 1 Yes

SQL Server: Using COUNT with IN and NOT IN

I have a data table as follows :
file_id | action code
1 | 10
1 | 20
2 | 10
2 | 12
3 | 10
3 | 20
4 | 10
4 | 10
4 | 20
The output is:
file_id | Warning
1 | 0
2 | 0 <- this should be 1 instead
3 | 0
4 | 1
The first count works as expected, and sets warning as 1, if there are any action_code duplicates, but i can't get it to work and display a warning if action_code is not perfectly divisible with 10
#exported [int] = NULL,
#bin_id [int] = NULL,
#date_start [DateTime],
#date_stop [DateTime],
#action_code [int] = NULL,
#action_description [varchar](43) = NULL
SELECT
dbo.Tf.file_id AS 'ID',
dbo.Tf.file_name AS 'NAME',
MAX(dbo.TFD.action_date) AS 'DATE',
MAX(dbo.TFD.file_length) AS 'SIZE',
dbo.Bins.name AS 'BIN',
dbo.TFD.action_description,
CASE
WHEN (COUNT(DISTINCT dbo.TFD.action_code) <> COUNT(dbo.TFD.action_code) )
AND
((SELECT COUNT ( dbo.TFD.action_code ) FROM TFD WHERE action_code IN (10,20,30,40,50)) > 0
AND
(SELECT COUNT ( dbo.TFD.action_code ) FROM TFD WHERE action_code NOT IN (10,20,30,40,50)) > 0 ) THEN 1
ELSE 0
END AS 'Warning'
FROM
( SELECT
dbo.Tf.file_id,
MAX(dbo.TFD.action_code) AS 'action_code'
FROM Tf
INNER JOIN TFD
ON Tf.file_id = TFD.file_id INNER JOIN Bins ON Tf.bin_id = Bins.bin_id
WHERE
(#bin_id IS NULL OR Tf.bin_id = #bin_id)
AND Tf.file_id IN
(
SELECT H.file_id
FROM Tf AS H INNER JOIN TFD AS D ON H.file_id = D.file_id
WHERE ((D.action_date >= #date_start AND D.action_date <= #date_stop) OR (H.file_date >= #date_start AND H.file_date <= #date_stop))
AND (H.bin_id = #bin_id OR #bin_id IS NULL)
AND H.file_type = #exported
AND ((#action_description IS NULL) OR (D.action_description LIKE #action_description + '%'))
)
AND (#exported IS NULL OR Tf.file_type = #exported)
GROUP BY dbo.Tf.file_id) AS TempSelect
INNER JOIN Tf
ON Tf.file_id = TempSelect.file_id
INNER JOIN TFD
ON (TFD.file_id = TempSelect.file_id
AND TFD.action_code = TempSelect.action_code)
INNER JOIN Bins ON Tf.bin_id = Bins.bin_id
WHERE
(
(#action_code IS NULL ) OR (#action_code <> -1 AND TempSelect.action_code = #action_code)
OR (#action_code = -1 AND TempSelect.action_code NOT IN (10,20,30,40) )
)
GROUP BY
dbo.Tf.file_id,
dbo.Tf.file_name,
dbo.Bins.name,
dbo.Tf.bin_id,
dbo.TFD.action_description
EDIT: I added the whole procedure. My main goal,among others, is to set the field warning as 1 if the following conditions are met:
if there are any action_code duplicates (as it's the case for file 4)
if there is an action_code not divisible by 10 among the other action_codes for each file (as it's the case with file 2)
If your logic is: Set a flag to 1 if there are duplicates or if a code is not divisible by 10, then I would suggest:
select (case when count(distinct d.action_code) <> count(*) then 1
else max(case when d.action_code % 10 <> 0 then 1 else 0 end)
end)
Notice that I replaced dbo.Detail with the table alias d. Table aliases make a query easier to write, read, and understand.
Hope this helps you:
SELECT FILE_ID,
MAX(CASE WHEN action_code % 10 != 0 THEN 1 END) not_divisible,
CASE WHEN COUNT(*)!=COUNT(DISTINCT action_Code) THEN 1 END not_unique
FROM #test
GROUP BY FILE_ID
Putting it all together you can use:
SELECT file_id,
CASE WHEN COUNT(*)!=COUNT(DISTINCT action_Code) THEN 1
ELSE MAX(CASE WHEN action_code % 10 != 0 THEN 1 ELSE 0 END) END Warning
FROM #test
GROUP BY file_id
Try with the below query..
CREATE TABLE #t (FileID INT,ActionCode INT)
INSERT INTO #t
VALUES (1,10),(1,20),(2,10),(2,12),(3,10),(3,20),(4,10),(4,10),(4,20)
WITH cte_1
as (
SELECT *,COUNT(1) OVER(PARTITION BY FileID,ActionCode ORDER BY fileID,ActionCode) CNT
FROM #T)
SELECT FileID,case WHEN SUM(ActionCode) %10 <>0 THEN 1 WHEN MAX(CNT)<>1 THEN 1 ELSE 0 END
FROM CTE_1
GROUP BY FileID
Result :
Thanks all for your answers, they were helpful, i modified the following section as such, and now it works:
...
dbo.TFD.action_description,
CASE
WHEN (COUNT(DISTINCT dbo.TFD.action_code) <> COUNT(dbo.TFD.action_code)) OR err_ac > 0
THEN 1 ELSE 0 END AS 'Warning'
FROM
(
SELECT
dbo.Tf.file_id,
MAX(dbo.TFD.action_code) AS 'action_code',
CASE
WHEN SUM(dbo.TFD.action_code) %10 <> 0 THEN 1 ELSE 0 END AS 'err_ac'
...

Checking if the row has the max value in a group

I'm trying get to find out if a row has the max value in a group. Here's really simple example:
Data
VoteCount LocationId UserId
3 1 1
4 1 2
3 2 2
4 2 1
Pseudo-query
select
LocationId,
sum(case
when UserId = 1 /* and has max vote count*/
then 1 else 0
end) as IsUser1Winner,
sum(case
when UserId = 2 /* and has max vote count*/
then 1 else 0
end) as IsUser2Winner
from LocationVote
group by LocationID
It should return:
LocationId IsUser1Winner IsUser2Winner
1 0 1
2 1 1
I also couldn't find a way to generate dynamic column names here. What would be the simplest way to write this query?
You could also do this using a Case statement
WITH CTE as
(SELECT
MAX(VoteCount) max_votes
, LocationId
FROM LocationResult
group by LocationId
)
SELECT
A.LocationId
, Case When UserId=1
THEN 1
ELSE 0
END IsUser1Winner
, Case when UserId=2
THEn 1
ELSE 0
END IsUser2Winner
from LocationResult A
inner join
CTE B
on A.VoteCount = B.max_votes
and A.LocationId = B.LocationId
Try this:
select *
from table t
cross apply (
select max(votes) max_value
from table ref
where ref.group = t.group
)votes
where votes.max_value = t.votes
but if your table is huge and has no propriate indexes performance may be poor
Another way is to get max values by groups into table variable or temp table and then join it to original table.