Remove duplicate rows when using CASE WHEN statement - sql

I am currently generating a table which converts the rows value to the new column, the following is my code:
SELECT ref_no,
(CASE WHEN code = 1 THEN code END) AS 'count_1',
(CASE WHEN code = 2 THEN code END) AS 'count_2',
(CASE WHEN code = 3 THEN code END) AS 'count_3',
(CASE WHEN code = 4 THEN code END) AS 'count_4',
(CASE WHEN code = 5 THEN code END) AS 'count_5',
(CASE WHEN code = 6 THEN code END) AS 'count_6'
FROM data"
The output is:
However, I needs those duplicated rows to be combined, is there any way to do? I don't need to sum up those values as there is no overlap among them.
I've tried with group by but it does not work as expected:
My expected out put is like:
ref c_1 c_2 c_3 c_4 c_5 c_6
1 1 2 3 - - -
This shows adding ORDER BY clause does not work in my context.
Updated: complete query in sqldf

The answer is: YES
By using GROUP BY and MAX like this:
SELECT ref_no,
max(CASE WHEN code = 1 THEN code END) AS 'count_1',
max(CASE WHEN code = 2 THEN code END) AS 'count_2',
max(CASE WHEN code = 3 THEN code END) AS 'count_3',
max(CASE WHEN code = 4 THEN code END) AS 'count_4',
max(CASE WHEN code = 5 THEN code END) AS 'count_5',
max(CASE WHEN code = 6 THEN code END) AS 'count_6'
FROM data
GROUP BY ref_no
ORDER BY ref_no

You could use PIVOT for this
SELECT *
FROM (
SELECT ref_no, code FROM data
) data
PIVOT (
max(code) FOR code IN ([1], [2], [3], [4], [5], [6])
) pivoted

The easiest would either be to use GROUP BY or a PIVOT function.
GROUP BY example below:
SELECT ref_no,
sum(CASE WHEN code = 1 THEN code ELSE 0 END) AS 'count_1',
sum(CASE WHEN code = 2 THEN code ELSE 0 END) AS 'count_2',
sum(CASE WHEN code = 3 THEN code ELSE 0 END) AS 'count_3',
sum(CASE WHEN code = 4 THEN code ELSE 0 END) AS 'count_4',
sum(CASE WHEN code = 5 THEN code ELSE 0 END) AS 'count_5',
sum(CASE WHEN code = 6 THEN code ELSE 0 END) AS 'count_6'
FROM data
GROUP BY ref_no
A really long way of doing this using your existing code and a CTE table:
WITH results as (
SELECT ref_no,
(CASE WHEN code = 1 THEN code END) AS 'count_1',
(CASE WHEN code = 2 THEN code END) AS 'count_2',
(CASE WHEN code = 3 THEN code END) AS 'count_3',
(CASE WHEN code = 4 THEN code END) AS 'count_4',
(CASE WHEN code = 5 THEN code END) AS 'count_5',
(CASE WHEN code = 6 THEN code END) AS 'count_6'
FROM data)
SELECT
ref_no
, sum(coalesce(count_1),0) -- for sum
, max(coalesce(count_1),0) -- for just the highest value
-- Repeat for other ones
FROM
results
GROUP BY
ref_no

Related

Aggregate function with case statement

I am trying to use the aggregate function with a CASE statement but I am not getting the syntax right.
I am attaching the sample code with the question on what I am trying to achieve but I get the syntax error.
count(case when weekminus1 = 0 and week0 = 1 then distinct(asin) end) as asin_added,
count(case when weekminus1 = 0 and week0 = 1 then distinct(fnsku) end) as fnsku_added,
Any leads will be helpful.
Yo want to express this as:
count(distinct case when weekminus1 = 0 and week0 = 1 then asin end) as asin_added,
count(distinct case when weekminus1 = 0 and week0 = 1 thenfnsku end) as fnsku_added,

Sum a column and perform more calculations on the result? [duplicate]

This question already has an answer here:
How to use an Alias in a Calculation for Another Field
(1 answer)
Closed 3 years ago.
In my query below I am counting occurrences in a table based on the Status column. I also want to perform calculations based on the counts I am returning. For example, let's say I want to add 100 to the Snoozed value... how do I do this? Below is what I thought would do it:
SELECT
pu.ID Id, pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed,
Snoozed + 100 AS Test
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
I get this error:
Invalid column name 'Snoozed'.
How can I take the value of the previous SUM statement, add 100 to it, and return it as another column? What I was aiming for is an additional column labeled Test that has the Snooze count + 100.
You can't use one column to create another column in the same way that you are attempting. You have 2 options:
Do the full calculation (as #forpas has mentioned in the comments above)
Use a temp table or table variable to store the data, this way you can get the first 5 columns, and then you can add the last column or you can select from the temp table and do the last column calculations from there.
You can not use an alias as a column reference in the same query. The correct script is:
SELECT
pu.ID Id, pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END)+100 AS Snoozed
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
MSSQL does not allow you to reference fields (or aliases) in the SELECT statement from within the same SELECT statement.
To work around this:
Use a CTE. Define the columns you want to select from in the CTE, and then select from them outside the CTE.
;WITH OurCte AS (
SELECT
5 + 5 - 3 AS OurInitialValue
)
SELECT
OurInitialValue / 2 AS OurFinalValue
FROM OurCte
Use a temp table. This is very similar in functionality to using a CTE, however, it does have different performance implications.
SELECT
5 + 5 - 3 AS OurInitialValue
INTO #OurTempTable
SELECT
OurInitialValue / 2 AS OurFinalValue
FROM #OurTempTable
Use a subquery. This tends to be more difficult to read than the above. I'm not certain what the advantage is to this - maybe someone in the comments can enlighten me.
SELECT
5 + 5 - 3 AS OurInitialValue
FROM (
SELECT
OurInitialValue / 2 AS OurFinalValue
) OurSubquery
Embed your calculations. opinion warning This is really sloppy, and not a great approach as you end up having to duplicate code, and can easily throw columns out-of-sync if you update the calculation in one location and not the other.
SELECT
5 + 5 - 3 AS OurInitialValue
, (5 + 5 - 3) / 2 AS OurFinalValue
You can't use a column alias in the same select. The column alias do not precedence / sequence; they are all created after the eval of the select result, just before group by and order by.
You must repeat code :
SELECT
pu.ID Id,pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END)+ 100 AS Test
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
If you don't want to repeat the code, use a subquery
SELECT
ID, Name, LeadCount, Working, Uninterested,Converted, Snoozed, Snoozed +100 AS test
FROM
(SELECT
pu.ID Id,pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed
FROM Prospects p
INNER JOIN ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE p.Store = '108'
GROUP BY pu.Name, pu.Id) t
ORDER BY Name
or a view

Exclude Row with Same ID but Different Secondary Column Value

Im trying to exclude a record if the ID (PK) is the same but a secondary column value is different.
So in my example below, I have two different codes E03_Port and E12_PortfNotDef for a single ID, so basically, I want to exclude the E12_PortfNotDef record if E01.. through E04.. codes are triggered.
SELECT *
FROM (
SELECT ID, Code,
MAX(CASE WHEN Code = 'E01_Matured' THEN 1 Else NULL END) AS Matured,
MAX(CASE WHEN Code = 'E02_Terminated' THEN 2 Else NULL END) AS Terminated,
MAX(CASE WHEN Code = 'E03_Port' THEN '3' Else NULL END) AS Port,
MAX(CASE WHEN Code = 'E04_Swap' THEN 4 Else NULL END) AS Swap,
MAX(CASE WHEN Code = 'E12_PortfNotDef' THEN '12' Else NULL END) AS Port_Not_Def
FROM EXCLUDED
GROUP BY ID, Code
)
WHERE COALESCE(Matured, Terminated, Port, Swap Port_Not_Def) IS NOT NULL
AND ID = '120320AC'
ORDER BY ID;
Actual Results:
ID Code Matured Terminated Port Swap Port_Not_Def
120320AC E03_Port 3
120320AC E12_PortfNotDef 12
Expected Results:
ID Code Matured Terminated Port Swap Port_Not_Def
120320AC E03_Port 3
There's a trivial way might be applied by using row_number() window analytic function :
SELECT *
FROM (
SELECT ID, Code,
MAX(CASE WHEN Code = 'E01_Matured' THEN 1 Else NULL END) AS Matured,
MAX(CASE WHEN Code = 'E02_Terminated' THEN 2 Else NULL END) AS Terminated,
MAX(CASE WHEN Code = 'E03_Port' THEN '3' Else NULL END) AS Port,
MAX(CASE WHEN Code = 'E04_Swap' THEN 4 Else NULL END) AS Swap,
MAX(CASE WHEN Code = 'E12_PortfNotDef' THEN '12' Else NULL END) AS Port_Not_Def,
ROW_NUMBER() OVER (PARTITION BY Code ORDER BY ID, Code) AS RN
FROM EXCLUDED
GROUP BY ID, Code)
WHERE COALESCE(Matured, Terminated, Port, Swap Port_Not_Def) IS NOT NULL
AND ID = '120320AC'
AND RN = 1
ORDER BY ID
I ended up restricting the ID's in the FROM Clause to not show those where Code was E01-E04.

Trying to get count of votes in SQL based on ID

Table structures:
Solution_Votes:
ID int
SolutionID string
Vote int
Solution:
ID int
Solution
VotesUp
VotesDown
Code:
SELECT
*,
(SELECT SUM(CASE WHEN voteUp = 1 THEN 1 ELSE 0 END)
FROM Solutions_Votes) AS VoteCountUp,
(SELECT SUM(CASE WHEN voteDown = 0 THEN 1 ELSE 0 END)
FROM Solutions_Votes) AS VoteCountDown
FROM
Solution
When I run this query it gives me the count on each row for voteUpCount and voteDownCount. I need the count to be based on the solution ID so that each solution has its count of up votes and down votes. If anybody can help it would be appreciated. Thanks in advance!
Just use conditional aggregation. In your case this is simple:
select sv.solutionid,
sum(case when sv.voteUp = 1 then 1 else 0 end) as VoteCountUp,
sum(case when sv.voteDown = 0 then 1 else 0 end) as VoteCountDown
from solutions_votes sv
group by sv.solutionid;
You only need the solutions table if some solutions have no votes and you want to include them.
EDIT:
You would include solutions in various way. Here is one:
select s.*, ss.VoteCountUp, ss.VoteCountDown
from solutions s left join
(select sv.solutionid,
sum(case when sv.voteUp = 1 then 1 else 0 end) as VoteCountUp,
sum(case when sv.voteDown = 0 then 1 else 0 end) as VoteCountDown
from solutions_votes sv
group by sv.solutionid
) ss
on s.solutionid = ss.solutionid;

Count rows for two columns using two different clauses

I'm after a CTE which I want to return two columns, one with the total number of 1's and one with the total number of 0's. Currently I can get it to return one column with the total number of 1's using:
WITH getOnesAndZerosCTE
AS (
SELECT COUNT([message]) AS TotalNo1s
FROM dbo.post
WHERE dbo.checkletters([message]) = 1
--SELECT COUNT([message]) AS TotalNo0s
--FROM dbo.post
--WHERE dbo.checkletters([message]) = 0
)
SELECT * FROM getOnesAndZerosCTE;
How do I have a second column called TotalNo0s in the same CTE which I have commented in there to show what I mean.
Using conditional aggregation:
WITH getOnesAndZerosCTE AS(
SELECT
TotalNo1s = SUM(CASE WHEN dbo.checkletters([message]) = 1 THEN 1 ELSE 0 END),
TotalNo0s = SUM(CASE WHEN dbo.checkletters([message]) = 0 THEN 1 ELSE 0 END)
FROM post
)
SELECT * FROM getOnesAndZerosCTE;
For using COUNT() directly just be aware that it counts any NON-NULL values. You can omit the ELSE condition which implicitly returns NULL if not stated
SELECT
COUNT(CASE WHEN dbo.checkletters([message]) = 1 THEN 1 END) TotalNo1s
, COUNT(CASE WHEN dbo.checkletters([message]) = 0 THEN 1 END) TotalNo0s
FROM post
or, explicitly state NULL
SELECT
COUNT(CASE WHEN dbo.checkletters([message]) = 1 THEN 1 ELSE NULL END) TotalNo1s
, COUNT(CASE WHEN dbo.checkletters([message]) = 0 THEN 1 ELSE NULL END) TotalNo0s
FROM post
You can do it without CTE
select count(message) total,
dbo.checkletters(message) strLength
from post
group by dbo.checkletters(message)
having dbo.checkletters(message) in (1, 2) //All the messages with length 1 or 2