SQL aggregation with case statement and group by

SQL aggregation with case statement and group by - sql

I'm trying to understand the problem with this code:
SELECT COUNT(CASE liveIn.state
WHEN ("NY" OR "NJ") THEN "group1"
WHEN ("NC" or "SC") THEN "group2"
END) AS state_groups
FROM (SELECT DISTINCT user_guid, state
FROM users
WHERE country="US" AND country IS NOT NULL) AS liveIn
GROUP BY state_groups;
The error I get is: "Can't group on 'state_groups'"
I have other code that solve my problem which look like this (but I'm trying to understand the problem with the one above):
SELECT COUNT(DISTINCT user_guid),
CASE
WHEN (state="NY" OR state="NJ") THEN "group1"
WHEN (state="NC" OR state="SC") THEN "group2"
END AS state_groups
FROM users
WHERE country="US" AND country IS NOT NULL
GROUP BY state_groups;
my output should look like this:
TKS!
P.S.- this is part of coursera sql learning course, so I'm working on Jupyter.

Your case when returns group values but they are not useful when wrapped in a count, as they will all contribute as 1 to it. Furthermore, you cannot group by the aggregate you want to calculate based on a group. Note that you only have 1 column in your first SQL. You should have 2: one identifying the group, and the other the count. You merged the two into one, and that does not make sense.
This would be a valid alternative:
SELECT CASE state
WHEN "NY" THEN "group1"
WHEN "NJ" THEN "group1"
WHEN "NC" THEN "group2"
WHEN "SC" THEN "group2"
ELSE "others"
END AS state_group,
COUNT(DISTINCT user_guid) AS user_count
FROM USERS
WHERE country = "US"
GROUP BY state_group
SQL fiddle
Note that if you put an or in this variation of the when clause, the second term of that or will be evaluated separately, and make the condition true, so all values will then end up in group1.
You could use the other variant of the case when syntax, where you can use or or even better, in:
SELECT CASE WHEN state IN ("NY", "NJ") THEN "group1"
WHEN state IN ("NC", "SC") THEN "group2"
ELSE "others"
END AS state_group,
COUNT(DISTINCT user_guid) AS user_count
FROM USERS
WHERE country = "US"
GROUP BY state_group
Note that in this syntax, there is nothing between the case and the first when.

Related

SQL Group by CASE result

I have a simple SQL query on IBM DB2. I'm trying to run something as below:
select case when a.custID = 42285 then 'Credit' when a.unitID <> '' then 'Sales' when a.unitID = '' then 'Refund'
else a.unitID end TYPE, sum(a.value) as Total from transactions a
group by a.custID, a.unitID
This query runs, however I have a problem with group by a.custID - I'd prefer not to have this, but the query won't run unless it's present. I'd want to run the group by function based on the result of the CASE function, not the condition pool behind it. So, I'm looking something like:
group by TYPE
However adding group by TYPE reports an error message "Column or global variable TYPE not found". Also removing a.custID from group section reports "Column custID or expression in SELECT list not valid"
Is this going to be possible at all or do I need to review my CASE function and avoid using the custID column since at the moment I'm getting a grouping also based on custID column, even though it's not present in SELECT.
I understand why the grouping works as it does, I'm just wondering if it's possible to get rid of the custID grouping, but still maintain it within CASE function.

If you want terseness of code, you could use a subquery here:
SELECT TYPE, SUM(value) AS Total
FROM
(
SELECT CASE WHEN a.custID = 42285 THEN 'Credit'
WHEN a.unitID <> '' THEN 'Sales'
WHEN a.unitID = '' THEN 'Refund'
ELSE a.unitID END TYPE,
value
FROM transactions a
) t
GROUP BY TYPE;
The alternative to this would be to just repeat the CASE expression in the GROUP BY clause, which is ugly, but should still work. Note that some databases (e.g. MySQL) have overloaded GROUP BY and do allow aliases to be used in at least some cases.

AND OR SQL operator with multiple records

I have the following query where if brand1/camp1 taken individually, query returns the correct value but if I specify more than one brand or campaigns, it returns some other number and I am not sure what the math is behind that. It is not the total of the two either.
I think it is IN operator that is specifying OR with "," as opposed to what I require it to do which is consider AND
select campaign,
sum(case when campaign in ('camp1', 'camp2') and description in ('brand1', 'brand2') then orders else 0 end) as brand_convs
from data.camp_results
where campaign in ('camp1', 'camp2') and channel='prog' and type='sbc'
group by campaign
having brand_convs > 0
order by brand_convs desc;
Any thoughts?

The problem is in the IN part as you suspected: The two IN operators do not affect eachother in any way, so campaign can be camp1 while description is brand2.
If your DBMS supports multiple columns in an IN statement, you use a single IN statement:
SELECT campaign, SUM(
CASE WHEN (campaign, description) IN (
('camp1', 'brand1'),
('camp2', 'brand2')
) THEN orders ELSE 0 END
) [rest of query...]
If not, you're probably going to have to use ANDs and ORs
SELECT campaign, SUM(
CASE WHEN
(campaign='camp1' AND description='brand1')
OR (campaign='camp2' AND description='brand2')
THEN orders ELSE 0 END
) [rest of query...]

Using SELECT with a display condition

SELECT DISTINCT Invoice.InvNo, Invoice.OrderNo, Part.PartNo,
orders.orddate AS Order_Date, Invoice.InvDate AS Bill_Date,
MiscChg.Descr, MiscChg.RegFee, Invoice.InvAmt,
Orders.ClaimNo, Firm.FirmName AS Ordering_Firm,
**oppatty.attyid(WHERE oppatty.attyfor = 13)**, Location.Name1 AS Location
The bolded section is the part I'm having trouble with. I know what I have isn't right, but it demonstrates what I would like to accomplish. In the oppatty table, there could be several items listed. I want it to only display "AttyID for the entry that has an ATTYFOR = 13".
Hope this make sense, thanks
Jack

You need to add a CASE WHEN to the select statement.
SELECT DISTINCT
Invoice.InvNo,
Invoice.OrderNo,
Part.PartNo,
orders.orddate AS Order_Date,
Invoice.InvDate AS Bill_Date,
MiscChg.Descr,
MiscChg.RegFee,
Invoice.InvAmt,
Orders.ClaimNo,
Firm.FirmName AS Ordering_Firm,
CASE WHEN oppatty.AttyFor = 13
THEN oppatty.AttyId
ELSE '' END AS attyfor,
Location.Name1 AS Location
FROM
.........
This will display the AttyId field when the row's AttyFor field is equal to 13 and show an empty string when it's not.

Your query has no from or where clause and your question is a bit jumbled, but even so, I think I understand what you want to do. Assuming it's acceptable to fill the "AttyID" values with null where "AttyFor" isn't equal to 13, then you could just use a case statement. Try something like this
select
stuff.things,
case
where oppatty.attyfor <> 13 then null
else oppatty.attyid
end as attyid,
stuff.others
from
oppatty
join stuff on oppatty.ID = stuff.ID
If that's not your desired result, and you'd rather entirely exclude rows where "AttyFor" isnt equal to 13, then just use a where clause.
select
stuff.things,
oppatty.attyid,
stuff.others
from
oppatty
join stuff on oppatty.ID = stuff.ID
where
oppatty.attyfor = 13

CASE and GROUP BY in SQL

I have been writing a query that allows me to select and count rows for specific product id's and shipment types.
Within this data, what I am now trying to achieve is count which rows have a specific field populated (second member name) and which have not. Then return this as a separate column in my query results.
Here's the query which I have written:
select count(job.JobID) as itemsCount, Lookup_Pack.PackDescription, Lookup_Pack.PackCode, Lookup_Pack.ID, job.shipping,
CASE
WHEN Job.secondMemForename <> '' THEN count(job.JobID)
ELSE 0
END AS [Extra card count]
from job
inner join Lookup_Pack on Lookup_Pack.ID = job.packTypeID
where Lookup_Pack.PackType = 'REN'
AND job.createDate >= '2015-06-01' and Job.createDate <= '2015-06-30'
GROUP BY Lookup_Pack.PackDescription, Lookup_Pack.PackCode, Lookup_Pack.ID, Job.shipping
If I run this query, I get an error returned as I am not grouping by Job.secondMemForename:
[FreeTDS][SQL Server]Column 'job.secondMemForename' is invalid in the
select list because it is not contained in either an aggregate
function or the GROUP BY clause.
although Job.secondMemForename does not form part of the query results.
I have subsequently added this field to the GROUP BY statement, the problem with this is that the data returned for all rows where the CASE applies is un-grouped as the Job.secondMemForename is different for all of them.
Any idea how I can resolve this?
Thanks.
Steeve.

Change Count() to Sum() and add it before CASE
SUM (CASE WHEN Job.secondMemForename <> '' THEN 1 END) AS [Extra card count]

multiple count(distinct)

I get an error unless I remove one of the count(distinct ...). Can someone tell me why and how to fix it?
I'm in vfp. iif([condition],[if true],[else]) is equivalent to case when
SELECT * FROM dpgift where !nocalc AND rectype = "G" AND sol = "EM112" INTO CURSOR cGift
SELECT
list_code,
count(distinct iif(language != 'F' AND renew = '0' AND type = 'IN',donor,0)) as d_Count_E_New_Indiv,
count(distinct iif(language = 'F' AND renew = '0' AND type = 'IN',donor,0)) as d_Count_F_New_Indiv /*it works if i remove this*/
FROM cGift gift
LEFT JOIN
(select didnumb, language, type from dp) d
on cast(gift.donor as i) = cast(d.didnumb as i)
GROUP BY list_code
ORDER by list_code
edit:
apparently, you can't use multiple distinct commands on the same level. Any way around this?

VFP does NOT support two "DISTINCT" clauses in the same query... PERIOD... I've even tested on a simple table of my own, DIRECTLY from within VFP such as
select count( distinct Col1 ) as Cnt1, count( distinct col2 ) as Cnt2 from MyTable
causes a crash. I don't know why you are trying to do DISTINCT as you are just testing a condition... I more accurately appears you just want a COUNT of entries per each category of criteria instead of actually DISTINCT
Because you are not "alias.field" referencing your columns in your query, I don't know which column is the basis of what. However, to help handle your DISTINCT, and it appears you are running from WITHIN a VFP app as you are using the "INTO CURSOR" clause (which would not be associated with any OleDB .net development), I would pre-query and group those criteria, something like...
select list_code,
donor,
max( iif( language != 'F' and renew = '0' and type = 'IN', 1, 0 )) as EQualified,
max( iif( language = 'F' and renew = '0' and type = 'IN', 1, 0 )) as FQualified
from
list_code
group by
list_code,
donor
into
cursor cGroupedByDonor
so the above will ONLY get a count of 1 per donor per list code, no matter how many records that qualify. In addition, if one record as an "F" and another does NOT, then you'll have a value of 1 in EACH of the columns... Then you can do something like...
select
list_code,
sum( EQualified ) as DistEQualified,
sum( FQualified ) as DistFQualified
from
cGroupedByDonor
group by
list_code
into
cursor cDistinctByListCode
then run from that...

You can try using either another derived table or two to do the calculations you need, or using projections (queries in the field list). Without seeing the schema, it's hard to know which one will work for you.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL aggregation with case statement and group by - sql

Related

SQL Group by CASE result

AND OR SQL operator with multiple records

Using SELECT with a display condition

CASE and GROUP BY in SQL

multiple count(distinct)

Categories

Resources