Custom aggregation in GROUP BY clause

Custom aggregation in GROUP BY clause - sql

If I have a table with a schema like this
table(Category, SubCategory1, SubCategory2, Status)
I would like to group by Category, SubCategory1 and aggregate the Status such that
if not all Status values over the group have a certain value Status will be 0 otherwise 1.
So my result set will look like
(Category, SubCategory1, Status)
I don't want to write a function. I would like to do it inside the query.

Assuming that status is a numeric data type, use:
SELECT t.category,
t.subcategory1,
CASE WHEN MIN(t.status) = MAX(t.status) THEN 1 ELSE 0 END AS status
FROM dbo.TABLE_1 t
GROUP BY t.category, t.subcategory1

You can test that both the minimum and maximum status for each group are equal to your desired value:
SELECT
category,
subcategory1,
CASE WHEN MIN(status) = 42 AND MAX(status) = 42 THEN 1 ELSE 0 END AS Status
FROM table1
GROUP BY category, subcategory1

Let's say you want to find groups that have all status values under 100
SELECT category, subcategory1,
CASE WHEN MAX(status) < 100 THEN 0 ELSE 1 END AS Status
FROM table1
GROUP BY category, subcategory1
All groups with status under 100 will have Status set to 0, and all groups with at least one status >= 100 will be set to 1.
I think that's what you're asking for, but if not let me know.

I would like to group by Category,
SubCategory1 and aggregate the Status
such that if not all Status values
over the group have a certain value
Status will be 0 otherwise 1.
I'm interpreting this as "If there exists a Status value in a given group not equal to a given parameter, the returned Status will be 0 otherwise 1".
Select T.Category, T.SubCategory1
, Case
When Exists(
Select 1
From Table As T2
Where T2.Category = T.Category
And T2.SubCategory1 = T.SubCategory1
And T2.Status <> #Param
) Then 0
Else 1
End As Status
From Table As T
Group By t.Category, T.SubCategory1

Something like that :
select
Category,
SubCategory1,
(
case
when good_record_count = all_record_count then 1
else 0
end
) as all_records_good
from (
select
t.Category,
t.SubCategory1,
sum( cast(coalesce(t.Status, 'GOOD', '1', '0') as int) ) good_record_count,
count(1) all_record_count
from
table_name t
group by
t.Category, t.SubCategory1
)

Related

Check whether an employee is present on three consecutive days

I have a table called tbl_A with the following schema:
After insert, I have the following data in tbl_A:
Now the question is how to write a query for the following scenario:
Put (1) in front of any employee who was present three days consecutively
Put (0) in front of employee who was not present three days consecutively
The output screen shoot:
I think we should use case statement, but I am not able to check three consecutive days from date. I hope I am helped in this
Thank you

select name, case when max(cons_days) >= 3 then 1 else 0 end as presence
from (
select name, count(*) as cons_days
from tbl_A, (values (0),(1),(2)) as a(dd)
group by name, adate + dd
)x
group by name

With a self-join on name and available = 'Y', we create an inner table with different combinations of dates for a given name and take a count of those entries in which the dates of the two instances of the table are less than 2 units apart i.e. for each value of a date adate, it will check for entries with its own value adate as well as adate + 1 and adate + 2. If all 3 entries are present, the count will be 3 and you will have a flag with value 1 for such names(this is done in the outer query). Try the below query:
SELECT Z.NAME,
CASE WHEN Z.CONSEQ_AVAIL >= 3 THEN 1 ELSE 0 END AS YOUR_FLAG
FROM
(
SELECT A.NAME,
SUM(CASE WHEN B.ADATE >= A.ADATE AND B.ADATE <= A.ADATE + 2 THEN 1 ELSE 0 END) AS CONSEQ_AVAIL
FROM
TABL_A A INNER JOIN TABL_A B
ON A.NAME = B.NAME AND A.AVAILABLE = 'Y' AND B.AVAILABLE = 'Y'
GROUP BY A.NAME
) Z;
Due to the complexity of the problem, I have not been able to test it out. If something is really wrong, please let me know and I will be happy to take down my answer.

--Below is My Approch
select Name,
Case WHen Max_Count>=3 Then 1 else 0 end as Presence
from
(
Select Name,MAx(Coun) as Max_Count
from
(
select Name, (count(*) over (partition by Name,Ref_Date)) as Coun from
(
select Name,adate + row_number() over (partition by Name order by Adate desc) as Ref_Date
from temp
where available='Y'
)
) group by Name
);

select name as employee , case when sum(diff) > =3 then 1 else 0 end as presence
from
(select id, name, Available,Adate, lead(Adate,1) over(order by name) as lead,
case when datediff(day, Adate,lead(Adate,1) over(order by name)) = 1 then 1 else 0 end as diff
from table_A
where Available = 'Y') A
group by name;

sql case statement IN with group by

I have a 2 column table with the columns : "user_name" and "characteristic". Each user_name may appear multiple times with a different characteristic.
The values in characteristic are:
Online
Instore
Account
Email
I want to write a sql statement that goes like this - but obviously this isn't working:
SELECT user_name,
case
when characteristic in ("online","instore") then 1
else 0
END as purchase_yn,
case
when characteristic in ("online","instore") and
characteristic in ("email",'account') then 1
else 0
END as purchaser_with_account
FROM my_table
GROUP BY user_name;
Essentially the first is a flag where I check for the presence of either value for that user_name.
The Second field is that they meet this criteria AND that they meet the criteria for having either 'email' or 'account'

An example the structure of your data would help better understand what you are trying to accomplish. But I think I get what you are trying to do.
You have to use an aggregate function in order to use a group by.
Something like SUM or AVG.
But you need first to build a pivot of your data and then you could use that pivot to check for your criterias:
This would create a table pivot that shows for each record what criterias are met:
SELECT
user_name,
case when characteristic = "online" then 1 else 0 end as online_yn,
case when characteristic = "instore" then 1 else 0 end as instore_yn,
case when characteristic = "account" then 1 else 0 end as account_yn,
case when characteristic = "email" then 1 else 0 end as email_yn,
FROM my_table
Now what you might wanted to do is to create an averaged version of these entries grouped by user_name and use those averages to create the fields you wanted. For that you need to use the same statement created earlier as an inline table :
Select
user_name,
case when avg(online_yn + instore_yn) >= 1 then 1 else 0 end as purchase_yn,
case when avg(online_yn + instore_yn) >= 1 and avg(email_yn + account_yn) >= 1 then 1 else 0 end as purchaser_with_account
From
(SELECT
user_name,
case when characteristic = "online" then 1 else 0 end as online_yn,
case when characteristic = "instore" then 1 else 0 end as instore_yn,
case when characteristic = "account" then 1 else 0 end as account_yn,
case when characteristic = "email" then 1 else 0 end as email_yn,
FROM my_table) avg_table
group by
user_name;
This should help.
It may not be efficient in terms of performance but you'll get what you want.

You just have to enclose the CASE expressions in COUNT aggregates:
SELECT user_name,
COUNT(case when characteristic in ("online","instore") then 1 END) as purchase_yn,
COUNT(case when characteristic in ("email",'account') then 1 END) as user_with_account
FROM my_table
GROUP BY user_name
If purchase_yn > 0 then you first flag is set. If purchase_yn > 0 and user_with_account > 0 then you second flag is set as well.
Note: You have to remove ELSE 0 from the CASE expressions because COUNT takes into account all not null values.

You haven't mentioned a specific RDBMS, but if SUM(DISTINCT ...) is available the following is quite nice:
SELECT
username,
SUM(DISTINCT
CASE
WHEN characteristic in ('online','instore') THEN 1
ELSE 0
END) AS purchase_yn,
CASE WHEN (
SUM(DISTINCT
CASE
WHEN characteristic in ('online','instore') THEN 1
WHEN characteristic in ('email','account') THEN 2
ELSE 0 END
)
) = 3 THEN 1 ELSE 0 END as purchaser_with_account
FROM
my_table
GROUP BY
username

If I correctly understand, if user have 'online' or 'instore', then for this user you want 1 as purchase_yn column, and if user also have 'email' or 'account', then 1 as purchaser_with_account column.
If this is correct, then one way is:
with your_table(user_name, characteristic) as(
select 1, 'online' union all
select 1, 'instore' union all
select 1, 'account' union all
select 1, 'email' union all
select 2, 'account' union all
select 2, 'email' union all
select 3, 'online'
)
-- below is actual query:
select your_table.user_name, coalesce(max(t1.purchase_yn), 0) as purchase_yn, coalesce(max(t2.purchaser_with_account), 0) as purchaser_with_account
from your_table
left join (SELECT user_name, 1 as purchase_yn from your_table where characteristic in('online','instore') ) t1
on your_table.user_name = t1.user_name
left join (SELECT user_name, 1 as purchaser_with_account from your_table where characteristic in('email', 'account') ) t2
on t1.user_name = t2.user_name
group by your_table.user_name

Calculation of occurrence of strings

I have a table with 3 columns, id, name and vote. They're populated with many registers. I need that return the register with the best balance of votes. The votes types are 'yes' and 'no'.
Yes -> Plus 1
No -> Minus 1
This column vote is a string column. I am using SQL SERVER.
Example:
It must return Ann for me

Use conditional Aggregation to tally the votes as Kannan suggests in his answer
If you really only want 1 record then you can do it like so:
SELECT TOP 1
name
,SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) AS VoteTotal
FROM
#Table
GROUP BY
name
ORDER BY
VoteTotal DESC
This will not allow for ties but you can use this method which will rank the responses and give you results use RowNum to get only 1 result or RankNum to get ties.
;WITH cteVoteTotals AS (
SELECT
name
,SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) AS VoteTotal
,ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) DESC) as RowNum
,DENSE_RANK() OVER (PARTITION BY 1 ORDER BY SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) DESC) as RankNum
FROM
#Table
GROUP BY
name
)
SELECT name, VoteTotal
FROM
cteVoteTotals
WHERE
RowNum = 1
--RankNum = 1 --if you want with ties use this line instead
Here is the test data used and in the future do NOT just put an image of your test data spend the 2 minutes to make a temp table or a table variable so that people you are asking for help do not have to!
DECLARE #Table AS TABLE (id INT, name VARCHAR(25), vote VARCHAR(4))
INSERT INTO #Table (id, name, vote)
VALUES (1, 'John','no'),(2, 'John','no'),(3, 'John','yes')
,(4, 'Ann','no'),(5, 'Ann','yes'),(6, 'Ann','yes')
,(9, 'Marie','no'),(8, 'Marie','no'),(7, 'Marie','yes')
,(10, 'Matt','no'),(11, 'Matt','yes'),(12, 'Matt','yes')

Use this code,
;with cte as (
select id, name, case when vote = 'yes' then 1 else -1 end as votenum from register
) select name, sum(votenum) from cte group by name
You can get max or minimum based out of this..

This one gives the 'yes' rate for each person:
SELECT Name, SUM(CASE WHEN Vote = 'Yes' THEN 1 ELSE 0 END)/COUNT(*) AS Rate
FROM My_Table
GROUP BY Name

SQL Server : do not Select all if true

I have these columns
Id Status
----------
1 pass
1 fail
2 pass
3 pass
How do I select all that only have a status of pass but if the Id has at least one fail it will not be selected as well.

If same id can have multiple passes
SELECT id
from table
WHERE status = 'pass'
and id NOT IN (SELECT id FROM table WHERE status = 'fail')

You need to use GROUP BY & HAVING clause
SELECT Id
FROM yourtable
GROUP BY Id
HAVING Sum(case when status ='pass' then 1 else 0 end) = count(status)
HAVING clause can be changed to
HAVING Count(case when status ='pass' then 1 end) = count(status)

I just hate chatty case statement, so
SELECT Id
FROM table1
GROUP BY Id
HAVING COUNT(DISTINCT [Status]) = 1 AND MIN([Status]) = 'pass'
or
SELECT Id
FROM table1
GROUP BY Id
HAVING COUNT(NULLIF([Status], 'fail')) = 1 AND COUNT(NULLIF([Status], 'pass')) = 0
The second query only works when status has two values 'pass' and 'fail'.

Using Rank or OVER() to create 1 or zero column SQL SERVER [duplicate]

I think I need some guidance as to what is wrong in my query. I am trying to do
Watched_Gladiator=CASE WHEN FilmName IN (CASE WHEN FilmName LIKE '%Gladiator%' THEN 1 END) then OVER(PARTITION BY Cust_Nr) THEN 1 ELSE 0 END
Tried this one too:
Watched_Gladiator=CASE WHEN FilmName IN (CASE WHEN FilmName LIKE '%Gladiator%' THEN Filmnamne END) then OVER(PARTITION BY Cust_Nr) THEN 1 ELSE 0 END
The Error I am currently getting is this:
Incorrect syntax near the keyword 'OVER'.
This is basically how my data looks like
Cust_Nr Date FilmName Watched Gladiator
157649306 20150430 Gladiator 1
158470722 20150504 Nick Cave: 20,000 Days On Earth 0
158467945 20150504 Out Of The Furnace 0
158470531 20150504 FilmA 0
157649306 20150510 Gladiator 1
158470722 20150515 Gladiator 1
I want to create a column (1 or zero) that shows if the customer has watched Gladiator then 1 ELSE 0. How can I do that?
I created a test column trying with a simple LIKE '%Gladiator%' THEN 1 ELSE 0. The problem with this solution is that it will show 1(one) more than once if the customer has watched multiple times. I only need 1 or zero.
I feel I am really close to finding a solution. I am very new to using OVER() and CASE WHEN but enjoying the thrill:=)

So you're saying that:
SELECT Cust_Nr, Date, FilmName,
CASE WHEN FilmName LIKE '%Gladiator%' THEN 1 ELSE 0 END as WatchedGladiator
FROM YourTable
WHERE YourColumn = #somevalue
Doesn't work? Because according to the data you've given, it should.
EDIT:
Well based on Tim's comment, I would simply add this bit to the query.
SELECT Cust_Nr, Date, FilmName, WatchedGladiator
FROM
(
SELECT Cust_Nr, Date, FilmName,
CASE WHEN FilmName LIKE '%Gladiator%' THEN 1 ELSE 0 END as WatchedGladiator
FROM YourTable
WHERE YourColumn = #somevalue
) as wg
WHERE WatchedGladiator = 1

The following does what you want for all films:
select r.*,
(case when row_number() over (partition by filmname order by date) = 1
then 1 else 0
end) as IsWatchedFirstAndGladiator
from results r;
For just Gladiator:
select r.*,
(case when filmname = 'Gladiator' and row_number() over (partition by filmname order by date) = 1
then 1 else 0
end) as IsWatchedFirst
from results r;

So you want to group by customer and add a column if this customer watched a specific film?
You could do:
SELECT Cust_Nr, MAX(Watched_Gladiator)
FROM( SELECT Cust_Nr,
Watched_Gladiator = CASE WHEN EXISTS
(
SELECT 1 FROM CustomerFilm c2
WHERE c2.Cust_Nr = c1.Cust_Nr
AND c2.FilmName LIKE '%Gladiator%'
) THEN 1 ELSE 0 END
FROM CustomerFilm c1 ) X
GROUP BY Cust_Nr
Demo
But it would be easier if you used the customer-table instead of this table, then you don't need the group-by.

Try grouping up to the cust/film level:
select
cust_nbr,
case when film_name like '%Gladiator%' then 1 else 0 end
from
(
select
cust_nbr,
film_name
from
<your table>
group by
cust_nbr,
film_name
) t
Or, as an alternative:
select distinct cust_nbr
from
<your table>
where
filmname = 'Gladiator'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Custom aggregation in GROUP BY clause - sql

Assuming that status is a numeric data type, use: SELECT t.category, t.subcategory1, CASE WHEN MIN(t.status) = MAX(t.status) THEN 1 ELSE 0 END AS status FROM dbo.TABLE_1 t GROUP BY t.category, t.subcategory1

You can test that both the minimum and maximum status for each group are equal to your desired value: SELECT category, subcategory1, CASE WHEN MIN(status) = 42 AND MAX(status) = 42 THEN 1 ELSE 0 END AS Status FROM table1 GROUP BY category, subcategory1

Related

Check whether an employee is present on three consecutive days

sql case statement IN with group by

Calculation of occurrence of strings

SQL Server : do not Select all if true

Using Rank or OVER() to create 1 or zero column SQL SERVER [duplicate]

Categories

Resources