Use aggregation only on rows where count(ID) is greater than one - sql

Hi I have the following table
Cash_table
ID Cash Rates
1 50 3
2 100 4
3 70 10
3 60 10
4 13 7
5 20 8
5 10 10
6 10 5
What I want as a result is to cumulate all the entries that have a Count(id)>1 like this:
ID New_Cash New_Rates
1 50 3
2 100 4
3 (70+60)/(10+10) 10+10
4 13 7
5 (20+10)/(8+10) 8+10
6 10 5
So I only want to change the rows where Count(id)>1 and leave the rest like it was.
For the rows with count(id)>1 I want to sum up the rates and take the sum of the cash and divide it by the sum of the rates. The Rates alone aren't a problem since I can sum them up and group by id and get the desired result.
The problem is with the cash column:
I am trying to do it with a case statement but it isn't working:
select id, sum(rates) as new_rates, case
when count(id)>1 then sum(cash)/nullif(sum(rates),0))
else cash
end as new_cash
from Cash_table
group by id

You only need group by id and aggregate:
select
id,
sum(cash) / (case count(*) when 1 then 1 else sum(rates) end) as new_cash,
sum(rates) as new_rates
from Cash_table
group by id
order by id
See the demo.

You can aggregate rate and cash columns by sum() function with grouping by id
select
id,
sum(cash)/decode( sum( nvl(rates,0) ), 0 ,1, sum( nvl(rates,0) )) as new_cash,
sum(rates) as new_rates
from cash_table
group by id
there's no nullif() function in Oracle, use nvl() instead
switch case part ( where decode() function is used ) against the
possibility of division by zero

Related

Resetting a Count in SQL

I have data that looks like this:
ID num_of_days
1 0
2 0
2 8
2 9
2 10
2 15
3 10
3 20
I want to add another column that increments in value only if the num_of_days column is divisible by 5 or the ID number increases so my end result would look like this:
ID num_of_days row_num
1 0 1
2 0 2
2 8 2
2 9 2
2 10 3
2 15 4
3 10 5
3 20 6
Any suggestions?
Edit #1:
num_of_days represents the number of days since the customer last saw a doctor between 1 visit and the next.
A customer can see a doctor 1 time or they can see a doctor multiple times.
If it's the first time visiting, the num_of_days = 0.
SQL tables represent unordered sets. Based on your question, I'll assume that the combination of id/num_of_days provides the ordering.
You can use a cumulative sum . . . with lag():
select t.*,
sum(case when prev_id = id and num_of_days % 5 <> 0
then 0 else 1
end) over (order by id, num_of_days)
from (select t.*,
lag(id) over (order by id, num_of_days) as prev_id
from t
) t;
Here is a db<>fiddle.
If you have a different ordering column, then just use that in the order by clauses.

Count rows according to 2 column with Group By

I have a database table of 3 columns
RecordID Deleted CardHolderID
1963 1 9
4601 1 9
6996 0 9
1532 1 11
1529 0 20
I want an sql query to output the sum of rows of Deleted column grouped by CardHolderID.
However query below outputs 2 columns for CardHolderID
select c.CardHolderID, c.Deleted, COUNT(*) as Sum
from Card c
group by c.Deleted, c.CardHolderID
CardHolderID Deleted Sum
9 0 1
9 1 2
20 0 1
11 1 1
I want to include 2 columns as Deleted0 (count of rows with Deleted column equal to 0) and Deleted1 (count of rows with Deleted column equal to 1)
CardHolderID Deleted0 Deleted1
9 1 2
20 1 0
11 1 1
How should be the SQL query for such a result?
Kind regards
Using conditional count:
select c.CardHolderID,
count( case when c.deleted > 0 then 1 else null end ) deleted0,
count( case when c.deleted = 0 then 1 else null end ) deleted1,
from Card c
group by c.CardHolderID
GROUP BY CardHolderID alone.
Use SUM(Deleted) to count the 1's.
Use SUM(1-deleted) to count the 0's.
select c.CardHolderID, sum(1-c.deleted) deleted0, sum(c.Deleted) deleted1
from Card c
group by c.CardHolderID
if you are using MSSQL
select DtlPivot.CardHolderID, isnull(DtlPivot.[0],0) as Deleted0, isnull(DtlPivot.[1],0) as Deleted1 from
(
select c.CardHolderID, c.Deleted, COUNT(*) as Total from Card c
group by c.Deleted, c.CardHolderID
) aa
PIVOT
(
sum(Total) FOR Deleted IN([0],[1])
)AS DtlPivot

Can't use case & aggregation correctly

I have the following table
Cash_table
ID Cash Rates Amount
1 50 3 16
2 100 4 25
3 130 10 7
3 130 10 6
4 13 7 1.8
5 30 8 2.5
5 30 10 1
6 10 5 2
What I want as a result is to cumulate all the entries that have a Count(id)>1 like this:
ID New_Cash New_Rates New_Amount
1 50 3 16
2 100 4 25
3 130 10+10 130/(10+10)
4 13 7 1.8
5 30 8+10 30/(8+10)
6 10 5 2
So I only want to change the rows where Count(id)>1 and leave the rest like it was.
For the rows with count(id)>1 I want to sum up the rates and take the cash and divide it by the sum of the rates. The Rates alone aren't a problem since I can sum them up and group by id and get the desired result.
The problem is with the New_Amount column:
I am trying to do it with a case statement but it isn't working:
select id,
cash as new_cash,
sum(rates) as new_rates,
(case count(id)
when 1 then amount
else cash/sum(nvl(rates,null))
end) as new_amount
from Cash_table
group by id
As the cash value is always the same for an ID, you can group by that as well:
select id,
cash as new_cash,
sum(rates) as new_rates,
case count(id)
when 1 then max(amount)
else cash/sum(rates)
end as new_amount
from cash_table
group by id, cash
order by id
ID NEW_CASH NEW_RATES NEW_AMOUNT
---------- ---------- ---------- ----------
1 50 3 16
2 100 4 25
3 130 20 6.5
4 13 7 1.8
5 30 18 1.66666667
6 10 5 2
The first branch of the case expression needs an aggregate because you aren't grouping by amount; and the sum(nvl(rates,null)) can just be sum(rates). If you're expecting any null rates then you need to decide how you want the amount to be handled, but nvl(rates,null) isn't doing anything.
You can do the same thing without a case expression if you prefer, manipulating all the values - which might be more expensive:
select id,
cash as new_cash,
sum(rates) as new_rates,
sum(amount * rates)/sum(rates) as new_amount
from cash_table
group by id, cash
order by id

SQL: Grouping, Count & Average

Trying an SQL question here. I have a SQLite DB (table):
type_id value
1 26
1 24
2 30
3 5
3 15
I want to achieve the following. For each type_id, I would like to know the number of rows (count) with that type_id and the average value (average) of the group. In the example table I would end up with:
type_id count average
1 2 25
2 1 30
3 2 10
Any ideas? Thanks :)
Just GROUP BY type_id and take the COUNT and AVG:
SELECT type_id, COUNT(*) AS count, AVG(value) AS average
FROM test
GROUP BY type_id
Output:
type_id count average
1 2 25
2 1 30
3 2 10
Demo on dbfiddle

SQL - Overall average Points

I have a table like this:
[challenge_log]
User_id | challenge | Try | Points
==============================================
1 1 1 5
1 1 2 8
1 1 3 10
1 2 1 5
1 2 2 8
2 1 1 5
2 2 1 8
2 2 2 10
I want the overall average points. To do so, i believe i need 3 steps:
Step 1 - Get the MAX value (of points) of each user in each challenge:
User_id | challenge | Points
===================================
1 1 10
1 2 8
2 1 5
2 2 10
Step 2 - SUM all the MAX values of one user
User_id | Points
===================
1 18
2 15
Step 3 - The average
AVG = SUM (Points from step 2) / number of users = 16.5
Can you help me find a query for this?
You can get the overall average by dividing the total number of points by the number of distinct users. However, you need the maximum per challenge, so the sum is a bit more complicated. One way is with a subquery:
select sum(Points) / count(distinct userid)
from (select userid, challenge, max(Points) as Points
from challenge_log
group by userid, challenge
) cl;
You can also do this with one level of aggregation, by finding the maximum in the where clause:
select sum(Points) / count(distinct userid)
from challenge_log cl
where not exists (select 1
from challenge_log cl2
where cl2.userid = cl.userid and
cl2.challenge = cl.challenge and
cl2.points > cl.points
);
Try these on for size.
Overall Mean
select avg( Points ) as mean_score
from challenge_log
Per-Challenge Mean
select challenge ,
avg( Points ) as mean_score
from challenge_log
group by challenge
If you want to compute the mean of each users highest score per challenge, you're not exactly raising the level of complexity very much:
Overall Mean
select avg( high_score )
from ( select user_id ,
challenge ,
max( Points ) as high_score
from challenge_log
) t
Per-Challenge Mean
select challenge ,
avg( high_score )
from ( select user_id ,
challenge ,
max( Points ) as high_score
from challenge_log
) t
group by challenge
After step 1 do
SELECT USER_ID, AVG(POINTS)
FROM STEP1
GROUP BY USER_ID
You can combine step 1 and 2 into a single query/subquery as follows:
Select BestShot.[User_ID], AVG(cast (BestShot.MostPoints as money))
from (select tLog.Challenge, tLog.[User_ID], MostPoints = max(tLog.points)
from dbo.tmp_Challenge_Log tLog
Group by tLog.User_ID, tLog.Challenge
) BestShot
Group by BestShot.User_ID
The subquery determines the most points for each user/challenge combo, and the outer query takes these max values and uses the AVG function to return the average value of them. The last Group By tells SQL to average all the values across each User_ID.