Partition by two columns and find sum SQL - sql

Assume there's a table students where we have student id, course id, role id, number of failed assignments and total number of assignments of the students in the course.
student role course fail total
1000 23 20022 1 5
1000 23 10055 2 8
1000 23 29000 0 10 fail = 2, total = 23
-------------------------------
1000 15 50003 1 7
1000 15 10299 3 8 fail = 4, total = 15
-------------------------------
1000 34 30042 0 5 fail = 0, total = 5
------------------------------------------
2035 34 90002 1 10
2035 34 55053 2 8 fail = 3, total = 18
-------------------------------
2035 10 80003 0 5 fail = 0, total = 5
...
What is the way to partition by two columns to find total number of fails and total number of assignments per each role for each student?
Expected output for above example would be:
student role sum_fail sum_total
1000 23 2 23
1000 15 4 15
1000 34 0 5
----------------------------------------
2035 34 3 18
2035 10 0 5
...
The code I tried so far produces incorrect numbers where each student have exactly the same number of fails and total per role:
SELECT s.student, s.sum_fail, s.sum_total
FROM ( SELECT student, role, SUM(fail) OVER (PARTITION BY role) AS sum_fail,
SUM(total) OVER (PARTITION BY role) AS sum_total
FROM students ) s
WHERE s.student=student
GROUP BY s.student, s.sum_fail, s.sum_total;

You want one result row per student and role. "per" translates to GROUP BY in SQL. With two mere sums, this is a simple aggregation:
select student, role, sum(fail) as sum_fail, sum(total) as sum_total
from students
group by student, role
order by student, role;

Related

Can't use case & aggregation correctly

I have the following table
Cash_table
ID Cash Rates Amount
1 50 3 16
2 100 4 25
3 130 10 7
3 130 10 6
4 13 7 1.8
5 30 8 2.5
5 30 10 1
6 10 5 2
What I want as a result is to cumulate all the entries that have a Count(id)>1 like this:
ID New_Cash New_Rates New_Amount
1 50 3 16
2 100 4 25
3 130 10+10 130/(10+10)
4 13 7 1.8
5 30 8+10 30/(8+10)
6 10 5 2
So I only want to change the rows where Count(id)>1 and leave the rest like it was.
For the rows with count(id)>1 I want to sum up the rates and take the cash and divide it by the sum of the rates. The Rates alone aren't a problem since I can sum them up and group by id and get the desired result.
The problem is with the New_Amount column:
I am trying to do it with a case statement but it isn't working:
select id,
cash as new_cash,
sum(rates) as new_rates,
(case count(id)
when 1 then amount
else cash/sum(nvl(rates,null))
end) as new_amount
from Cash_table
group by id
As the cash value is always the same for an ID, you can group by that as well:
select id,
cash as new_cash,
sum(rates) as new_rates,
case count(id)
when 1 then max(amount)
else cash/sum(rates)
end as new_amount
from cash_table
group by id, cash
order by id
ID NEW_CASH NEW_RATES NEW_AMOUNT
---------- ---------- ---------- ----------
1 50 3 16
2 100 4 25
3 130 20 6.5
4 13 7 1.8
5 30 18 1.66666667
6 10 5 2
The first branch of the case expression needs an aggregate because you aren't grouping by amount; and the sum(nvl(rates,null)) can just be sum(rates). If you're expecting any null rates then you need to decide how you want the amount to be handled, but nvl(rates,null) isn't doing anything.
You can do the same thing without a case expression if you prefer, manipulating all the values - which might be more expensive:
select id,
cash as new_cash,
sum(rates) as new_rates,
sum(amount * rates)/sum(rates) as new_amount
from cash_table
group by id, cash
order by id

Count distinct values of a Column based on Distinct values of First Column

I am dealing with a huge volume of traffic data. I want to identify the vehicles which have changed their lanes, I'm Microsoft Access with VB.Net.
Traffic Data:
Vehicle_ID Lane_ID Frame_ID Distance
1 2 12 100
1 2 13 103
1 2 14 105
2 1 16 130
2 1 17 135
2 2 18 136
3 1 19 140
3 2 20 141
I have tried to distinct the Vehicle_ID and then count(distinct Lane_ID).
I could list the distinct Vehicle_ID but the it counts the total Lane_ID instead of Distinct Lane_ID.
SELECT
Distinct Vehicle_ID, count(Lane_ID)
FROM Table1
GROUP BY Vehicle_ID
Shown Result:
Vehicle_ID Lane Count
1 3
2 3
3 2
Correct Result:
Vehicle_ID Lane Count
1 1
2 2
3 2
Further to that i would like to get all Vehicle_ID who have changed their lane (all data including previous lane and new lane). Output result would be somehow like: Vehicle_ID Lane_ID Frame_ID Distance
2 1 17 135
2 2 18 136
3 1 19 140
3 2 20 141
Access does not support COUNT(DISTINCT columnname) so do this:
SELECT t.Vehicle_ID, COUNT(t.Lane_ID) AS [Lane Count]
FROM (
SELECT DISTINCT Vehicle_ID, Lane_ID FROM Table1
) AS t
GROUP BY t.Vehicle_ID
So
to identify the vehicles which have changed their lanes
you need to add to the above query:
HAVING COUNT(t.Lane_ID) > 1
SELECT
Table1.Vehicle_ID,
LANE_COUNT
FROM Table1
JOIN (
SELECT Vehicle_ID, COUNT(*) as LANE_COUNT FROM (
SELECT distinct Vehicle_ID, Lane_ID FROM Table1
) dTable1 # distinct vehicle and land id
GROUP BY Vehicle_ID # counting the distinct
) cTable1 ON cTable1.Vehicle_ID = Table1.Vehicle_ID # join the table with the counting
I think you should do one by one,
Distinct the vehicle id and land id
counting the distinct combination
and merge the result with the actual table.
If you want vehicles that have changed their lanes, then you can do:
SELECT Vehicle_ID,
IIF(MIN(Lane_ID) = MAX(Lane_ID), 0, 1) as change_lane_flag
FROM Table1
GROUP BY Vehicle_ID;
I think this is as good as counting the number of distinct lanes, because you are not counting actual "lane changes". So this would return "2" even though the vehicle changes lanes multiple times:
2 1 16 130
2 1 17 135
2 2 18 136
2 1 16 140
2 1 17 145
2 2 18 146

How do I make a query that selects where the SUM equals a fixed value

I've spent that last couple of days searching for a way to make a SQL query that searches the database and returns records where the SUM of the same ID's equal or grater then the value provided.
For this I've been using the W3schools database to test it out in the products table.
More so what I've been trying to do:
SELECT * FROM products
WHERE supplierid=? and SUM(price) > 50
in the "where supplier id" would loop through same suppliers and sum of their price higher than 50 in this case return the records.
In this case it would read supplier ID 1 then add the price of all that supplier 18+19+10=47 now 47 < 50 so it will not print those records at the end. Next supplier ID 2 22+21.35=43.35 and again would not print those records until the sum of price is higher than 50 it will print
I'm working with a DB2 database.
SAMPLE data:
ProductID ProductName SupplierID CategoryID Price
1 Chais 1 1 18
2 Chang 1 1 19
3 Aniseed 1 2 10
4 Chef Anton 2 2 22
5 Chef Anton 2 2 21.35
6 Grandma's 3 2 25
7 Uncle Bob 3 7 30
8 Northwoods 3 2 40
9 Mishi 4 6 97
10 Ikura 4 8 31
11 Queso 5 4 21
12 Queso 5 4 38
13 Konbu 6 8 6
14 Tofu 6 7 23.25
How about:
select * from products where supplierid in (
select supplierid
from products
group by supplierid
having sum(price) > 50
);
The subquery finds out all the supplierid values that match your condition. The main (external) query retrieves all rows that match the list of supplierids.
not tested, but I would expect db2 to have analytic functions and CTEs, so perhaps:
with
basedata as (
select t.*
, sum(t.price) over(partition by t.supplierid) sum_price
from products t
)
select *
from basedata
where supplierid = ?
and sum_price > 50
The analytic function aggregates the price information but does not group the resultset, so you get the rows from your initial result, but restricted to those with an aggregated price value > 50.
The difference to a solution with a subquery is, that the use of the analytic function should be more efficient since it has to read the table only once to produce the result.

Oracle SQL find row crossing limit

I have a table which has four columns as below
ID.
SUB_ID. one ID will have multiple SUB_IDs
Revenue
PAY where values of Pay is always less than or equal to Revenue
select * from Table A order by ID , SUB_ID will have data as below
ID SUB_ID REVENUE PAY
100 1 10 8
100 2 12 9
100 3 9 7
100 4 11 11
101 1 6 5
101 2 4 4
101 3 3 2
101 4 8 7
101 5 4 3
101 6 3 3
I have constant LIMIT value 20 . Now I need to find the SUB_ID which Revenue crosses the LIMIT when doing consecutive SUM using SUB_ID(increasing order) for each ID and then find total Pay ##. In this example
for ID 100 Limit is crossed by SUB ID 2 (10+12) . So total Pay
is 17 (8+9)
for ID 101 Limit is crossed by SUB ID 4
(6+4+3+8) . So total Pay is 18 (5+4+2+7)
Basically I need to find the row which crosses the Limit.
Fiddle: http://sqlfiddle.com/#!4/4f12a/4/0
with sub as
(select x.*,
sum(revenue) over(partition by id order by sub_id) as run_rev,
sum(pay) over(partition by id order by sub_id) as run_pay
from tbl x)
select *
from sub s
where s.run_rev = (select min(x.run_rev)
from sub x
where x.id = s.id
and x.run_rev > 20);

CTE to Group Employees that are in different departments Many to Many

Can anyone help please! I just cant seem to solve this puzzle?
Input Table
DepartmentID EmpID
-----------------------
1 100
1 101
1 103
1 200
2 300
2 350
3 350
3 100
4 50
4 30
4 45
5 50
5 51
5 52
5 53
6 53
6 54
7 54
7 55
8 55
8 56
10 800
11 900
Output Table
Please note that GroupID is created by us to group departments that have common employees, with the condition that 1 employee cannot be in two departments.
GroupID Department
-----------------------
1000 1
1000 2
1000 3
1001 4
1001 5
1001 6
1001 7
1001 8
1002 10
1003 11
Example to show how and why Department 1, 2 & 3 are grouped:
EmpID 100 is common between Department 1 & 3, but wait! EmpID 350 is common between 2 & 3 as well. So group them as well. Now the Group created by departments 1, 2 and 3 do not any product that is in any other department, then we can stop.
Note: This is not a 'normal' group by because we dont want any 2 groups that we create to have same employee.
LOGIC:
Step1: EmpID 50 is common between department 4 & 5. So Group 4 & 5 together
Step2: So this means Group of 4 & 5 have 50,30,45,51,52,53 unique employees
Step3: But wait! the Department 6 has EmpID 53 common with the Group of 4 & 5, formed in step2
Step4: Group department 4, 5, and 6. This new group has unique employees of 50,30,45,51,52,53,54
Step5: But wait! the department 7 has EmpID of 54, which is common with the Group formed in step4. So Group them together
This goes on.... Till we don`t have any employee that is not in 2 groups. So in this case Group 7, Group 8 will also need to be 'merged' into the Group that is mentioned in Step 4.
This is a graph traversal problem that requires recursive CTEs. I think this is one approach:
with cte as (
select department, empid
from inputs
union all
select cte.department, i.empid
from inputs i join
cte
on i.empid = cte.empid and i.department <> cte.department
)
select department,
row_number() over (order by min(empid)) as groupid
from cte
group by deparment;