Extracting several math operations outputs from single select query - sql

I have three tables that I need to merge to analyse: active, students and bills.
'Active' contains records on active students and the subjects they have been active on with columns: id (student id) int, time (time they have been active) timestamp, and subject (subject in which were active) - text
id time subject
1 2020-04-23 06:53:30 Math
2 2020-05-13 09:51:22 Physics
2 2020-02-26 17:34:56 History
'Students' is the mass database containing: id (student id) int, group (the group to which student was assigned for a/b test) - text
id group
1 A
2 B
3 A
4 A
'Bills' keeps record of all transactions for courses that student purchased: id (student id) int, sale_time (time when student purchased course) timestamp, subject (subject in which course purchased) text, money (amount paid).
id sale_time subject money
1 2020-03-04 08:54:55 Math 4300
1 2020-04-08 20:43:56 Math 3200
2 2020-05-09 13:43:12 Law 8900
Basically, we have a student database (Students) some of which purchased courses (Bills). While some of those who purchased remain active (Active).
I need to write ONE SINGLE query where I can extract the following grouped by whether they belong to A or B group:
average revenue per user: sum (money) / count (distinct Students.id)
average revenue per active user: sum (money) / count (distinct Active.id)
conversion rate (%): count (distinct Bills.id) / count (distinct Students.id)
conversion rate (active) (%): count (distinct Bills.id) / count (distinct Active.id)
conversion rate (Math) (%) (count (distinct Bills.id) where Bills.subject = Math) / (count (distinct Active.id) where Active.subject = Math)
All these in single query!
I used
select sum (money)/count (distinct Students.id)
from Students
left join Bills using (id)
left join Active using (id)
group by group, Students.id
but I don't know how to do these math calculations all in one right after select with filters.
Please help!
SQL fiddle: https://www.db-fiddle.com/f/NPQR6aBf8H36XvrefJY2J/0

All You need is this:
select s.[group], sum (money)/ NULLIF( count (distinct s.id),0) as
AvgPerUser,
sum (money) / NULLIF(count (distinct a.id),0) as AvgActUser,
count (distinct b.id) / NULLIF(count (distinct a.id),0) as CovRate,
count (distinct b.id) / NULLIF(count (distinct a.id),0) as ConActRate,
(select count(distinct b2.id) from Bills as b2 where b2.subject = 'Math') /
NULLIF((select count ( distinct a2.id) from Active as a2 where a2.subject
='Math'),0) as ConRateMath
from Students as s
left join Bills as b on s.id = b.id
left join Active as a on s.id = a.id
group by s.[group]

I would recommend removing duplicates before joining and then using window functions:
select s.group, avg(b.money)as AvgPerUser,
sum(b.money) / nullif(count(a.id), 0) as AvgActUser,
count(b.id) / nullif(count(s.id), 0) as CovRate,
count(b.id) / nullif(count(a.id),0) as ConActRate,
count(b.id) filter (where s.subject = 'Math') * 1.0 / count(*) filter (where s.subject = 'Math') as ConRateMath
from Students s left join
(select b.id, sum(money) as money
from bills b
group by b.id
) b
on s.id = b.id left join
(select distinct a.id from active a
) a
on s.id = a.id
group by s.group;
Note: I don't think you want s.id in the GROUP BY. That really would not be aggregating anything.

Related

How to use count() without the output changing when I use join -Postgresql

I need to find the sum of all the rental days of a car dealership (including
period_begin and period_end, not unique) for all cars of that department.Divided by the total number of different (unique) employees that department ever had.
I have 5 tables
Department(PK departmentnr, name, FK postcode, FK place_name)
Employee(PK employeenr, FK email)
Contract(PK periode_begin, PK periode_end, FK departmentnr, FK employeenr)
registerform (,PK periode_end,PK,periode_end,FKemail,FK,
Fk numberplate,periode_end,periode_begin)
car(PK numberplate,FK departmentnr,FK Brand,FK model)
when I go step by step
part 1
The total employees per department
select departmentnr,Count(employeenr)FROM contract
group by departmentnr
part 2
the amount of days the cars were hired
SELECT DISTINCT departmentnr,
Sum((( Date(periode_end) - Date(periode_begin) + 1 ))) AS
average
FROM registerform r
INNER JOIN car w using(numberplate)
GROUP BY departmentnr
I get the correct ouput but when I try to get these 2 together
SELECT distinct departmentnr,
(
sum(((date(r.periode_end) - date(r.periode_begin) + 1))) / (
select
count(employeenr))
)
as average
from
registerform r
inner join
car w using(numberplate)
inner join
contract using(departmentnr)
inner join
employee using(employeenr)
group by
departmentnr
then my output gets absurd.
How can I fix this and is there a way to make the code more efficient.
Aggregated before you JOIN. So, one method is:
SELECT c.departmentnr, co.num_employees,
Sum( Date(r.periode_end) - Date(r.periode_begin) + 1 ) AS average
FROM registerform r JOIN
car c
USING (numberplate) LEFT JOIN
(SELECT co.departmentnr, Count(*) as num_employees
FROM contract co
GROUP BY co.departmentnr
) co
ON co.departmentnr = c.departmentnr
GROUP BY c.departmentnr, co.num_employees;

SELECT * FROM table in addition of aggregation function

Short context:
I would like to show a list of all companies except if they are in the sector 'defense' or 'government' and their individual total spent on training classes. Only the companies that have this total amount above 1000 must be shown.
So I wrote the following query:
SELECT NAME, ADDRESS, ZIP_CODE, CITY, SUM(FEE-PROMOTION) AS "Total spent on training at REX"
FROM COMPANY INNER JOIN PERSON ON (COMPANY_NUMBER = EMPLOYER) INNER JOIN ENROLLMENT ON (PERSON_ID = STUDENT)
WHERE SECTOR_CODE NOT IN (SELECT CODE
FROM SECTOR
WHERE DESCRIPTION = 'Government' OR DESCRIPTION = 'Defense')
GROUP BY NAME, ADDRESS, ZIP_CODE, CITY
HAVING SUM(FEE-PROMOTION) > 1000
ORDER BY SUM(FEE-PROMOTION) DESC
Now what I actually need is, instead of defining every single column in the COMPANY table, I would like to show ALL columns of the COMPANY table using *.
SELECT * (all tables from COMPANY here), SUM(FEE-PROMOTION) AS "Total spent on training at REX"
FROM COMPANY INNER JOIN PERSON ON (COMPANY_NUMBER = EMPLOYER) INNER JOIN ENROLLMENT ON (PERSON_ID = STUDENT)
WHERE SECTOR_CODE NOT IN (SELECT CODE
FROM SECTOR
WHERE DESCRIPTION = 'Government' OR DESCRIPTION = 'Defense')
GROUP BY * (How to fix it here?)
HAVING SUM(FEE-PROMOTION) > 1000
ORDER BY SUM(FEE-PROMOTION) DESC
I could define every single column from COMPANY in the SELECT and that solution will do the job (as in the first example), but how can I make the query shorter using "SELECT * from the table COMPANY"?
The key idea is to summarize in the subquery to get the total spend for the company. This allows you to remove the aggregation from the outer query:
select c.*, pe.total_spend
from company c join
sector s
on c.sector_code = s.code left join
(select p.employer, sum(e.fee - e.promotion) as training_spend
from person p join
enrollment e
on p.person_id = e.student
group by p.employer
) pe
on pe.employer = c.company_number
where s.sector not in ('Government', 'Defense') and
pe.total_spend > 1000

SQL query to generate the following output

Display user id, user name, total amount, amount to be paid after discount and give alias name as User_ID, user_name, Total_amount, Paid_amount. Display record in descending order by user id. Click on TABLE SHCEMA to get the table
This is the code I've written. But this is not showing the expected result:
SELECT
USERS.USER_ID AS User_ID,
NAME AS user_name,
(FARE * NO_SEATS) AS Total_amount
FROM
USERS
INNER JOIN
TICKETS ON USERS.USER_ID = TICKETS.USER_ID
INNER JOIN
PAYMENTS ON TICKETS.TICKET_ID = PAYMENTS.TICKET_ID
INNER JOIN
DISCOUNTS ON PAYMENTS.DISCOUNT_ID = DISCOUNTS.DISCOUNT_ID
GROUP BY
USERS.USER_ID, NAME, FARE, NO_SEATS
ORDER BY
USERS.USER_ID DESC;
This is the expected output:
USER_ID USER_NAME TOTAL_AMOUNT PAID_AMOUNT
----------------------------------------------
5 Krena 700 625
4 Johan 800 775
3 Ivan 3000 2900
1 John 4000 3950
1 John 4000 3950
1 John 2000 1900
This will work, try it
SELECT
USERS.USER_ID AS USER_ID,
USERS.NAME AS USER_NAME,
(TICKETS.FARE * TICKETS.NO_SEATS) AS TOTAL_AMOUNT,
(TICKETS.FARE * TICKETS.NO_SEATS - DISCOUNTS.DISCOUNT_AMOUNT) AS PAID_AMOUNT
FROM
USERS
INNER JOIN TICKETS
ON USERS.USER_ID=TICKETS.USER_ID
INNER JOIN PAYMENTS
ON TICKETS.TICKET_ID=PAYMENTS.TICKET_ID
INNER JOIN DISCOUNTS
ON PAYMENTS.DISCOUNT_ID=DISCOUNTS.DISCOUNT_ID
ORDER BY USERS.USER_ID DESC;
Try this code.
SELECT
USERS.USER_ID AS USER_ID,
USERS.NAME AS USER_NAME,
(TICKETS.FARE * TICKETS.NO_SEATS) AS TOTAL_AMOUNT,
(TICKETS.FARE * DISCOUNTS.DISCOUNT_AMOUNT) + TICKETS.FARE AS PAID_AMOUNT
FROM
USERS USERS
INNER JOIN TICKETS TICKETS
ON USERS.USER_ID=TICKETS.USER_ID
INNER JOIN PAYMENTS PAYMENTS
ON TICKETS.TICKET_ID=PAYMENTS.TICKET_ID
INNER JOIN DISCOUNTS DISCOUNTS
ON PAYMENTS.DISCOUNT_ID=DISCOUNTS.DISCOUNT_ID
ORDER BY USERS.USER_ID DESC;
I'm confused in your database structure if why is it in number datatype? you might want to change it to decimal if you want and make its length as (11,2) because i assume that your discount amount in your DISCOUNT table is in percentage right? Or just paste your sample database here so i can update my code.
if i had enough reputation, i would have written this as a comment (sigh):
u havent quite defined what ur expected results are.
i assume the following:
print the total sums for a user, both the Total_amount (ie. official price), Paid_amount (ie. after discounts), so it would include all the tickets and their payments.
i assume that the discount is a positive number to be deducted from the fair and is per 1 seat
then this should do the trick:
SELECT
USER_ID AS USER_ID,
NAME AS USER_NAME,
sum(FARE * NO_SEATS) AS TOTAL_AMOUNT,
sum((FARE - DISCOUNT_AMOUNT ) * NO_SEATS) AS PAID_AMOUNT
FROM USERS u
INNER JOIN TICKETS t
ON u.USER_ID=t.USER_ID
INNER JOIN PAYMENTS p
ON t.TICKET_ID=p.TICKET_ID
INNER JOIN DISCOUNTS d
ON p.DISCOUNT_ID=d.DISCOUNT_ID
group by USER_ID, name
ORDER BY USER_ID DESC;
This is the right answer for the question and would give the accurate output
SELECT User_ID, (select name from users u where t.user_id=u.user_id) as
user_name, ticket_id as Ticket_id, (no_seats*fare) as Total_amount,
((no_seats*fare)-(select discount_amount from discounts d
where d.discount_id=(select discount_id from payments p where t.Ticket_id=p.Ticket_id))) as Paid_amount
FROM tickets t ORDER BY User_ID desc;
Write a query to
display user id, user name, ticket id , total amount, amount to be
paid after discount and give alias name as User_ID, user_name,
Ticket_Id, Total_amount, Paid_amount. Display record in descending
order by user id.
Avoid duplicate records.
[Note : Total amount to be paid should be number of tickets * amount per ticket ]
this code runs 100%

I need a SQL query for comparing column values against rows in the same table

I have a table called BB_BOATBKG which holds passengers travel details with columns Z_ID, BK_KEY and PAXSUM where:
Z_ID = BookingNumber* LegNumber
BK_KEY = BookingNumber
PAXSUM = Total number passengers travelled in each leg for a particular booking
For Example:
Z_ID BK_KEY PAXSUM
001234*01 001234 2
001234*02 001234 3
001287*01 001287 5
001287*02 001287 5
002323*01 002323 7
002323*02 002323 6
I would like to get a list of all Booking Numbers BK_KEY from BB_BOATBKG where the total number of passengers PAXSUM is different in each leg for the same booking
Example, For Booking number A, A*Leg01 might have 2 Passengers, A* Leg02 might have 3 passengers
Dependent of your RDBMs there might be several options availible. A solution that should work for most is:
SELECT A.Z_ID, A.BK_KEY, A.PAXSUM
FROM BB_BOATBKG A
JOIN (
SELECT BK_KEY
FBB_BOATBKGROM BB_BBK_KEY
GROUP BY BK_KEY
HAVING COUNT( DISTINCT PAXSUM ) > 1
) B
ON A.BK_KEY = B.BK_KEY
If your DBMS support OLAP functions, have a look at RANK() OVER (...)
It's a little counterintuitive, but you could join the table to itself on {BK_KEY, PAXSUM} and pull out only the records whose joined result is null.
I think this does it:
SELECT
a.BK_KEY
FROM
BB_BOATBKG a
LEFT OUTER JOIN BB_BOATBKG b ON a.BK_KEY = b.BK_KEY AND a.PAXSUM = b.PAXSUM
WHERE
b.Z_ID IS NULL
GROUP BY
a.BK_KEY
Edit: I think I missed anything beyond the trivial case. I think you can do it with some really nasty subselecting though, a la:
SELECT
b.BK_KEY
FROM
(
SELECT
a.BK_KEY,
Count = COUNT(*)
FROM
(
SELECT
a.BK_KEY,
a.PAXSUM
FROM
BB_BOATBKG a
GROUP BY
a.BK_KEY,
a.PAXSUM
HAVING
COUNT(*) = 1
) a
GROUP BY
a.BK_KEY
) b
INNER JOIN
(
SELECT
c.BK_KEY,
Count = COUNT(*)
FROM
BB_BOATBKG c
GROUP BY
c.BK_KEY
) c ON b.BK_KEY = c.BK_KEY AND b.Count = c.Count

SQL Oracle Select Group where all members are borrowed

I have two tables: Car and CarBorrowed.
The Car table contains all cars in the car pool with an ID and a group the car belongs to. For example:
ID 1, Car 1, Group Renault
ID 2, Car 3, Group Renault
ID 3, Car 4, Group VW
ID 4, Car 6, Group BMW
ID 5, Car 7, Group BMW
The CarBorrow table contains all cars which are borrowed on a particular day
Car 1, Borrowed on 23.08.2012
Car 3, Borrowed on 23.08.2012
Car 5, Borrowed on 23.08.2012
Now I want all groups, where no cars are left (today= 23.08.2012). So I should get "Group Renault"
First, join the tables, so we have for every car its borrows(a day).
select c.id, c.GroupName, cb.day
from car c
left join (select * from CarBorrow where day = '23 Aug 2012') cb
on (c.id = cb.id)
All cars not borrowed will have null at day.
After this, we shoud select all Groups that does not have nulls.
Bellow an trick to get it:
select GroupName
FROM(
select c.id, c.GroupName, cb.day
from car c
left join (select * from CarBorrow where day = '23 Aug 2012') cb
on (c.id = cb.id)
)
group by GroupName
having count(day) = count(*)
(Days that are null are not counted by COUNT)
SELECT distinct(D1.CARGROUP)
FROM den_car d1
MINUS
(SELECT D.CARGROUP
FROM den_car d
WHERE d.id IN (SELECT c.ID
FROM den_car c
MINUS
SELECT b.id
FROM den_car_borrow b
WHERE B.DATE_BORROW = TO_DATE (SYSDATE)))
This may be optimized but the idea is simple: Find the borrowed ones, subtract it from all cars. Then find the remaning groups.
Hope it helps. (By the way of course there are lots of other ways to do it.)
Hmmm . . . One way to approach this query is to count the cars in a group and also count the cars on a particular day, then take the groups where the borrowed equals the available:
select borrowed.BorrowedOn, available.CarGroup
from (select c.CarGroup, count(*) as cnt
from car c
group by c.CarGroup
) available left outer join
(select c.CarGroup, cb.BorrowedOn, count(*) as cnt
from CarBorrowed cb join
Car c
on cb.CarId = c.CarId
group by c.CarGroup, cb.BorrowedOn
) borrowed
on available.CarGroup = borrowed.CarGroup
where available.cnt = borrowed.cnt
By the way, "Group" is a bad name for a column, since it is a SQL reserved word. I've renamed it to CarGroup.
If the same car can be borrowed more than once on a given day, then change the count(*) in the second subquery to count(distinct cb.carId).
If you want just one day, you can add a clause to the WHERE clause.
I think I have a solution now:
select x.groupname from
(select a.groupname, count(*) as cnt from car a group by a.groupname) x
inner join
(
select b.groupname, count(*) as cnt from car b where b.carid in (select caraid from modavail where day ='23.08.2012')
group by b.groupname
) y
on x.groupname = y.groupname
where x.cnt = y.cnt and y.cnt ! = 0 ORDER BY GROUPNAME;
Thanks for your help!!!!