How to subtract mean between groups in SQL? - sql

I have a table like:
AVG_AMOUNT
WAS_MEMBER
YEAR
200.00
True
2018
100.00
False
2018
20.00
True
2019
300.00
False
2019
400.00
True
2020
10.00
False
2020
And I want a table like:
DIFF_AVG_AMOUNT
YEAR
100.00
2018
-280.00
2019
390.00
2020
How can I get a difference of means based on membership for each year? Probably uses a partition over but not sure how to use difference with a partition.

You can use conditional aggregation containing SUM() such as
SELECT SUM(CASE WHEN was_member='True' THEN avg_amount
WHEN was_member='False' THEN - avg_amount END) AS diff_avg_amount,
year
FROM tab
GROUP BY year
ORDER BY year

Related

Aggregate multiple invoice numbers and invoice amount rows into one row

I have the following:
budget_id
invoice_number
April
June
August
004
11
NULL
690
NULL
004
12
1820
NULL
NULL
004
13
NULL
NULL
890
What I want to do is do the following:
budget_id
invoice_number
April
June
August
004
11, 12, 13
1820
690
890
However, when I try to do the following:
SELECT budget_id,
STRING_AGG(invoice_number, ',') AS invoice number,
April,
June,
August
FROM invoice_table
GROUP BY budget_id,
April,
June,
August
Nothing happens. The table stays exactly the same. The code above works if I'm able to comment out the months as it aggregates the invoices numbers without the months. But once I include the months, I still get 3 separate rows. I need the invoice amounts to be included with the months. Is it possible to get the invoice numbers aggregated as well as the invoice amounts in one row? I'm using Big Query if that helps.
Use below query,
SELECT budget_id,
STRING_AGG(invoice_number, ',') invoice_number,
SUM(April) April,
SUM(June) June,
SUM(August) August
FROM invoice_table
GROUP BY 1;

Include "0" results in COUNT(*) aggregate

Good morning, I've searched in the forum one doubt that I have but the results that I've seen didn't give me a solution.
I have two tables.
CARS:
Id Model
1 Seat
2 Audi
3 Mercedes
4 Ford
BREAKDOWNS:
IdBd Description Date Price IdCar
1 Engine 01/01/2020 500 € 3
2 Battery 05/01/2020 0 € 1
3 Wheel's change 10/02/2020 110,25 € 4
4 Electronic system 15/03/2020 100 € 2
5 Brake failure 20/05/2020 0 € 4
6 Engine 25/05/2020 400 € 1
I wanna make a query that shows the number of breakdowns by month with 0€ of cost.
I have this query:
SELECT Year(breakdowns.[Date]) AS YEAR, StrConv(MonthName(Month(breakdowns.[Date])),3) AS MONTH, Count(*) AS [BREAKDOWNS]
FROM cars LEFT JOIN breakdowns ON (cars.Id = breakdowns.IdCar AND breakdowns.[Price]=0)
GROUP BY breakdowns.[Price], Year(breakdowns.[Date]), Month(breakdowns.[Date]), MonthName(Month(breakdowns.[Date]))
HAVING ((Year([breakdowns].[Date]))=[Insert a year:])
ORDER BY Year(breakdowns.[Date]), Month(breakdowns.[Date]);
And the result is (if I put year '2020'):
YEAR MONTH BREAKDOWNS
2020 January 1
2020 May 1
And I want:
YEAR MONTH BREAKDOWNS
2020 January 1
2020 February 0
2020 March 0
2020 May 1
Thanks!
The HAVING condition should be in WHERE (otherwise it changes the Outer to an Inner join). But as long as you don't use columns from cars there's no need to join it.
To get rows for months without a zero price you should switch to conditional aggregation (Access doesn't support Standard SQL CASE, but IIF?).
SELECT Year(breakdowns.[Date]) AS YEAR,
StrConv(MonthName(Month(breakdowns.[Date])),3) AS MONTH,
SUM(CASE WHEN breakdowns.[Price]=0 THEN 1 ELSE 0 END) AS [BREAKDOWNS]
FROM breakdowns
JOIN cars
ON (cars.Id = breakdowns.IdCar)
WHERE ((Year([breakdowns].[Date]))=[Insert a year:])
GROUP BY breakdowns.[Price], Year(breakdowns.[Date]), Month(breakdowns.[Date]), MonthName(Month(breakdowns.[Date]))
ORDER BY Year(breakdowns.[Date]), Month(breakdowns.[Date]

Distinct count for entire dataset, grouped by month

I am dealing with a sales order table (ORDER) that looks roughly like this (updated 2018/12/20 to be closer to my actual data set):
SOID SOLINEID INVOICEDATE SALESAMOUNT AC
5 1 2018-11-30 100.00 01
5 2 2018-12-05 50.00 02
4 1 2018-12-12 25.00 17
3 1 2017-12-31 75.00 03
3 2 2018-01-03 25.00 05
2 1 2017-11-25 100.00 17
2 2 2017-11-27 35.00 03
1 1 2017-11-20 15.00 08
1 2 2018-03-15 30.00 17
1 3 2018-04-03 200.00 05
I'm able to calculate the average sales by SOID and SOLINEID:
SELECT SUM(SALESAMOUNT) / COUNT(DISTINCT SOID) AS 'Total Sales per Order ($)',
SUM(SALESAMOUNT) / COUNT(SOLINEID) AS 'Total Sales per Line ($)'
FROM ORDER
This seems to provide a perfectly good answer, but I was then given an additional constraint, that this count be done by year and month. I thought I could simply add
GROUP BY YEAR(INVOICEDATE), MONTH(MONTH)
But this aggregates the SOID and then performs the COUNT(DISTINCT SOID). This becomes a problem with SOIDs that appears across multiple months, which is fairly common since we invoice upon shipment.
I want to get something like this:
Year Month Total Sales Per Order Total Sales Per Line
2018 11 0.00
The sore thumb sticking out is that I need some way of defining in which month and year an SOID will be aggregated if it spans across multiple ones; for that purpose, I'd use MAX(INVOICEDATE).
From there, however, I'm just not sure how to tackle this. WITH? A subquery? Something else? I would appreciate any help, even if it's just pointing in the right direction.
You should select Year() and month() for invocedate and group by
SELECT YEAR(INVOICEDATE) year
, MONTH(INVOICEDATE) month
, SUM(SALESAMOUNT) / COUNT(DISTINCT SOID) AS 'Total Sales per Order ($)'
, SUM(SALESAMOUNT) / COUNT(SOLINEID) AS 'Total Sales per Line ($)'
FROM ORDER
GROUP BY YEAR(INVOICEDATE), MONTH(INVOICEDATE)
Here are the results, but the data sample does not have enuf rows to show Months...
SELECT
mDateYYYY,
mDateMM,
SUM(SALESAMOUNT) / COUNT(DISTINCT t1.SOID) AS 'Total Sales per Order ($)',
SUM(SALESAMOUNT) / COUNT(SOLINEID) AS 'Total Sales per Line ($)'
FROM DCORDER as t1
left join
(Select
SOID
,Year(max(INVOICEDATE)) as mDateYYYY
,Month(max(INVOICEDATE)) as mDateMM
From DCOrder
Group By SOID
) as t2
On t1.SOID = t2.SOID
Group by mDateYYYY, mDateMM
mDateYYYY mDateMM Total Sales per Order ($) Total Sales per Line ($)
2018 12 87.50 58.33
I have used new SQL still MAX(INVOICEDATE)(not above), with new 12/20 data, and excluded AC=17.
YYYY MM Total Sales per Order ($) Total Sales per Line ($)
2017 11 35.00 35.00
2018 1 100.00 50.00
2018 4 215.00 107.50
2018 12 150.00 75.00

SQL Creating a cumulative sum column in a table by a specific order

I apologize for the confusing title. I am dealing with an issue this morning that I thought I solved with everyone's help here but I can't do what I originally had hoped with just the master_line_num. Once again, below is a small subset of the data I am working with:
ID Proj_Id Year Quarter Value **Cumu_Value** Master_Line_Num
1 "C102" 2017 1 200.00 **200.00** 1
2 "C102" 2017 2 200.00 **400.00** 2
3 "C102" 2017 3 200.00 **600.00** 3
4 "C102" 2017 4 200.00 **800.00** 4
5 "C102" 2018 1 400.00 **1200.00** 5
6 "C102" 2018 2 400.00 **1600.00** 6
7 "C102" 2018 3 400.00 **2000.00** 7
8 "C102" 2018 4 400.00 **2400.00** 8
9 "B123" 2017 1 100.00 **100.00** 1
10 "B123" 2017 2 100.00 **200.00** 2
11 "B123" 2017 3 100.00 **300.00** 3
12 "B123" 2017 4 100.00 **400.00** 4
13 "B123" 2018 1 200.00 **600.00** 5
14 "B123" 2018 2 200.00 **800.00** 6
15 "B123" 2018 3 200.00 **1000.00** 7
16 "B123" 2018 4 200.00 **1200.00** 8
The desired values I am trying to get is the "Cumu_Value" column. I am trying to get those values by adding up the "value" column by year, by quarter for a specific "Proj_Id". I originally just tried to multiply the "value" column by the master_line_num column after getting that but then realized that it doesn't work due to the "value" column changing between years.
Is it possible to calculate this with T-SQL or do I need to do something more extravagant?
SQL supports the cumulative sum as a window function, so this is easy to express:
select . . . ,
sum(value) over (partition by proj_id order by year, quarter) as cumulative_sum
You need a Windowed Aggregate, this will return a Cumulative Sum:
sum(value)
over (partition by proj_id
order by Year, Quarter
rows unbounded preceding)
Caution, don't use (partition by proj_id order by Year, Quarter) without the ROWS as it defaults to RANGE which might return a different result and has much more overhead. RANGE includes all rows with the same value as the current. In your case it would return:
800
800
800
800
2400
2400
2400
2400
Edit:
After checking your other question I noticed that you don't have a Master_Line_Num in your data, so you better use ORDER BY Year, Quarter instead.
You can try something like this:
select t1.id, t1.proj_ID, t1.Year, t1.Value, SUM(t2.Value) as Cumu_sum, Master_Line_Num
from #tablename t1
inner join #tablename t2 on t1.id >= t2.id
group by t1.id, t1.Value
order by t1.id

sql server sum aggregate functions

suppose I have this record
empcode net year month
602256 3479.97 2014 1
602256 33125.98 2014 1
602256 5247.11 2014 2
602256 7698.39 2014 2
602256 2941.46 2013 3
602256 5515.57 2014 3
602256 5758.68 2014 3
602256 4966.89 2013 4
602256 4984.06 2013 4
602256 5951.63 2014 4
602256 19861.04 2014 4
what i want to happen is that i want to sum the net with the same year and month hope you can help me thank you in advance.
You should use the GROUP BY clause:
SELECT SUM(net) FROM table GROUP BY [year],[month]
Its very simple by Group By cluase. Also whatever you given in group by clause , you can define in select like below, which gives better idea for calculation.
SELECT year, month, SUM(net) FROM table GROUP BY [year],[month]