GROUP BY clause in SQL - sql

Year Brand Amount
2018 Apple 45000
2019 Apple 35000
2020 Apple 75000
2018 Samsung 15000
2019 Samsung 20000
2020 Samsung 95000
2018 Nokia 21000
2019 Nokia 17000
2020 Nokia 14000
i want the expexted output to be like:
Year Brand Amount
2018 Apple 45000
2019 Apple 35000
2020 Samsung 95000
THIS IS WHAT I TRIED:
Select Year, Brand, Max(Amount)as HighestPrice
from Practice
Group by Year
but it shows error:
"Column 'Practice.Brand' is invalid in the select list because it is
not contained in either an aggregate function or the GROUP BY clause."
I would highly appreciate the help. Thanks

The error is happening because you need both in the group by. I would just write a standard select and join into the max amount in a sub query and correlate on year.
SELECT YEAR
,BRAND
,Amount AS HighestPrice
FROM Practice B
WHERE Amount = (SELECT MAX(Amount) FROM Practice A
WHERE A.YEAR = B.YEAR)
ORDER BY YEAR ASC

A generic SQL version would be:
select p.Year, p.Brand, p.Amount as HighestPrice
from Practice p
join (
select Year, max(Amount) as Amount
from Practice
group by Year
) as m on m.Amount=p.Amount

You can use partition, example in ssms:
SELECT Year, Brand, Amount as HighestPrice
FROM(
SELECT
Year, Brand, Amount,
RANK()OVER (PARTITION BY year ORDER BY Amount DESC) AS rn
FROM Practice
) a
WHERE rn = 1
ORDER BY year

You don't say which database you are using, so I'll assume it's PostgreSQL. You can use DISTINCT ON to get the row with the max amount per year.
For example:
select distinct on (year) * from practice order by year, amount desc

Or you can use cross apply:
select a.year, a.brand, b.amount
from yourtable a
cross apply (select max(amount) as amount from yourtable
where a.year = year
group by year
having max(amount) =a.amount
order by amount desc)b

Try this :
https://dbfiddle.uk/Cr6kFqaE
select XX.Year,Table1.Brand ,XX.Amount from
(
select Year, Max(Amount) as Amount from Table1 group by Year
)as XX
left outer join Table1 on Table1.Year= XX.Year and Table1.Amount = XX.Amount

Related

SQL How to query customers who have made a purchase at least once a year since they first start purchasing?

I want to query customers who have made at least 10 purchases a year since they started shopping with us.
Customer 1 would not qualify because he made >10 purchases in 2018 and 2020 but 0 in 2019.
Customer 2 would qualify.
Customer 3 would not qualify because he made less than 10 purchases in 2018 even though he did make consecutive purchase. The time window depends on when the customer first shopped with us. What SQL query should I use to filter out customer 2 from the others?
You can use group by and having as follows:
Select customerid
From t
Group by customerid
Having max(year) - min(year) + 1 = count(distinct year)
and min(purchase) > 10
Presumably, you want every year up to the current year. Or, at least, up to the maximum year in the data.
select t.customerid
from (select t.*, max(year) over () as max_year
from t
) t
group by customerid, max_year
having max_year - min(year) + 1 = count(distinct year) and
min(purchase) >= 10;
This would disallow a customer who made more than 10 purchases in 2017 and 2018 but has no data in 2019 or 2020.
in mysql you could try:
SELECT * FROM (
SELECT t.CustomerID, COUNT(*) AS yearsPurchasing, t2.firstYear
FROM customers t
JOIN (
SELECT CustomerID, MIN(YEAR) AS firstYear
FROM customers
GROUP BY CustomerID
) t2 ON t2.CustomerID = t.CustomerID
WHERE t.Purchase >= 10
GROUP BY t.CustomerID
) r WHERE 2020 + 1 = firstYear + yearsPurchasing
2020 is the year they should have purchased to qualify, if you want the actual year, you could use YEAR(NOW()) instead 2020, if you want until the last year change 2020 + 1 by YEAR(NOW()), tell me if it works.

Calculate profit of successive years by adding profit of previous year

Sample data and expected result is provided in the image:
We need to add the profit of previous year with the successive year and display the data in the format given in he image (sample data is also provided in the image).
Please help me with the SQL query to solve this problem.
You can also write this using the window function.
SELECT
Year,
SUM(Profit) OVER(ORDER BY Year) AS Total_Profit
FROM your_table
ORDER BY Year
This is probably the world's simplest recursive CTE which you could have googled.
But here it is:
declare #years table(y int, p int)
insert #years values (2015,1000),(2016,2000),(2017,500),(2018,1000)
; with cumulative as
(
select top 1 * from #years order by y
union all
select y.y, y.p+c.p
from #years y
join cumulative c on y.y=c.y+1
)
select * from cumulative
Result:
y p
2015 1000
2016 3000
2017 3500
2018 4500
Use Sum over partition :
WITH V1 AS (
SELECT 2015 AS YEAR, 1000 AS PROFIT FROM DUAL
UNION ALL SELECT 2016 AS YEAR, 2000 AS PROFIT FROM DUAL
UNION ALL SELECT 2017 AS YEAR, 500 AS PROFIT FROM DUAL
UNION ALL SELECT 2018 AS YEAR, 1000 AS PROFIT FROM DUAL
)
SELECT
V1.YEAR
, PROFIT --- You can comment it if not needed
, SUM(PROFIT) OVER (PARTITION BY 1 ORDER BY YEAR RANGE UNBOUNDED PRECEDING) AS PROFIT_CUM
FROM V1;

SQL monthly rolling sum

I am trying to calculate monthly balances of bank accounts from the following postgresql table, containing transactions:
# \d transactions
View "public.transactions"
Column | Type | Collation | Nullable | Default
--------+------------------+-----------+----------+---------
year | double precision | | |
month | double precision | | |
bank | text | | |
amount | numeric | | |
In "rolling sum" I mean that the sum should contain the sum of all transactions until the end of the given month from the beginning of time, not just all transactions in thegiven month.
I came up with the following query:
select
a.year, a.month, a.bank,
(select sum(b.amount) from transactions b
where b.year < a.year
or (b.year = a.year and b.month <= a.month))
from
transactions a
order by
bank, year, month;
The problem is that this contains as many rows for each of the months for each banks as many transactions were there. If more, then more, if none, then none.
I would like a query which contains exactly one row for each bank and month for the whole time interval including the first and last transaction.
How to do that?
An example dataset and a query can be found at https://rextester.com/WJP53830 , courtesy of #a_horse_with_no_name
You need to generate a list of months first, then you can outer join your transactions table to that list.
with all_years as (
select y.year, m.month, b.bank
from generate_series(2010, 2019) as y(year) --<< adjust here for your desired range of years
cross join generate_series(1,12) as m(month)
cross join (select distinct bank from transactions) as b(bank)
)
select ay.*, sum(amount) over (partition by ay.bank order by ay.year, ay.month)
from all_years ay
left join transactions t on (ay.year, ay.month, ay.bank) = (t.year::int, t.month::int, t.bank)
order by bank, year, month;
The cross join with all banks is necessary so that the all_years CTE will also contain a bank for each month row.
Online example: https://rextester.com/ZZBVM16426
Here is my suggestion in Oracle 10 SQL:
select a.year,a.month,a.bank, (select sum(b.amount) from
(select a.year as year,a.month as month,a.bank as bank,
sum(a.amount) as amount from transactions c
group by a.year,a.month,a.bank
) b
where b.year<a.year or (b.year=a.year and b.month<=a.month))
from transactions a order by bank, year, month;
Consider aggregating all transactions first by bank and month, then run a window SUM() OVER() for rolling monthly sum since earliest amount.
WITH agg AS (
SELECT t.year, t.month, t.bank, SUM(t.amount) AS Sum_Amount
FROM transactions t
GROUP BY t.year, t.month, t.bank
)
SELECT agg.year, agg.month, agg.bank,
SUM(agg.Sum_Amount) OVER (PARTITION BY agg.bank ORDER BY agg.year, agg.month) AS rolling_sum
FROM agg
ORDER BY agg.year, agg.month, agg.bank
Should you want YTD rolling sums, adjust the OVER() clause by adding year to partition:
SUM(agg.Sum_Amount) OVER (PARTITION BY agg.bank, agg.year ORDER BY agg.month)

Output top 3 most profitable products every quarter

I'm trying to output a top 3 products per quarter, that should be a total of 12 rows, since 3 top products per quarter.
Closest output is the one provided below i have no idea how to like partition it every quarter
SELECT * FROM (SELECT QUARTER, PRODUCT_NAME, SUM(QUANTITY) "QTY_SOLD", SALES, SUM(PROFIT) "PROFIT_GENERATED" FROM DELIVERIES_FACT
WHERE EXTRACT(YEAR from SHIP_DATE) = 2015 GROUP BY QUARTER, PRODUCT_NAME, SALES ORDER BY "PROFIT_GENERATED" DESC)
WHERE rownum <= 3
getting an output of
I've written this SQL extracting the calendar quarter from SHIP_DATE; you can adjust as needed.
Similarly, RANK(), ROW_NUMBER(), and DENSE_RANK() all are different; you may wish to experiment with each analytical function to see which best fits your data and handles ties the way you want them to.
SELECT *
FROM (SELECT RANK() OVER (PARTITION BY SHIP_QUARTER
ORDER BY PROFIT_GENERATED desc) AS PROFIT_RANK_BY_Q,
ORIG.*
FROM
(SELECT EXTRACT(QUARTER from SHIP_DATE) AS SHIP_QUARTER,
PRODUCT_NAME,
SUM(QUANTITY) "QTY_SOLD", SALES, SUM(PROFIT) "PROFIT_GENERATED"
FROM DELIVERIES_FACT
WHERE EXTRACT(YEAR from SHIP_DATE) = 2015
GROUP BY EXTRACT(QUARTER from SHIP_DATE), PRODUCT_NAME, SALES
)
)
WHERE PROFIT_RANK_BY_Q <= 3
order by SHIP_QUARTER, PROFIT_RANK_BY_Q

How to calculate average value based on duration between measurements?

I have data similar to this:
Price DateChanged Product
10 2012-01-01 A
12 2012-02-01 A
30 2012-03-01 A
10 2012-09-01 A
12 2013-01-01 A
110 2012-01-01 B
112 2012-02-01 B
130 2012-03-01 B
110 2012-09-01 B
112 2013-01-01 B
I want to calculate average value, but the challenge is this:
Look at the first record, price 10 is valid for a duration of one month, price 12 is valid for a duration of one month while price 30 is valid for a duration of six months.
So, a basic average for product A (10+12+30+10+12)/5 would result in 14.8 while taking duration in to account then the average price would be ~20.1.
What is the best approach to solve this?
I know I could create a sub-query with a row_number() to join against to calculate a duration, but is there a better way? SQL Server has powerful features like STDistance, so surely there is a function for this?
What you are looking for is called weighted average, and AFAIK, there is no built-in function in SQL Server that calculates it for you. However, is not that hard to calculate it by hand.
First, you need to find the weight of each data point, in this case, you need to find the duration of each price period. You might have some additional columns in your data that could enable easier lookup, but you could do it like this as well:
SELECT p1.Product, p1.Price, p1.DateChanged AS DateStart,
isnull(min(p2.DateChanged),getdate()) AS DateEnd
INTO #PricePlanStartEnd
FROM PricePlan p1
LEFT OUTER JOIN PricePlan p2
ON p1.DateChanged < p2.DateChanged
AND p1.Product =p2.Product
GROUP BY p1.Product, p1.Price, p1.DateChanged
ORDER BY p1.Product, p1.DateChanged
This creates a #PricePlanStartEnd temporary table that has the start and the end of each price period. I've used getdate() as the end of the current time period. If you need to just calculate an average up to the last price change, just use INNER JOIN instead of the LEFT OUTER JOIN.
After that you just need to divide the sum of (price * period) by the total length of the period, and get the answer.
Here is an SQL Fiddle with the calculation
Also when your working with months, you must remember that not all months are equal, so the price for December was active longer than it was for February.
Using CTE and row_number() to get monthly average up to the last dateChanged. Fiddle-Demo
;with cte as (
select product, dateChanged, price,
row_number() over (partition by product order by datechanged) rn
from x
)
select t1.product,
sum(t1.price *1.0 * datediff(month, t1.dateChanged,t2.dateChanged))/12 monthlyAvg
from cte t1 join cte t2 on t1.product = t2.product
and t1.rn +1 = t2.rn
group by t1.product
--Results
Product MonthlyAvg
A 20.166666
B 120.166666
OR if you need up to date daily average then use a LEFT JOIN Fiddle-Demo;
;with cte as (
select product, dateChanged, price,
row_number() over (partition by product order by datechanged) rn
from x
)
select t1.product,
sum(t1.price *1.0 *
datediff(day, t1.dateChanged,isnull(t2.dateChanged,getdate())))/365 dailyAvg
from cte t1 left join cte t2 on t1.product = t2.product
and t1.rn +1 = t2.rn
group by t1.product
--Results
product dailyAvg
A 21.386301
B 130.975342