How to create query for search total sales previous year - sql

I have table named Sales.OrderValues that contain of 2 column, namely orderyear and val (total sales per day).
This is the record snippet (I cant show all of record because there are 830 rows)
I want to show the result like this
But, my output is different with my expected output.
As you can see, the expected output of prevtotalsales in 2008 is 618085.30. But, my output is 825169.29 (which is 208083.99 + 617085.30).
Below is my query
SELECT
YEAR(D1.orderdate) AS orderyear,
SUM(D1.val) AS curtotalsales,
(
SELECT
SUM(D2.val)
FROM
Sales.OrderValues D2
WHERE
YEAR(D1.orderdate) > YEAR(D2.orderdate)
)
AS prevtotalsales
FROM
Sales.OrderValues D1
GROUP BY
YEAR(D1.orderdate);
How to show the SUM of totalsales at the previous year without adding the next year's totalsales?

Basically, you want an equality condition in the WHERE clause of the subquery. This:
WHERE YEAR(D1.orderdate) > YEAR(D2.orderdate)
Should be:
WHERE YEAR(D1.orderdate) = YEAR(D2.orderdate) + 1
But it is much simpler and more efficient to just use lag():
SELECT
YEAR(orderdate) AS orderyear,
SUM(val) AS curtotalsales,
LAG(SUM(val)) OVER(ORDER BY YEAR(orderdate)) AS prevtotalsales
FROM Sales.OrderValues
GROUP BY YEAR(orderdate)
ORDER BY orderyear

You need to first SUM the values per year, and then use a cumulative SUM:
WITH Totals AS(
SELECT YEAR(OV.orderdate) AS OrderYear
SUM(OV.Val) AS YearSum
FROM Sales.OrderValues OV
GROUP BY YEAR(OV.orderdate))
SELECT OrderYear,
YearSum,
SUM(YearSum) OVER (ORDER BY OrderYear ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS PreviousCumulative
FROM Totals;

Related

How do I write a query to find highest earning day per quarter?

I need to write SQL query to pull the single, highest-earning day for a certain brand of each quarter of 2018. I have the following but it does not pull a singular day - it pulls the highest earnings for each day.
select distinct quarter, order_event_date, max(gc) as highest_day_gc
from (
select sum(commission) as cm, order_date,
extract(quarter from order__date) as quarter
from order_table
where advertiser_id ='123'
and event_year='2018'
group by 3,2
)
group by 1,2
order by 2 DESC
You can use window functions to find the highest earning day per quarter by using rank().
select rank() over (partition by quarter order by gc desc) as rank, quarter, order_event_date, gc
from (select sum(gross_commission) gc,
order_event_date,
extract(quarter from order_event_date) quarter
from order_aggregation
where advertiser_id = '123'
and event_year = '2018'
group by order_event_date, quarter) a
You could create the query above as view and filter it by using where rank = 1.
You could add the LIMIT clause at the end of the sentence. Also, change the las ORDER BY clause to ORDER BY highest_day_gc. Something like:
SELECT DISTINCT quarter
,order_event_date
,max(gc) as highest_day_gc
FROM (SELECT sum(gross_commission) as gc
,order_event_date
,extract(quarter from order_event_date) as quarter
FROM order_aggregation
WHERE advertiser_id ='123'
AND event_year='2018'
GROUP BY 3,2) as subquery
GROUP BY 1,2
ORDER BY 3 DESC
LIMIT 1

Postgres - AVG calculation

Please refer to the below query
SELECT sum(sales) AS "Sales",
sum(discount) AS "discount",
year
FROM Sales_tbl
WHERE Group by year
Now I want to also display a column for AVG(sales) that is the same value and based on the total of sales column
Output
Please advise
Use AVG() as a window function:
WITH t AS (
SELECT
SUM(sales) AS sales, SUM(discount) AS discount, year
FROM tbl_sales
GROUP BY year
)
SELECT *,AVG(sales) OVER w_total
FROM t
WINDOW w_total AS (RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY year;
The frame RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is pretty much optional in this case, but it is considered a good practice to be as explicit as possible in window functions. So you're also able to write the query like this:
WITH t AS (
SELECT
SUM(sales) AS sales, SUM(discount) AS discount, year
FROM tbl_sales
GROUP BY year
)
SELECT *,AVG(sales) OVER ()
FROM t
ORDER BY year;
Demo: db<>fiddle

refering to field out of subeselects scope

I'm working on a piece of SQL at the moment and i need to retrieve every row of a dataset with a median and an average aggregated in it.
Example
i have the following set
ID;month;value
and i would like to retrieve something like :
ID;month;value;average for this month;median for this month
without having to group by my result.
So it would be something like :
SELECT ID,month,value,
(SELECT AVG(value) FROM myTable) as "myAVG"
FROM myTable
but i would need that average to be the average for that month specifically. So, rows where the month="January" will have the average and median for "January" etc ...
Issue here is that i did not find a way to refer to the value of month in my subquery
(SELECT AVG(value) FROM myTable)
Does someone have a clue?
P.S: It's a redshift database i'm working on.
You would need to select all rows from the table, and do a left join with a select statement that does group by month. This way, you would get every row, and the group by results with them for that month.
Something like this:
SELECT * FROM myTable a
LEFT JOIN
(
SELECT Month, Sum(value being summed) as mySum
FROM myTable
GROUP BY Month
) b
ON a.Month = b.Month
Helpful?
with myavg as
(SELECT month, AVG(value) as avgval FROM myTable group by month)
, mymed as
(select month, median(value) as medval from myTable group by month)
select ID, month, value, ma.avgval, mm.medval
from mytable m left join myavg ma
on m.month = ma.month
left join mymed mm
on m.month = mm.month
You can use a cte to do this. However, you need a group by on month, as you are calculating an aggregate value.
In Redshift you can use Window Function.
select month,
avg(value) over
(PARTITION BY month rows unbounded preceding) as avg
from myTable
order by 1;

How to Increment Total Sum/Count in Row SQL

I am having trouble counting the TotalAmount incrementing by however many more number of policies there are iterating through each row.
For Example consider the following code:
SELECT
Customer.custno,
Customer.enteredDate AS 'Date Entered',
COUNT(BasicPolInfo.polid) AS 'Number of Policies',
SUM( COUNT(BasicPolInfo.polid)) over() AS TotalAmount
FROM Customer
INNER JOIN BasicPolInfo ON Customer.custid = BasicPolInfo.custid
WHERE BasicPolInfo.polid IS NOT NULL
and Customer.firstname IS NOT NULL
AND Customer.enteredDate > '1/1/79'
GROUP BY Customer.custno, Customer.firstname, Customer.lastname, Customer.entereddate
ORDER BY Customer.enteredDate ASC
What I would like to see is the TotalAmount Column be added from the Number of Policies iterating through each and every customer.
ex:
21 -- date -- 6 -- 6
24 -- date -- 13 -- 19
25 -- date -- 23 -- 32
29 -- date -- 16 -- 48
I could care less for the order of the custno, rather I am more concerned if the total policies are even 159703? There are more than 1000 rows in this SQL.
Please help me how I am able to sum each row from the preceding total sum!
In SQL Server 2012 forward you can use ROWS in an analytic/window function to get a running aggregate:
SELECT Customer.custno
, Customer.enteredDate AS 'Date Entered'
, COUNT(BasicPolInfo.polid) AS 'Number of Policies'
, SUM(COUNT(BasicPolInfo.polid)) OVER (ORDER BY Customer.custno ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS TotalAmount
FROM Customer
INNER JOIN BasicPolInfo ON Customer.custid = BasicPolInfo.custid
WHERE BasicPolInfo.polid IS NOT NULL
AND Customer.firstname IS NOT NULL
AND Customer.enteredDate > '1/1/79'
GROUP BY Customer.custno
, Customer.firstname
, Customer.lastname
, Customer.entereddate
ORDER BY Customer.enteredDate ASC
Note that while you don't care about the order, an ORDER BY is required in order to determine which rows precede the current row.
It appears you are looking for a cumulative total.
This can be done via a CTE, joining the table on itself, a subquery or as of 2012 by using the "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW" in the aggregate windowed function.
This can be done with any aggregated windowed function. You need to use
OVER (ORDER BY ______ ORDER BY ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
*Note you need to use order by to specify the arrangement of the column.
The below link to another stack overflow question provides some clear examples.
how to get cumulative sum

Can I limit the amount of rows to be used for a group in a GROUP BY statement

I'm having an odd problem
I have a table with the columns product_id, sales and day
Not all products have sales every day. I'd like to get the average number of sales that each product had in the last 10 days where it had sales
Usually I'd get the average like this
SELECT product_id, AVG(sales)
FROM table
GROUP BY product_id
Is there a way to limit the amount of rows to be taken into consideration for each product?
I'm afraid it's not possible but I wanted to check if someone has an idea
Update to clarify:
Product may be sold on days 1,3,5,10,15,17,20.
Since I don't want to get an the average of all days but only the average of the days where the product did actually get sold doing something like
SELECT product_id, AVG(sales)
FROM table
WHERE day > '01/01/2009'
GROUP BY product_id
won't work
If you want the last 10 calendar day since products had a sale:
SELECT product_id, AVG(sales)
FROM table t
JOIN (
SELECT product_id, MAX(sales_date) as max_sales_date
FROM table
GROUP BY product_id
) t_max ON t.product_id = t_max.product_id
AND DATEDIFF(day, t.sales_date, t_max.max_sales_date) < 10
GROUP BY product_id;
The date difference is SQL server specific, you'd have to replace it with your server syntax for date difference functions.
To get the last 10 days when the product had any sale:
SELECT product_id, AVG(sales)
FROM (
SELECT product_id, sales, DENSE_RANK() OVER
(PARTITION BY product_id ORDER BY sales_date DESC) AS rn
FROM Table
) As t_rn
WHERE rn <= 10
GROUP BY product_id;
This asumes sales_date is a date, not a datetime. You'd have to extract the date part if the field is datetime.
And finaly a windowing function free version:
SELECT product_id, AVG(sales)
FROM Table t
WHERE sales_date IN (
SELECT TOP(10) sales_date
FROM Table s
WHERE t.product_id = s.product_id
ORDER BY sales_date DESC)
GROUP BY product_id;
Again, sales_date is asumed to be date, not datetime. Use other limiting syntax if TOP is not suported by your server.
Give this a whirl. The sub-query selects the last ten days of a product where there was a sale, the outer query does the aggregation.
SELECT t1.product_id, SUM(t1.sales) / COUNT(t1.*)
FROM table t1
INNER JOIN (
SELECT TOP 10 day, Product_ID
FROM table t2
WHERE (t2.product_ID=t1.Product_ID)
ORDER BY DAY DESC
)
ON (t2.day=t1.day)
GROUP BY t1.product_id
BTW: This approach uses a correlated subquery, which may not be very performant, but it should work in theory.
I'm not sure if I get it right but If you'd like to get the average of sales for last 10 days for you products you can do as follows :
SELECT Product_Id,Sum(Sales)/Count(*) FROM (SELECT ProductId,Sales FROM Table WHERE SaleDAte>=#Date) table GROUP BY Product_id HAVING Count(*)>0
OR You can use AVG Aggregate function which is easier :
SELECT Product_Id,AVG(Sales) FROM (SELECT ProductId,Sales FROM Table WHERE SaleDAte>=#Date) table GROUP BY Product_id
Updated
Now I got what you meant ,As far as I know it is not possible to do this in one query.It could be possible if we could do something like this(Northwind database):
select a.CustomerId,count(a.OrderId)
from Orders a INNER JOIN(SELECT CustomerId,OrderDate FROM Orders Order By OrderDate) AS b ON a.CustomerId=b.CustomerId GROUP BY a.CustomerId Having count(a.OrderId)<10
but you can't use order by in subqueries unless you use TOP which is not suitable for this case.But maybe you can do it as follows:
SELECT PorductId,Sales INTO #temp FROM table Order By Day
select a.ProductId,Sum(a.Sales) /Count(a.Sales)
from table a INNER JOIN #temp AS b ON a.ProductId=b.ProductId GROUP BY a.ProductId Having count(a.Sales)<=10
If this is a table of sales transactions, then there should not be any rows in there for days on which there were no Sales. I.e., If ProductId 21 had no sales on 1 June, then this table should not have any rows with productId = 21 and day = '1 June'... Therefore you should not have to filter anything out - there should not be anything to filter out
Select ProductId, Avg(Sales) AvgSales
From Table
Group By ProductId
should work fine. So if it's not, then you have not explained the problem completely or accurately.
Also, in yr question, you show Avg(Sales) in the example SQL query but then in the text you mention "average number of sales that each product ... " Do you want the average sales amount, or the average count of sales transactions? And do you want this average by Product alone (i.e., one output value reported for each product) or do you want the average per product per day ?
If you want the average per product alone, for just thpse sales in the ten days prior to now? or the ten days prior to the date of the last sale for each product?
If the latter then
Select ProductId, Avg(Sales) AvgSales
From Table T
Where day > (Select Max(Day) - 10
From Table
Where ProductId = T.ProductID)
Group By ProductId
If you want the average per product alone, for just those sales in the ten days with sales prior to the date of the last sale for each product, then
Select ProductId, Avg(Sales) AvgSales
From Table T
Where (Select Count(Distinct day) From Table
Where ProductId = T.ProductID
And Day > T.Day) <= 10
Group By ProductId