Running total with if clause or do while - sql

Consider the following table with 3 columns.
Use this to create a SQL query to list the top products by revenue that make up 25% of the total revenue in 2020.
(i.e. If total revenue is 1000 then list of top products that account for <= 250)
Table ProductRevenue:
Date_DD ... date(YYYY-MM-DD)
Product_Name ... varchar(250)
Revenue ... decimal(10,2)
Sample data:
Date_DD Product_Name Revenue
-------------------------------------
2020-11-30 a 100
2020-10-02 b 100
2020-07-07 c 100
2020-04-04 d 100
2020-05-05 f 50
2020-06-06 g 120
2020-05-30 h 90
2020-11-13 k 120
2020-01-30 l 120
I used that code but don't know how to use where clause . Anyone can help?
SELECT
product_name, revenue,
SUM(revenue) OVER (ORDER BY revenue DESC, product_name) AS running _total
FROM
TABLE_PRODUCT_REVENUE
new code
select product_name, revenue, running_total from
(SELECT product_name, revenue, SUM(revenue) OVER ( ORDER BY revenue DESC, product_name) AS running_total
FROM TABLE_PRODUCT_REVENUE ) o
where running_total<(select max(running_total) from
(SELECT product_name, revenue, SUM(revenue) OVER ( ORDER BY revenue DESC, product_name) AS running_total
FROM TABLE_PRODUCT_REVENUE ) o )*0.25
group by product_name, revenue, running_total
order by running_total
I just need to know where can i add where clause where YEAR([Date_DD])=2000 anyone can help?

The question was not very descriptive; however, below might help you narrow down the issue.
Below will show up you the running total for the Product.
SELECT
product_name, revenue,
SUM(revenue) OVER (partition by product_name ORDER BY revenue DESC, product_name) AS running _total
FROM
TABLE-PRODUCT_REVENUE
below would give the result if the product total is more significant than x amount
select
*,case when running _total >=1000 then 'top selling product' else null end
(
SELECT
product_name, revenue,
SUM(revenue) OVER (partition by product_name ORDER BY revenue DESC, product_name) AS running _total
FROM
TABLE-PRODUCT_REVENUE
)t
where running_total >= xxx amount

Related

Find the second top selling product in terms of sales and quantity

There are two Tables - orders and item_line
orders
order_id
created_at
total_amount
123
2022-11-11 13:40:50
450.00
124
2022-10-30 00:40:50
1500.00
item_line
order_id
product_id
product_name
quantity
unit_price
123
a1b
milo
4
100.00
123
c2d
coke
5
10.00
124
c2d
coke
150
10.00
The question is:
Find the second top selling product in terms of sales and quantity in the current year sold between 6PM to 9PM.
My Take on This is -
SELECT * FROM (
SELECT i.product_name,
SUM(o.total_amount)sales,
SUM(i.quantity)total_qty,
ROW_NUMBER() OVER (ORDER BY SUM(o.total_amount) DESC,SUM(i.quantity)total_qty DESC) AS rn
FROM item_line i
WHERE o.created_at BETWEEN 18:00:00 AND 21:00:00
JOIN orders o on o.order_id = i.order_id
GROUP BY i.product_name ) temp
WHERE rn = 2;
But it's not correct. What wrong I am doing?
SELECT * FROM (
SELECT i.product_name,SUM(o.total_amount)AS 'Net Sales',
ROW_NUMBER() OVER(ORDER BY SUM(o.total_amount) DESC) AS rn
FROM item_line i
JOIN orders o on o.order_id = i.order_id
WHERE DATEPART(HOUR,o.created_at) BETWEEN 18 AND 21
GROUP BY i.product_name) temp
WHERE rn =2;
-- In terms of total quantity
SELECT * FROM (
SELECT i.product_name,SUM(i.quantity)AS 'Total Quantity',
ROW_NUMBER() OVER(ORDER BY SUM(i.quantity) DESC) AS rn
FROM item_line i
JOIN orders o on o.order_id = i.order_id
WHERE DATEPART(HOUR,o.created_at) BETWEEN 18 AND 21
GROUP BY i.product_name) temp
WHERE rn =2;
select o.order_id, sum(quantity), total_amount from orders [o]
inner join item_line[i] on o.order_id = i.order_id
group by o.order_id, total_amount order by total_amount desc, sum(quantity) desc
OFFSET 1 ROWS
FETCH NEXT 1 ROWS ONLY;
you can add target date time in filter

How to do a Min and Max of date but following the changes in price points

I'm not really sure how to word this question better so I'll provide the data that I have and the result that I'm after.
This is the data that I have
sku sales qty date
A 100 1 1-Jan-19
A 200 2 2-Jan-19
A 100 1 3-Jan-19
A 240 2 4-Jan-19
A 360 3 5-Jan-19
A 360 4 6-Jan-19
A 200 2 7-Jan-19
A 90 1 8-Jan-19
B 100 1 9-Jan-19
B 200 2 10-Jan-19
And this is the result that I'm after
sku price sum(qty) sum(sales) min(date) max(date)
A 100 4 400 1-Jan-19 3-Jan-19
A 120 5 600 4-Jan-19 5-Jan-19
A 90 4 360 6-Jan-19 6-Jan-19
A 100 2 200 7-Jan-19 7-Jan-19
A 90 1 90 8-Jan-19 8-Jan-19
B 100 3 300 9-Jan-19 10-Jan-19
As you can see, I'm trying to get the min and max date of each price point, where price = sales/qty. At this point, I can get the min and max date of the same price but I can separate it when there's another price in between. I think I have to use some sort of min(date) over (partition by sales/qty order by date) but I can't figure it out yet.
I'm using Redshift SQL
This is a gaps-and-islands query. You can do this by generating a sequence and subtracting that from the date. Then aggregate:
select sku, price, sum(qty), sum(sales),
min(date), max(date)
from (select t.*,
row_number() over (partition by sku, price order by date) as seqnum
from t
) t
group by sku, price, (date - seqnum * interval '1 day')
order by sku, price, min(date);
You can do with Sub Query and LAG
FIDDLE DEMO
SELECT SKU, Price, SUM(Qty) SumQty, SUM(Sales) SumSales, MIN(date) MinDate, MAX(date) MaxDate
FROM (
SELECT SKU,Price,SUM(is_change) OVER(order by SKU, date) is_change,Sales, Qty,date
FROM (SELECT SKU, Sales/Qty AS Price, Sales, Qty,date,
CASE WHEN Sales/Qty = lag(Sales/Qty) over (order by SKU, date)
and SKU = lag(SKU) OVER (order by SKU, date) then 0 ELSE 1 END AS is_change
FROM Tbl
)InnerSelect
) X GROUP BY sku, price,is_change
ORDER BY SKU,MIN(date)
Output

SQL - How to count number of distinct values (payments), after sum of rows where they have another column value (Due Date) in common

My 'deals_payments' table is:
Due Date Payment ID
1-Mar-19 1,000.00 123
1-Apr-19 1,000.00 123
1-May-19 1,000.00 123
1-Jun-19 1,000.00 123
1-Jul-19 1,000.00 123
1-Aug-19 1,000.00 123
1-Jun-19 500.00 456
1-Jul-19 500.00 456
1-Aug-19 500.00 456
I have the SQL code:
select
count(*), payment
from (select deals_payments.*,
(row_number() over (order by due_date) -
row_number() over (partition by payment order by due_date)
) as grp
from deals_payments
where id = 123
) deals_payments
group by grp, payment
order by grp
which gives me what I want - the number of payments on each distinct amount - (here I only asked for ID 123):
COUNT(*) PAYMENT
6 1000.00
But now I need the sum of payments of the two ID's (123 and 456), where the due dates are the same, and count the number of payments on each distinct amount, as:
COUNT(*) PAYMENT
3 1000.00
3 1500.00
I tried the below but it gives me the 'missing right parenthesis' error. What is wrong??
select
count(*),
(select
sum(total) total
from (select distinct
due_date,
(select
sum(payment)
from deals_payments
where (due_date = a.due_date)) as total
from deals_payments a
where a.id in (123, 456)
and payment > 0)
group by due_date
order by due_date) b
from (select deals_payments.*,
(row_number() over (order by due_date) -
row_number() over (partition by payment order by due_date)
) as grp
from deals_payments
where id = 123
) deals_payments
group by grp, payment
order by grp
Taking your earlier comments into consideration, I agree that the SQL can be simplified to get the intended result. My understanding is that the expected output is the frequency of the total payment of a subset of IDs on any given date.
select count(*) as PaymentFrequency, TotalPaidOnDueDate from
(
select due_date, sum(payment) as TotalPaidOnDueDate from #deals_payments
where ID in (123, 456)
group by due_date
) a
group by a.TotalPaidOnDueDate
Here is a sql fiddle I used to verify: http://sqlfiddle.com/#!18/6b04f/1
This seems really strange. I don't understand why your logic is so complicated.
How about this?
select id, count(*), max(payment)
from (select dp.*,
count(*) over (partition by due_date) as cnt
from deal_payments dp
where dp.id in (123, 456)
) dp
where cnt = 2
group by id;
An interesting question. Could this do the trick???
select payment, count(*)
from deals_payments
where due_date in
(select due_date
from deals_payments
group by due_date
having count(*) > 1)
group by payment;
You can add a filter by id if you want, of course.

SQL - "not contained in either an aggregate function or the GROUP BY clause."

Using SQL Server 2016. I have a table:
Product Qty OrderDate
--------------------------
Toys 100 2018-10-01
Toys 100 2018-10-01
Books 30 2018-10-01
Toys 150 2018-10-02
Toys 50 2018-10-02
Toys 20 2018-10-02
Toys 110 2018-10-03
Toys 90 2018-10-04
Toys 200 2018-10-05
Toys 100 2018-10-05
Toys 30 2018-10-08
Toys 50 2018-10-09
and I want to calculate the average quantity per product, for the last 5 days. I am close to this with this query:
SELECT
Product,
RowNumber,
OrderDate,
AVG(TotalQty) OVER (ORDER BY RowNumber DESC ROWS 5 PRECEDING) as RollingAvg
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY Product ORDER BY orderDate) AS RowNumber, Product, OrderDate, sum(Qty) as TotalQty
FROM Tbl
GROUP BY Product, OrderDate
) x
GROUP BY Product, RowNumber, OrderDate
The inner query works correctly, giving me the total per product/date pair. However my outer query reports a problem:
Column 'x.TotalQty' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
There's obviously something I'm doing wrong with my OVER clause, because when I remove that I get a valid result.
Syntactically valid query (that does the wrong thing):
SELECT
Product,
RowNumber,
OrderDate,
AVG(TotalQty) as RollingAvg
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY Product ORDER BY orderDate) AS RowNumber, Product, OrderDate, sum(Qty) as TotalQty
FROM Tbl
GROUP BY Product, OrderDate
) x
GROUP BY Product, RowNumber, OrderDate
Any help/pointers would be much appreciated please - I'm close but can't cross this final hurdle!
I think you want:
SELECT Product, RowNumber, OrderDate,
AVG(TotalQty) OVER (ORDER BY RowNumber DESC ROWS 5 PRECEDING) as RollingAvg
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY Product ORDER BY orderDate) AS RowNumber,
Product, OrderDate, sum(Qty) as TotalQty
FROM Tbl
GROUP BY Product, OrderDate
) x;
That is, the outer aggregation is unnecessary because AVG() is being used as a window function, not an aggregation function.
You should be able to do this without a subquery:
SELECT ROW_NUMBER() OVER (PARTITION BY Product ORDER BY orderDate) AS RowNumber,
Product, OrderDate, sum(Qty) as TotalQty,
AVG(SUM(Qty)) OVER (PARTITION BY Product ORDER BY orderDate ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) as avg_5
FROM Tbl
GROUP BY Product, OrderDate;
Note that this interprets "last five days" as the current day plus the preceding four days. Your version has six days for the average.

T-SQL calculate the percent increase or decrease between the earliest and latest for each project

I have a table like below, I am trying to run a query in T-SQL to get the earliest and latest costs for each project_id according to the date column and calculate the percent cost increase or decrease and return the data-set show in the second table (I have simplified the table in this question).
project_id date cost
-------------------------------
123 7/1/17 5000
123 8/1/17 6000
123 9/1/17 7000
123 10/1/17 8000
123 11/1/17 9000
456 7/1/17 10000
456 8/1/17 9000
456 9/1/17 8000
876 1/1/17 8000
876 6/1/17 5000
876 8/1/17 10000
876 11/1/17 8000
Result:
(Edit: Fixed the result)
project_id "cost incr/decr pct"
------------------------------------------------
123 80% which is (9000-5000)/5000
456 -20%
876 0%
Whatever query I run I get duplicates.
This is what I tried:
select distinct
p1.Proj_ID, p1.date, p2.[cost], p3.cost,
(nullif(p2.cost, 0) / nullif(p1.cost, 0)) * 100 as 'OVER UNDER'
from
[PROJECT] p1
inner join
(select
[Proj_ID], [cost], min([date]) min_date
from
[PROJECT]
group by
[Proj_ID], [cost]) p2 on p1.Proj_ID = p2.Proj_ID
inner join
(select
[Proj_ID], [cost], max([date]) max_date
from
[PROJECT]
group by
[Proj_ID], [cost]) p3 on p1.Proj_ID = p3.Proj_ID
where
p1.date in (p2.min_date, p3.max_date)
Unfortunately, SQL Server does not have a first_value() aggregation function. It does have an analytic function, though. So, you can do:
select distinct project_id,
first_value(cost) over (partition by project_id order by date asc) as first_cost,
first_value(cost) over (partition by project_id order by date desc) as last_cost,
(first_value(cost) over (partition by project_id order by date desc) /
first_value(cost) over (partition by project_id order by date asc)
) - 1 as ratio
from project;
If cost is an integer, you may need to convert to a representation with decimal places.
You can use row_number and OUTER APPLY over top 1 ... prior to SQL 2012
select
min_.projectid,
latest_.cost - min_.cost [Calculation]
from
(select
row_number() over (partition by projectid order by date) rn
,projectid
,cost
from projectable) min_ -- get the first dates per project
outer apply (
select
top 1
cost
from projectable
where
projectid = min_.projectid -- get the latest cost for each project
order by date desc
) latest_
where min_.rn = 1
This might perform a little better
;with costs as (
select *,
ROW_NUMBER() over (PARTITION BY project_id ORDER BY date) mincost,
ROW_NUMBER() over (PARTITION BY project_id ORDER BY date desc) maxcost
from table1
)
select project_id,
min(case when mincost = 1 then cost end) as cost1,
max(case when maxcost = 1 then cost end) as cost2,
(max(case when maxcost = 1 then cost end) - min(case when mincost = 1 then cost end)) * 100 / min(case when mincost = 1 then cost end) as [OVER UNDER]
from costs a
group by project_id