sql date subquery get the last - sql

I have an error "could not be bound" while trying this SQL
SELECT
promotions.id_product, price.value
FROM
promotions
LEFT OUTER JOIN
(SELECT TOP 1 id_product, date, value
WHERE date > promotions.date) AS price ON price.id_product = promotion.id_product
About the SQL... I have two tables and I need get the correct price while the promotions is running (not the last price)...
Table promotions
id_product | DATE | VALUE | Finish_date
1 | 2018-05-01 | 20 | 2018-06-03
1 | 2018-07-02 | 18 | 2018-08-01
Table prices
id_product | DATE | VALUE
1 | 2018-04-01 | 30
1 | 2018-06-02 | 25

You a subquery with join cannot be correlated to other tables in the from clause.
Instead, use outer apply:
SELECT p.id_product, pr.value
FROM promotions p OUTER APPLY
(SELECT TOP 1 pr.id_product, pr.date, pr.value
FROM prices pr
WHERE pr.id_produto = p.id_produto AND pr.date > p.date
ORDER BY pr.date DESC
) pr;
I added the ORDER BY. Presumably, you want the "next" price after the promotion date, not an arbitrary price afterwards.

Related

Redshift: Add Row for each hour in a day

I have a table contains item_wise quantity at different hour of date. trying to add data for each hour(24 enteries in a day) with previous hour available quantity. For example for hour(2-10), it will be 5.
I created a table with hours enteries (1-24) & full join with shared table.
How can i add previous available entry. Need suggestion
item_id| date | hour| quantity
101 | 2022-04-25 | 2 | 5
101 | 2022-04-25 | 10 | 13
101 | 2022-04-25 | 18 | 67
101 | 2022-04-25 | 23 | 27
You can try to use generate_series to generate hours number, let it be the OUTER JOIN base table,
Then use a correlated-subquery to get your expect quantity column
SELECT t1.*,
(SELECT quantity
FROM T tt
WHERE t1.item_id = tt.item_id
AND t1.date = tt.date
AND t1.hour >= tt.hour
ORDER BY tt.hour desc
LIMIT 1) quantity
FROM (
SELECT DISTINCT item_id,date,v.hour
FROM generate_series(1,24) v(hour)
CROSS JOIN T
) t1
ORDER BY t1.hour
Provided the table of int 1 .. 24 is all24(hour) you can use lead and join
select t.item_id, t.date, all24.hour, t.quantity
from all24
join (
select *,
lead(hour, 1, 25) over(partition by item_id, date order by hour) - 1 nxt_h
from tbl
) t on all24.hour between t.hour and t.nxt_h

Aggregate functions based on current Row value

I am working with data similar to below,
week | product | sale
1 | ABC | 2
1 | ABC | 1
2 | ABC | 1
3 | ABC | 5
4 | ABC | 1
2 | DEF | 5
Let us say that is my Orders table named tblOrders. Now, in each row, I want to aggregate the total sales from last week for that product - for instance, if I am on week 2 of product "ABC", I need to show the aggregated sales amount of week 1 for product ABC. so, the output should look something like below,
week | product | sale | ProductPreviousWeekSales
1 | ABC | 2 | 0
1 | ABC | 1 | 0
2 | ABC | 1 | 3
3 | ABC | 5 | 1
4 | ABC | 1 | 5
2 | DEF | 5 | 0
I was originally thinking I could solve this using Aggregates and Window Function, but doesn't look to be so. Another thought I was having is to use Conditional Aggregate - something like sum(case when x=currentRow.x then sale else 0 end), but that wouldn't work too.
Here is the SQLFiddle for above sample - http://sqlfiddle.com/#!18/890b7/2
Note: I need to calculate similar value for Last 4 weeks, so trying to avoid doing this as a sub-query or multiple joins (if possible), as the data set I am working with is very large, and don't want to add to much performance overhead trying to incorporate this change.
Here is one approach which first aggregates your table in a separate CTE and uses LAG to find the previous week's amount, for each week and product:
WITH cte AS (
SELECT week, product,
LAG(SUM(sale)) OVER (PARTITION BY product ORDER BY week) AS lag_total_sales
FROM yourTable
GROUP BY week, product
)
SELECT t1.week, t1.product, t1.sale,
COALESCE(t2.lag_total_sales, 0) AS ProductPreviousWeekSales
FROM yourTable t1
INNER JOIN cte t2
ON t2.week = t1.week AND
t2.product = t1.product
ORDER BY
t1.product,
t1.week;
Demo
DISCLAIMER
The query I am showing below doesn't work in SQL Server, unfortunately. Up to SQL Server version 2019 the DBMS lacks full support of the RANGE clause that is essential for the query to work. Running the query in SQL Server results in
Msg 4194 Level 16 State 1 Line 1 RANGE is only supported with UNBOUNDED and CURRENT ROW window frame delimiters.
I am not deleting this answer, because this is standard SQL and the approach may help future readers. It runs fine in a lot of DBMS, and maybe a future version of SQL Server will be able to deal with this, too. I've added demos to show that it runs in PostgreSQL, MySQL and Oracle, but fails in SQL Server 2019.
ORIGINAL ANSWER
Your query shown in the fiddle (select a.*, sum(sale) over(partition by product) ProductPreviousWeekSales from tblOrder a) is merely lacking the appropriate windowing clause. As you are dealing with ties here (more than one row per product and week) this needs to be a RANGE clause:
select a.*,
sum(sale) over(partition by product
order by week range between 1 preceding and 1 preceding
) as ProductPreviousWeekSales
from tblOrder a
order by product, week;
(Use COALESCE if you want to see a zero instead of NULL.)
Demos:
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=149eddbff82500d539b2c615f4167cff
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=a8453970efac08ad69275914910bb13e
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=64ed21150142caa0acb7f8c7ca7d9022
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=149eddbff82500d539b2c615f4167cff
You can do from following
; WITH cteorder AS
(
SELECT DISTINCT product, week FROM dbo.tblOrder
)
SELECT
cte.*,
SUM(ISNULL(b.sale,0)) ProductPreviousWeekSales
from tblOrder a
INNER JOIN cteorder cte ON cte.product = a.product AND cte.week = a.week
LEFT JOIN dbo.tblOrder b ON b.product = cte.product AND b.week = (a.week-1)
GROUP BY cte.product,
cte.week
You can run from : Fiddle
You need to select from TblOrders twice. Once, grouping by week and product and summing the sales, and the second time, a row-by-row scan against TblOrders, left-joining it with the grouping query on same product and week offset by 1:
If the join fails , the sales value of the joined grouping query returns NULL. You can put in 0 instead of NULL using COALESCE(), but ISNULL() has all chances of being faster, as it has a fixed number of parameters, while COALESCE() has a variable argument list, which comes at a certain cost.
WITH
tblorders(wk,product,sales) AS (
SELECT 1,'ABC',2
UNION ALL SELECT 1,'ABC',1
UNION ALL SELECT 2,'ABC',1
UNION ALL SELECT 3,'ABC',5
UNION ALL SELECT 4,'ABC',1
UNION ALL SELECT 2,'DEF',5
)
,
grp AS (
SELECT
wk
, product
, SUM(sales) AS sales
FROM tblorders
GROUP BY
wk
, product
)
SELECT
o.wk
, o.product
, o.sales
, ISNULL(g.sales,0) AS productpreviousweeksales
FROM tblorders o
LEFT
JOIN grp g
ON o.wk - 1 = g.wk
AND o.product= g.product
ORDER BY 2,1
;
wk | product | sales | productpreviousweeksales
----+---------+-------+--------------------------
1 | ABC | 2 | 0
1 | ABC | 1 | 0
2 | ABC | 1 | 3
3 | ABC | 5 | 1
4 | ABC | 1 | 5
2 | DEF | 5 | 0

How to create BigQuery this query in retail dataset

I have a table with user retail transactions. It includes sales and cancels. If Qty is positive - it sells, if negative - cancels. I want to attach cancels to the most appropriate sell. So, I have tables likes that:
| CustomerId | StockId | Qty | Date |
|--------------+-----------+-------+------------|
| 1 | 100 | 50 | 2020-01-01 |
| 1 | 100 | -10 | 2020-01-10 |
| 1 | 100 | 60 | 2020-02-10 |
| 1 | 100 | -20 | 2020-02-10 |
| 1 | 100 | 200 | 2020-03-01 |
| 1 | 100 | 10 | 2020-03-05 |
| 1 | 100 | -90 | 2020-03-10 |
User with ID 1 has the following actions: buy 50 -> return 10 -> buy 60 -> return 20 -> buy 200 -> buy 10 - return 90. For each cancel row (with negative Qty) I find the previous row (by Date) with positive Qty and greater than cancel Qty.
So I need to create BigQuery queries to create table likes this:
| CustomerId | StockId | Qty | Date | CancelQty |
|--------------+-----------+-------+------------+-------------|
| 1 | 100 | 50 | 2020-01-01 | -10 |
| 1 | 100 | 60 | 2020-02-10 | -20 |
| 1 | 100 | 200 | 2020-03-01 | -90 |
| 1 | 100 | 10 | 2020-03-05 | 0 |
Does anybody help me with these queries? I have created one candidate query (split cancel and sales, join them, and do some staff for removing), but it works incorrectly in the above case.
I use BigQuery, so any BQ SQL features could be applied.
Any ideas will be helpful.
You can use the following query.
;WITH result AS (
select t1.*,t2.Qty as cQty,t2.Date as Date_t2 from
(select *,ROW_NUMBER() OVER (ORDER BY qty DESC) AS [ROW NUMBER] from Test) t1
join
(select *,ROW_NUMBER() OVER (ORDER BY qty) AS [ROW NUMBER] from Test) t2
on t1.[ROW NUMBER] = t2.[ROW NUMBER]
)
select CustomerId,StockId,Qty,Date,ISNULL(cQty, 0) As CancelQty,Date_t2
from (select CustomerId,StockId,Qty,Date,case
when cQty < 0 then cQty
else NULL
end AS cQty,
case
when cQty < 0 then Date_t2
else NULL
end AS Date_t2 from result) t
where qty > 0
order by cQty desc
result: https://dbfiddle.uk
You can do this as a gaps-and-islands problem. Basically, add a grouping column to the rows based on a cumulative reverse count of negative values. Then within each group, choose the first row where the sum is positive. So:
select t.* (except cancelqty, grp),
(case when min(case when cancelqty + qty >= 0 then date end) over (partition by customerid grp) = date
then cancelqty
else 0
end) as cancelqty
from (select t.*,
min(cancelqty) over (partition by customerid, grp) as cancelqty
from (select t.*,
countif(qty < 0) over (partition by customerid order by date desc) as grp
from transactions t
) t
from t
) t;
Note: This works for the data you have provided. However, there may be complicated scenarios where this does not work. In fact, I don't think there is a simple optimal solution assuming that the returns are not connected to the original sales. I would suggest that you fix the data model so you record where the returns come from.
The below query seems to satisfy the conditions and the output mentioned.The solution is based on mapping the base table (t) and having the corresponding canceled qty row alongside from same table(t1)
First, a self join based on the customer and StockId is done since they need to correspond to the same customer and product.
Additionally, we are bringing in the canceled transactions t1 that happened after the base row in table t t.Dt<=t1.Dt and to ensure this is a negative qty t1.Qty<0 clause is added
Further we cannot attribute the canceled qty if they are less than the Original Qty. Therefore I am checking if the positive is greater than the canceled qty. This is done by adding a '-' sign to the cancel qty so that they can be compared easily. -(t1.Qty)<=t.Qty
After the Join, we are interested only in the positive qty, so adding a where clause to filter the other rows from the base table t with canceled quantities t.Qty>0.
Now we have the table joined to every other canceled qty row which is less than the transaction date. For example, the Qty 50 can have all the canceled qty mapped to it but we are interested only in the immediate one came after. So we first group all the base quantity values and then choose the date of the canceled Qty that came in first in the Having clause condition HAVING IFNULL(t1.dt, '0')=MIN(IFNULL(t1.dt, '0'))
Finally we get the rows we need and we can exclude the last column if required using an outer select query
SELECT t.CustomerId,t.StockId,t.Qty,t.Dt,IFNULL(t1.Qty, 0) CancelQty
,t1.dt dt_t1
FROM tbl t
LEFT JOIN tbl t1 ON t.CustomerId=t1.CustomerId AND
t.StockId=t1.StockId
AND t.Dt<=t1.Dt AND t1.Qty<0 AND -(t1.Qty)<=t.Qty
WHERE t.Qty>0
GROUP BY 1,2,3,4
HAVING IFNULL(t1.dt, '0')=MIN(IFNULL(t1.dt, '0'))
ORDER BY 1,2,4,3
fiddle
Consider below approach
with sales as (
select * from `project.dataset.table` where Qty > 0
), cancels as (
select * from `project.dataset.table` where Qty < 0
)
select any_value(s).*,
ifnull(array_agg(c.Qty order by c.Date limit 1)[offset(0)], 0) as CancelQty
from sales s
left join cancels c
on s.CustomerId = c.CustomerId
and s.StockId = c.StockId
and s.Date <= c.Date
and s.Qty > abs(c.Qty)
group by format('%t', s)
if applied to sample data in your question - output is

PostgreSQL Referencing Outer Query in Subquery

I have two Postgres tables (really, more than that, but simplified for the purpose of the question) - one a record of products that have been ordered by customers, and another a historical record of prices per customer and a date they went into effect. Something like this:
'orders' table
customer_id | timestamp | quantity
------------+---------------------+---------
1 | 2015-09-29 16:01:01 | 5
1 | 2015-10-23 14:33:36 | 3
2 | 2015-10-19 09:43:02 | 7
1 | 2015-11-16 15:08:32 | 2
'prices' table
customer_id | effective_time | price
------------+---------------------+-------
1 | 2015-01-01 00:00:00 | 15.00
1 | 2015-10-01 00:00:00 | 12.00
2 | 2015-01-01 00:00:00 | 14.00
I'm trying to create a query that will return every order and its unit price for that customer at the time of the order, like this:
desired result
customer_id | quantity | price
------------+----------+------
1 | 5 | 15.00
1 | 3 | 12.00
2 | 7 | 14.00
1 | 2 | 12.00
This is essentially what I want, but I know that you can't reference an outer query inside an inner query, and I'm having trouble figuring out how to re-factor:
SELECT
o.customer_id,
o.quantity,
p.price
FROM orders o
INNER JOIN (
SELECT price
FROM prices x
WHERE x.customer_id = o.customer_id
AND x.effective_time <= o.timestamp
ORDER BY x.effective_time DESC
LIMIT 1
) p
;
Can anyone suggest the best way to make this work?
Instead of joining an inline view based on the prices table, you can perform a subquery in the SELECT list:
SELECT customer_id, quantity, (
SELECT price
FROM prices p
WHERE
p.customer_id = o.customer_id
AND p.effective_time <= o.timestamp
ORDER BY p.effective_time DESC
LIMIT 1
) AS price
FROM orders o
That does rely on a correlated subquery, which could be bad for performance, but with the way your data are structured I doubt there's a substantially better alternative.
You dont need the subquery, just a plain inner join will do (this assumes there are no duplicate effective_times per customer):
SELECT o.customer_id, o.quantity
,p.price
FROM orders o
JOIN prices p ON p.customer_id = o.customer_id
AND p.effective_time <= o.timestamp
AND NOT EXISTS ( SELECT * FROM prices nx
WHERE nx.customer_id = o.customer_id
AND nx.effective_time <= o.timestamp
AND nx.effective_time > p.effective_time
)
;

PostgreSQL Order By SubQuery

I have this database structure:
Tables:
Category
id | category
1 | fruit
2 | cars
3 | tables
Product
id | product | category_id
1 | banana | 1
2 | apple | 1
3 | orange | 1
4 | example 1 | 2
5 | example 2 | 3
User_List
id | product_id | user_id | bought_date
1 | 1 | 1 | 2012-06-21 11:00:00
2 | 2 | 1 | 2012-06-21 06:00:00
3 | 4 | 1 | 2012-06-21 08:00:00
4 | 5 | 1 | 2012-06-21 01:00:00
what i want is create a query that "order by bought_date (desc) by category".
In that case the expected result is:
banana
apple
example 1
example 2
My query:
SELECT c.id, u.bought_date
FROM categry as c
left join product p on (c.id=p.category_id)
left join user_list u on (p.id=u.product_id)
WHERE u.user_id=3
ORDER BY u.bought_date DESC NULLS LAST
But this only does a simple sort by bought date...
with this result:
banana
example 1
apple
example 2
I thought of one ordering. You want to order by the earliest or latest date for each category. For that, use window functions.
SELECT c.id, u.bought_date, max(u.bought_date) over (partition by c.id) as category_bd
FROM categry c left join
product p
on (c.id=p.category_id) left join
user_list u
on (p.id=u.product_id)
WHERE u.user_id = 3
ORDER BY category_bd DESC NULLS LAST, u.bought_date DESC NULLS LAST
It sounds as though you just need two columns in your order by clause:
SELECT c.id, u.bought_date
FROM categry as c
left join product p on (c.id=p.category_id)
left join user_list u on (p.id=u.product_id)
WHERE u.user_id=3
ORDER BY category_id, u.bought_date DESC NULLS LAST
Assuming missing information:
bought_date is defined NOT NULL.
There can be multiple rows in user_list per product, and they can have a different bought_date.
Order products by the latest bought_date of their category first,
and by their latest bought_date next.
SELECT p.product
FROM product p
JOIN user_list u ON u.product_id = p.id
GROUP BY 1
ORDER BY max(max(u.bought_date)) OVER (PARTITION BY p.category_id) DESC
, max(u.bought_date) DESC;
You don't need the table categry in the query at all.
The window function to get the latest bought_date per category can go into the ORDER BY clause.
Yes, that's a window function over the aggregate max(u.bought_date).
Obviously you don't want oranges in the result, since nobody "brought" them.
Use JOIN, not LEFT JOIN.