Redshift: Add Row for each hour in a day

Redshift: Add Row for each hour in a day - sql

I have a table contains item_wise quantity at different hour of date. trying to add data for each hour(24 enteries in a day) with previous hour available quantity. For example for hour(2-10), it will be 5.
I created a table with hours enteries (1-24) & full join with shared table.
How can i add previous available entry. Need suggestion
item_id| date | hour| quantity
101 | 2022-04-25 | 2 | 5
101 | 2022-04-25 | 10 | 13
101 | 2022-04-25 | 18 | 67
101 | 2022-04-25 | 23 | 27

You can try to use generate_series to generate hours number, let it be the OUTER JOIN base table,
Then use a correlated-subquery to get your expect quantity column
SELECT t1.*,
(SELECT quantity
FROM T tt
WHERE t1.item_id = tt.item_id
AND t1.date = tt.date
AND t1.hour >= tt.hour
ORDER BY tt.hour desc
LIMIT 1) quantity
FROM (
SELECT DISTINCT item_id,date,v.hour
FROM generate_series(1,24) v(hour)
CROSS JOIN T
) t1
ORDER BY t1.hour

Provided the table of int 1 .. 24 is all24(hour) you can use lead and join
select t.item_id, t.date, all24.hour, t.quantity
from all24
join (
select *,
lead(hour, 1, 25) over(partition by item_id, date order by hour) - 1 nxt_h
from tbl
) t on all24.hour between t.hour and t.nxt_h

Related

Select first rows where condition [duplicate]

Here's what I'm trying to do. Let's say I have this table t:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
2 | 18 | 2012-05-19 | y
3 | 18 | 2012-08-09 | z
4 | 19 | 2009-06-01 | a
5 | 19 | 2011-04-03 | b
6 | 19 | 2011-10-25 | c
7 | 19 | 2012-08-09 | d
For each id, I want to select the row containing the minimum record_date. So I'd get:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
The only solutions I've seen to this problem assume that all record_date entries are distinct, but that is not this case in my data. Using a subquery and an inner join with two conditions would give me duplicate rows for some ids, which I don't want:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
5 | 19 | 2011-04-03 | b
4 | 19 | 2009-06-01 | a

How about something like:
SELECT mt.*
FROM MyTable mt INNER JOIN
(
SELECT id, MIN(record_date) AS MinDate
FROM MyTable
GROUP BY id
) t ON mt.id = t.id AND mt.record_date = t.MinDate
This gets the minimum date per ID, and then gets the values based on those values. The only time you would have duplicates is if there are duplicate minimum record_dates for the same ID.

I could get to your expected result just by doing this in mysql:
SELECT id, min(record_date), other_cols
FROM mytable
GROUP BY id
Does this work for you?

To get the cheapest product in each category, you use the MIN() function in a correlated subquery as follows:
SELECT categoryid,
productid,
productName,
unitprice
FROM products a WHERE unitprice = (
SELECT MIN(unitprice)
FROM products b
WHERE b.categoryid = a.categoryid)
The outer query scans all rows in the products table and returns the products that have unit prices match with the lowest price in each category returned by the correlated subquery.

I would like to add to some of the other answers here, if you don't need the first item but say the second number for example you can use rownumber in a subquery and base your result set off of that.
SELECT * FROM
(
SELECT
ROW_NUM() OVER (PARTITION BY Id ORDER BY record_date, other_cols) as rownum,
*
FROM products P
) INNER
WHERE rownum = 2
This also allows you to order off multiple columns in the subquery which may help if two record_dates have identical values. You can also partition off of multiple columns if needed by delimiting them with a comma

This does it simply:
select t2.id,t2.record_date,t2.other_cols
from (select ROW_NUMBER() over(partition by id order by record_date)as rownum,id,record_date,other_cols from MyTable)t2
where t2.rownum = 1

If record_date has no duplicates within a group:
think of it as of filtering. Simpliy get (WHERE) one (MIN(record_date)) row from the current group:
SELECT * FROM t t1 WHERE record_date = (
select MIN(record_date)
from t t2 where t2.group_id = t1.group_id)
If there could be 2+ min record_date within a group:
filter out non-min rows (see above)
then (AND) pick only one from the 2+ min record_date rows, within the given group_id. E.g. pick the one with the min unique key:
AND key_id = (select MIN(key_id)
from t t3 where t3.record_date = t1.record_date
and t3.group_id = t1.group_id)
so
key_id | group_id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
8 | 19 | 2009-06-01 | e
will select key_ids: #1 and #4

SELECT p.* FROM tbl p
INNER JOIN(
SELECT t.id, MIN(record_date) AS MinDate
FROM tbl t
GROUP BY t.id
) t ON p.id = t.id AND p.record_date = t.MinDate
GROUP BY p.id
This code eliminates duplicate record_date in case there are same ids with same record_date.
If you want duplicates, remove the last line GROUP BY p.id.

This a old question, but this can useful for someone
In my case i can't using a sub query because i have a big query and i need using min() on my result, if i use sub query the db need reexecute my big query. i'm using Mysql
select t.*
from (select m.*, #g := 0
from MyTable m --here i have a big query
order by id, record_date) t
where (1 = case when #g = 0 or #g <> id then 1 else 0 end )
and (#g := id) IS NOT NULL
Basically I ordered the result and then put a variable in order to get only the first record in each group.

The below query takes the first date for each work order (in a table of showing all status changes):
SELECT
WORKORDERNUM,
MIN(DATE)
FROM
WORKORDERS
WHERE
DATE >= to_date('2015-01-01','YYYY-MM-DD')
GROUP BY
WORKORDERNUM

select
department,
min_salary,
(select s1.last_name from staff s1 where s1.salary=s3.min_salary ) lastname
from
(select department, min (salary) min_salary from staff s2 group by s2.department) s3

How to JOIN 2 tables and keep only the most recent records

I have 2 tables I want to join. One table is a historical record of inventory that has a "last updated" date associated with each "piece" of inventory. The other table has the prices for each of those pieces. I want to join the tables so that I get the historical records with each of their prices. eg.
TABLE 1
Date Item Location QTY
06/01/2020 ABC 123 10
06/01/2020 DEF 234 12
06/02/2020 ABC 345 13
06/06/2020 ABC 123 10
TABLE 2
ITEM Price
ABC 34.5
DEF 52.12
-----------------> result table ------------------>
Date Item Location QTY Price
06/01/2020 DEF 234 12 34.5
06/02/2020 ABC 345 13 52.12
06/06/2020 ABC 123 10 34.5
Where the result table filters so that it only keeps the most recent records. Eg. TABLE1 updates every minute to show new inventory levels. The item + location combination is "unique" in the sense that table1 is at the item/location level of granularity. However, there can be many of the same item/location combinations as the table updates and creates new entries (it is a historical table, so older entries with the same item + location combination remain in the table). Sometimes the date is different, sometimes the date is the same day.
The query I wrote to try to do this is:
SELECT DISTINCT
TB1.DATE
,TB1.ITEM
,TB1.LOCATION
,TB1.QTY
,TB2.ITEM_COST
FROM
(
SCHEMA_1.TABLE1 AS TB1
JOIN SCHEMA_1.TABLE2 AS TB2
ON TB1.ITEM = TB2.ITEM
JOIN (
SELECT ITEM AS ITM,
LOCATION AS LOC,
MAX(DATE) AS MAXDATE
FROM SCHEMA_1.TABLE1
GROUP BY ITEM, LOCATION
)TB3
ON TB1.ITEM = TB3.ITM AND TB1.LOCATION= TB3.LOC AND TB1.DATE= TB3.MAXDATE
)
This query does execute but it gives me duplicates and definitely does not filter for the most recent records only. Not sure what I'm doing wrong here.

Good old subselect should work, too.
Assuming unqiqe Date per item, Location pair.
SELECT T1.* , T2.price
FROM SCHEMA_1.TABLE1 AS TB1
JOIN SCHEMA_1.TABLE2 AS TB2 ON TB1.Item = TB2.Item
WHERE Date = (SELECT MAX(Date) FROM SCHEMA_1.TABLE1 AS TB3
WHERE TB1.Item = TB3.Item
AND TB1.Location = TB3.Location)

I would suggest:
SELECT t1.*, t2.ITEM_PRICE
FROM SCHEMA_1.TABLE1 t1 JOIN
(SELECT t2.ITEM, t2.LOCATION,
MAX(t2.ITEM_PRICE) KEEP (DENSE_RANK FIRST ORDER BY t2.DATE DESC) as ITEM_PRICE
FROM SCHEMA_1.TABLE2 t2
GROUP BY t2.ITEM, t2.LOCATION
) t2
USING (ITEM, LOCATION);
Oracle has the convenient functionality to get the "first" or "last" value within a group. KEEP isn't the simplest syntax for this endeavor, but it does exactly what you want.

Columns names(dte= Date, LOC = Location) are changed but you can try this simple query to get the results:
Select dte dates, item, loc Locations, price, qty from
(Select a.dte, a.item, a.loc, b.price, a.qty,
max(a.dte) OVER (PARTITION BY a.item, a.loc) latest_dt
from table1 a LEFT JOIN table2 b ON a.item = b.item) where dte = latest_dt
order by 1;
Output:
+-----------+------+-----------+-------+-----+
| DATES | ITEM | LOCATIONS | PRICE | QTY |
+-----------+------+-----------+-------+-----+
| 01-JUN-20 | DEF | 234 | 52.12 | 12 |
+-----------+------+-----------+-------+-----+
| 02-JUN-20 | ABC | 345 | 34.5 | 13 |
+-----------+------+-----------+-------+-----+
| 06-JUN-20 | ABC | 123 | 34.5 | 10 |
+-----------+------+-----------+-------+-----+
You can also get Latest date as : max(a.dte) KEEP (DENSE_RANK FIRST order by dte desc) OVER (PARTITION BY a.item, a.loc )

Get the previous record based on conditions in Postgresql

I have two tables, the 1st contains transaction details and the 2nd contains user's orders :
id | transaction_date
1 | 2019-01-01
2 | 2019-02-01
3 | 2019-01-01
id | transaction_id | amount | user_id
15 1 7 1
20 2 15 1
25 3 25 1
And I would like to have this result, that is to say for all users orders have also the previous amount he paid based on the transaction date.
user_id | amount | previous amount
1 7 NULL
1 15 7
1 25 15
I tried multiple things including using the LAG function, but it doesn't seems to be possible with it because I have to join on another table to get the transaction_date. I think I should do a subquery with a left join but I don't figure out how to get only the previous order
Thanks

This is a join and lag():
select t2.user_id, t2.amount,
lag(t2.amount) over (partition by t2.user_id order by t1.date) as prev_amount
from table1 t1 join
table2 t2
on t2.transaction_id = t1.id;

Group by minimum value in one field while selecting distinct rows

Here's what I'm trying to do. Let's say I have this table t:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
2 | 18 | 2012-05-19 | y
3 | 18 | 2012-08-09 | z
4 | 19 | 2009-06-01 | a
5 | 19 | 2011-04-03 | b
6 | 19 | 2011-10-25 | c
7 | 19 | 2012-08-09 | d
For each id, I want to select the row containing the minimum record_date. So I'd get:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
The only solutions I've seen to this problem assume that all record_date entries are distinct, but that is not this case in my data. Using a subquery and an inner join with two conditions would give me duplicate rows for some ids, which I don't want:
key_id | id | record_date | other_cols
1 | 18 | 2011-04-03 | x
5 | 19 | 2011-04-03 | b
4 | 19 | 2009-06-01 | a

How about something like:
SELECT mt.*
FROM MyTable mt INNER JOIN
(
SELECT id, MIN(record_date) AS MinDate
FROM MyTable
GROUP BY id
) t ON mt.id = t.id AND mt.record_date = t.MinDate
This gets the minimum date per ID, and then gets the values based on those values. The only time you would have duplicates is if there are duplicate minimum record_dates for the same ID.

I could get to your expected result just by doing this in mysql:
SELECT id, min(record_date), other_cols
FROM mytable
GROUP BY id
Does this work for you?

To get the cheapest product in each category, you use the MIN() function in a correlated subquery as follows:
SELECT categoryid,
productid,
productName,
unitprice
FROM products a WHERE unitprice = (
SELECT MIN(unitprice)
FROM products b
WHERE b.categoryid = a.categoryid)
The outer query scans all rows in the products table and returns the products that have unit prices match with the lowest price in each category returned by the correlated subquery.

I would like to add to some of the other answers here, if you don't need the first item but say the second number for example you can use rownumber in a subquery and base your result set off of that.
SELECT * FROM
(
SELECT
ROW_NUM() OVER (PARTITION BY Id ORDER BY record_date, other_cols) as rownum,
*
FROM products P
) INNER
WHERE rownum = 2
This also allows you to order off multiple columns in the subquery which may help if two record_dates have identical values. You can also partition off of multiple columns if needed by delimiting them with a comma

This does it simply:
select t2.id,t2.record_date,t2.other_cols
from (select ROW_NUMBER() over(partition by id order by record_date)as rownum,id,record_date,other_cols from MyTable)t2
where t2.rownum = 1

If record_date has no duplicates within a group:
think of it as of filtering. Simpliy get (WHERE) one (MIN(record_date)) row from the current group:
SELECT * FROM t t1 WHERE record_date = (
select MIN(record_date)
from t t2 where t2.group_id = t1.group_id)
If there could be 2+ min record_date within a group:
filter out non-min rows (see above)
then (AND) pick only one from the 2+ min record_date rows, within the given group_id. E.g. pick the one with the min unique key:
AND key_id = (select MIN(key_id)
from t t3 where t3.record_date = t1.record_date
and t3.group_id = t1.group_id)
so
key_id | group_id | record_date | other_cols
1 | 18 | 2011-04-03 | x
4 | 19 | 2009-06-01 | a
8 | 19 | 2009-06-01 | e
will select key_ids: #1 and #4

SELECT p.* FROM tbl p
INNER JOIN(
SELECT t.id, MIN(record_date) AS MinDate
FROM tbl t
GROUP BY t.id
) t ON p.id = t.id AND p.record_date = t.MinDate
GROUP BY p.id
This code eliminates duplicate record_date in case there are same ids with same record_date.
If you want duplicates, remove the last line GROUP BY p.id.

This a old question, but this can useful for someone
In my case i can't using a sub query because i have a big query and i need using min() on my result, if i use sub query the db need reexecute my big query. i'm using Mysql
select t.*
from (select m.*, #g := 0
from MyTable m --here i have a big query
order by id, record_date) t
where (1 = case when #g = 0 or #g <> id then 1 else 0 end )
and (#g := id) IS NOT NULL
Basically I ordered the result and then put a variable in order to get only the first record in each group.

The below query takes the first date for each work order (in a table of showing all status changes):
SELECT
WORKORDERNUM,
MIN(DATE)
FROM
WORKORDERS
WHERE
DATE >= to_date('2015-01-01','YYYY-MM-DD')
GROUP BY
WORKORDERNUM

select
department,
min_salary,
(select s1.last_name from staff s1 where s1.salary=s3.min_salary ) lastname
from
(select department, min (salary) min_salary from staff s2 group by s2.department) s3

Get Monthly Totals from Running Totals

I have a table in a SQL Server 2008 database with two columns that hold running totals called Hours and Starts. Another column, Date, holds the date of a record. The dates are sporadic throughout any given month, but there's always a record for the last hour of the month.
For example:
ContainerID | Date | Hours | Starts
1 | 2010-12-31 23:59 | 20 | 6
1 | 2011-01-15 00:59 | 23 | 6
1 | 2011-01-31 23:59 | 30 | 8
2 | 2010-12-31 23:59 | 14 | 2
2 | 2011-01-18 12:59 | 14 | 2
2 | 2011-01-31 23:59 | 19 | 3
How can I query the table to get the total number of hours and starts for each month between two specified years? (In this case 2011 and 2013.) I know that I need to take the values from the last record of one month and subtract it by the values from the last record of the previous month. I'm having a hard time coming up with a good way to do this in SQL, however.
As requested, here are the expected results:
ContainerID | Date | MonthlyHours | MonthlyStarts
1 | 2011-01-31 23:59 | 10 | 2
2 | 2011-01-31 23:59 | 5 | 1

Try this:
SELECT c1.ContainerID,
c1.Date,
c1.Hours-c3.Hours AS "MonthlyHours",
c1.Starts - c3.Starts AS "MonthlyStarts"
FROM Containers c1
LEFT OUTER JOIN Containers c2 ON
c1.ContainerID = c2.ContainerID
AND datediff(MONTH, c1.Date, c2.Date)=0
AND c2.Date > c1.Date
LEFT OUTER JOIN Containers c3 ON
c1.ContainerID = c3.ContainerID
AND datediff(MONTH, c1.Date, c3.Date)=-1
LEFT OUTER JOIN Containers c4 ON
c3.ContainerID = c4.ContainerID
AND datediff(MONTH, c3.Date, c4.Date)=0
AND c4.Date > c3.Date
WHERE
c2.ContainerID is null
AND c4.ContainerID is null
AND c3.ContainerID is not null
ORDER BY c1.ContainerID, c1.Date

Using recursive CTE and some 'creative' JOIN condition, you can fetch next month's value for each ContainterID:
WITH CTE_PREP AS
(
--RN will be 1 for last row in each month for each container
--MonthRank will be sequential number for each subsequent month (to increment easier)
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY ContainerID, YEAR(Date), MONTH(DATE) ORDER BY Date DESC) RN
,DENSE_RANK() OVER (ORDER BY YEAR(Date),MONTH(Date)) MonthRank
FROM Table1
)
, RCTE AS
(
--"Zero row", last row in decembar 2010 for each container
SELECT *, Hours AS MonthlyHours, Starts AS MonthlyStarts
FROM CTE_Prep
WHERE YEAR(date) = 2010 AND MONTH(date) = 12 AND RN = 1
UNION ALL
--for each next row just join on MonthRank + 1
SELECT t.*, t.Hours - r.Hours, t.Starts - r.Starts
FROM RCTE r
INNER JOIN CTE_Prep t ON r.ContainerID = t.ContainerID AND r.MonthRank + 1 = t.MonthRank AND t.Rn = 1
)
SELECT ContainerID, Date, MonthlyHours, MonthlyStarts
FROM RCTE
WHERE Date >= '2011-01-01' --to eliminate "zero row"
ORDER BY ContainerID
SQLFiddle DEMO (I have added some data for February and March in order to test on different lengths of months)
Old version fiddle

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Redshift: Add Row for each hour in a day - sql

Provided the table of int 1 .. 24 is all24(hour) you can use lead and join select t.item_id, t.date, all24.hour, t.quantity from all24 join ( select *, lead(hour, 1, 25) over(partition by item_id, date order by hour) - 1 nxt_h from tbl ) t on all24.hour between t.hour and t.nxt_h

Related

Select first rows where condition [duplicate]

How to JOIN 2 tables and keep only the most recent records

Get the previous record based on conditions in Postgresql

Group by minimum value in one field while selecting distinct rows

Get Monthly Totals from Running Totals

Categories

Resources