Numerate values in a column regardless of order - sql

I have a table like this:
Date
Week
2021-01-01
53
2021-01-02
53
2021-01-03
53
2021-01-04
1
2021-01-05
1
2021-01-06
1
2021-01-07
1
...
...
2021-12-30
52
2021-12-31
52
I want to rank weeks not with their values but with Date ascending order. I tried to use
dense_rank() over (order by Week)
and got this results:
Date
Week
2021-01-01
53
2021-01-02
53
2021-01-03
53
2021-01-04
1
2021-01-05
1
2021-01-06
1
2021-01-07
1
...
...
2021-12-30
52
2021-12-31
52
But 53rd week is on 53rd rank, not 1st as I want. Do you know what I need to use in that case? Thx

You can try to use MOD function in ORDER BY.
Because the Week Number seem like between 1 to 53, MOD function will calculate
MOD(53, 53)=> 0
MOD(1, 53) => 1
so on .... .
dense_rank() over (order by MOD(Week, 53))

use order by desc
select *, row_number()over(order by week desc) from table_name

You can simply play with Vertica's date/time functions - and add #D-Shih 's clever idea with the modulo function to it, and no dense-rank needed if the result is the one you display:
WITH
indata (dt) AS (
SELECT DATE '2020-12-30'
UNION ALL SELECT DATE '2020-12-31'
UNION ALL SELECT DATE '2021-01-01'
UNION ALL SELECT DATE '2021-01-02'
UNION ALL SELECT DATE '2021-01-03'
UNION ALL SELECT DATE '2021-01-04'
UNION ALL SELECT DATE '2021-01-05'
[...]
UNION ALL SELECT DATE '2021-12-30'
UNION ALL SELECT DATE '2021-12-31'
UNION ALL SELECT DATE '2022-01-01'
UNION ALL SELECT DATE '2022-01-02'
UNION ALL SELECT DATE '2022-01-03'
UNION ALL SELECT DATE '2022-01-04'
)
SELECT
dt
, WEEK(dt) AS stdweek
, WEEK_ISO(dt) AS isoweek
, MOD(WEEK(dt),53) AS stdwkmod53
, MOD(WEEK_ISO(dt),53) AS isowkmod53
FROM indata;
-- out dt | stdweek | isoweek | stdwkmod53 | isowkmod53
-- out ------------+---------+---------+------------+------------
-- out 2020-12-30 | 53 | 53 | 0 | 0
-- out 2020-12-31 | 53 | 53 | 0 | 0
-- out 2021-01-01 | 1 | 53 | 1 | 0
-- out 2021-01-02 | 1 | 53 | 1 | 0
-- out 2021-01-03 | 2 | 53 | 2 | 0
-- out 2021-01-04 | 2 | 1 | 2 | 1
-- out 2021-01-05 | 2 | 1 | 2 | 1
[...]
-- out 2021-12-30 | 53 | 52 | 0 | 52
-- out 2021-12-31 | 53 | 52 | 0 | 52
-- out 2022-01-01 | 1 | 52 | 1 | 52
-- out 2022-01-02 | 2 | 52 | 2 | 52
-- out 2022-01-03 | 2 | 1 | 2 | 1
-- out 2022-01-04 | 2 | 1 | 2 | 1

Related

Oracle SQL compare dates less than two days old

Hi guys I need your support how to perform this logic.
I'm stucking currently and i really do not know how to procide further.
Target: compare ref_num, entry_date and status=1
with
ref_num, change_status_date, status 0 (always the last two rows)
Compare, if these dates change_status_date and entry_date are less then 2 days old then update the status value from status=1 to status=2, else If the days are more then 3 days old change to status=0
Any idea how to perfom a correct select sql and update sql?
+--------------+-----------------------+---------------------+----------+
| ref_num | entry_date | change_status_date | status
+--------------+-----------------------+---------------------+----------+
| x326585 | 28/04/2020 16:54:14 | | 1 |
| x326585 | 25/04/2020 13:14:00 | 27/04/2020 23:44:00 | 0 |
| x326585 | 20/04/2020 11:15:02 | 20/04/2020 23:52:01 | 0 |
| A142585 | 28/04/2020 16:55:14 | | 1 |
| A142585 | 26/04/2020 11:54:04 | 27/04/2020 22:54:51 | 0 |
| A142585 | 24/04/2020 10:44:14 | 25/04/2020 13:17:23 | 0 |
| B188532 | 29/04/2020 11:34:41 | | 1 |
| B188532 | 14/04/2020 11:44:24 | 15/05/2020 23:11:10 | 0 |
| B188532 | 11/04/2020 08:34:10 | 13/05/2020 11:44:41 | 0 |
+--------------+-----------------------+---------------------+----------+
END RESULTS:
+--------------+-----------------------+---------------------+----------+
| ref_num | entry_date | change_status_date | status
+--------------+-----------------------+---------------------+----------+
| x326585 | 28/04/2020 16:54:14 | 27/07/2020 23:47:31 | 2 | is less than 3 days (28/04/2020 16:54:14 - 27/04/2020 23:44:00) -> status 2
| x326585 | 25/04/2020 13:14:00 | 27/04/2020 23:44:00 | 0 |
| x326585 | 20/04/2020 11:15:02 | 20/04/2020 23:52:01 | 0 |
| A142585 | 28/04/2020 16:35:58 | 27/07/2020 23:47:31 | 2 | is less than 3 days (28/04/2020 16:35:58 - 27/04/2020 22:54:51) -> status 2
| A142585 | 26/04/2020 11:54:04 | 27/04/2020 22:54:51 | 0 |
| A142585 | 24/04/2020 10:44:14 | 25/04/2020 13:17:23 | 0 |
| B188532 | 29/04/2020 11:34:41 | 27/07/2020 23:47:31 | 0 | is more than 3 days (29/04/2020 11:34:41 - 15/05/2020 23:11:10) -> status 0
| B188532 | 14/04/2020 11:44:24 | 15/05/2020 23:11:10 | 0 |
| B188532 | 11/04/2020 08:34:10 | 13/05/2020 11:44:41 | 0 |
+--------------+-----------------------+---------------------+----------+
select x.ref_num, x.entry_date, x.change_status_date, x.status from kl_table x
Thank you for your support and advice
This is how I understood the question.
Sample data:
SQL> with test (ref_num, entry_date, change_status_date, status) as
2 (select 'x3', to_date('28.04.2020 16:54', 'dd.mm.yyyy hh24:mi'), null , 1 from dual union all
3 select 'x3', to_date('25.04.2020 13:14', 'dd.mm.yyyy hh24:mi'), to_date('27.04.2020 23:44', 'dd.mm.yyyy hh24:mi'), 0 from dual union all
4 select 'x3', to_date('20.04.2020 11:15', 'dd.mm.yyyy hh24:mi'), to_date('20.04.2020 23:52', 'dd.mm.yyyy hh24:mi'), 0 from dual union all
5 --
6 select 'b1', to_date('29.04.2020 11:34', 'dd.mm.yyyy hh24:mi'), null , 1 from dual union all
7 select 'b1', to_date('14.04.2020 11:44', 'dd.mm.yyyy hh24:mi'), to_date('15.05.2020 23:11', 'dd.mm.yyyy hh24:mi'), 0 from dual union all
8 select 'b1', to_date('11.04.2020 08:34', 'dd.mm.yyyy hh24:mi'), to_date('13.05.2020 11:44', 'dd.mm.yyyy hh24:mi'), 0 from dual
9 ),
Max change_status_date for that ref_num whose status = 0; it'll be compared to entry_date
10 temp as
11 (select
12 a.ref_num,
13 a.entry_date,
14 a.change_status_date,
15 --
16 (select max(b.change_status_date)
17 from test b
18 where b.ref_num = a.ref_num
19 and b.status = 0
20 ) compare_change_status_date,
21 a.status
22 from test a
23 )
Finally: I presume that change_status_date (that was NULL) should be replaced by sysdate. Difference between those dates should be ABS to eliminate negative numbers.
24 select
25 t.ref_num,
26 t.entry_date,
27 --
28 nvl(t.change_status_date, sysdate) change_status_date,
29 --
30 case when t.status = 1 then
31 case when abs(t.entry_date - t.compare_change_status_date) < 2 then 2
32 when abs(t.entry_date - t.compare_change_status_date) > 3 then 0
33 end
34 else t.status
35 end status
36 from temp t
37 order by t.ref_num desc, t.entry_date desc;
RE ENTRY_DATE CHANGE_STATUS_DA STATUS
-- ---------------- ---------------- ----------
x3 28.04.2020 16:54 28.07.2020 08:21 2
x3 25.04.2020 13:14 27.04.2020 23:44 0
x3 20.04.2020 11:15 20.04.2020 23:52 0
b1 29.04.2020 11:34 28.07.2020 08:21 0
b1 14.04.2020 11:44 15.05.2020 23:11 0
b1 11.04.2020 08:34 13.05.2020 11:44 0
6 rows selected.
SQL>
If you want to update rows whose status = 1, code I posted above can be reused for e.g. MERGE:
merge into test a
using (with temp
as (select a.ref_num,
a.entry_date,
a.change_status_date,
--
(select max (b.change_status_date)
from test b
where b.ref_num = a.ref_num
and b.status = 0)
compare_change_status_date,
a.status
from test a)
select t.ref_num,
t.entry_date,
--
nvl (t.change_status_date, sysdate) change_status_date,
--
case
when t.status = 1
then
case
when abs (
t.entry_date - t.compare_change_status_date) <
2
then
2
when abs (
t.entry_date
- t.compare_change_status_date) > 3
then
0
end
else
t.status
end
status
from temp t) x
on ( a.ref_num = x.ref_num
and a.entry_date = x.entry_date)
when matched
then
update set a.status = x.status
where a.status = 1;
Your solution will look like this.
update [tablename] set status=2 where DATEDIFF(day, [tablename].entry_date, [tablename].change_status_date) < 2
update [tablename] set status=0 where DATEDIFF(day, [tablename].entry_date, [tablename].change_status_date) > 3
Thanks

Using LAG() function to find past value

I've used the Lag Function to find the previous value. However I have run into an issue that requires a more complex query.
Here is my scenario.
Our table currently keeps month end data for each record. With the exception of the last 95 days. we like to keep daily records fro the last 95 days. This is what I mean by month end and daily records
ID Date Amount
123 10/31/2019 52
123 11/31/2019 56
123 12/31/2019 59
123 01/25/2020 32
123 01/26/2020 28
123 ... ..
123 03/12/2020 103
Imagine that the ... represent a daily record for id: 123 up until yesterday.
My task worked perfectly for our month end historical data, but i ran into an issue with our daily historical data
what I want is to be able to get the last value from the last day of the previous month for all months.
this is what I currently have for my query
Select ID, Date, Amount,LAG(Amount, 1, 0) OVER(PARTITION BY ID
ORDER BY id,
Date)
AS SharePreviousBalance from dbo.shares
where date >= 20191031
This is the output I would like to have, but my current query does not work the way i want it to work:
ID Date Amount SharePreviousBalance
123 10/31/2019 52 0
123 11/31/2019 56 52
123 12/31/2019 59 56
123 ... .. ..
123 01/25/2020 32 0
123 01/26/2020 28 0
123 01/27/2020 28 0
123 ... .. ..
123 01/31/2020 28 59
123 ... .. ..
123 02/15/2020 28 0
123 ... .. ..
123 02/29/2020 25 28
123 ... .. ..
123 03/05/2020 29 0
123 ... .. ..
123 03/10/2020 30 0
123 ... .. ..
123 03/12/2020 103 25
Any Ideas?
Thank you
With a little conditional logic, you can still do this with lag():
select
t.*,
case when date = eomonth(date) then
coalesce(
lag(amount) over(
partition by id, case when date = eomonth(date) then 1 else 0 end
order by date
),
0
)
end SharePreviousBalance
from mytable t
The idea is to build a partition for "end-of-month" rows (ie rows whose date is the last day of a month). Within that partition, an end-of-month row can access the previous end of month with lag().
Demo on DB Fiddle - I added a few rows to your sample data:
ID | Date | Amount | SharePreviousBalance
--: | :--------- | -----: | -------------------:
123 | 2019-10-31 | 52 | 0
123 | 2019-11-30 | 56 | 52
123 | 2019-12-31 | 59 | 56
123 | 2020-01-20 | 28 | null
123 | 2020-01-25 | 32 | null
123 | 2020-01-26 | 28 | null
123 | 2020-01-31 | 28 | 59
123 | 2020-02-12 | 103 | null
123 | 2020-02-28 | 103 | null
123 | 2020-02-29 | 103 | 28
If you also want to show the value of the previous end of month for the current date, then add that row to the "end-of-month" partition:
select
t.*,
case when date in (eomonth(date), cast(getdate() as date)) then
coalesce(
lag(amount) over(
partition by
id,
case when date in (eomonth(date), cast(getdate() as date)) then 1 else 0 end
order by date
),
0
)
end SharePreviousBalance
from mytable t
order by id, date

Showing date even zero value SQL

I have SQL Query:
SELECT Date, Hours, Counts FROM TRANSACTION_DATE
Example Output:
Date | Hours | Counts
----------------------------------
01-Feb-2018 | 20 | 5
03-Feb-2018 | 25 | 3
04-Feb-2018 | 22 | 3
05-Feb-2018 | 21 | 2
07-Feb-2018 | 28 | 1
10-Feb-2018 | 23 | 1
If you can see, there are days that missing because no data/empty, but I want the missing days to be shown and have a value of zero:
Date | Hours | Counts
----------------------------------
01-Feb-2018 | 20 | 5
02-Feb-2018 | 0 | 0
03-Feb-2018 | 25 | 3
04-Feb-2018 | 22 | 3
05-Feb-2018 | 21 | 2
06-Feb-2018 | 0 | 0
07-Feb-2018 | 28 | 1
08-Feb-2018 | 0 | 0
09-Feb-2018 | 0 | 0
10-Feb-2018 | 23 | 1
Thank you in advanced.
You need to generate a sequence of dates. If there are not too many, a recursive CTE is an easy method:
with dates as (
select min(date) as dte, max(date) as last_date
from transaction_date td
union all
select dateadd(day, 1, dte), last_date
from dates
where dte < last_date
)
select d.date, coalesce(td.hours, 0) as hours, coalesce(td.count, 0) as count
from dates d left join
transaction_date td
on d.dte = td.date;

creating complete historical timeline from overlapping intervals

I have below table which contain a code, from, to and hour. The problem is that i have overlapping dates in the intervals. Instead of it i want to create a complete historical timeline. So whe the code is identical and there is a overlap it should sum the hours like in the desired result.
** table **
+------+-------+--------------------------------------+
| code | from | to | hours |
+------+-------+--------------------------------------+
| 1 | 2013-05-01 | 2013-09-30 | 37 |
| 1 | 2013-05-01 | 2014-02-28 | 10 |
| 1 | 2013-10-01 | 9999-12-31 | 5 |
+------+-------+--------------------------------------+
desired result:
+------+-------+--------------------------------------+
| code | from | to | hours |
+------+-------+--------------------------------------+
| 1 | 2013-05-01 | 2013-09-30 | 47 |
| 1 | 2013-10-01 | 2014-02-28 | 15 |
| 1 | 2014-02-29 | 9999-12-31 | 5 |
+------+-------+--------------------------------------+
Oracle Setup:
CREATE TABLE Table1 ( code, "FROM", "TO", hours ) AS
SELECT 1, DATE '2013-05-01', DATE '2013-09-30', 37 FROM DUAL UNION ALL
SELECT 1, DATE '2013-05-01', DATE '2014-02-28', 10 FROM DUAL UNION ALL
SELECT 1, DATE '2013-10-01', DATE '9999-12-31', 5 FROM DUAL;
Query:
SELECT *
FROM (
SELECT code,
dt AS "FROM",
LEAD( dt ) OVER ( PARTITION BY code ORDER BY dt ASC, value DESC, ROWNUM ) AS "TO",
hours
FROM (
SELECT code,
dt,
SUM( hours * value ) OVER ( PARTITION BY code ORDER BY dt ASC, VALUE DESC ) AS hours,
value
FROM table1
UNPIVOT ( dt FOR value IN ( "FROM" AS 1, "TO" AS -1 ) )
)
)
WHERE "FROM" + 1 < "TO";
Results:
CODE FROM TO HOURS
---- ---------- ---------- -----
1 2013-05-01 2013-09-30 47
1 2013-10-01 2014-02-28 15
1 2014-02-28 9999-12-31 5

Weekly Average Reports: Redshift

My Sales data for first two weeks of june, Monday Date i.e 1st Jun , 8th Jun are below
date | count
2015-06-01 03:25:53 | 1
2015-06-01 03:28:51 | 1
2015-06-01 03:49:16 | 1
2015-06-01 04:54:14 | 1
2015-06-01 08:46:15 | 1
2015-06-01 13:14:09 | 1
2015-06-01 16:20:13 | 5
2015-06-01 16:22:13 | 1
2015-06-01 16:27:07 | 1
2015-06-01 16:29:57 | 1
2015-06-01 19:16:45 | 1
2015-06-08 10:54:46 | 1
2015-06-08 15:12:10 | 1
2015-06-08 20:35:40 | 1
I need a find weekly avg of sales happened in a given range .
Complex Query:
(some_manipulation_part), ifact as
( select date, sales_count from final_result_set
) select date_part('h',date )) as h ,
date_part('dow',date )) as day_of_week ,
count(sales_count)
from final_result_set
group by h, dow.
Output :
h | day_of_week | count
3 | 1 | 3
4 | 1 | 1
8 | 1 | 1
10 | 1 | 1
13 | 1 | 1
15 | 1 | 1
16 | 1 | 8
19 | 1 | 1
20 | 1 | 1
If I try to apply avg on the above final result, It is not actually fetching correct answer!
(some_manipulation_part), ifact as
( select date, sales_count from final_result_set
) select date_part('h',date )) as h ,
date_part('dow',date )) as day_of_week ,
avg(sales_count)
from final_result_set
group by h, dow.
h | day_of_week | count
3 | 1 | 1
4 | 1 | 1
8 | 1 | 1
10 | 1 | 1
13 | 1 | 1
15 | 1 | 1
16 | 1 | 1
19 | 1 | 1
20 | 1 | 1
So I 've two mondays in the given range, it is not actually dividing by it. I am not even sure what is happening inside redshift.
To get "weekly averages" use date_trunc():
SELECT date_trunc('week', my_date_column) as week
, avg(sales_count) AS avg_sales
FROM final_result_set
GROUP BY 1;
I hope you are not actually using date as name for your date column. It's a reserved word in SQL and a basic type name, don't use it as identifier.
If you group by the day of week (DOW) you get averages per weekday. and sunday is 0. (Use ISODOW to get 7 for Sunday.)