date_format() unexpected results with <= or <= in where clause - hive

We are using a built in calendar table dm_reference.dim_date with a sequence of dates:
select * from dm_reference.dim_date limit 10;
calendar_date date_name, day_of_week
1999-01-01 January 1, 1999 1 5 Friday
1999-01-02 January 2, 1999 2 6 Saturday
1999-01-03 January 3, 1999 3 7 Sunday
1999-01-04 January 4, 1999 4 1 Monday
1999-01-05 January 5, 1999 5 2 Tuesday
1999-01-06 January 6, 1999 6 3 Wednesday
1999-01-07 January 7, 1999 7 4 Thursday
1999-01-08 January 8, 1999 8 5 Friday
1999-01-09 January 9, 1999 9 6 Saturday
1999-01-10 January 10, 1999 10 7 Sunday
I wanted to filter this to only include dates between August 2014 and the current year month.
If I select min(date_format(calendar_date, "YYYYMM")) from dm_reference.dim_date I get returned 199901
So, I tried the following query to format my calendar_date field as year and month and then to filter to include dates between august 14 now:
select
distinct date_format(calendar_date, "YYYY-MMM") as year_month
, date_format(calendar_date, "YYYYMM") as year_month_num -- for ordering in asc
from dm_reference.dim_date
where date_format(calendar_date, "YYYYMM") <= 201408
and date_format(calendar_date, "YYYYMM") <= date_format(from_unixtime(unix_timestamp()), "YYYYMM")
order by year_month_num;
This returns dates going all the way back to 1999 whereas I expected the earliest date in this query result to be august 2014.
Any idea why this is happening? How can I query our calendar to only include a filtered date range?

I think you are complicating the query. You can simply filter for the required date range using
select *
from dm_reference.dim_date
where calendar_date >= '2014-08-01' and calendar_date < trunc(current_date,'MM')
This outputs all dates on or after August 2014 until the end of last month. If you need data until today, use the ending condition as calendar_date <= current_date.
The reason your query doesn't return expected results is because the condition is year month of calendar_date <= '201408' and year month of calendar_date <= '201808', which is everything before 201408.

Related

SQL - POSTGRES - DATE_PART why is sql resulting in week 53 when it should be week 2

TABLE
INSERT INTO runners
("runner_id", "registration_date")
VALUES
(1, '2021-01-01'),
(2, '2021-01-03'),
(3, '2021-01-08'),
(4, '2021-01-15');
SQL Query
SELECT
DATE_PART('WEEK', R.registration_date) AS week_num,
COUNT(runner_id)
FROM
pizza_runner.runners R
GROUP BY
week_num
ORDER BY
week_num ASC;
I was expecting the query to return weeks 1 and 2 only but for some reason I am getting 53
]1
I was expecting the query to return weeks 1 and 2 only but for some reason I am getting 53
The documentation does a good job explaining the ISO rules for weeks - which Postgres follows:
The number of the ISO 8601 week-numbering week of the year. By definition, ISO weeks start on Mondays and the first week of a year contains January 4 of that year. In other words, the first Thursday of a year is in week 1 of that year.
Using your dataset:
SELECT r.*,
extract(week from registration_date) AS week_num,
extract(isodow from registration_date) as day_of_week
FROM runners r
ORDER BY registration_date;
runner_id
registration_date
week_num
day_of_week
1
2021-01-01
53
5
2
2021-01-03
53
7
3
2021-01-08
1
5
4
2021-01-15
2
5
It turns out that January 3rd, 2021 was a Sunday (day of week 7). January 4st, 2021 was a Monday, and according to the ISO rules this is when the first week of that year began. Previous dates (January 3rd, 2nd, 1st, and so on) belong to the last week of 2020 (week 53), although the dates belong to year 2021.

Teradata SQL Week Number - Week 1 starting 1st Jan with weeks aligned to specific day of the week

my first post on here so please be gentle...
I'm trying to create a week number variable in Teradata (SQL) that does the following:
Week 1 always starts on 1st January of the given year
Week numbers increment on the specified day of the week
For example: If Saturday was the specified day of the week:
2019-01-01 would be the start of week 1, 2019, changing to week 2 on 2019-01-05
2020-01-01 would be the start of week 1, 2020, changing to week 2 on 2020-01-04
I have come up wit the following based on an Excel function however it doesn't quite work as expected:
ROUND(((DATE_SPECIFIED - CAST(EXTRACT(YEAR FROM DATE_SPECIFIED) || '-01-01' AS DATE) + 1) - ((DATE_SPECIFIED - DATE '0001-01-06') MOD 7 + 1) + 10) / 7) AS REQUIRED_WEEK
The last digit of the section - DATE '0001-01-06' deals with the specified day of the week, where '0001-01-01' would be Monday.
This works in some cases however for some years, the first week number is showing as 0 where it should be 1, e.g. 1st Jan 2018 / 2019 are fine whereas 1st Jan 2020 is not.
Any ideas to correct this would be gratefully received.
Many thanks,
Mike
You can apply NEXT_DAY for both the specified date and Jan 1st of that year, e.g. for Saturday as week start:
(Next_Day(DATE_SPECIFIED,'SAT') - Next_Day(Trunc(DATE_SPECIFIED,'yyyy'),'SAT')) / 7 +1
Hmmm . . . I'm a bit week on Teradata functions. But the idea is to get the start of the second week. This follows the rule:
Jan 1 weekday (TD) 2nd week
Sunday 1 01-02
Monday 2 01-08
Tuesday 3 01-07
Wednesday 4 01-06
Thursday 5 01-05
Friday 6 01-04
Saturday 7 01-03
I think the following logic calculates this:
select t.*,
(case when td_day_of_week(cast(extract(year from DATE_SPECIFIED) || '-01-01' as date) ) = 1
then cast(extract(year from DATE_SPECIFIED) + '-01-02' as date)
else extract(year from DATE_SPECIFIED) + 10 - cast(td_day_of_week(cast(extract(year from DATE_SPECIFIED) || '-01-01') as date)
from t;
Then do you your week calculate either from the second week or subtract one more week to get when the first week really starts.

How to Find Week number, Period and year from Date in Redshift? (Week Starting with Wednesday and ends up with Tuesday)

Need to find weekNumber like 1,2,3,4 but the week starts with Wednesday and ends with Tuesday from date column and after the 4th week, again the week restart by again as the 1st week and so on (no need to consider month).
Need to find the Period based on weekNumber only, 4 weeks as 1 Period and Periods end with 13 (period 1-13) will restart again 1st period.
(4 weeks = 1 period) (no need to consider month).
Now need to calculate the businessyear based on Period. 13 Periods as One businessyear. (13 periods = 1 year)
Calculation logic:
7 days * 4 weeks = 28 days = 1 period
13 periods = 1 businessyear
Example:
A year has 365 days normally
In my scenario, 4 weeks * 7 days = 28 days
28 days *13 periods = 364 days
The remaining days will come as the 5th week and period 14.
Datekey date Year semistor Quarter Month DayName DayNum Wnumber
20090101 01-01-2009 2009 1 1 January 1 Thursday 1 0
20090102 02-01-2009 2009 1 1 January 1 Friday 2 0
20090103 03-01-2009 2009 1 1 January 1 Saturday 3 0
20090104 04-01-2009 2009 1 1 January 1 Sunday 0
20090105 05-01-2009 2009 1 1 January 1 Monday 0
20090106 06-01-2009 2009 1 1 January 1 Tuesday 6 0
20090107 07-01-2009 2009 1 1 January 1 Wednesday 0 0
20090108 08-01-2009 2009 1 1 January 1 Thursday 1 1
20090109 09-01-2009 2009 1 1 January 1 Friday 2 1
20090110 10-01-2009 2009 1 1 January 1 Saturday 3 1
20090111 11-01-2009 2009 1 1 January 1 Sunday 4 1
20090112 12-01-2009 2009 1 1 January 1 Monday 5 1
20090113 13-01-2009 2009 1 1 January 1 Tuesday 6 1
20090114 14-01-2009 2009 1 1 January 1 Wednesday 0 1
No need to consider the month in my scenario, need to consider leap year also (2016, 2020).
The traditional way to do this type of thing is to create a calendar table in the database. Then, your queries can simply JOIN to the calendar table to extract the relevant value.
I find that the easiest way to create the calendar table is to use Excel. Simply write some formulas that provide the desired values and Copy Down for the next decade or so. Then, save the sheet as CSV and load it into the database.
This way, you can totally avoid complex calculations involving database functions and you can use whatever rules you wish.

Need list of weeknumbers between two dates

I need a resultset of weeknumers, year and startdate of all weeks between two dates. I need it to match other search results to the weeks. Since the report will span just over a year, I need it to match the calendars.
We're in Europe, so weeks start on MONDAY.
I use SQL Server via a JDBC connection. I cannot use the calander table.
I've come across various solutions, but none does just what I need it to. This is the kind of list I need, but somehow the results are not correct. I can't find my mistake:
WITH mycte AS
(
SELECT DATEADD(ww, DATEDIFF(ww,0,CAST('2010-12-01' AS DATETIME)), 0) DateValue
UNION ALL
SELECT DateValue + 7
FROM mycte
WHERE DateValue + 7 < '2016-12-31'
)
SELECT DATEPART(wk, DateValue) as week, DATEPART(year, DateValue) as year, DateValue
FROM mycte
OPTION (MAXRECURSION 0);
I used --SET DATEFIRST 1; to make sure weeks start on monday.
The result looks like:
week year DateValue
----------- ----------- -------------------------
49 2010 2010-11-29 00:00:00.0
50 2010 2010-12-06 00:00:00.0
51 2010 2010-12-13 00:00:00.0
52 2010 2010-12-20 00:00:00.0
53 2010 2010-12-27 00:00:00.0
2 2011 2011-01-03 00:00:00.0
3 2011 2011-01-10 00:00:00.0
4 2011 2011-01-17 00:00:00.0
5 2011 2011-01-24 00:00:00.0
6 2011 2011-01-31 00:00:00.0
The problem is obvious. 2010 hasn't 53 weeks, and week 1 is gone.
This hapens for other years as well. Only 2015 has 53 weeks.
(note: in iso weeks (Europe) there only 52 weeks in 2010, see wiki: https://en.wikipedia.org/wiki/ISO_week_date)
The following 71 years in a 400-year cycle (add 2000 for current
years) have 53 weeks (leap years, with February 29, are emphasized),
years not listed have 52 weeks: 004, 009, 015, 020, 026, 032, 037,
043, 048, 054, 060, 065, 071, 076, 082, 088, 093, 099,105, 111, 116,
122, 128, 133, 139, 144, 150, 156, 161, 167, 172, 178, 184, 189,
195,201, 207, 212, 218, 224, 229, 235, 240, 246, 252, 257, 263, 268,
274, 280, 285, 291, 296,303, 308, 314, 320, 325, 331, 336, 342, 348,
353, 359, 364, 370, 376, 381, 387, 392, 398.
The dates are correct though. 2012-12-27 is a monday and so is 2011-01-03.
But in Europe we always have a full week 1 (so there is always a monday with weeknumber 1)
Any ideas what hapend to week 1 or why there are so many years with 53 (which is wrong)?
Use iso_week in DATEPART:
ISO 8601 includes the ISO week-date system, a numbering system for
weeks. Each week is associated with the year in which Thursday occurs.
For example, week 1 of 2004 (2004W01) ran from Monday 29 December 2003
to Sunday, 4 January 2004. The highest week number in a year might be
52 or 53. This style of numbering is typically used in European
countries/regions, but rare elsewhere.
WITH mycte AS
(
SELECT DATEADD(ww, DATEDIFF(ww,0,CAST('2010-12-01' AS DATETIME)), 0) DateValue
UNION ALL
SELECT DateValue + 7
FROM mycte
WHERE DateValue + 7 < '2016-12-31'
)
SELECT DATEPART(iso_week, DateValue) as week, DATEPART(year, DateValue) as year,
DateValue
FROM mycte
OPTION (MAXRECURSION 0);
LiveDemo
You can also consider changing recursive CTE with tally table.
The reason you're seeing 53 weeks in 2010 is simply because, there are 53 weeks in 2010.
Let's take a closer look at how the weeks break down in that year:
Declare #FromDate Date = '2010-01-01',
#ToDate Date = '2011-01-03'
;With Date (Date) As
(
Select #FromDate Union All
Select DateAdd(Day, 1, Date)
From Date
Where Date < #ToDate
)
Select Date, DatePart(Week, Date) WeekNo, DateName(WeekDay, Date) WeekDay
From Date
Option (MaxRecursion 0)
SQL Fiddle
Here's how the beginning of the year is:
Date WeekNo WeekDay
---------- ----------- ------------------------------
2010-01-01 1 Friday
2010-01-02 1 Saturday
2010-01-03 2 Sunday
2010-01-04 2 Monday
2010-01-05 2 Tuesday
2010-01-06 2 Wednesday
2010-01-07 2 Thursday
2010-01-08 2 Friday
2010-01-09 2 Saturday
2010-01-10 3 Sunday
Since the year begins in the middle of a week, there are only two days for Week 1. This causes the year to have 53 total weeks.
Now, to answer your question for why you don't see a Week 1 value for 2011, let's look at how that year ends:
Date WeekNo WeekDay
---------- ----------- ------------------------------
2010-12-26 53 Sunday
2010-12-27 53 Monday
2010-12-28 53 Tuesday
2010-12-29 53 Wednesday
2010-12-30 53 Thursday
2010-12-31 53 Friday
2011-01-01 1 Saturday
2011-01-02 2 Sunday
2011-01-03 2 Monday
You are selecting your dates in 7-day increments. The last date that you pulled for 2010 was 2010-12-27, which was accurately being displayed as being in Week 53. But the beginning of the next year occurs within this week on the Saturday, making Saturday Week 1 of 2011, with the following day starting Week 2.
Since you are not selecting a new date until Monday, 2011-01-03, it will effectively skip the dates in the first week of 2011, and begin with Week 2.
SQL Server is using standard week numbers, which match with Outlook (and while 53 sounds odd it is valid).
Of course you could always create your own custom week numbers. All you need to do is to pick a starting Monday and calculate the weeknumber you would assign. Your CTE can then increment its own week count, which resets to 1 each time the year changes. But even this will return week 53 (Dec 31st 2012 was a Monday, making it the 53rd week of that year, even though the rest of the week feel in 2013).
Worth mentioning; your week numbers are unlikely to match those from other systems/processes. This could cause you problems further down the line.
SET DATEFIRST 1;
WITH [Week] AS
(
SELECT
CAST('2010-11-29' AS DATE) AS [Date],
48 AS WeekNumber
UNION ALL
SELECT
DATEADD(DAY, 7, [Date]) AS [Date],
CASE
-- Reset the week number count when the year changes.
WHEN YEAR([Date]) <> YEAR(DATEADD(DAY, 7, [Date])) THEN 1
ELSE WeekNumber + 1
END AS WeekNumber
FROM
[Week]
WHERE
[Date] < GETDATE()
)
SELECT
*
FROM
[Week]
OPTION
(MAXRECURSION 0)
;

Set first week of the year from the first day of the year until the following monday

I want that Oracle treats weeks the following way:
-The week starts on monday
-The first week of the year is from the first day of the year until the following monday
So for 2015 would be
01-jan-2015 (Thursday) would be 1
02-jan-2015 (Friday) would be 1
03-jan-2015 (Saturday) would be 1
04-jan-2015 (Sunday) would be 1
05-jan-2015 (Monday) would be 2
06-jan-2015 (Tuesday) would be 2
For 2017 would be
01-jan-2017 (Sunday) would be 1
02-jan-2017 (Monday) would be 2
03-jan-2017 (Tuesday) would be 2
04-jan-2017 (Wednesday) would be 2
and so on....
I need the numeration for the year so would go to 1 to 53-54
A little bit of arithmetics will probably do the trick (assuming d is your date):
select TRUNC((case to_char(d, 'DY')
when 'MON' then 6
when 'TUE' then 5
when 'WED' then 4
when 'THU' then 3
when 'FRI' then 2
when 'SAT' then 1
when 'SUN' then 0
end + to_number(to_char(d, 'DDD'))-1) / 7)+1
from v;
The case statement will build an offset according to the day of week
next I add the day of year
then I divide the result by 7 for the week number (starting at 0)
finally, the +1 will start numbering weeks at 1
Please take some time to experiment with it http://sqlfiddle.com/#!4/d41d8/38236 to check if I didn't make a stupid mistake as I didn't have time to test it thoughtfully. Sorry about that...
Use the ISO week date standard: 'iw'.
2014-01-01 was Wednesday actually, so:
select
to_number(to_char(date'2014-01-05', 'iw')) as weeknum,
to_char(date'2014-01-05', 'Day') as day
from dual;
WEEKNUM DAY
----------- ---------
1 Sunday
select
to_number(to_char(date'2014-01-06', 'iw')) as weeknum,
to_char(date'2014-01-06', 'Day') as day
from dual;
WEEKNUM DAY
----------- ---------
2 Monday