Postgres to_date() function returns wrong year - sql

I used a query:
select to_date(substring('0303653597' from 0 for 7),'DDMMYY')
Expected output: 1965-03-03
Actual output : 2065-03-03
When I give the string as 030370 or above it behaves correctly.
Whats wrong with the predefined function?
Can we use any other function to achieve the same?

According to the documentation:
In to_timestamp and to_date, if the year format specification is less
than four digits, e.g., YYY, and the supplied year is less than four
digits, the year will be adjusted to be nearest to the year 2020,
e.g., 95 becomes 1995.
in your case 2065 is closer to 2020 than 1965 and thus it defaults to 2065.

I believe you won't get a better answer than the one from #whites11, but you can try to fix it by parsing the strings and adding the century yourself:
WITH j (dt) AS (
VALUES ('0303653597'),('0303701111'),('0510511111'),('0510051111')
)
SELECT
CASE
WHEN substring(dt from 5 for 2)::int > '49' THEN
to_date(substring(dt from 1 for 4) || '19' || substring(dt from 5 for 2), 'DDMMYYYY')
ELSE
to_date(substring(dt from 1 for 4) || '20' || substring(dt from 5 for 2), 'DDMMYYYY')
END
FROM j;
to_date
------------
1965-03-03
1970-03-03
1951-10-05
2005-10-05
Play with the case condition and see if it fits your needs.
Demo: db<>fiddle

A solution for this problem heavy relies on the fact that you can guarantee your dates are starting after a certain year for which the two numbers representation doesn't overlap with the years passed in 2000 century so far.
E.g. if the earliest datapoint is in 1930 you'll know that anything with YY less than 30 needs to be considered 2000+ and anything >30 needs to be considered 1900-1999. If you have entries also for years like 1919 the above problem is not solvable because any date with YY=[00-21] can't be uniquely associated.
However, if you can state that your dates can't be bigger than today's date then a possible solution is to check if the extracted date is bigger than today, if so, add a '19` prefix to the year, like in the example below
with dt as (
select '0303653597' dt_str
union all
select '1705213597' dt_str
union all
select '1805213597' dt_str
)
select
to_date(
substring(dt_str from 0 for 5) ||
case when
to_date(substring(dt_str from 0 for 7),'DDMMYY' ) <= current_date
then '20'
else '19'
end
|| substring(dt_str from 5 for 2)
,'DDMMYY')
from dt;
For the 3 dates (03/03/65, 17/05/21 and 18/05/21) as of today (17/05/21) the output will correctly be
to_date
------------
1965-03-03
2021-05-17
1921-05-18
(3 rows)

Related

Convert Year+WeekOfYear+DayOfWeek to a date

I have date values identified by a year, the week number within that year and the weekday and want to convert those into simple dates.
I couldn't find a function or another simple way to combine those, so I came up with a workaround using generate_series to get all dates in a range and JOIN the extracted values of those with my data:
SELECT data.*, days.d result
FROM ( VALUES (2017, 33, 3) ) data(d_year, d_week, d_weekday)
JOIN (
SELECT
-- the potential castdate
d::date d
-- year-week-dayofweek combination for JOINing
, EXTRACT('year' FROM d) d_year, EXTRACT('week' FROM d) d_week, EXTRACT('dow' FROM d) d_weekday
FROM generate_series('2015-01-01', '2019-12-31', INTERVAL '1day') AS days(d)
) days
USING(d_year, d_week, d_weekday)
Result is:
+--------+--------+-----------+------------+
| d_year | d_week | d_weekday | result |
+--------+--------+-----------+------------+
| 2017 | 33 | 3 | 16.08.2017 |
+--------+--------+-----------+------------+
While this works, this seems like overkill for such a simple task. Moreover, if one doesn't have a fixed range, this might not even work.
Is there an easier way to this?
demo:db<>fiddle
you can use the to_date() function, which takes an date string as argument, as well as a format pattern. So if the date string may be '2017-33-3', you could take this pattern to clarify each date part:
'IYYY-IW-ID'
'ID': The tricky part is: Does your week start with Sunday oder with Monday? This question influences the solution because it would shift the week numbers in an unexpected ways if you don't think about it. Thanks to your expected output, I saw you need 'ID' (ISO week day, week starts mondays) instead of 'D' (week day, week start sundays.)
'IW': Because we are taking the ISO week day, we need the ISO week of year as well (instead of 'WW': week of year)
'IYYY': Similar to (2)
More information about date patterns (especially the ISO thing): Postgres documentation
SELECT to_date(d_year || '-' || d_week || '-' || d_weekday, 'IYYY-IW-ID')
If you used the standard week pattern: 'YYYY-WW-D', your result would be 2017-08-13 (see fiddle)
Of course, this works also without the - characters, but it might be less readable:
SELECT to_date(d_year || d_week || d_weekday, 'IYYYIWID')

Presto SQL get yyyymm minus 2 months

I am using Presto. I have an integer column (let's call the column 'mnth_nbr') showing year and month as: yyyymm. For instance, 201901. I want to have records showing all dates AFTER 201901 as well as 2 months before the given date. In this example, it would return 201811, 201812, 201901, 201902, 201903, etc. Keep in mind that my data type here is integer.
This is what I have so far (I do a self join):
select ...
from table 1 as first_table
left join table 1 as second_table
on first_table.mnth_nbr = second_table.mnth_nbr
where first_table.mnth_nbr <= second_table.mnth_nbr
I know this gives me all dates AFTER 201901, including 201901. But, I don't know how to add the 2 previous months (201811 and 201812)as explained above.
As far as the documentation, Presto DB date_parse function expects a MySQL-like date format specifier.
So the proper condition for your use case should be :
SELECT ...
FROM mytable t
WHERE
date_parse(cast(t.mnth_nbr as varchar), '%Y%m') >= date '2019-01-01' - interval '2' month
Edit
As commented by Piotr, a more optimized expression (index-friendly) would be :
WHERE
mnth_nbr >= date_format(date '2019-01-01' - interval '2', '%Y%m')
Something like this would help. first parse your int to date
date_parse(cast(first_table.mnth_nbr as varchar), 'yyyymm') > date '2019-01-01' - interval '2' month
please keep in mind that you may encounter with indexing issues with this approach.

Get month,days difference between two date columns

I'm trying to fix some problems in my database and i want to re-calculate column in my db based on other 2 date columns. This col is float and i want to get the difference between 2 dates in months with decimal point for days.
For example if i have 2 dates '2016-01-15', '2015-02-01' the difference should be 12.5 best of 12 months differences and 0.5 for the remaining 15 days
Here is what i tried so far based on my searches but i think there is something i'm missing as it tells me there is an error with my date col as it doesn't exist
Select EXTRACT(year FROM vehicle_delivery(date, vehicle_received_date))*12 + EXTRACT(month FROM vehicle_delivery(date, vehicle_received_date));
Where vehicle_delivery is my table name & date is my end date and vehicle_received_date is my start date
same thing happes with this sql :
select extract('years' from vehicle_delivery) * 12 + extract('months' from vehicle_delivery) + extract('days' from vehicle_delivery) / 30
from (select age(date::timestamp, vehicle_received_date::timestamp)) a;
The SQL should look like this:
select extract(year from diff) * 12 + extract(month from diff) + extract(day from diff) / 30
from (select age(date::timestamp, vehicle_received_date::timestamp) as diff
from vehicle_delivery
) vd;
I don't know what the purpose of the / 30 is, but you appear to want it.
Notes:
The FROM clause references the table.
The first argument in extract() is a keyword, not a string.
You want to reference the age() value in the extract().
extract() returns an interval, so it is rather redundant to take out the parts (only needed if you want them in separate columns).

SQL computing and reusing fiscal year calculation in sql query

I have a condition in my SQL query, using Oracle 11g database, that depends on a plan starting or ending with in a fiscal year:
(BUSPLAN.START_DATE BETWEEN (:YEAR || '-04-01') AND (:YEAR+1 || '-03-31')) OR
(BUSPLAN.END_DATE BETWEEN (:YEAR || '-04-01') AND (:YEAR+1 || '-03-31'))
For now, I am passing in YEAR as a parameter. It can be computed as (pseudocode):
IF CURRENT MONTH IN (JAN, FEB, MAR):
USE CURRENT YEAR // e.g. 2015
ELSE:
USE CURRENT YEAR + 1 // e.g. 2016
Is there a way I could computer the :YEAR parameter within in an SQL query and reuse it for the :YEAR parameter?
CTEs are easy, you can make little tables on the fly. With a 1 row table you just cross join it and then you have that value available every row:
WITH getyear as
(
SELECT
CASE WHEN to_char(sysdate,'mm') in ('01','02','03') THEN
EXTRACT(YEAR FROM sysdate)
ELSE
EXTRACT(YEAR FROM sysdate) + 1
END as ynum from dual
), mydates as
(
SELECT getyear.ynum || '-04-01' as startdate,
getyear.ynum+1 || '-03-31' as enddate
from getyear
)
select
-- your code here
from BUSPLAN, mydates -- this is a cross join
where
(BUSPLAN.START_DATE BETWEEN mydates.startdate AND mydates.enddate) OR
(BUSPLAN.END_DATE BETWEEN mydates.startdate AND mydates.enddate)
note, values statement is probably better if Oracle has values then the first CTE would look like this:
VALUES(CASE WHEN to_char(sysdate,'mm') in ('01','02','03') THEN
EXTRACT(YEAR FROM sysdate)
ELSE
EXTRACT(YEAR FROM sysdate) + 1)
I don't have access to Oracle so I might have bugs typos etc since I didn't test.
In the code you shared there is a problem and a potential problem.
Problem, implicit conversion to date without format string.
In (BUSPLAN.START_DATE BETWEEN (:YEAR || '-04-01') AND (:YEAR+1 || '-03-31')) two strings are being formed and then converted to dates. The conversion to date is going to change depending on the value of NLS_DATE_FORMAT. To insure that the string is converted correctly to_date(:YEAR || '-04-01', 'YYYY-MM-DD').
Potential problem, boundary at the end of the year when time <> midnight.
Oracle's date type holds both date and time. A test like someDate between startDate and endDate will miss all records that happened after midnight on endDate. One simple fix that precludes use of indexes on someDate is trunc(someDate) between startDate and endDate.
A more general approach is to define date ranges and closed open intervals. lowerBound <= aDate < upperBound where lowerBound is the same asstartDateabove andupperBoundisendDate` plus one day.
Note: Some applications used Oracle date columns as dates and always store midnight, if your application is of that sort, then this is not a problem. And check constraints like check (trunc(dateColumn) = dateColumn) would make sure it stays that way.
And now, to answer the question actually asked.
Using subquery factoring (Oracle's terminology) / common table expression (SQL Server's terminology) one can avoid repetition within a query.
Instead of figuring out the proper year, and then using strings to put together dates, the code below starts by getting January 1 at Midnight of the current calendar year, trunc(sysdate, 'YEAR')). Then it adds an offset in months. When the months are Jan, Feb, Mar, the current fiscal year started last year on 4/1, or nine months before the start of this year. The offset is -9. Else the current fiscal year started 4/1 of this calendar year, start of this year plus three months.
Instead of end date, an upper bound is calculated, similar to lower bound, but with the offsets being 12 greater than lower bound to get 4/1 the following year.
with current_fiscal_year as (select add_months(trunc(sysdate, 'YEAR')
, case when extract(month from sysdate) <= 3 then -9 else 3 end) as LowerBound
, add_months(trunc(sysdate, 'YEAR')
, case when extract(month from sysdate) <= 3 then 3 else 15 end) as UpperBound
from dual)
select *
from busplan
cross join current_fiscal_year CFY
where (CFY.LowerBound <= busplan.start_date and busplan.start_date < CFY.UpperBound)
or (CFY.LowerBound <= busplan.end_date and busplan.end_date < CFY.UpperBound)
And yet more unsolicited advise.
The times I've had to deal with fiscal year stuff, avoiding repetition within a query was low hanging fruit. Having the fiscal year calculations consistent and correct among many queries, that was the essence of the work. So I'd recommend a developing PL/SQL package that centralizes fiscal calculations. It might include a function like:
create or replace function GetFiscalYearStart(v_Date in date default sysdate)
return date
as begin
return add_months(trunc(v_Date, 'YEAR')
, case when extract(month from v_Date) <= 3 then -9 else 3 end);
end GetFiscalYearStart;
Then the query above becomes:
select *
from busplan
where (GetFiscalYearStart() <= busplan.start_date
and busplan.start_date < add_months(GetFiscalYearStart(), 12))
or (GetFiscalYearStart() <= busplan.end_date
and busplan.end_date < add_months(GetFiscalYearStart(), 12))

AS400 SQL query to determine records for who is 70.5 years old for the current year

I am trying to find individuals that will turn 70.5 years old in the current year.
dob7 = DECIMAL(7) YYYYDDD
select acctno, name, address, status, year(curdate()) - year(date(digits(dob7))) as Age
from mydata.cdmast cdmast
left join mydata.cfmast cfmast
on cdmast.cifno = cfmast.cifno
where status <> 'R' and year(curdate()) - year(date(digits(dob7))) >= 70
The code above returns the following error:
[Error Code: -181, SQL State: 22008] [IBM][System i Access ODBC Driver][DB2 for i5/OS]SQL0181 - Value in date, time, or timestamp string not valid.
After seeing the other answers, I'm submitting my own. This should have the benefit of using any indicies on dob7, and should work without too many 'tricks'.
I've modified the WHERE clause in your original query. I'm assuming '.5 years' means '6 months', although this is adjustable. I deliberately wrapped the calculations in CTEs to 'encapsulate' the logic; the operations should be nearly no-cost.
WITH Youngest (dateOfBirth) as (
SELECT CURRENT_DATE - 70 YEARS - 6 MONTHS
FROM sysibm/sysdummy1),
Converted (dateOfBirth, formatted) as (
SELECT dateOfBirth, YEAR(dateOfBirth) * 1000 + DAYOFYEAR(dateOfBirth)
FROM Youngest)
SELECT acctno, name, address, status,
YEAR(CURRENT_DATE) - INT(dob7 / 1000)
- CASE WHEN DAYOFYEAR(CURRENT_DATE) < MOD(dbo7, 1000)
THEN 1
ELSE 0 END as Age
FROM myData.cdMast cdMast
JOIN Converted
ON Converted.formatted >= dob7
LEFT JOIN myData.cfMast cfMast
ON cdMast.cifno = cfMast.cifno
WHERE status <> 'R'
Please note that it will consider people born on a leap day to have had their birthday on March 1st (due to DAYOFYEAR()).
From the DATE scalar function documentation:
A string with an actual length of 7 that represents a valid date in the form yyyynnn, where yyyy are digits denoting a year, and nnn are digits between 001 and 366 denoting a day of that year.
Reformat the date with:
DATE(SUBSTR(DIGITS(DOB7),4,4) || SUBSTR(DIGITS(DOB7),1,3))
To select 70.5 or older by the end of the current year:
YEAR(CURRENT_DATE) - YEAR(DATE(SUBSTR(DIGITS(DOB7),4,4) || SUBSTR(DIGITS(DOB7),1,3))) = 70
AND MONTH(DATE(SUBSTR(DIGITS(DOB7),4,4) || SUBSTR(DIGITS(DOB7),1,3))) >= 6
OR YEAR(CURRENT_DATE) - YEAR(DATE(SUBSTR(DIGITS(DOB7),4,4) || SUBSTR(DIGITS(DOB7),1,3))) > 70
The error message is saying that the contents of DOB7 cannot be converted to a date. Does the value of DOB7 match one of the valid formats? Note that many require quotation marks. http://publib.boulder.ibm.com/infocenter/iseries/v6r1m0/index.jsp?topic=/db2/rbafzscadate.htm
Try this instead:
(year(curdate()) - mod(dob7, 10000)) >= 70
This is using modular arithmetic to extract the year, rather than trying to convert it to a date.
By the way, storing the date this way seems very awkward. Databases have built-in support for dates and times, so it is usually better to store them in the native format.
If you date of birth is really yyyymmm, then the following should work for years:
(year(curdate()) - cast(dob7/1000 as int)) >= 70
For the half year:
(year(curdate()) - cast(dob7/1000 as int))+(1-mod(dob7,1000)/365.0) >= 70.5