Oracle - How to convert Date represented by NUMBER(6,0) format - sql

I've got data from our third party partner and every date column is coded in this NUMBER(6,0) format:
118346
118347
118348
118351
119013
119035
119049
119051
118339
118353
119019
119028
119029
119031
None of the last 3 digits are more than 365, so I reckon 118339 must mean 2018 + 339 days, which is December 5, 2018: '2018-12-05'. I've never encountered this kind of format before, so I'm a bit helpless how to handle it. Is this some standardized format? Can I use some built-in convert function or should I just manually cut and convert it using some arithmetics?
I would like to sort my rows grouping by weeks, so maybe I shouldn't even convert it, but for some reason I feel converting to a date type would be more elegant. Which approach is the better?
EDIT: I've just checked my excel version of the data, and this format is in fact working as I've imagined. So the question stands.

This seems to be Excel's 1900-based internal representation of dates. Assuming your interpretation is right, you can convert to a normal date with a bit of manipulation:
-- CTE for sample values
with your_table (num) as (
select *
from table(sys.odcinumberlist(118339, 118346, 118347, 118348, 118351, 119013, 119035,
119049, 119051, 118339, 118353, 119019, 119028, 119029, 119031))
)
-- actual query
select num,
date '1899-12-31'
+ floor(num/1000) * interval '1' year
+ mod(num, 1000) * interval '1' day as converted
from your_table;
NUM CONVERTED
---------- ----------
118339 2018-12-05
118346 2018-12-12
118347 2018-12-13
118348 2018-12-14
118351 2018-12-17
119013 2019-01-13
119035 2019-02-04
119049 2019-02-18
119051 2019-02-20
118339 2018-12-05
118353 2018-12-19
119019 2019-01-19
119028 2019-01-28
119029 2019-01-29
119031 2019-01-31
This treats the first three digits - obtained with floor(num/1000) - as the number of years, offset from 1900. Those are multiplied by a single year interval value, to give 118 or 199 years. Then it treats the last three digits - from mod(num, 1000) - as the number of days into that year, by multiplying by a single day interval. Both are then added to the fixed date 1899-12-31. (You could use 1900-01-01 instead but then you have to subtract a day at the end...)

Related

Converting a four digit month year string to date in Snowflake

So I have some data as follows:
data
1203
0323
0101
1005
1130
0226
So in every case, the first two digits are the month and the last two digits are the year. Is there anyway to easily map these to date for without making it an outright lookup table?
Here's what I am seeking
data date
1203 12/01/2003
0323 03/01/2023
0101 01/01/2001
1005 10/01/2005
1130 11/01/2030
0226 02/01/2026
In every case, I would like the day to be the first of that month.
We can try building a valid date string with a year, month, and day component and then converting to a bona fide date using the TO_DATE function:
WITH cte AS (
SELECT '1203' AS dt
)
SELECT dt, TO_DATE(CONCAT('01', dt), 'DDMMYY') AS dt_out
FROM cte;
Much the same as Tim's answer:
You want to cast from number to text (and if you data is text skip that, BUT if it's variant (from JSON or XML) you still need to cast to TEXT), then parse with TO_DATE as MM month, YY year in 2 column (format tokens). And you will get the 1st of each of those months
SELECT
column1,
to_date(column1::text, 'MMYY')
FROM VALUES
(1203),
(0323),
(0101),
(1005),
(1130),
(0226);
giving:
COLUMN1
TO_DATE(COLUMN1::TEXT, 'MMYY')
1203
2003-12-01
323
2023-03-01
101
2001-10-01
1005
2005-10-01
1130
2030-11-01
226
2026-02-01

Convert arbitrary string values to timestamps SQL

I am wondering if there is a way to convert arbitrary string values (such as the examples below) to something that can be interpreted as a timestamp, perhaps in days.
Dropdown_values
Desired Output(days)
12 weeks
84
1 Week 4 Days
11
1 Year
365
1 Year 1 Week 2 Days
374
The idea I had was to split part out the values since they are all separated by spaces and then do the addition in a separate column, are there other (better) ways to do this? Thank you.
To expand on my comment as an answer:
select extract(epoch from '12 Week'::interval)/86400; 84
select extract(epoch from '1 Year 1 Week 2 Days'::interval)/86400; 11
select extract(epoch from '1 Year 1 Week 2 Days'::interval)/86400; 374.25
The above is how I usually deal with this sort of thing. Extract the epoch value from a the interval and then divide by the number of seconds in a day. It would be a good idea to read in the docs this Interval input and Interval output to understand how an interval is constructed and returned and the assumptions used. Note: the queries will return a float value not a timestamp. A value like 84 cannot be timestamp. You could turn it into an interval like: 84 * '1 day'::interval 84 days. If at all possible it is good idea to store data as actual timestamps(start and end) and then derive intervals from that.

How to select and group fortnightly in postgreql

I am trying to group the rows in a table fortnightly, but can't seem to work out how to do it - especially, as the date_part function does not have a 'fortnight' keyword argument.
This is what I have so far:
CREATE TABLE foo(
dt DATE NOT NULL,
f1 REAL NOT NULL,
f2 REAL NOT NULL,
f3 REAL NOT NULL,
f4 REAL NOT NULL
);
SELECT AVG((f1+f2+f3+f4)/4) as fld_avg FROM
(
SELECT date_part('year', dt) AS year_part,
date_part('fortnight', dt) AS fortnight_part,
f1, f2, f3, f4
FROM foo
WHERE dt >= date_trunc('day', NOW() - '3 month')
) foo
GROUP BY year_part, fortnight_part
How may I rewrite (or modify) the query above so as to group data fortnightly?
Basic idea
What we need to do, is take intervals of 14 consecutive days and map them to unique buckets and then group by those buckets. These buckets can of any type, int, char, timstamp, as long as we have unique value.
Division
A simple way to accomplish this is division. Divide by 14 days and truncate the result to date precision.
For example, we can extract the number of seconds since 1970-01-01, the UNIX epoch, and divide by the number of seconds in a fortnight: 14 * 24 * 60 * 60 = 14 * 86400 = 1209600. (I'll use Vao Tsun's example data)
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT (EXTRACT(EPOCH FROM d)::int/86400)/14 fortnight FROM c
which yields fortnights since 1970-01-01 (a Thursday):
fortnight
-----------
1251
1252
1254
1254
(4 rows)
The integer values we get, represent the number of fortnights since 1970-01-01, but we don't have to care about this. The important thing is, that it uniquely identifies a fortnight.
Due to 1970-01-01 being a Thursday, all fortnights will start at a Thursday. We might want to vary the starting point of our fortnight to a different day of the week (e.g. Monday) by adding:
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT (EXTRACT(EPOCH FROM d)::int/86400 + 4)/14 fortnight FROM c
By adding four days to Thursday we end up at Monday.
If you rather want fortnights with respect to the beginning of the year, instead of some arbitrary absolute date, such as 1970-01-01, we can use the day of the year instead:
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT EXTRACT(year FROM d) * 26 + EXTRACT(doy FROM d)::int/14 AS fortnight FROM c;
which yields
fortnight
-----------
52467
52468
52469
52470
(4 rows)
We need to multiply the extracted year by 26, because there are 26.1… fortnights in a year.
Truncation
Instead of division another approach is truncation. We map each day of a specific fortnight to the first timestamp of that fortnight.
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT d - make_interval(secs => EXTRACT(EPOCH FROM d)::int % (86400 * 14)) AS fortnight FROM c;
which yields
fortnight
---------------------
2017-12-14 00:00:00
2017-12-28 00:00:00
2018-01-25 00:00:00
2018-01-25 00:00:00
(4 rows)
This might seems a bit more complicated, but has some benefits. The result is still a date/time type and other code does not need to worry about the fact, that we used fortnights.
Again, instead of absolute fortnights, we can calculate this with respect to the beginning of the year:
WITH c(d) AS (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
SELECT d - make_interval(days => EXTRACT(dow FROM d)::int % 14) AS fortnight FROM c;
which yields
fortnight
---------------------
2017-12-17 00:00:00
2017-12-31 00:00:00
2018-01-21 00:00:00
2018-01-28 00:00:00
(4 rows)
The result is of type timestamp, you might want to have date instead. This can be addressed by casting:
(d - make_interval(days => EXTRACT(dow FROM d)::int % 14))::date
or subtracting int instead of interval from date:
d - (EXTRACT(dow FROM d)::int % 14)
There are much more possibilities. With this scheme, we can calculate the fortnight or any other interval with respect to the beginning of the month, some arbitrary date, etc.
update
fortnight is a two week period - one even the other odd. eg week 1 and 2, 3 and 4, 5 and 6.
closer: 2 is even, mod(2,2)=0 and 1 is odd, mod(1,2)=1
4 is even, mod(4,2)=0 and 3 is odd, mod(3,2)=1
6 is even, mod(6,2)=0 and 5 is odd, mod(5,2)=1
thus you can make an assumption that each one week's in year consecutive number divided by two reminder is 1, and each next one weeks number/2 reminders 0
The general idea is - using the sequential number of week in a year. To avoid Jan 1st to be first and Dec31 (possible be the 53rd - and thus two odds in a row), I use IW
week number of ISO 8601 week-numbering year (01-53; the first Thursday
of the year is in week 1)
then I assume that if one week number will be odd, next will be even, so we divide all the time in parts of two weeks - even+odd.
SQL Example:
o=# with c(d) as (values('2017.12.21'::date),('2017.12.31'),('2018.01.26'),('2018.02.01'))
select d,to_char(d,'IW'),right(to_char(d,'IW'),1)::int,mod(right(to_char(d,'IW'),1)::int, 2) from c;
d | to_char | right | mod
------------+---------+-------+-----
2017-12-21 | 51 | 1 | 1
2017-12-31 | 52 | 2 | 0
2018-01-26 | 04 | 4 | 0
2018-02-01 | 05 | 5 | 1
(4 rows)
mod is either 0 or 1 - group by this column
https://www.postgresql.org/docs/current/static/functions-math.html
https://www.postgresql.org/docs/current/static/functions-formatting.html
Of course you would need to add outer join on generate_series if you want data without gaps...
I post another answer to explain how I was wrong and why my "smart-n-neat"
way failed...
the schema build and queries are at:
https://www.db-fiddle.com/f/j5i2Td8CvxCVXQQYePKzCe/0
the first (and correct) query:
select distinct w2, avg(c) over (partition by w2)
from d
join generate_series('2016.11.28'::date,'2017.02.23'::date,'2 weeks'::interval) w2
on gs >= w2 and gs < w2 + '2 weeks'::interval
order by w2;
Is a long, simple and correct approach. with idea is to join on two weeks interval. It's working, reliable and all good.
Now the second query:
select distinct div(to_char(gs,'IW')::int,2), min(gs) over w, avg(c) over w
from d
window w as (partition by div(to_char(gs,'IW')::int,2))
order by min;
Is much shorter, neater and smarter, yet has a huge limitation and is unusable. Here's why:
My approach splits next to last two-weeks-interval to two parts: last week of 2016 and first week of 2017, thus dividing the result by half. If you multiply a sum of averages for those two weeks by a half, the result for both queries will match. Alas introducing CASE WHEN logic for the edge year weeks makes neat solution a heavy and overhead. And thus the very point is lost.
TL;DR the neat and lightweight solution works only on interval of one year, farther then two weeks from end or start of the year and lastly if our fortnightly interval starts from Monday.
Now the idea behind lightweight solution: round(2/2, 0)=1 and round(3/2, 0)=1, so you can divide year in intervals of two weeks and use it for grouping by.
Also I deliberately took not this New Year switch, because this 2018 Jan 1 is Monday, so IW is same as WW - which usually is not the case.
Lastly my first answer with odd and even weeks is not viable at all. It divides year not in two-weeks interval, but rather in two parts - for even and odd weeks... I deceived myself with "something close" idea and worked on the reminder, while I should do the opposite the whole value of division...

Vertica date series is starting one month before specified date

I work with a Vertica database and I needed to make a query that, given two dates, would give me a list of all months between said dates. For example, if I were to give the query 2015-01-01 and 2015-12-31, it would output me the following list:
2015-01-01
2015-02-01
2015-03-01
2015-04-01
2015-05-01
2015-06-01
2015-07-01
2015-08-01
2015-09-01
2015-10-01
2015-11-01
2015-12-01
After a bit of digging, I was able to discover the following query:
SELECT date_trunc('MONTH', ts)::date as Mois
FROM
(
SELECT '2015-01-01'::TIMESTAMP as tm
UNION
SELECT '2015-12-31'::TIMESTAMP as tm
) as t
TIMESERIES ts as '1 month' OVER (ORDER BY tm)
This query works and gives me the following output:
2014-12-01
2015-01-01
2015-02-01
2015-03-01
2015-04-01
2015-05-01
2015-06-01
2015-07-01
2015-08-01
2015-09-01
2015-10-01
2015-11-01
2015-12-01
As you can see, by giving the query a starting date of '2015-01-01' or anywhere in january for that matters, I end up with an extra entry, namely 2014-12-01. In itself, the bug (or whatever you want to call this unexpected behavior) is easy to circumvent (just start in february), but I have to admit my curiosity's piked. Why exactly is the serie starting one month BEFORE the date I specified?
EDIT: Alright, after reading Kimbo's warning and confirming that indeed, long periods will eventually cause problems, I was able to come up with the following query that readjusts the dates correctly.
SELECT ts as originalMonth,
ts +
(
mod
(
day(first_value(ts) over (order by ts)) - day(ts) + day(last_day(ts)),
day(last_day(ts))
)
) as adjustedMonth
FROM
(
SELECT ts
FROM
(
SELECT '2015-01-01'::TIMESTAMP as tm
UNION
SELECT '2018-12-31'::TIMESTAMP as tm
) as t
TIMESERIES ts as '1 month' OVER (ORDER BY tm)
) as temp
The only problem I have is that I have no control over the initial day of the first record of the series. It's set automatically by Vertica to the current day. So if I run this query on the 31st of the month, I wonder how it'll behave. I guess I'll just have to wait for december to see unless someone knows how to get timeseries to behave in a way that would allow me to test it.
EDIT: Okay, so after trying out many different date combinations, I was able to determine that the day which the series starts changes depending on the date you specify. This caused a whole lot of problems... until we decided to go the simple way. Instead of using a month interval, we used a day interval and only selected one specific day per month. WAY simpler and it works all the time. Here's the final query:
SELECT ts as originalMonth
FROM
(
SELECT ts
FROM
(
SELECT '2000-02-01'::TIMESTAMP as tm
UNION
SELECT '2018-12-31'::TIMESTAMP as tm
) as t
TIMESERIES ts as '1 day' OVER (ORDER BY tm)
) as temp
where day(ts) = 1
I think it boils down to this statement from the doc: http://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/SQLReferenceManual/Statements/SELECT/TIMESERIESClause.htm
TIME_SLICE can return the start or end time of a time slice, depending
on the value of its fourth input parameter (start_or_end). TIMESERIES,
on the other hand, always returns the start time of each time slice.
When you define a time interval with some start date (2015-01-01, for example), then TIMESERIES ts AS '1 month' will create for its first time slice a slice that starts 1 month ahead of that first data point, so 2014-12-01. When you do DATE_TRUNC('MON', ts), that of course sets the first date value to 2014-12-01 even if your start date is 2015-01-03, or whatever.
e: I want to throw out one more warning -- your use of DATE_TRUNC achieves what you need, I think. But, from the doc: Unlike TIME_SLICE, the time slice length and time unit expressed in [TIMESERIES] length_and_time_unit_expr must be constants so gaps in the time slices are well-defined. This means that '1 month' is actually 30 days exactly. This obviously has problems if you're going for more than a couple years.

Change Character to time stamp in IBM informix DB

I am writing a query to convert a character to Date Time
The following query extracts my time stamps in Character format.
select
(to_char(TO_CHAR(MDY(month(current- 1 units month), 1,year(current- 1 units month)),'%Y-%m-%d')||' 13:00:00')),
(to_char(TO_CHAR((DATE(DATE(extend(TODAY, YEAR TO MONTH)) - 1 UNITS DAY)+1),'%d-%m-%Y')||' 13:00:00'))
from dual
Output:
`T 0÷
2015-08-01 13:00:00 01-09-2015 13:00:00
2015-08-01 13:00:00 01-09-2015 13:00:00
Now I am trying to convert the Character to Time stamp using DATETIME(2001-12-31 15:32:55) YEAR TO SECOND function. I am getting syntax error.
select
DATETIME(to_char(TO_CHAR(MDY(month(current- 1 units month), 1,year(current- 1 units month)),'%Y-%m-%d')||' 13:00:00')) YEAR TO SECOND ,
DATETIME(to_char(TO_CHAR((DATE(DATE(extend(TODAY, YEAR TO MONTH)) - 1 UNITS DAY)+1),'%d-%m-%Y')||' 13:00:00') ) YEAR TO SECOND
from dual
How ever the following is working fine:
select DATETIME(2001-12-31 15:32:55) YEAR TO SECOND
from dual
Thanks in Advance. Please do not suggest answers for Oracle. its damn easy in Oracle.
Try using a CAST to convert your output as a DATETIME YEAR TO SECOND:
select
(to_char(TO_CHAR(MDY(month(current- 1 units month), 1,year(current- 1 units month)),'%Y-%m-%d')||' 13:00:00'))::DATETIME YEAR TO SECOND ,
(to_char(TO_CHAR((DATE(DATE(extend(TODAY, YEAR TO MONTH)) - 1 UNITS DAY)+1),'%Y-%m-%d')||' 13:00:00'))::DATETIME YEAR TO SECOND
from dual
That seems to work OK, but I'd suggest you don't do this. Also, note that DATETIME is an ISO standard: YYYY-MM-DD HH:MM:SS.FFF (or part thereof). Your second example with the date in English format is not going to parse to a DATETIME.
Your first algorithm is not leap-day safe, and the second example is horribly over-complicated. You can determine 13:00 on the first day of the current month using the far simpler construction:
(TODAY-DAY(TODAY)+1)::DATETIME YEAR TO SECOND + INTERVAL(13) HOUR TO HOUR
This also has the benefit of avoiding casting back and forth between DATE and CHAR. The calculation of the same time the previous month can be written as:
MDY(MONTH(TODAY-DAY(TODAY)),1,YEAR(TODAY-DAY(TODAY)))::DATETIME YEAR TO SECOND + INTERVAL(13) HOUR TO HOUR
... which will do the right thing on Feb 29/March 1 in a leap year, which your algorithm won't.
The construction TODAY-DAY(TODAY) will always produce the last day of the prior month.