Transform every row in a column to date, using first unix_timestamp - hive

I have rows with the following format and I would like to transform then into valid Hive timestamps. Format in my data:
28/04/2017 00:00:00|20550|22/05/2017 00:00:00|
I'm only interested in the first and third column, separated with |, in MY case the format is, then:
dd/MM/yy HH:mm:ss
I've discovered this can't be used as timestamp in Hive.
I find myself unable to transform all that first and third column to the proper format using queries similar to:
select from_unixtime(unix_timestamp('28/04/2017','dd/MM/yy HH:mm:ss'),'yyyy-MM-dd') from `20170428_f_pers_pers`
I'm trying different instances of that query but since I can't access the documentation (internet is capped here at work), I can't see how to properly use this two functions, from_unixtime and unix_timestamp
I've made the following assumptions:
I can reorder the days and years. If this isn't true, I have no idea how to transform my original data into proper Hive format
When I do this select, it affects the whole column. Further, after doing this with success I should be able to change the format of the whole column from string to timestamp (maybe I have to create a new column for that, not sure)
I do not care about doing both columns at once, but right now when I do the query showed first I get as many nulls as data has my table, and I'm unsure my assumptions are even partially right since every example I come accross is simpler (they do not change days and years arround, for instance).
I would like to know how to apply the query to a specific column, since I haven't understood how to do that from the examples studied so far. I do not see them using any type of column ID for that, which is weird to me, using data from the column to change the column itself.
Thanks in advance.
edit: I am now trying something like
select from_unixtime(unix_timestamp(f_Date, 'dd/MM/yyyy HH:mm:ss')) from `myTable`
But I get from HUE the following error:
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

The format should be completely covered by the input string.
In other words -
The format can be equal in length to the the input string or shorter, but not longer.
28/04/2017 00:00:00
|||||||||||||||||||
dd/MM/yyyy HH:mm:ss
select from_unixtime(to_unix_timestamp('28/04/2017 00:00:00', 'dd/MM/yyyy HH:mm:ss'))
2017-04-28 00:00:00
28/04/2017 00:00:00
||||||||||
dd/MM/yyyy
select from_unixtime(to_unix_timestamp('28/04/2017 00:00:00', 'dd/MM/yyyy'))
2017-04-28 00:00:00
The result can be converted from string to timestamp using cast
select cast (from_unixtime(to_unix_timestamp('28/04/2017 00:00:00', 'dd/MM/yyyy HH:mm:ss')) as timestamp)

Related

Issues while converting timestamp to specific timezone and then converting it to date in bigquery

I am doing just a simple conversion of timestamp column value to specific timezone and then getting the date out of it to create analytical charts based on the output of the query.
I am having the column of type timestamp in the bigquery and value for that column is in UTC. Now I need to convert that to PST (which is -8:00 GMT) and was looking straight forward to convert but I am seeing some dates up and down based on the output I get.
From the output that I was getting I took one abnormal output and wrote a query out of it as below:
select "2021-05-27 18:10:10" as timestampvalue ,
Date(Timestamp("2021-05-27 18:10:10" ,"-8:00")) as completed_date1,
Date(Timestamp("2021-05-27 18:10:10","America/Los_Angeles")) as completed_date2,
Date(TIMESTAMP_SUB("2021-05-27 18:10:10", INTERVAL 8 hour)) as completed_date3,
Date(Timestamp("2021-05-27 18:10:10","America/Tijuana")) as completed_date4
The output that I get is as below:
Based on my understanding I need to subtract 8 hours from the time in order to get the timestamp value for the timezone that I wanted and according to that completed_date3 column seems to show the correct value that should be there but if I use other timezone conversions as suggested in google documentation, the output gets changed to 2021-05-28 and I am not able to understand how that can happen.
Can anyone let me know what is the thing that I am doing wrong?
I was actually using it in a wrong way. I need to use it as below :
select "2021-05-27 18:10:10" as timestampvalue ,
Date(Timestamp("2021-05-27 18:10:10") ,"-8:00") as completed_date1,
Date(Timestamp("2021-05-27 18:10:10"),"America/Los_Angeles") as completed_date2,
Date(TIMESTAMP_SUB("2021-05-27 18:10:10", INTERVAL 8 hour)) as completed_date3,
Date(Timestamp("2021-05-27 18:10:10"),"America/Tijuana") as completed_date4
Initially I was converting that string timestamp to a specific timestamp based on the timezone and that is what I did not want.
Now if a convert a string to timestamp first without using time zone parameter and then apply timezone parameter when getting the date value out of it then it would return me correct date.
Please see the snapshot below :

Convert string to date with added time Oracle

I have two values I want to compare. Samples of those are the following:
'01/04/2020T07.08.45'
'2020-04-01 14:46'
I want to transform the first value into the same mask as the second, so I can use it to join two tables lateron.
The first value is saved in a different time-zone than the second. I also need to convert that by adding two hours to the first value, depending on the moment I ran the comparasing.
Using substring didn't solve my problem as I wasn't able to add two hours to a string. The to_date function also didn't brought me anything.
Can you help me?
You have to convert both values to the date.
TO_DATE('2020-04-01 14:46', 'YYYY-MM-DD HH24:MI');
TO_DATE('01/04/2020T07.08.45', 'DD/MM/YYYYTZH.HH24.MI')
I'm not sure about the date that contains a timezone.

Oracle: One attribute with date, another with date and time?

I am taking an introductory course to databases so I am a complete beginner. In the database I am supposed to create I have two tables, each contain a DATE datatype.
In the first table, I want it to only display a date (DD-MM-YY) and in the second table display a date and time (DD-MM-YY HH24:MM).
How can I format each attribute to have these respective formats? I've looked around and tried the following command:
ALTER SESSION SET nls_date_format = 'YYYY-MM-DD HH24:MI:SS'
Which works nicely for the date and time field but leaves 00:00:00 for the date only field. Which I do not want, so I reverted it back to nls_date_format = 'DD-MM-YY'
As of right now the following:
INSERT INTO ITEM (ITEM_ENDDATEANDTIME)
VALUES ('13-AUG-13 23:56:00');
Gives me the error: date format picture ends before converting entire input string
Any ideas? Again, I'm a beginner a lot of this is new to me! Thanks!
What you want to do is always store your date values as dates. Data manipulation is infinitely easier when you have stored them in this format instead of in a text based format.
Then, you output them into a more human readable format through a query using syntax similar to this:
SELECT DateField,
TO_CHAR(DateField, 'YYYY-MM-DD HH24:MI:SS') AS Date1,
TO_CHAR(Datefield, 'DD-MM-YY') AS Date2
FROM MyTable
This takes the date data and outputs it as a formatted string. I hope this helps.

PLSQL: Query star schema time-dimension without stored date

I have a star schema database with an Hour-Dimension (Time-dimension), with the following columns in it:
ID, ON_HOUR, ON_DAY, IN_MONTH, IN_YEAR
I then query the database, and I want to find all entries within an interval of given dates, based on this Hour-Dimension.
However comparing the ON_DAY attribute with the interval days and so on with IN_MONTH and IN_YEAR, I can often reach a case where I receive no data, if the interval spans over several months. Thus I need to convert these values to a timestamp, however I am querying into the database, so how do I compare my given timestamps with the time data properly? I do not have a stored DATE nor TIMESTAMP in the database - should I change this?
Right now, my best bet is something like this:
to_timestamp('H.IN_YEAR-H.IN_MONTH-H.ON_DAY H.ON_HOUR:00:00', 'YYYY-MM-DD hh24:mi:ss')
This does not seem to work however, and it also looks dodgy, so I didn't really expect it to...
What is the best way to get the entries within my given interval of dates?
You appear to be passing a literal string into the timestamp function - you need to pass in the values as a concatenated string using the concat function. Try the below code snippet
(H.IN_YEAR||H-IN_MONTH||H.ON_DAY||H.ON_HOUR,'YYYMMDDHH24')

Convert date format in Oracle

I have a date format 2011-01-06T06:30:10Z in Excel.I want to just load the date part into a table from excel.How do I get the date part from it.
i.e. 2011-01-06
Thanks
Try this:
select cast(TO_TIMESTAMP_TZ(REPLACE('2011-01-06T06:30:10Z', 'T', ''), 'YYYY-MM-DD HH:MI:SS TZH:TZM') as date) from dual
I think, some more explanation is needed.
Loading data into database is one part, and displaying it after fetching is another part.
If you have loaded the data into database, then all you need to do is use TRUNC. It will truncate the time portion and will display only the date portion.
A DATE always has a datetime part together. TIMESTAMP is an extension to the DATE type. And what you see the date looks like is not the way it is stored in database. The format is for we human beings to understand. A date is stored in 7 byte in internal format.
More information Based on OP's question via comments
NEVER store a DATE as VARCHAR2 datatype. A date is not a string literal. Oracle provides lot of FORMAT MODELS to display the datetime the way you want. Sooner or later, you will run into performance issues due to data conversion. Always use explicit conversion to convert a literal to a perfect DATE to compare it with other date value.