Convert timestamp value from string to timestamp hive - hive

I have timestamp value stored as string in my table created in hive, and want to convert it to the timestamp type.
I tried the following code:
select date_value, FROM_UNIXTIME(UNIX_TIMESTAMP(date_value, 'dd-MMM-YY HH.mm.ss')) from sales limit 2;
Original time and result is as following:
Original time result
07-NOV-12 17.07.03 2012-01-01 17:07:03
25-FEB-13 04.26.53 2012-12-30 04:26:53
What's wrong in my script?

yy instead of YY
select date_value
,FROM_UNIXTIME(UNIX_TIMESTAMP(date_value, 'dd-MMM-yy HH.mm.ss')) as ts
from sales
;
+--------------------+---------------------+
| date_value | ts |
+--------------------+---------------------+
| 07-NOV-12 17.07.03 | 2012-11-07 17:07:03 |
| 25-FEB-13 04.26.53 | 2013-02-25 04:26:53 |
+--------------------+---------------------+

Related

extract date only from given date in oracle sql

What is the best Oracle sql query to extract date?
input entry - 2020-10-14T07:26:32.661Z ,
expected output - 2020-10-14
If you want a DATE data type where the time component is midnight then:
SELECT TRUNC(
TO_TIMESTAMP_TZ(
'2020-10-14T07:26:32.661Z',
'YYYY-MM-DD"T"HH24:MI:SS.FF3TZR'
)
) AS truncated_date
FROM DUAL;
Which (depending on your NLS_DATE_FORMAT) outputs:
| TRUNCATED_DATE |
| :------------------ |
| 2020-10-14 00:00:00 |
(Note: a DATE data type has year, month, day, hour, minute and second components. Whatever client program you are using to access the database may choose not to show the time component but it will still be there.)
If you want a YYYY-MM-DD formatted string then:
SELECT TO_CHAR(
TO_TIMESTAMP_TZ(
'2020-10-14T07:26:32.661Z',
'YYYY-MM-DD"T"HH24:MI:SS.FF3TZR'
),
'YYYY-MM-DD'
) AS formatted_date
FROM DUAL;
| FORMATTED_DATE |
| :------------- |
| 2020-10-14 |
db<>fiddle here
The canonical way is probably trunc():
select trunc(input_entry)
This assumes that input_entry is a date or timestamp value.
EDIT:
If your input is just a string, use string operations:
select substr(input_entry, 1, 10)
You can also readily cast this to a date.

Postgres Array Issue

I have a table as below and want the output to be loaded the data into another table:
Input Table Data(Tempabc):
ID,COURSE,ENROLL_DT
'12345fgh-2bce-467f',array['BB','TT',''],array['01/07/2007 12:00:00 AM','15/09/2007 12:00:00 AM',''],
'1234rty-863d-4e4f',array['CRKT','HKY',''],array['01/01/2005 12:00:00 AM','01/07/2012 12:00:00 AM','']
Output Data:
ID,COURSE,ENROLL_DT
'12345fgh-2bce-467f',array['BB','TT'],array['01/07/2007','15/09/2007'],
'1234rty-863d-4e4f',array['CRKT','HKY'],array['01/01/2005','01/07/2012']
Can you guys please help. I have used the below query however unable to extract date from the third column. The third column is a varchar column while importing from a file but I want to load it to target table where it is a Date datatype Array column:
SELECT ID,
ARRAY_REMOVE(COURSE,'') AS COURSE,ARRAY_REMOVE(ENROLL_DT,'') AS ENROLL_DT
FROM TEMPABC;
However, I am still unable to extract the date from the ENROLL_DT column. Is there a way to extract the Date. Can someone please suggest?
If you want to remove the blank elements of the arrays and change their data type, you could array_remove, unnest, cast the values and finally group them again with array_agg, e.g.
WITH tempabc (id,course,enroll_dt) AS (
VALUES
('12345fgh-2bce-467f',array['BB','TT',''],array['01/07/2007 12:00:00 AM','15/09/2007 12:00:00 AM','']),
('1234rty-863d-4e4f',array['CRKT','HKY',''],array['01/01/2005 12:00:00 AM','01/07/2012 12:00:00 AM',''])
)
SELECT id, array_agg(course) AS course, array_agg(enroll_dt) AS enroll_dt FROM (
SELECT id,
unnest(array_remove(course,'')) AS course,
unnest(array_remove(enroll_dt,''))::date AS enroll_dt
FROM tempabc) q
GROUP BY id;
id | course | enroll_dt
--------------------+------------+-------------------------
12345fgh-2bce-467f | {BB,TT} | {2007-07-01,2007-09-15}
1234rty-863d-4e4f | {CRKT,HKY} | {2005-01-01,2012-07-01}
If you're aiming to create a record for each array value, just array_remove and unnest, e.g.
WITH tempabc (id,course,enroll_dt) AS (
VALUES
('12345fgh-2bce-467f',array['BB','TT',''],array['01/07/2007 12:00:00 AM','15/09/2007 12:00:00 AM','']),
('1234rty-863d-4e4f',array['CRKT','HKY',''],array['01/01/2005 12:00:00 AM','01/07/2012 12:00:00 AM',''])
)
SELECT id,
unnest(array_remove(course,'')) AS course,
unnest(array_remove(enroll_dt,''))::date AS enroll_dt
FROM tempabc;
id | course | enroll_dt
--------------------+--------+------------
12345fgh-2bce-467f | BB | 2007-07-01
12345fgh-2bce-467f | TT | 2007-09-15
1234rty-863d-4e4f | CRKT | 2005-01-01
1234rty-863d-4e4f | HKY | 2012-07-01
Further reading:
PostgreSQL Array Functions
PostgreSQL type cast :: operator

How do I subtract two columns with datetime in SQLite?

I am learning SQLite for work and I am trying to subtract 'Enddate' column fromn 'Startdate' column, which contain date and time. Soemthing like this:
Startdate 3/15/18 16:00 3/28/18 17:00
Enddate 3/19/18 00:00 3/20/18 00:00
My table's name is data1. I tried this:
select *,
strftime('%m/%d/%y %H:%M', 'data1.Enddate') -
(strftime('%m/%d/%y %H:%M', 'data1.Startdate')) as TimeOff
from data1;
But this gives me all 'Null' values.
If you could help me with this I would really appreciate that. That you so much!
Two possible reasons you got NULL (likely because of a silent error):
1) Your dates are malformed when you create them. They should be yyyy-mm-dd HH:MM:SS format instead.
2) Not having a closing semicolon in one of your queries. I see it in the one above, but if the one where you insert your test rows didn't close properly, you may not
My test query:
https://dbfiddle.uk/?rdbms=sqlite_3.8&fiddle=8b9a168291bbc08c74a895ce22ab41ac
Setup
CREATE TABLE data1 (foo int, StartDate datetime, EndDate datetime) ;
INSERT INTO data1 (foo, StartDate, EndDate)
VALUES (1,'2018-03-15 16:00:00', '2018-03-28 17:00:00')
, (2,'2018-03-19 00:00:00', '2018-03-20 00:00:00') ;
The Query
SELECT foo, StartDate, EndDate
, julianday(EndDate)-julianday(StartDate) AS TimeOffInDays
, CAST((julianday(EndDate) - julianday(StartDate))*24 AS real) AS TimeOffInHours
FROM data1 ;
Which gives us...
| foo | StartDate | EndDate | TimeOffInDays | TimeOffInHours |
=========================================================================================
| 1 | 2018-03-15 16:00:00 | 2018-03-28 17:00:00 | 13.041666666977 | 313.00000000745 |
| 2 | 2018-03-19 00:00:00 | 2018-03-20 00:00:00 | 1 | 24 |

Is it possible to group by day, month or year with timestamp values?

I have a table ORDERS(idOrder, idProduct, Qty, OrderDate) where OrderDate is a varchar column with timestamp values, is it possible to get the Qty of each day, week, month or year ?
The table looks like this :
---------------------------------------
|idOrder | idProduct | Qty | OrderDate|
---------------------------------------
| 1 | 5 | 20 | 1504011790 |
| 2 | 5 | 50 | 1504015790 |
| 3 | 5 | 60 | 1504611790 |
| 4 | 5 | 90 | 1504911790 |
-----------------------------------------
and i want something like this
------------------------------
| idProduct | Qty | OrderDate|
-------------------------------
| 5 | 70 | 08/29/2017|
| 5 | 60 | 09/05/2017|
| 5 | 90 | 09/08/2017|
-------------------------------
looks like you want to do 2 things here: first group by your idProduct and OrderDate
select idProduct, sum(Qty), OrderDate from [yourtable] group by idProduct, OrderDate
This will get you the sums that you want. Next, you want to convert time formats. I assume that your stamps are in Epoch time (number of seconds from Jan 1, 1970) so converting them takes the form:
dateadd(s,[your time field],'19700101')
It also looks like you wanted your dates formatted as mm/dd/yyyy.
convert(NVARCHAR, [date],101) is the format for accomplishing that
Together:
select idProduct, sum(Qty), convert(NVARCHAR,dateadd(s,OrderDate,'19700101'), 101)
from [yourtable]
group by idProduct, OrderDate
Unfortunately, the TSQL TIMESTAMP data type isn't really a date. According to this SO question they're even changing the name because it's such a misnomer. You're much better off creating a DATETIME field with a DEFAULT = GETDATE() to keep an accurate record of when a line was created.
That being said, the most performant way I've seen to track dates down to the day/week/month/quarter/etc. is to use a date dimension table that just lists every date and has fields like WeekOfMonth and DayOfYearand. Once you join your new DateCreated field to it you can get all sorts of information about that date. You can google scripts that will create a date dimension table for you.
Yes its very simple:
TRUNC ( date [, format ] )
Format can be:
TRUNC(TO_DATE('22-AUG-03'), 'YEAR')
Result: '01-JAN-03'
TRUNC(TO_DATE('22-AUG-03'), 'MONTH')
Result: '01-AUG-03'
TRUNC(TO_DATE('22-AUG-03'), 'DDD')
Result: '22-AUG-03'
TRUNC(TO_DATE('22-AUG-03'), 'DAY')
Result: '17-AUG-03'

Django returns wrong results when selecting from a postgres view

I have a view defined in postgres, in a separate schema to the data it is using.
It contains three columns:
mydb=# \d "my_views"."results"
View "my_views.results"
Column | Type | Modifiers
-----------+-----------------------+-----------
Date | date |
Something | character varying(60) |
Result | numeric |
When I query it from psql or adminer, I get results like theese:
bb_adminpanel=# select * from "my_views"."results";
Date | Something | Result
------------+-----------------------------+--------------
2015-09-14 | Foo | -3.36000000
2015-09-14 | Bar | -16.34000000
2015-09-12 | Foo | -11.55000000
2015-09-12 | Bar | 11.76000000
2015-09-11 | Bar | 2.48000000
However, querying it through django, I get a different set:
(c is a cursor object on the database)
c.execute('SELECT * from "my_views"."results"')
c.fetchall()
[(datetime.date(2015, 9, 14), 'foo', Decimal('-3.36000000')),
(datetime.date(2015, 9, 14), 'bar', Decimal('-16.34000000')),
(datetime.date(2015, 9, 11), 'foo', Decimal('-11.55000000')),
(datetime.date(2015, 9, 11), 'bar', Decimal('14.24000000'))]
Which doesn't match at all - the first two rows are correct, but the last two are really weird - they have a shifted date, and the Result of the last record is the sum of the last two.
I have no idea why that's happening, any suggestions welcome.
Here is the view definition:
SELECT a."Timestamp"::date AS "Date",
a."Something",
sum(a."x") AS "Result"
FROM my_views.another_view a
WHERE a.status::text = ANY (ARRAY['DONE'::character varying::text, 'CLOSED'::character varying::text])
GROUP BY a."Timestamp"::date, a."Something"
ORDER BY a."Timestamp"::date DESC;
and "another_view" looks like this:
Column | Type | Modifiers
---------------------------+--------------------------+-----------
Timestamp | timestamp with time zone |
Something | character varying(60) |
x | numeric |
status | character varying(100) |
(some columns ommited)
Simple explanation of problem is: timezones.
Detailed: you're not declaring any timezone setting when connecting to PostgreSQL console, but django does it on each query. That way,the timestamp for some records will point to different day depending on used timezone, for example with data
+-------------------------+-----------+-------+--------+
| timestamp | something | x | status |
+-------------------------+-----------+-------+--------+
| 2015-09-11 12:00:00 UTC | foo | 2.48 | DONE |
| 2015-09-12 00:50:00 UTC | foo | 11.76 | DONE |
+-------------------------+-----------+-------+--------+
query on your view executed with timezone UTC will give you 2 rows, but query executed with timezone GMT-2 will give you only one row. because in GMT-2 timezone timestamp from second row is still in day 2015-09-11.
To fix that, you can edit your view, so it will always group days according to specified timezone:
SELECT (a."Timestamp" AT TIME ZONE 'UTC')::date AS "Date",
a."Something",
sum(a."x") AS "Result"
FROM my_views.another_view a
WHERE a.status::text = ANY (ARRAY['DONE'::character varying::text, 'CLOSED'::character varying::text])
GROUP BY (a."Timestamp" AT TIME ZONE 'UTC'), a."Something"
ORDER BY (a."Timestamp" AT TIME ZONE 'UTC') DESC;
That way days will be always counted according to 'UTC' timezone.