Google Dataprep Date Serial Number - google-bigquery

In Google Dataprep when I apply min() to a date it gives a long serial number e.g. 1304985600000. I'm trying to get the first order date of a customer but I can't seem to do anything with this number
Thanks

It's a UNIX timestamp represented in milliseconds. With this value in BigQuery, simply do:
SELECT TIMESTAMP_MILLIS(1304985600000)
Output:
Row f0_
1 2011-05-10 00:00:00.000 UTC

You can convert the unix timestamp to a Date in Cloud Dataprep using the UNIXTIMEFORMAT function: https://cloud.google.com/dataprep/docs/html/UNIXTIMEFORMAT-Function_57344709

Related

Convert date to dateTtime format in SQL

I am trying to convert a date column (ie. 2012-10-02) to the first day of the year with time (ie. 2012-01-01T00:00:00) in sql.
Is there a way to do so in the SELECT query?
for BigQuery use below
select timestamp_trunc('2012-10-02', year)
with output
2012-01-01 00:00:00 UTC
Note - if you column is of date type - the output will be
2012-01-01T00:00:00
and finally, you can use datetime_trunc instead of timestamp_trunc and you will get expected result - 2012-01-01T00:00:00
Look at the YEAR() function.
It would allow you to extract just the year, and then just as the date and time you need.

Extract time on sql gcp

I am extracting the time from timestamp using sql on gcp. The timestamp is
2020-05-10 06:14:25.276 UTC
My code is
SELECT
cast( Request_Timestamp as time) AS Time,
FROM conversation;
However, I got the result:
18:08:47.371000
What should I code so I will get the result like:
18:08:47
Thanks
Below is for BigQuery Standard SQL
#standardSQL
SELECT TIME_TRUNC(TIME(Request_Timestamp), SECOND) AS Time
FROM `project.dataset.conversation`
So when applied to 2020-05-10 06:14:25.276 UTC timestamp - output it
Row Time
1 06:14:25

Converting only time to unixtimestamp in Hive

I have a column eventtime that only stores the time of day as string. Eg:
0445AM - means 04:45 AM. I am using the below query to convert to UNIX timestamp.
select unix_timestamp(eventtime,'hhmmaa'),eventtime from data_raw limit 10;
This seems to work fine for test data. I always thought unixtimestamp is a combination of date and time while here I only have the time. My question is what date does it consider while executing the above function? The timestamps seem to be quite small.
Unix timestamp is the bigint number of seconds from Unix epoch (1970-01-01 00:00:00 UTC). The unix time stamp is a way to track time as a running total of seconds.
select unix_timestamp('0445AM','hhmmaa') as unixtimestamp
Returns
17100
And this is exactly 4hrs, 45min converted to seconds.
select 4*60*60 + 45*60
returns 17100
And to convert it back use from_unixtime function
select from_unixtime (17100,'hhmmaa')
returns:
0445AM
If you convert using format including date, you will see it assumes the date is 1970-01-01
select from_unixtime (17100,'yyyy-MM-dd hhmmaa')
returns:
1970-01-01 0445AM
See Hive functions dosc here.
Also there is very useful site about Unix timestamp

date_trunc in hive is working incorrectly

I am running below query:
select a.event_date,
date_format(date_trunc('month', a.event_date), '%m/%d/%Y') as date
from monthly_test_table a
order by 1;
Output:
2017-09-15 | 09/01/2017
2017-10-01 | 09/30/2017
2017-11-01 | 11/01/2017
Can anyone tell me why for date "2017-10-01" it is showing me date as "09/30/2017" after using date_trunc.
Thanks in Advance...!
You are reverse formatting so it is incorrect.
Use the below Code
select a.event_date,
date_format(date_trunc('month', a.event_date), '%Y/%m/%d') as date
from monthly_test_table a
order by 1;
You can use date_add with a logic to subtract 1-day(yourdate) to replicate trunc.
For eg:
2017-10-01 - day('2017-10-01') is 1 and you add 1-1=0 days
2017-08-30 - day('2017-08-30') is 30 and you add 1-30=-29 days
I faced the same issue recently and resorted to using this logic.
date_add(from_unixtime(unix_timestamp(event_date,'yyyy-MM-dd'),'yyyy-MM-dd'),
1-day(from_unixtime(unix_timestamp(event_date,'yyyy-MM-dd'),'yyyy-MM-dd'))
)
PS: As far as i know, there is no date_trunc function in Hive documentation.
As per the source code below: UTC_CHRONOLOGY time is translated w.r.t. locale, also in Description it is mentioned that session timezone will be the precision, also refer to below URL.
#Description("truncate to the specified precision in the session timezone")
#ScalarFunction("date_trunc")
#LiteralParameters("x")
#SqlType(StandardTypes.DATE)
public static long truncateDate(ConnectorSession session, #SqlType("varchar(x)") Slice unit, #SqlType(StandardTypes.DATE) long date)
{
long millis = getDateField(UTC_CHRONOLOGY, unit).roundFloor(DAYS.toMillis(date));
return MILLISECONDS.toDays(millis);
}
See https://prestodb.io/docs/current/release/release-0.66.html:::
Time Zones:
This release has full support for time zone rules, which are needed to perform date/time calculations correctly. Typically, the session time zone is used for temporal calculations. This is the time zone of the client computer that submits the query, if available. Otherwise, it is the time zone of the server running the Presto coordinator.
Queries that operate with time zones that follow daylight saving can
produce unexpected results. For example, if we run the following query
to add 24 hours using in the America/Los Angeles time zone:
SELECT date_add('hour', 24, TIMESTAMP '2014-03-08 09:00:00');
Output: 2014-03-09 10:00:00.000

bigquery standard sql error, invalid timestamp

I'm playing with some tables in bigquery and I receive this error:
Cannot return an invalid timestamp value of -62169990264000000 microseconds relative to the Unix epoch.
The range of valid timestamp values is [0001-01-1 00:00:00, 9999-12-31 23:59:59.999999]
Doing the query in legacy sql and sorting ascending, it displays as 0001-11-29 22:15:36 UTC
How does it get transformed into microseconds?
This is the query:
#standardSQL
SELECT
birthdate
FROM
X
WHERE
birthdate IS NOT NULL
ORDER BY
birthdate ASC
**strong text**Confirming , that in BigQuery Legacy SQL
SELECT USEC_TO_TIMESTAMP(-62169990264000000)
produces 0001-11-29 22:15:36 UTC timestamp
whereas in BigQuery Standard SQL
SELECT TIMESTAMP_MICROS(-62169990264000000)
produces error:
TIMESTAMP value is out of allowed range: from 0001-01-01 00:00:00.000000+00 to 9999-12-31 23:59:59.999999+00.
How does it get transformed in microseconds?
TIMESTAMP
You can describe TIMESTAMP data types as either UNIX timestamps or calendar datetimes. BigQuery stores TIMESTAMP data internally as a UNIX timestamp with microsecond precision.
See more about TIMESTAMP type
Midnight of January 1 of the year 0001 (the minimum possible timestamp value in standard SQL) is -62135596800000000 in microseconds relative to the UNIX epoch, which is greater than -62169990264000000. I don't have a good explanation for legacy SQL's behavior with that timestamp value, but you can read about some suggestions for dealing with it in standard SQL in this item on the issue tracker. We plan to add some content to the migration guide about this timestamp behavior in the future as well.