A previous solution regarding obtaining an answer in milliseconds for differences between two timestamps does not work in Hive 1.0 on Amazon EMR. Hive returns a blank column when casting a timestamp as double in my testing today. No errors are thrown when doing the CAST. Being able to calculate a time difference in fractions of a second between two columns of type "timestamp" are critical to our analysis. Any ideas?
You should try to convert into unix_timestamp using unix_timestamp(timestamp) but I think you will still be losing milliseconds.
select (unix_timestamp(DATE1)-unix_timestamp(DATE2)) TIMEDIFF from TABLE;
Related
BigQuery TIMESTAMP datatype has microsecond precision, 6 fractional seconds.
When I run the following query
SELECT CAST("2020-06-02 07:00:53.001000" AS TIMESTAMP) AS as_timestamp
I would expect 2020-06-02 07:00:53.001000 UTC
What I get instead is ... 2020-06-02 07:00:53.1000 UTC
As there is 2 leadings 0's, BigQuery omits them for some reason. Can anyone help me out at all to stop BigQuery omitting these leadings 0s ? I'm trying to calculate some time differences between timestamps and it's throwing my calculations off.
Thanks
I strongly believe this is a UI bug, not a BigQuery Engine's
Below two proves for this
Prove 1
Look at JSON tab to see actual value returned by BQ
Prove 2
I run same query in another BigQuery IDE ( I personally use Goliath BigQuery IDE) and you can see correct result
I am currently using the function
select date_cmp_timestamp('2008-01-25', '2008-01-24 06:43:24')
But this one only compares a date and a timestamp.. Is there a function for comparing two timestamps so I can see if the seconds, hours, minutes are different? I haven't been able to find it, thank you
My solution is the following
SELECT TIMESTAMP_ADD('1970-01-01', INTERVAL 1551692341 SECOND) AS ts
Is there any other, more readable, way to convert a unix timestamp to a datetime ?
Yes there is.
TIMESTAMP_SECONDS(int64_expression). Description. Interprets
int64_expression as the number of seconds since 1970-01-01 00:00:00
UTC
Example:
SELECT timestamp_seconds(1551692341)
returns
2019-03-04 09:39:01 UTC
I am curious, why would you want anything simpler than what you already have.
Alternatively, BQ has support for UDF (user defined function) wherein you can create TEMP functions embedded with your JS snippet to parse the epoch value and convert into a required date value in any format deemed fit for data.
In most cases, it is good to have all such formatting and value transformations at the app layer and not create bespoke utilities at DB layer.
In any case, you may want to have a quick read at here
How to find the last DML or DQL update timestamp for Hive table. I can find TransientDDLid by using "formatted describe ". But it is helping in getting Modified Date. How can I figure out the latest UPDATED DATE for a Hive Table(Managed/External)?
Do show table extended like 'table_name';
It will give number of milliseconds elapsed since epoch.
Copy that number, remove last 3 digits and do select from_unixtime(no. of milliseconds elapsed since epoch)
e.g. select from_unixtime(1532442615733);
This will give you timestamp of that moment in current system's time zone.
I guess this is what you're looking for...
I'm running a very small database that contains a table with a column containing data of type INTERVAL HOUR TO MINUTE. Although this means the table will only store time intervals with minute precision, the database system I am using (PostgreSQL) will return an interval with microsecond precision on a aggregate function such as AVG(). Can I rely on this behavior, or is it possible that in the future the database system will return values with only minute precision? How do other DBMS's behave in this respect?
I'm asking because values in the table do not require finer than minute precision, but I expect higher precision when I use an aggregate function.
An aggregate function such as avg() has to return the general form of an interval, as the average of multiple values can lie in between. This will definitely not change in future releases. Also, the datatypes are identical internally. Just the least significant parts get truncated.
The behavior is similar with other datatypes. If you compute an average over an integer column, you get a result of type numeric that can hold exact results.
If you want the results to be truncated (not your request), you can always cast to interval hour to minute explicitly to be sure.
SELECT avg(i)::interval hour to minute from mytbl;
I can't say much about other RDBMSes. Maybe additional answers can fill in here?