Hive SELECT records from 1 hour ago - hive

I have a hive table that contains a column called timestamp. The timestamp is a bigint field generated from java System.currenttimemillis(). I suppose it should be in UTC. Right now I am trying to select records from 1 hour ago. I know in MySQL you can do something like:
SELECT * FROM table WHERE datetimefield >= DATE_SUB(NOW(), INTERVAL 1 HOUR)
In hive, it seems like NOW() is missing. I did some searching and find unix_timestamp(). I should be able to get the current UTC time in milliseconds by doing a unix_timestamp()*1000.
So if i want to get records from 1 hour ago I am thinking about doing something like:
SELECT * FROM hivetable WHERE datetimefield >= (unix_timestamp()*1000-3600000);
Can someone suggest if it's the right way to approach this problem? Also what if I want to select like 1 day ago? Seems inconvenient to convert that to milliseconds. Any help or suggested readings will be highly appreciated. Thanks in advance for your help.

Yes unix_timestamp() gets you the seconds elapsed since Unix epoch. You can subtract 60*60*1000 milliseconds and compare your field to get the desired records.
For Hive 1.2.0 and higher you can use current_timestamp
select *
from hivetable
where
datetimefield >= ((unix_timestamp()*1000) - 3600000);
For 1 day,convert the milliseconds to date format and use date_sub
select *
from hivetable
where
from_unixtime(unix_timestamp(datetimefield,'MM-dd-yyyy HH:mm:ss')) >=
date_sub(from_unixtime(unix_timestamp()),1);

Related

Timestamp update DB2 SQL

I would like to update query for records between 24 hours since the specific date and time. The current query works fine, except I need to update two timestamps manually. I am looking to reduce timestamps number to one or replace it with dynamic expression, so it will minimize human error if possible.
Current query looks like this:
SELECT timestamp
FROM table
WHERE timestamp BETWEEN '2023-01-18-06.00.00.000000' AND '2023-01-19-06.00.00.000000'
I have been trying multiple recommended options but it does not work yet:
WHERE timestamp > '2023-01-19-06.00.00.000000' - 24 HOURS
WHERE timestamp > '2023-01-19-06.00.00.000000' – ‘24 HOURS’
WHERE timestamp ('2023-01-19-06.00.00.000000' - 24 HOURS)
WHERE timestamp > '2023-01-19-06.00.00.000000' - '24.00.00.000000'
WHERE timestamp BETWEEN '2023-01-04-06.00.00.000000' AND INTERVAL - 24 HOURS
WHERE timestamp > CURRENT DATE - 24 HOURS
WHERE timestamp ('2023-01-19' - 1 DAY, ('06.00.00.000000' - 24 HOURS))
Could anyone let me know what I am doing incorrectly?
'2023-01-19-06.00.00.000000' - 24 HOURS is near, but incorrect because DB2 doesn't see the first value as a timestamp but as a string even if it makes the automatic cast in the working query. so what you have to do is to tell it is a timestamp, because you add a duration
with the timestamp keyword
WHERE yourtimestamp > timestamp '2023-01-19-06.00.00.000000' - 24 HOURS
or the timestamp function
WHERE yourtimestamp > timestamp('2023-01-19-06.00.00.000000') - 24 HOURS
or this notation
WHERE yourtimestamp > '2023-01-19-06.00.00.000000'::timestamp - 24 HOURS
if you're not using DB2LUW or an old version, one or more option may not be available
i suggest you try something like this
SELECT timestamp
FROM table cross join (values timestamp '2023-01-19-06.00.00.000000') as ref (stamp)
WHERE timestamp between ref.stamp - 24 hours and ref.stamp
For the past 24hrs, as implied by your example
WHERE timestamp > CURRENT DATE - 24 HOURS
You'd want to use CURRENT TIMESTAMP not CURRENT DATE
For a specific period, you'll always need two dates specified in the WHERE clause like so
WHERE timestamp BETWEEN startTs AND endTs
For a specific 24hr period from a given starting timestamp, you can do something like so:
WHERE timestamp BETWEEN startTs AND startTs + 24 hours
You can define startTs as a global variable, and use it in your select
create variable startTs timestamp default('2023-01-18 06:00:00.000');
SELECT timestamp
FROM table
WHERE timestamp BETWEEN startTs AND startTs + 24 hours;
Or you could use a table value constructor to store it for use...
WITH tmp (startTime) AS (
VALUES (timestamp('2023-01-19 06:00:00.000'))
)
select timestamp from table
where timestamp between (select startTime from tmp limit 1)
and (select startTime + 2 hours from tmp limit 1);
Depending on your use case, it might be worthwhile to encapsulate the statement as a stored procedure or a user defined table function (UDTF)...

timestamp string to timestamp in sql

Date data saved from stripe start_date as string timestamp like "1652789095".
Now I want to filter with this timestamp string form last 12 months.
what should I do ?
how can I filter with this timestamp string?
These are some examples - I'm sure there are plenty of options that would work.
convert to date
select *
from Table
where
to_timestamp(cast(start_date as int)::date > date_add(now(), interval -1 year);
work with unix timestamps
-- approx 1 year ago, by way of example
select *
from Table
where
start_date > '1621253095';
-- exactly one year ago, calculated dynamically
select *
from Table
where
start_date >
cast(unix_timestamp(date_add(now(), interval -1 year)) as varchar);
I'm not a MySQL guy really so forgive any syntax errors and fix up the sql as needed to work in MySQL.
Resources:
PostgreSQL: how to convert from Unix epoch to date?
https://www.postgresonline.com/article_pfriendly/3.html

Presto SQL / Athena: select between times across different days

I have a database that contains a series of events and their timestamp.
I find myself needing to select all events that happen between 11:00 and 11:10 and 21:00 and 21:05, for all days.
So what I would do is I extract from timestamp the hour and the minute, and:
SELECT *
WHERE (hour = 11 AND minute <= 10)
OR (hour = 21 AND minute <= 05)
However, I was wondering if there's a simpler / less verbose way to do this, such as when you query between dates:
SELECT *
WHERE date BETWEEN '2020-07-01' AND '2020-07-05'
I read here that this is doable in SQLite, I was wondering if it's possible to be done in presto as well. I've looked at the docs but couldn't find an analogue function that does what time() does in SQLite.
You could use date formatting functions, e.g. date_format, then string comparisons:
select *
from mytable
where
date_format(mydate, '%H:%i') between '11:00' and '11:09'
or date_format(mydate, '%H:%i') between '21:00' and '21:04'
Note that I substracted one minute from the upper bound, since I assume you don't want to include the last minute. between '11:00' and '11:09' gives you everything from 11:00:00 to 11:09:59.

SQL - how to change the format of a current_timestamp to have 'mm ss' as zeros?

I want to check if a metric is still missing 4 hours later and return a single record if it exists. I wrote a query that checks if there were metrics in the last 4 hours. But I need to check if there is a metric for a certain hour that was expected to load 4 hours before.
-- Returns records that appeared within the last 4 hours
select * from main.basic_metrics
where metric_name = 'common_metric'
and transaction_time > current_timestamp - interval 4 hours
The problem is that transaction_timeis in the following format 2019-10-30T12:00:00.000+0000 where mm ss are always zeros. So when I check it like transaction_time = current_timestamp - interval 4 hours it returns nothing since current_timestamp contains mm ss data.
How should I format timestamp to the format similar to transaction_time - 2019-10-30T12:00:00.000+0000 ?
UPD: There was a typo, mentioned in the comments below. fixed it
That should be very simple: cast the string to timestamp with time zone:
WHERE CAST(transaction_time AS timestamp with time zone)
> current_timestamp - INTERVAL '4 hours'
Try the following:
select * from main.basic_metrics
where metric_name = 'common_metric'
and transaction_time = date_trunc('hour',current_timestamp) - interval 4 hours
This is not necessarily the best query for what you're doing, but it does solve the problem you're having. My guess is that some version of "between" or > and < would solve it, however without knowing exactly how the "transaction time" is populated, I'm could only venture guesses.
The trick in my example is to "truncate" everything after the "hours" off of the current_timestamp using date_trunc()
Note: It helps a lot to realize that timestamps are NOT formatted. Timestamps are a single long integer field that happens to get formatted on your screen so you can make sense of it. Text comparisons are nearly always the wrong way to do things, and datetime aware functions are the preferred method of doing any comparison.

How to convert Epoch time to date?

Hi I have a column with number datatype
the data like 1310112000 this is a date, but I don't know how to make it in an understandable format:
ex: 10-mar-2013 12:00:00 pm
Can any one please help me.
That is EPOCH time: number of seconds since Epoch(1970-01-01). Use this:
SELECT CAST(DATE '1970-01-01' + ( 1 / 24 / 60 / 60 ) * '1310112003' AS TIMESTAMP) FROM DUAL;
Result:
08-JUL-11 08.00.03.000000000 AM
Please try
select from_unixtime(floor(EPOCH_TIMESTAMP/1000)) from table;
This will give the result like E.g: 2018-03-22 07:10:45
PFB refence from MYSQL
In Microsoft SQL Server, the previous answers did not work for me. But the following does work.
SELECT created_time AS created_time_raw,
dateadd( second, created_time, CAST( '1970-01-01' as datetime ) ) AS created_time_dt
FROM person
person is a database table, and created_time is an integer field whose value is a number of seconds since epoch.
There may be other ways to do the datetime arithmetic. But this is the first thing that worked. I do not know if it is MSSQL specific.