Redshift: Running query using GETDATE() at specified list of times - sql

So, I have a query that uses GETDATE() in WHERE and HAVING clauses:
SELECT GETDATE(), COUNT(*) FROM (
SELECT 1 FROM events
WHERE (event_time > (GETDATE() - interval '25 hours'))
GROUP BY id
HAVING MAX(event_time) BETWEEN (GETDATE() - interval '25 hours') AND (GETDATE() - interval '24 hours')
)
I'm basically trying to find the number of unique ids that have their latest event_time between 25 and 24 hours ago with respect to the current time.
The problem: I have another table query_dts which contains one column containing timestamps. Instead of running the above query on the current time, using GETDATE(), I need to run in on the timestamp of every entry of the query_dts table. Any ideas?
Note: I'm not really storing query_dts anywhere. I've created it like this:
WITH query_dts AS (
SELECT (
DATEADD(hour,-(row_number() over (order by true)), getdate())
) as n
FROM events LIMIT 48
),
which I got from here

How about avoiding the generator altogether and instead just splitting the intervals:
SELECT
dateadd(hour, -distance, getdate()),
count(0) AS event_count
FROM (
SELECT
id,
datediff(hour, max(event_time), getdate()) AS distance
FROM events
WHERE event_time > getdate() - INTERVAL '2 days'
GROUP BY id) AS events_with_distance
GROUP BY distance;

You can use a JOIN to combine the two queries. Then you just need to substitute the values for your date expression. I think this is the logic:
WITH query_dts AS (
SELECT DATEADD(hour, -(row_number() over (order by true)), getdate()) as n
FROM events
LIMIT 48
)
SELECT d.n, COUNT(*)
FROM (SELECT d.n
FROM events e JOIN
query_dts d
WHERE e.event_time > d.n
GROUP BY id
HAVING MAX(event_time) BETWEEN n - interval '25 hours' AND n
) i;

Here's what I ended up doing:
WITH max_time_table AS
(
SELECT id, max(event_time) AS max_time
FROM events
WHERE (event_time > GETDATE() - interval '74 hours')
GROUP BY id
),
query_dts AS
(
SELECT (DATEADD(hour,-(row_number() over (ORDER BY TRUE) - 1), getdate()) ) AS n
FROM events LIMIT 48
)
SELECT query_dts.n, COUNT(*)
FROM max_time_table JOIN query_dts
ON max_time_table.max_time BETWEEN (query_dts.n - interval '25 hours') AND (query_dts.n - interval '24 hours')
GROUP BY query_dts.n
ORDER BY query_dts.n DESC
Here, I selected 74 hours because I wanted 48 hours ago + 25 hours ago = 73 hours ago.
The problem is that this isn't a general-purpose way of doing this. It's a highly specific solution for this particular problem. Can someone think of a more general way of running a query dependent on GETDATE() using a column of dates in another table?

Related

How to get the percentage change from same time 7 days ago?

I have a big PostgreSQL database with time series data.
I query the data with a resample to one hour. What I want is to compare the the mean value from the last hour to the value 7 days ago at the same time and I don't know how to do it.
This is what I use to get the latest value.
SELECT DATE_TRUNC('hour', datetime) AS time, AVG(value) as value, id FROM database
GROUP BY id, time
WHERE datetime > now()- '01:00:00'::interval
You can use a CTE to calculate last week's average in the same time period, then join on id and hour.
with last_week as
(
SELECT
id,
extract(hour from datetime) as time,
avg(value) as avg_value
FROM my_table
where DATE_TRUNC('hour', datetime) =
(date_trunc('hour', now() - interval '7 DAYS'))
group by 1,2
)
select n.id,
DATE_TRUNC('hour', n.datetime) AS time_now,
avg(n.value) as avg_now,
t.avg_value as avg_last_week
from my_table n
left join last_week t
on t.id = n.id
and t.time = extract(hour from n.datetime)
where datetime > now()- '01:00:00'::interval
group by 1,2,4
order by 1
I'm making a few assumptions on how your data appear.
**EDIT - JUST NOTICED YOU ASKED FOR PERCENT CHANGE
Showing change as decimal...
select id,
extract(hour from time_now) as hour_now,
avg_now,
avg_last_week,
coalesce(((avg_now - avg_last_week) / avg_last_week), 0) AS CHANGE
from (
with last_week as
(
SELECT
id,
extract(hour from datetime) as time,
avg(value) as avg_value
FROM my_table
where DATE_TRUNC('hour', datetime) =
(date_trunc('hour', now() - interval '7 DAYS'))
group by 1,2
)
select n.id,
DATE_TRUNC('hour', n.datetime) AS time_now,
avg(n.value) as avg_now,
t.avg_value as avg_last_week
from my_table n
left join last_week t
on t.id = n.id
and t.time = extract(hour from n.datetime)
where datetime > now()- '01:00:00'::interval
group by 1,2,4
)z
group by 1,2,3,4
order by 1,2
db-fiddle found here: https://www.db-fiddle.com/f/rWJATypGzHPZ8sG2vXAGXC/4

How can I get the updated record in 30 days till today?

I have a small question about SQL Server: how to get the last 30 days information from this column from table1:
created_at
updated_at
2020-02-05T01:25:42Z
2020-02-05T01:25:42Z
2020-05-05T02:31:56Z
2020-05-05T02:31:56Z
With the above data, I would need something like day count within 30 days.
I have tried
SELECT * FROM table1
DATEDIFF(CAST(SUBSTR(updated_at,1,10)) AS VARCHAR,CURRENT_TIMESTAMP) BETWEEN 0 AND 30 ;
and
SELECT * FROM table1
WHERE updated_at BETWEEN DATETIME('now', '-30 days') AND DATETIME('now', 'localtime')
Would need your expertise to help me with this query
Thank you!
SELECT *
FROM (
SELECT otherColumns
, DATEADD(mi, DATEDIFF(mi, GETUTCDATE(), GETDATE()), updated_at) AS updated_at
FROM table1
) b
WHERE CAST(b.updated_at AS DATE) >= DATEADD(DAY,-30,GETDATE())
I think this will help you
If you want a count of updates by day for 30 (or so) days, then:
SELECT CONVERT(DATE, updated_at) as dte, COUNT(*)
FROM table1
WHERE updated_at >= DATEADD(DAY, -30, CONVERT(DATE, GETDATE()))
GROUP BY CONVERT(DATE, updated_at)
ORDER BY CONVERT(DATE, updated_at);
Note that SQLite date/time functions (which your code uses) are very peculiar to to SQLite. So are SQL Server's -- although I personally find them easier to remember.

how to get date different in postgres using date_part option

How to get date time difference in PostgreSQL
I am using below syntax
select id, A_column,B_column,
(SELECT count(*) AS count_days_no_weekend
FROM generate_series(B_column ::timestamp , A_column ::timestamp, interval '1 day') the_day
WHERE extract('ISODOW' FROM the_day) < 5) * 24 + DATE_PART('hour', B_column::timestamp-A_column ::timestamp ) as hrs
FROM table req where id='123';
If A_column=2020-05-20 00:00:00 and B_column=2020-05-15 00:00:00 I want to get 72(in hours).
Is there any possibility to skip weekends(Saturday and Sunday) in first one, it means to get the result as 72 hours(exclude weekend hours)
i am getting 0
But i need to get 72 hours
And if If A_column=2020-08-15 12:00:00 and B_column=2020-08-15 00:00:00 I want to get 12(in hours).
One option uses a lateral join and generate_series() to enumerate each and every hour between the two timestamps, while filtering out week-ends:
select t.a_column, t.b_column, h.count_hours_no_weekend
from mytable t
cross join lateral (
select count(*) count_hours_no_weekend
from generate_series(t.b_column::timestamp, t.a_column::timestamp, interval '1 hour') s(col)
where extract('isodow' from s.col) < 5
) h
where id = 123
I would attack this by calculating the weekend hours to let the database deal with daylight savings time. I would then subtract the intervening weekend hours from the difference between the two date values.
with weekend_days as (
select *, date_part('isodow', ddate) as dow
from table1
cross join lateral
generate_series(
date_trunc('day', b_column),
date_trunc('day', a_column),
interval '1 day') as gs(ddate)
where date_part('isodow', ddate) in (6, 7)
), weekend_time as (
select id,
sum(
least(ddate + interval '1 day', a_column) -
greatest(ddate, b_column)
) as we_ival
from weekend_days
group by id
)
select t.id,
a_column - b_column as raw_difference,
coalesce(we_ival, interval '0') as adjustment,
a_column - b_column -
coalesce(we_ival, interval '0') as adj_difference
from weekend_time w
left join table1 t on t.id = w.id;
Working fiddle.

Grab abandoned carters from the last hour in Oracle Responsys

I'm trying to grab people out of a table who have an abandon date between 20 minutes ago and 2 hours ago. This seems to grab the right amount of time, but is all 4 hours old:
SELECT *
FROM $A$
WHERE ABANDONDATE >= SYSDATE - INTERVAL '2' HOUR
AND ABANDONDATE < SYSDATE - INTERVAL '20' MINUTE
AND EMAIL_ADDRESS_ NOT IN(SELECT EMAIL_ADDRESS_ FROM $B$ WHERE ORDERDATE >= sysdate - 4)
also, it grabs every record for everyone and I only want the most recent product abandoned (highest abandondate) for each email address. I can't seem to figure this one out.
If the results are EXACTLY four hours old, it is possible that there is a time zone mismatch. What is the EXACT data type of ABANDONDATE in your database? Perhaps TIMESTAMP WITH TIMEZONE? Four hours seems like the difference between UTC and EDT (Eastern U.S. with daylight savings time offset).
For your other question, did you EXPECT your query to only pick up the most recent product abandoned? Which part of your query would do that? Instead, you need to add row_number() over (partition by [whatever identifies clients etc.] order by abandondate), make the resulting query into a subquery and wrap it within an outer query where you filter by (WHERE clause) rn = 1. We can help with this if you show us the table structure (name and data type of columns in the table - only the relevant columns - including which is or are Primary Key).
Try
SELECT * FROM (
SELECT t.*,
row_number()
over (PARTITION BY email_address__ ORDER BY ABANDONDATE DESC) As RN
FROM $A$ t
WHERE ABANDONDATE >= SYSDATE - INTERVAL '2' HOUR
AND ABANDONDATE < SYSDATE - INTERVAL '20' MINUTE
AND EMAIL_ADDRESS_ NOT IN(
SELECT EMAIL_ADDRESS_ FROM $B$
WHERE ORDERDATE >= sysdate - 4)
)
WHERE rn = 1
another approach
SELECT *
FROM $A$
WHERE (EMAIL_ADDRESS_, ABANDONDATE) IN (
SELECT EMAIL_ADDRESS_, MAX( ABANDONDATE )
FROM $A$
WHERE ABANDONDATE >= SYSDATE - INTERVAL '2' HOUR
AND ABANDONDATE < SYSDATE - INTERVAL '20' MINUTE
AND EMAIL_ADDRESS_ NOT IN(
SELECT EMAIL_ADDRESS_ FROM $B$
WHERE ORDERDATE >= sysdate - 4)
GROUP BY EMAIL_ADDRESS_
)

Include count 0 in my group by request

I have a COUNT + GROUP BY request for postgresql.
SELECT date_trunc('day', created_at) AS "Day" ,
count(*) AS "No. of actions"
FROM events
WHERE created_at > now() - interval '2 weeks'
AND "events"."application_id" = 7
AND ("what" LIKE 'ACTION%')
GROUP BY 1
ORDER BY 1
My request counts the number of "ACTION*" per day on my events table (a log table) in 2weeks for my application with the id 7. But the problem is it doesn't show when there is a Day without any actions recorded.
I know it is because of my WHERE clause, so I tried some stuff with JOIN requests, but nothing gave me the good answer.
Thank you for your help
Make a date table:
CREATE TABLE "myDates" (
"DateValue" date NOT NULL
);
INSERT INTO "myDates" ("DateValue")
select to_date('20000101', 'YYYYMMDD') + s.a as dates from generate_series(0,36524,1) as s(a);
Then left join on it:
SELECT d.DateValue AS "Day" ,
count(*) AS "No. of actions"
FROM myDates d left join events e on date_trunc('day', "events"."created_at") = d.DateValue
WHERE created_at > now() - interval '2 weeks' AND
"events"."application_id" = 7 AND
("what" LIKE 'ACTION%')
GROUP BY 1 ORDER BY 1
Ok a friend helped me, here is the answer:
SELECT "myDates"."DateValue" AS "Day" ,
(select count(*) from events WHERE date_trunc('day', "events"."created_at") = "myDates"."DateValue" AND
("events"."application_id" = 4) AND
("events"."what" LIKE 'ACTION%')) AS "No. of actions"
FROM "myDates"
where ("myDates"."DateValue" > now() - interval '2 weeks') AND ("myDates"."DateValue" < now())
So we need to ask all the date from the MyDates table, and ask the count on the second argument.