Postgresql Distinct Statement - sql

How can i get the minutes distinct value with timestamp ...
Like , if table contains 1 minute 100 records are there...so i want count of records present or not per minute ...
For example,
SELECT DISTINCT(timestamp) FROM customers WHERE DATE(timestamp) = CURRENT_DATE
Result should be ..like
timestamp record
30-12-2019 11:30 5
30-12-2019 11:31 8

One option would be ::date conversion for timestamp column including GROUP BY :
SELECT timestamp, count(*)
FROM tab
WHERE timestamp::date = current_date
GROUP BY timestamp
Demo for current day
timestamp::date might be replaced with date(timestamp) like in your case.
Update : If the table contains data with precision upto microseconds, then
SELECT to_char(timestamp,'YYYY-MM-DD HH24:MI'), count(*)
FROM tab
WHERE date(timestamp) = current_date
GROUP BY to_char(timestamp,'YYYY-MM-DD HH24:MI')
might be considered.

Try something like the following:
SELECT DATE_TRUNC('minute', timestamp) as timestamp, COUNT(*) as record
FROM customers
WHERE DATE(timestamp) = CURRENT_DATE
GROUP BY DATE_TRUNC('minute', timestamp)
ORDER BY DATE_TRUNC('minute', timestamp)

Related

Oracle SQL Select Rows where column with date is at most one hour before now ( sysdate )

I have this table
invoice_number
creation_date
1
2023-02-06T10:38:37.000+00:00
2
2023-02-06T10:49:34.000+00:00
I want to fetch rows only where creation_Date is 1 hours ago or less, how to do that ?
Since you are comparing to a TIMESTAMP WITH TIME ZONE column, then you can use:
SELECT *
FROM table_name
WHERE creation_date >= SYSTIMESTAMP - INTERVAL '1' HOUR;
Check this one;
... where creation_Date>sysdate - interval '1' hour

Window function for average

I have this table timestamp_table and I'm using Presto SQL
timestamp | id
2021-01-01 10:00:00 | 2456
I would like to compute the number of unique IDs in the last 24 and 48 hours and I thought this could be achieved with window functions but I'm struggling. This is my proposed solution, but it needs work
SELECT COUNT(id) OVER (PARTITION BY timestamp ORDER BY timestamp RANGE BETWEEN INTERVAL '24' HOUR PRECEDING AND CURRENT ROW)
You're probably having trouble due to the PARTITION BY clause, since the COUNT will only apply to rows within the same timestamp values.
Try something like this, as a starting point:
The fiddle
SELECT *
, COUNT(id) OVER (ORDER BY timestamp RANGE BETWEEN INTERVAL '24' HOUR PRECEDING AND CURRENT ROW)
, MIN(id) OVER (ORDER BY timestamp RANGE BETWEEN INTERVAL '24' HOUR PRECEDING AND CURRENT ROW)
FROM tbl
;
I think that you can't get data for both time intervals by one table scan. Because row that is in last 24 hours must be in both groups: 24 hours and 48 hours. So you must do 2 request or union them.
select 'h24', count(distinct id)
from timestamp_table
where timestamp < current_timestamp and timestamp >= date_add(day, -1, current_timestamp)
union all
select 'h48', count(distinct id)
from timestamp_table
where timestamp < current_timestamp and timestamp >= date_add(day, -2, current_timestamp)

How to get the percentage change from same time 7 days ago?

I have a big PostgreSQL database with time series data.
I query the data with a resample to one hour. What I want is to compare the the mean value from the last hour to the value 7 days ago at the same time and I don't know how to do it.
This is what I use to get the latest value.
SELECT DATE_TRUNC('hour', datetime) AS time, AVG(value) as value, id FROM database
GROUP BY id, time
WHERE datetime > now()- '01:00:00'::interval
You can use a CTE to calculate last week's average in the same time period, then join on id and hour.
with last_week as
(
SELECT
id,
extract(hour from datetime) as time,
avg(value) as avg_value
FROM my_table
where DATE_TRUNC('hour', datetime) =
(date_trunc('hour', now() - interval '7 DAYS'))
group by 1,2
)
select n.id,
DATE_TRUNC('hour', n.datetime) AS time_now,
avg(n.value) as avg_now,
t.avg_value as avg_last_week
from my_table n
left join last_week t
on t.id = n.id
and t.time = extract(hour from n.datetime)
where datetime > now()- '01:00:00'::interval
group by 1,2,4
order by 1
I'm making a few assumptions on how your data appear.
**EDIT - JUST NOTICED YOU ASKED FOR PERCENT CHANGE
Showing change as decimal...
select id,
extract(hour from time_now) as hour_now,
avg_now,
avg_last_week,
coalesce(((avg_now - avg_last_week) / avg_last_week), 0) AS CHANGE
from (
with last_week as
(
SELECT
id,
extract(hour from datetime) as time,
avg(value) as avg_value
FROM my_table
where DATE_TRUNC('hour', datetime) =
(date_trunc('hour', now() - interval '7 DAYS'))
group by 1,2
)
select n.id,
DATE_TRUNC('hour', n.datetime) AS time_now,
avg(n.value) as avg_now,
t.avg_value as avg_last_week
from my_table n
left join last_week t
on t.id = n.id
and t.time = extract(hour from n.datetime)
where datetime > now()- '01:00:00'::interval
group by 1,2,4
)z
group by 1,2,3,4
order by 1,2
db-fiddle found here: https://www.db-fiddle.com/f/rWJATypGzHPZ8sG2vXAGXC/4

BigQuery Where Date is Less Than or Equal to 3 Days Minus Current Date

I'm trying to create a query to only return data where date is minus 3 days from the current date. I've tried:
date <= DATE_ADD(CURRENT_DATE(), -3, 'DAY')
But this returns Error: Expected INTERVAL expression
See WHERE clause in below example
#standardSQL
WITH yourTable AS (
SELECT i, date
FROM UNNEST(GENERATE_DATE_ARRAY('2017-04-15', '2017-04-28')) AS date WITH OFFSET AS i
)
SELECT *
FROM yourTable
WHERE date <= DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)
-- ORDER BY date
Btw, in case if you are still with Legacy SQL - see below example
#legacySQL
SELECT *
FROM -- yourTable
(SELECT 1 AS id, DATE('2017-04-20') AS date),
(SELECT 2 AS id, DATE('2017-04-21') AS date),
(SELECT 3 AS id, DATE('2017-04-22') AS date),
(SELECT 4 AS id, DATE('2017-04-23') AS date),
(SELECT 5 AS id, DATE('2017-04-24') AS date),
(SELECT 6 AS id, DATE('2017-04-25') AS date)
WHERE TIMESTAMP(date) <= DATE_ADD(TIMESTAMP(CURRENT_DATE()), -3, 'DAY')
-- ORDER BY date
This works with a string formatted date.
DATE(TIMESTAMP(date)) <= DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)
Just tested this and seems to work.
I added this :
and DATE(TIMESTAMP(datevalue)) >= DATE_SUB(CURRENT_DATE(), INTERVAL 21 DAY)
and managed to get all records greater than last 21 days worth. Only thing I changed from #ericbrownaustin 's code was changed the 'date' in the first piece of code in the second set of parenthesis.

Grab abandoned carters from the last hour in Oracle Responsys

I'm trying to grab people out of a table who have an abandon date between 20 minutes ago and 2 hours ago. This seems to grab the right amount of time, but is all 4 hours old:
SELECT *
FROM $A$
WHERE ABANDONDATE >= SYSDATE - INTERVAL '2' HOUR
AND ABANDONDATE < SYSDATE - INTERVAL '20' MINUTE
AND EMAIL_ADDRESS_ NOT IN(SELECT EMAIL_ADDRESS_ FROM $B$ WHERE ORDERDATE >= sysdate - 4)
also, it grabs every record for everyone and I only want the most recent product abandoned (highest abandondate) for each email address. I can't seem to figure this one out.
If the results are EXACTLY four hours old, it is possible that there is a time zone mismatch. What is the EXACT data type of ABANDONDATE in your database? Perhaps TIMESTAMP WITH TIMEZONE? Four hours seems like the difference between UTC and EDT (Eastern U.S. with daylight savings time offset).
For your other question, did you EXPECT your query to only pick up the most recent product abandoned? Which part of your query would do that? Instead, you need to add row_number() over (partition by [whatever identifies clients etc.] order by abandondate), make the resulting query into a subquery and wrap it within an outer query where you filter by (WHERE clause) rn = 1. We can help with this if you show us the table structure (name and data type of columns in the table - only the relevant columns - including which is or are Primary Key).
Try
SELECT * FROM (
SELECT t.*,
row_number()
over (PARTITION BY email_address__ ORDER BY ABANDONDATE DESC) As RN
FROM $A$ t
WHERE ABANDONDATE >= SYSDATE - INTERVAL '2' HOUR
AND ABANDONDATE < SYSDATE - INTERVAL '20' MINUTE
AND EMAIL_ADDRESS_ NOT IN(
SELECT EMAIL_ADDRESS_ FROM $B$
WHERE ORDERDATE >= sysdate - 4)
)
WHERE rn = 1
another approach
SELECT *
FROM $A$
WHERE (EMAIL_ADDRESS_, ABANDONDATE) IN (
SELECT EMAIL_ADDRESS_, MAX( ABANDONDATE )
FROM $A$
WHERE ABANDONDATE >= SYSDATE - INTERVAL '2' HOUR
AND ABANDONDATE < SYSDATE - INTERVAL '20' MINUTE
AND EMAIL_ADDRESS_ NOT IN(
SELECT EMAIL_ADDRESS_ FROM $B$
WHERE ORDERDATE >= sysdate - 4)
GROUP BY EMAIL_ADDRESS_
)