How to retrieve day from single column based on timestamp - postgresql-9.5

I am inserting data daily into the database. I need to retrieve a single column data based on timestamp. I need to display today Usage and yesterday Usage based on the timestamp.
My actual table
name usage timestamp
prod 200 2019-01-08 09:22:53.364366
Test 100 2019-01-08 09:22:53.364366
qality 50 2019-01-08 09:22:53.364366
prod 270 2019-01-09 08:22:53.364366
Test 300 2019-01-09 08:22:53.364366
qality 90 2019-01-09 08:22:53.364366
Expecting output:
name usage(yesterday) usage(Today) timestamp
prod 200 270 2019-01-09 08:22:53.364366
Test 100 300 2019-01-09 08:22:53.364366
qality 50 90 2019-01-09 08:22:53.364366

Try aggregating by the name, and then check the timestamp:
SELECT
name,
MAX(CASE WHEN timestamp::date = NOW()::date - INTERVAL '1 DAY'
THEN usage END) AS u_yesterday,
MAX(CASE WHEN timestamp::date = NOW()::date
THEN usage END) AS u_today,
MAX(timestamp) AS timestamp
FROM yourTable
GROUP BY
name;

Related

Sql query to select the records for past 60 seconds and compare the temperature of the selected records and if any record has higher value then ignore

I am trying to eliminate the data anomalies in the data I am receiving from eventhub and send only selected data to azure functions through Azure stream analytics for that I am writing a sql query where I need some help
Requirement: I need to collect the past 60 seconds data and need to group by Id and compare the records that I received in the 60 seconds and If any record value is way higher than the selected values than ignore that record (for example, I will collect the 4 records in past 60 seconds and if the data is 40 40 40 40 5. We should drop the 5. Example 2 - 20 20 20 500 drop the 500. ).
My sql table will be something like this:
id Temp date datetime
123 30 2023-01-01 2023-01-01 12:00:00
124 35 2023-01-01 2023-01-01 12:00:00
123 31 2023-01-01 2023-01-01 12:00:00
123 33 2023-01-01 2023-01-01 12:00:00
123 60 2023-01-01 2023-01-01 12:00:00
124 36 2023-01-01 2023-01-01 12:00:00
124 36 2023-01-01 2023-01-01 12:00:00
124 8 2023-01-01 2023-01-01 12:00:00
124 36 2023-01-01 2023-01-01 12:00:00
I need to eliminate the records that are not in the range with the other records
I'll leave the details of the comparison up to you, but you can use a CROSS APPLY to gather the data for comparison.
Something like:
SELECT *
FROM TemperatureData T
CROSS APPLY (
SELECT AVG(T2.Temp * 1.0) AS PriorAvgTemp, COUNT(*) As PriorCount
FROM TemperatureData T2
WHERE T2.id = T.id
AND T2.datetime >= DATEADD(second, -60, T.datetime)
AND T2.datetime < T.datetime
) P
WHERE T.Temp BETWEEN P.PriorAvgTemp - 10 AND P.PriorAvgTemp + 10
--OR P.PriorCount < 3 -- Should we allow if there is insufficient prior data
--AND P.PriorCount >= 3 -- Should we omit if there is insufficient prior data
Be sure you have an index on TemperatureData(id, datetime).
If you are willing to accept the last N values instead of a time range, windowed aggregate calculation may be more efficient.
SELECT *
FROM (
SELECT *,
AVG(T.Temp * 1.0)
OVER(PARTITION BY id ORDER BY datetime
ROWS BETWEEN 60 PRECEDING AND 1 PRECEDING)
AS PriorAvgTemp,
COUNT(*)
OVER(PARTITION BY id ORDER BY datetime
ROWS BETWEEN 60 PRECEDING AND 1 PRECEDING)
AS PriorCount
FROM TemperatureData T
) TT
WHERE TT.Temp BETWEEN TT.PriorAvgTemp - 10 AND TT.PriorAvgTemp + 10
--OR TT.PriorCount < 3 -- Should we allow if there is insufficient prior data
--AND TT.PriorCount >= 3 -- Should we omit if there is insufficient prior data
Please note: The above is untested code, which may need some syntax fixes and debugging. If you discover errors, please comment and I will correct the post.

Extract 30 minutes from timestamp and group it by 30 mins time interval -PGSQL

In PostgreSQL I am extracting hour from the timestamp using below query.
select count(*) as logged_users, EXTRACT(hour from login_time::timestamp) as Hour
from loginhistory
where login_time::date = '2021-04-21'
group by Hour order by Hour;
And the output is as follows
logged_users | hour
--------------+------
27 | 7
82 | 8
229 | 9
1620 | 10
1264 | 11
1990 | 12
1027 | 13
1273 | 14
1794 | 15
1733 | 16
878 | 17
126 | 18
21 | 19
5 | 20
3 | 21
1 | 22
I want the same output for same SQL for 30 mins. Please suggest
SELECT to_timestamp((extract(epoch FROM login_time::timestamp)::bigint / 1800) * 1800)::timestamp AS interval_30_min
, count(*) AS logged_users
FROM loginhistory
WHERE login_time::date = '2021-04-21' -- inefficient!
GROUP BY 1
ORDER BY 1;
Extracting the epoch gets the number of seconds since the epoch. Integer division truncates. Multiplying back effectively rounds down, achieving the same as date_trunc() for arbitrary time intervals.
1800 because 30 minutes contain 1800 seconds.
Detailed explanation:
Truncate timestamp to arbitrary intervals
The cast to timestamp makes me wonder about the actual data type of login_time? If it's timestamptz, the cast depends on your current time zone setting and sets you up for surprises if that setting changes. See:
How do I match an entire day to a datetime field?
Subtract hours from the now() function
Ignoring time zones altogether in Rails and PostgreSQL
Depending on the actual data type, and exact definition of your date boundaries, there is a more efficient way to phrase your WHERE clause.
You can change the column on which you're aggregating to use the minute too:
select
count(*) as logged_users,
CONCAT(EXTRACT(hour from login_time::timestamp), '-', CASE WHEN EXTRACT(minute from login_time::timestamp) < 30 THEN 0 ELSE 30 END) as HalfHour
from loginhistory
where login_time::date = '2021-04-21'
group by HalfHour
order by HalfHour;

How to build in product expiration in SQL?

I have a table that looks like the following and from it I want to get days remaining of total doses:
USER|PURCHASE_DATE|DOSES
1111|2017-07-27|15
2222|2020-07-17|3
3333|2021-02-01|5
If the doses do not have an expiration and each can be used for 90 days then the SQL I use is:
SUM(DOSES)*90-DATEDIFF(DAY,MIN(DATE),GETDATE())
USER|DAYS_REMAINING
1111|0
2222|6
3333|385
But what if I want to impose an expiration of each dose at a year? What can I do to modify my SQL to get the following desired answer:
USER|DAYS_REMAINING
1111|-985
2222|6
3333|300
It probably involves taking the MIN between when doses expire and how long they would last but I don't know how to aggregate in the expiry logic.
MIN is a aggregate function you want LEAST to pick between the two values:
WITH data(user,purchase_date, doses) AS (
SELECT * FROM VALUES
(1111,'2017-07-27',15),
(2222,'2020-07-17',3),
(3333,'2021-02-01',5)
)
SELECT
d.*,
d.doses * 90 AS doses_duration,
365::number AS year_duration,
least(doses_duration, year_duration) as max_duration,
DATEADD('day', max_duration, d.purchase_date)::date as last_dose_day,
DATEDIFF('day', current_date, last_dose_day) as day_remaining
FROM data AS d
ORDER BY 1;
gives:
USER PURCHASE_DATE DOSES DOSES_DURATION YEAR_DURATION MAX_DURATION LAST_DOSE_DAY DAY_REMAINING
1111 2017-07-27 15 1350 365 365 2018-07-27 -986
2222 2020-07-17 3 270 365 270 2021-04-13 5
3333 2021-02-01 5 450 365 365 2022-02-01 299
which can all be rolled together with a tiny fix on the date_diff, as:
WITH data(user,purchase_date, doses) AS (
SELECT * FROM VALUES
(1111,'2017-07-27',15),
(2222,'2020-07-17',3),
(3333,'2021-02-01',5)
)
SELECT
d.user,
DATEDIFF('day', current_date, DATEADD('day', least(d.doses * 90, 365::number), d.purchase_date)::date)+1 as day_remaining
FROM data AS d
ORDER BY 1;
giving:
USER DAY_REMAINING
1111 -985
2222 6
3333 300

Find discarded records postgresql

I have this query
select count(id)filter(where id>2 and id<=50) from table;
I want to find records that are eliminated by this filter
Yes! I can do this to find those records
select count(id)filter(where id<=2 or id>50) from table;
But suppose I have complex query I replaced my formula with id in above query for example.
I have a formula that calculates three different times based on different values now if i want to filter each time on some condition I can use filter for example
These are my filters:
> start_time<= 40 mins and start_time> 5 mins
> end_time<= 10 mins and end_time> 1 mins
> journey_time<= 80 mins and journey_time> 10 mins
> Total_time(start_time+end_time+journey_time) <= 150 and Total_time(start_time+end_time+journey_time) > 15
If I want to filter I have to write my formula 8 times (To filter < and >= for each time and total time) This will be my query
select
avg(start_time_formula)filter(where start_time_formula<= 40 and
start_time_formula>5),
avg(end_time_formula)filter(where end_time_formula<= 10 and
end_time_formula>1),
avg(journey_time_formula)filter(where journey_time_formula<= 80 and
journey_time_formula>10)
from table
where (start_time_formula+end_time_formula+journey_time_formula <=150 and
start_time_formula+end_time_formula+journey_time_formula > 15)
Now if I want to find all the discarded values also.
Do I have to write same formula 8 more times that will replace > with <= and "AND" with "OR" so it give me the discarded results or is there any other way to find the discarded values?
Update
My table values are
id start_time end_time journey_time Out_time
1 2018-04-06 01:37:36 2018-04-06 10:37:36 2018-04-06 04:37:36 2018-04-06
11:37:36
2 2018-04-16 02:37:36 2018-04-16 08:37:36 2018-04-16 06:37:36 2018-04-16
07:37:36
3 2018-05-10 01:37:36 2018-04-10 11:37:36 2018-04-06 09:37:36 2018-04-10
10:11:36
4 2018-05-10 04:37:36 2018-05-10 5:00:36 2018-05-10 04:47:36 2018-05-10
05:5:36
My Calculations are
start_time = journey_time - start_time
journey_time = end_time - journey_time
end_time = Out_time - end_time
This is my desired Output
start_time journey_time end_time discarded
10 mins 13 mins 5 mins 3
thanks
use condition by using case when,so your query would be like below
select
sum(case when start_time_formula <= 40 and start_time_formula>5 then 1 else 0 end),
sum(case when end_time_formula<= 10 and end_time_formula>1 then 1 else 0 end) ,
sum(case when where journey_time_formula<= 80 and journey_time_formula>10 then 1 else o end )
from table
where (start_time_formula+end_time_formula+journey_time_formula <=150 and
start_time_formula+end_time_formula+journey_time_formula > 15)

sql sliding window - finding max value over interval

i have a sliding window problem. specifically, i do not know where my window should start and where it should end. i do know the size of my interval/window.
i need to find the start/end of the window that delivers the best (or worst, depending on how you look at it) case scenario.
here is an example dataset:
value | tstamp
100 | 2013-02-20 00:01:00
200 | 2013-02-20 00:02:00
300 | 2013-02-20 00:03:00
400 | 2013-02-20 00:04:00
500 | 2013-02-20 00:05:00
600 | 2013-02-20 00:06:00
500 | 2013-02-20 00:07:00
400 | 2013-02-20 00:08:00
300 | 2013-02-20 00:09:00
200 | 2013-02-20 00:10:00
100 | 2013-02-20 00:11:00
let's say i know that my interval needs to be 5 minutes. so, i need to know the value and timestamps included in the 5 minute interval where the sum of 'value' is the highest. in my above example, the rows from '2013-02-20 00:04:00' to '2013-02-20 00:08:00' would give me a sum of 400+500+600+500+400 = 2400, which is the highest value over 5 minutes in that table.
i'm not opposed to using multiple tables if needed. but i'm trying to find a "best case scenario" interval. results can go either way, as long as they net the interval. if i get all data points over that interval, it still works. if i get the start and end points, i can use those as well.
i've found several sliding window problems for SQL, but haven't found any where the window size is the known factor, and the starting point is unknown.
SELECT *,
(
SELECT SUM(value)
FROM mytable mi
WHERE mi.tstamp BETWEEN m.tstamp - '5 minute'::INTERVAL AND m.tstamp
) AS maxvalue
FROM mytable m
ORDER BY
maxvalue DESC
LIMIT 1
In PostgreSQL 11 and above:
SELECT SUM(value) OVER (ORDER BY tstamp RANGE '5 minute' PRECEDING) AS maxvalue,
*
FROM mytable m
ORDER BY
maxvalue DESC
LIMIT 1