I have a Postgres database table of 'events' (i.e. rows with a timestamp column). I want to count the number of events that are separated by more than a specified minimum time gap and by less than or equal to a specified maximum time gap.
For example, if there is an event on 6 consecutive days but I specify a minimum time gap of 2 days, I only want to register a count of 1 for those 6 events.
At the same time, if I specify a maximum time gap of 30 days, if two events are 30 days apart I want to register a count of 2 for the pair, but if they are 31 days apart I want to register a count of 0.
The accepted answer to the following post gives a method for counting events and satisfying the 'maximum gap' requirement using the Postgres generate_series function:
Best way to count rows by arbitrary time intervals
Maybe it's possible to modify the suggested solution to also satisfy the 'minimum gap' requirement. Can anyone advise on how I can accomplish this? Thanks.
Related
I need to write a SQL query that helps return the highest count in a given hourly range. The problem is that in my table, it just logs orders as they come and doesn’t have a unique identifier that separates hours from hours.
So basically, I need to find the highest number of orders (on any given hour), from 7/08/2022, - 7/15/2022, have a table that does not distinguish distinct hour sets, and logs orders as they come.
I have tried to use a query that combines MAX(), COUNT(), and DATETIME(), but to no avail.
Can I please receive some help?
I've had to tackle this kind of measurement in the past..
Here's what I did for 15 minute intervals:
My datetime column is named datreg in my database log area.
cast(round(floor(cast(datreg as float(53))*24*4)/(24*4),5) as smalldatetime
I times by 4 in this formula, to get 4 intervals inside my 24 hour period.. For you it would look like this to get just hourly intervals:
cast(round(floor(cast(datreg as float(53))*24)/(24),5) as smalldatetime
This is a little piece of magic when it comes to dashboards and reports.
i have some experience with pandas - but cannot figure out the following:
i have several weeks of timestamped data with multiple records within one day,
i want to add a column in which, for each day, the maximum value of the remaining records of that day is displayed.
so if 5 records remain in a particular day, i need the max the next 5 records, after that, the max of next 4 records etc etc.
I have tried to use Group By but this does not seem to do the trick,
can somebody help me out?
exampledata
This is not the fastest, but you can try this -
dt['mvalue'] = dt.sort('datetime', ascending=False).groupby('date').value.cummax()
It simply does rolling max on a reverse sorted series
I have a small application that can read weigh scale weights continuously.
I want users to only capture when the weight stabilizes for about 3 seconds.
How can I achieve that?
You need to store the received values with there timestamps in a queue and then calculate the min, max and average over the last three seconds.
First create a class to hold the values and the timestamp, for example called measure.
Then create another class with a qqeue of measure. Implement functions for adding a measure to the class internal queue and to calculate the min,max and average for a timespan. The final function can then use min, max and average to say if the last measure is near enough to the average within a time span.
Instead of a queue you may use a data table and then use sql commands to get that scalars for min, max and average.
If the values are delivered with a constant interval in between, you can avoid the timespan parts and only calculate over the last x values. For example if the scale delivers a new value every 0.5 second, you will have 6 values for the last three seconds.
A FIFO will store the values (use an array with custom add function or a queue). To know if the last values are stable, you need to know what is the min, max and average over the last measures. That enables you to decide if the last value is near the average or if the diff to min and max is too large.
Ie measures:
3 4 8 2 5 4 gives min=2, max=8, avg=4.3. The last val is near to avg but far from max
5 4 6 4 5 5 gives min=4, max=6, avg=4.9, The last value is near min, max and avg. That seems to be a good last measure.
I am unable to solve an Esper problem. I have to calculate Max and Min of 24 hours and then i have to check if tick price goes above this value ( This has to be done on multiple securities .) Here is the code which i am using. But i am betting alot of performance hit and getting an event fired more than once.
create context
GroupSecurity
partition by
security
from
Tick;
context
GroupSecurity
select
currentData.last, max(groupedData.last)
from
Tick as currentData unidirectional, Tick.win:time_batch(24 hour) as groupedData
having
currentData.last > max(groupedData.last);
How can i Improve this code?
The "Tick.win:time_batch(24 hour)" tells the engine to retain in memory all 24 hours of Tick events that may arrive, and only spit these out after 24 hours.
I think a better approach would be to have the engine compute say 1-minute maximums and take the 1-minute maximums for 24 hours and take the max of that, i.e. retain and build a max from no more then 24*60 rows where each row keeps a 1-minute max.
Suppose ,I have a table which has all the billing records. Now I want to see the sales trend for a user given time duration group by each 3 days ...what should be the sql query regarding this?
please help,Otherwise I am gone ...
I can only give a vague suggestion as per the question, however you may want to have a derived column with a standardised date (as per MS date format, just a number per day) that you could then use a modulus (3) on so that days are equal per 3 day period. You can then group and aggregate over this column to get the values for a 3 day period. Obviously to display the date nicely you would have to multiply back and convert your column as well.
Again I'm not sure of the specifics, but I think this general idea could be achieved to get a result (may well not be the best way so it would help to add more to the question...)