ESPER: Find Max and Min of 24 hours and check if price goes above the Max of previous 24 hours value - stock

I am unable to solve an Esper problem. I have to calculate Max and Min of 24 hours and then i have to check if tick price goes above this value ( This has to be done on multiple securities .) Here is the code which i am using. But i am betting alot of performance hit and getting an event fired more than once.
create context
GroupSecurity
partition by
security
from
Tick;
context
GroupSecurity
select
currentData.last, max(groupedData.last)
from
Tick as currentData unidirectional, Tick.win:time_batch(24 hour) as groupedData
having
currentData.last > max(groupedData.last);
How can i Improve this code?

The "Tick.win:time_batch(24 hour)" tells the engine to retain in memory all 24 hours of Tick events that may arrive, and only spit these out after 24 hours.
I think a better approach would be to have the engine compute say 1-minute maximums and take the 1-minute maximums for 24 hours and take the max of that, i.e. retain and build a max from no more then 24*60 rows where each row keeps a 1-minute max.

Related

Postgres SQL count events with minimum and maximum time gap

I have a Postgres database table of 'events' (i.e. rows with a timestamp column). I want to count the number of events that are separated by more than a specified minimum time gap and by less than or equal to a specified maximum time gap.
For example, if there is an event on 6 consecutive days but I specify a minimum time gap of 2 days, I only want to register a count of 1 for those 6 events.
At the same time, if I specify a maximum time gap of 30 days, if two events are 30 days apart I want to register a count of 2 for the pair, but if they are 31 days apart I want to register a count of 0.
The accepted answer to the following post gives a method for counting events and satisfying the 'maximum gap' requirement using the Postgres generate_series function:
Best way to count rows by arbitrary time intervals
Maybe it's possible to modify the suggested solution to also satisfy the 'minimum gap' requirement. Can anyone advise on how I can accomplish this? Thanks.

How to deal with high frequency queries

I have a query that should run every 30 seconds. I am doing so because I want to track if the Cycle Time which is displayed in seconds, decreases or increases for a given timerange. If it execedes the accepted limit, I will get a notification.
Here a more specific display : The aim is to have a cycle time < 60s, because we are working in seconds a one time increase or decrease of the cyle time would not be very meaningful, so what I would do is take the last ten cyle times and calculate the average of it. If this > 60 seconds I will get notified.
BUT To make the tracking as accurate as possible I need the query to run evey 30 seconds. QUERY :
SELECT
CAST([Cycletime (s)] as Float) as [Cycletime (s)]
,Machine
,Date
FROM v_Analysis
WHERE Date >= CAST( GETDATE() AS Date )
My question now is how much is the 30 seconds intervall of the query is affecting the performace of the database and if so how do we improve the performance.

SUM of last 24 hour scores within a specific range in a sorted set (Redis)

Is there a way to calculate the SUM of scores saved under 24 hour respecting performance of the Redis server ? (For around 1Million new rows added per day)
What is the right format to use in order to store timestamp and score of users using sorted sets ?
Actually I am using this command:
ZADD allscores 1570658561 20
As score, it is the actual time in seconds ... and other field is the real score.
But, there is a problem here ! When another user get the same score (20), it is not added since it's already present - Any solution for this problem ?
I am thinking to use a LUA script, but there is 2 headaches:
The LUA script will block other commands from working until it is finished the job (Which is not a good practice for my case since the script have to work 24/24 7/7 meanwhile many users have to fetch datas in the same time from the Redis cache server like users scores, history infos ect.) - Plus, the LUA script have to deal each time with many records saved each day inside a specific key - So, while the Lua script is working, users can't fetch datas ... knowing that the Lua script will work in loop all time.
Second, it is related to the first problem that do not let me store same score if I use timestamp as score in the command so I can return 24 hour datas.
If you are in my case, how will you deal with this ? Thanks
Considering that the data is needed for last 24 hours(Sliding window) and the number of rows possible is 1 million. We cannot use sorted set data structure to compute sum with high performance.
High performance design and also solving your duplicate score issue:
Instead with a little decision on the accuracy, you can have a highly performant system by crunching the data within a window.
Sample Input data:
input 1: user 1 wants to add time: 11:10:01 score: 20
input 2: user 2 wants to add time: 11:11:02 score: 20
input 3: user 1 wants to add time: 11:17:04 score: 50
You can have 1 minute, 5 minutes or 1 hour accuracy and decide window based on that.
If you accept an approximation of 1 hour data, you can have this while insertion,
for input 1 :
INCRBY SCORES_11_hour 20
for input 2:
INCRBY SCORES_11_hour 20
for input 3:
INCRBY SCORES_11_hour 20
To get the data for last 24 hours, you need to sum up only 24 hourly keys.
MGET SCORES_previous_day_12_hour SCORES_previous_day_13_hour SCORES_previous_day_14_hour .... SCORES_current_day_10_hour SCORES_current_day_11_hour
If you accept an approximation of 5 minutes, you can have this while insertion, along with incrementing the hourly keys, you need to store the 5 minute window data.
for input 1 :
INCRBY SCORES_11_hour 20
INCRBY SCORES_11_hour_00_minutes 20
for input 2:
INCRBY SCORES_11_hour 20
INCRBY SCORES_11_hour_00_minutes 20
for input 3:
INCRBY SCORES_11_hour 20
INCRBY SCORES_11_hour_05_minutes 20
To get the data for last 24 hours, you need to sum up only 23 hour keys(whole hours data) + 12 five minute window keys
If the time added is based on the current time, you can optimize it further. (Assuming that if it is 11th hour and the data for 10th, 9th and the previous hours wont change at all).
As you told it is going to be 24/7, we can use some computed values from the previous iterations too.
Say it is computed on 11th hour, you would've got the values for past 24 hours.
If it is again computed on 12th hour, you can reuse the sum for 22 intermediate hours whose data is unchanged and get only the missing 2 hours data from redis.
Similarly further optimisations can be applied based on your need.

.Net ; Read stable weight from RS232

I have a small application that can read weigh scale weights continuously.
I want users to only capture when the weight stabilizes for about 3 seconds.
How can I achieve that?
You need to store the received values with there timestamps in a queue and then calculate the min, max and average over the last three seconds.
First create a class to hold the values and the timestamp, for example called measure.
Then create another class with a qqeue of measure. Implement functions for adding a measure to the class internal queue and to calculate the min,max and average for a timespan. The final function can then use min, max and average to say if the last measure is near enough to the average within a time span.
Instead of a queue you may use a data table and then use sql commands to get that scalars for min, max and average.
If the values are delivered with a constant interval in between, you can avoid the timespan parts and only calculate over the last x values. For example if the scale delivers a new value every 0.5 second, you will have 6 values for the last three seconds.
A FIFO will store the values (use an array with custom add function or a queue). To know if the last values are stable, you need to know what is the min, max and average over the last measures. That enables you to decide if the last value is near the average or if the diff to min and max is too large.
Ie measures:
3 4 8 2 5 4 gives min=2, max=8, avg=4.3. The last val is near to avg but far from max
5 4 6 4 5 5 gives min=4, max=6, avg=4.9, The last value is near min, max and avg. That seems to be a good last measure.

Excel complicated formula for a user form

Some of my colleagues attend conferences, meetings and workshops in various cities.
An example of a made up itinerary is shown below
The itinerary includes stopovers and each location is marked by a trip number. (A:15 to A:22)
I am working on a user form which would give me the time spent in hours and minutes from the departure to the arrival time for each trip number. Note that some trips include a stopover which is why there are three trip number entries for number 1 (trip to Frankfurt via Paris)
I know that the overall time spent for all these trips is 185 hours and 45 minutes as stated in L:23.
In red, along raw 23 there are five formulas as follows:
C:23 shows 24/06/2016 which is =C17
D:23 shows 19:15 which is =D17
H:23 shows 16/01/2016 which is =LOOKUP(2,1/(H17:H22<>""),H17:H22) it picks up the last date inserted between H17:H22
J:23 shows 13:00 which is =LOOKUP(2,1/(J17:J22<>""),J17:J22) it picks up the last time value inserted between J17:J22
L:23 shows 185:45 hours and minutes. It is the difference between the departure date and time of the first and the arrival date and time of the last trip. (Overall time in hours and minutes) =MAX(0,(H23+J23)-(C23+D23))
I need a way to work out the total time of 185:45 broken down between various business trip numbers in C:26 to C:29. Note that trips will always be shown in a logical order i.e. 1,2,3 but the amount of legs per trip will vary depending on stop overs. The minimum amount of trips is 1 and the maximum amount of trips is 4.
Thanking you in advance
Abe
Try this:
=TEXT(MAX(IF($A$17:$A$22=A26,$I$17:$I$22+$J$17:$J$22))-MIN(IF($A$17:$A$22=A26,$C$17:$C$22+$D$17:$D$22)),"[hh]:mm")
It is an array formula and must be confirmed with Ctrl-Shift-Enter. Put in C26, hit Ctrl-Shift-Enter then copy down.
Edit: As per OP's comments, what was wanted was the total of time from beginning of leg to the beginning of the next leg. So the formula was changed to:
=IF(MIN(IF($A$17:$A$22=A26+1,$C$17:$C$22+$D$17:$D$22))=0,MAX(IF($A$17:$A$22=A26,$I$17:$I$22+$J$17:$J$22)),MIN(IF($A$17:$A$22=A26+1,$C$17:$C$22+$D$17:$D$22)))-MIN(IF($A$17:$A$22=A26,$C$17:$C$22+$D$17:$D$22))
This is still an array formula. It needs to be confirmed by hitting Ctrl-Shift-Enter. Then copied down.