How to alert on an event that normally happens once a day? - librato

I have a batch job that runs once per day.
At the end of the job I submit a meter metric with a count of the items processed.
I want to alert if one day this metric is not updated.
On http://metrics.librato.com the maximum time I can check "not reported for" when creating an alert is 60 minutes.
I thought maybe I can create a composite metric and take the avg rate of change over the past 24 hours, and alert if that reaches zero.
I've been trying:
derive(s("my.metric", "%", {function:"sum", period:"86400"}))
However it seems that, because I log only a single event, above quite small values of period (~250s) my rate of change simply drops to zero ...I guess the low frequency means my single value is completely lost by the sampling.
Maybe I am using the wrong tool for the job...
Is there a way to achieve this in Librato?

There currently is not a way to achieve this as composite metrics are subject to the 60 minute limitation of alerts as well (as of 5/15/2015). You may need to look into configuring the metric (or a similar metric) to report within the 60m time range if possible.

Related

Laravel where clause based on conditions from value in database

I am building an event reminder page where people can set a reminder for certain events. There is an option for the user to set the amount of time before they need to be notified. It is stored in notification_time and notification_unit. notification_time keeps track of the time before they want to be notified and notification_unit keeps track of the PHP date format in which they selected the time, eg. i for minutes, H for hours.
Eg. notification_time - 2 and notification_unit - H means they need to be notified 2 hours before.
I have Cron jobs running in the background for handling the notification. This function is being hit once every minute.
Reminder::where(function ($query) {
$query->where('event_time', '>=', now()->subMinutes(Carbon::createFromFormat('i', 60)->diffInMinutes() - 1)->format('H:i:s'));
$query->where('event_time', '<=', now()->subMinutes(Carbon::createFromFormat('i', 60)->diffInMinutes())->format('H:i:s'));
})
In this function, I am hard coding the 'i', 60 while it should be fetched from the database. event_time is also part of the same table
The table looks something like this -
id event_time ... notification_unit notification_time created_at updated_at
Is there any way to solve this issue? Is it possible to do the same logic with SQL instead?
A direct answer to this question is not possible. I found 2 ways to resolve my issue.
First solution
Mysql has DATEDIFF and DATE_SUB to get timestamp difference and subtract certain intervals from a timestamp. In my case, the function runs every minute. To use them, I have to refactor my database to store the time and unit in seconds in the database. Then do the calculation. I chose not to use this way because both operations are a bit heavy on the server-side since I am running the function every minute.
Second Solution
This is the solution that I personally did in my case. Here I did the calculations while storing it in the database. Meaning? Let me explain. I created a new table notification_settings which is linked to the reminder (one-one relation). The table looks like this
id, unit, time, notify_at, repeating, created_at, updated_at
The unit and time columns are only used while displaying the reminder. What I did is, I calculated when to be notified in the notify_at column. So in the event scheduler, I need to check for the reminders at present (since I am running it every minute). The repeating column is there to keep track of whether the reminder is repeating or not. If it is repeating I re-calculate the notify_at column at the time of scheduling. Once the user is notified notify_at is set to null.

Get the most recent time series message from Time Series Insights

Is there an TSI-Endpoint that provides the most recent message of a time series that arrived from e.g. an iot-hub? In my current situation I have to poll a certain period of time (for example now to 30 seconds in the past) and I wonder if there is a better way to do this?
Unfortunately there is no endpoint that lets you query only the last event. You can use GetEvents and query over the last 30 seconds, like you said, or peek the last message from the Event Hub/IoT Hub.

Dynamically filtering large query result for presentation in SSRS

We have a system that records data to an SQL Server DB captured from field equipment every minute. This data is used for a number of purposes, one of which is for charting in reports via SSRS.
The issue is that with such a high volume of data, when a report is run for period of for example 3 months, the volume of data returned obviously causes excessive report rendering times.
I've been thinking of finding a way of dynamically reducing the amount of data returned, based on the start and end time periods chosen. Something along the lines of a sliding scale where from the duration between the start and end period, I can apply different levels of filtering so that where larger periods are chosen, more filtering occurs while for smaller periods less or no filtering occurs.
There is still a need to be able to produce higher resolution (as in more data points returned) reports for troubleshooting purposes.
For example:
Scenario 1:
User is executing a report for a period of 3 months. Result set returned by the query is reduced for performance reasons without adversely affecting what information the user wants to see (the chart is still representative of the changes over time).
Scenario 2:
User executes the report for a period of 1 hour, in order to look for potential indicator(s) of problems with field devices while troubleshooting the system. For this short time period, no filtering is applied.
My first thought was to use a modulo operation on the primary key of the data (which is an identity field), whereby the divisor is chosen depending on the difference between the start and end dates.
For example, something like if the difference between the start and end dates for the report execution period is 5 weeks, choose a divisor of 5 and apply a mod to the PK, selecting where the result is equal to zero.
I would love to get feedback as to whether this sounds like a valid approach or whether there is a better way to do this.
Thanks.

track sales for week/month and find the best sellers

Lets say I have a website that sells widgets. I would like to do something similar to a tag cloud tracking best sellers. However, due to constantly aquiring and selling new widgets, I would like the sales to decay on a weekly time scale.
I'm having problems puzzling out how store and manipulate this data and have it decay properly over time so that something that was an ultra hot item 2 months ago but has since tapered off doesn't show on top of the list over the current best sellers. What would be the logic and database design for this?
Part 1: You have to have tables storing the data that you want to report on. Date/time sold is obviously key. If you need to work in decay factors, that raises the question: for how long is the data good and/or relevant? At what point in time as the "value" of the data decayed so much that you no longer care about it? When this point is reached for any given entry in the database, what do you do--keep it there but ensure it gets factored out of all subsequent computations? Or do you archive it--copy it to a "history" table and delete it from your main "sales" table? This is relevant, as it has to be factored into your decay formula (as well as your capacity planning, annual reporting requirements, and who knows what all else.)
Part 2: How much thought has been given to the decay formula that you want to use? There's no end of detail you can work into this. Options and factors to wade through include but are not limited to:
Simple age-based. Everything before the cutoff date counts as 1; everything after counts as 0. Sum and you're done.
What's the cutoff date? Precisly 14 days ago, to the minute? Midnight as of two Saturdays ago from (now)?
Does the cutoff date depend on the item that was sold? If some items are hot but some are not, does that affect things? What if you want to emphasize some things (the expensive/hard to sell ones) over others (the fluff you'd sell anyway)?
Simple age-based decays are trivial, but can be insufficient. Time to go nuclear.
Perhaps you want some kind of half-life, Dr. Freeman?
Everything sold is "worth" X, where the value of X is either always the same or varies on the item sold. And the value of X can decay over time.
Perhaps the value of X decreased by one-half every week. Or ever day. Or every month. Or (again) it may vary depending on the item.
If you do half-lifes, the value of X may never reach zero, and you're stuck tracking it forever (which is why I wrote "part 1" first). At some point, you probably need some kind of cut-off, some point after which you just don't care. X has decreased to one-tenth the intial value? Three months have passed? Either/or but the "range" depends on the inherent valud of the item?
My real point here is that how you calculate your decay rate is far more important than how you store it in the database. So long as the data's there that the formalu needs to do it's calculations, you should be good. And if you only need the last month's data to do this, you should perhaps move everything older to some kind of archive table.
you could just count the sales for the last month/week/whatever, and sort your items according to that.
if you want you can always add the total amonut of sold items into your formula.
You might have a table which contains the definitions of the pointing criterion (most sales, most this, most that, etc.), then for a given period, store in another table the attribution of points for each of the criterion defined in the criterion table. Obviously, a historical table will be used to store the score for each sellers for a given period or promotion, call it whatever you want.
Does it help a little?

SQL query to calculate visit duration from log table

I have a MySQL table LOGIN_LOG with fields ID, PLAYER, TIMESTAMP and ACTION. ACTION can be either 'login' or 'logout'. Only around 20% of the logins have an accompanying logout row. For those that do, I want to calculate the average duration.
I'm thinking of something like
select avg(LL2.TIMESTAMP - LL1.TIMESTAMP)
from LOGIN_LOG LL1
inner join LOGIN_LOG LL2 on LL1.PLAYER = LL2.PLAYER and LL2.TIMESTAMP > LL1.TIMESTAMP
left join LOGIN_LOG LL3 on LL3.PLAYER = LL1.PLAYER
and LL3.TIMESTAMP between LL1.TIMESTAMP + 1 and LL2.TIMESTAMP - 1
and LL3.ACTION = 'login'
where LL1.ACTION = 'login' and LL2.ACTION = 'logout' and isnull(LL3.ID)
is this the best way to do it, or is there one more efficient?
Given the data you have, there probably isn't anything much faster you can do because you have to look at a LOGIN and a LOGOUT record, and ensure there is no other LOGIN (or LOGOUT?) record for the same user between the two.
Alternatively, find a way to ensure that a disconnect records a logout, so that the data is complete (instead of 20% complete). However, the query probably still has to ensure that the criteria are all met, so it won't help the query all that much.
If you can get the data into a format where the LOGIN and corresponding LOGOUT times are both in the same record, then you can simplify the query immensely. I'm not clear if the SessionManager does that for you.
Do you have a SessionManager type object that can timeout sessions? Because a timeout could be logged there, and you could get the last activity time from that and the timeout period.
Or you log all activity on the website/service, and thus you can query website/service visit duration directly, and see what activities they performed. For a website, Apache log analysers can probably generate the required stats.
I agree with JeeBee, but another advantage to a SessionManager type object is that you can handle the sessionEnd event and write a logout row with the active time in it. This way you would likely go from 20% accompanying logout rows to 100% accompanying logout rows. Querying for the activity time would then be trivial and consistent for all sessions.
If only 20% of your users actually log out, this search will not give you a very accurate time of each session. A better way to gauge how long an average user session is would be to take the average time between actions, or avg. time per page. This, then, can multiplied by the average number of pages/actions per visit to give a more accurate time.
Additionally, you can determine avg. time for each page, and then get your session end time = session time to that point + avg time spent on their last page. This will give you a much more fine-grained(and accurate) measure of time spent per session.
Regarding the given SQL, it seems to be more complicated than you really need. This sort of statistical operation can often be better handled/more maintainable in code external to the database where you can have the full power of whichever language you choose, and not just the rather convoluted abilities of SQL for statistical calculations