Grafana x-axis to show data with 10 seconds granularity instead of 1 sec - data-visualization

It's probably something easy but I'm new to grafana so bear with me.
I have data collected every 10 seconds I would like to display in Grafana.
select time, value from "metrics_value" where instance='Processor' and type='counter' and type_instance='message' and time> now() - 1m;
name: metrics_value
---------------
time value
2016-10-13T09:24:33Z 23583
2016-10-13T09:24:43Z 23583
2016-10-13T09:24:53Z 23583
2016-10-13T09:25:03Z 23583
2016-10-13T09:25:13Z 23583
But it's shown as :
So it fills in the intermediate points with some values.
How could I set the interval of x axis of grafana to show only points of 10 seconds?
I know the I could aggregate summarize function to sum up as described here: How to change the x axis in Graphite/Grafana (to graph by day)?
But I don't think I can use that.

Works properly with:
select sum("value") from "metrics_value" where instance='Processor' and type='counter' and type_instance='message' and time> now() - 1m GROUP BY time(10s) fill(null);
Edit: I also changed "sum" aggregation to mean so grafana calculates the mean of the values when zoomed out. (Otherwise it summed the values.)

Related

How to set a max range condition with timescale time_bucket_gapfill() in order to not fill real missing values?

I'd like some advices to know if what I need to do is achievable with timescale functions.
I've just found out I can use time_bucket_gapfill() to complete missing data, which is amazing! I need data each 5 minutes but I can receive 10 minutes, 30 minutes or 1 hour data. So the function helps me to complete the missing points in order to have only 5 minutes points. Also, I use locf() to set the gapfilled value with last value found.
My question is: can I set a max range when I set the last value found with locf() in order to never overpass 1 hour ?
Example: If the last value found is older than 1 hour ago I don't want to fill gaps, I need to leave it empty to say we have real missing values here.
I think I'm close to something with this but apparently I'm not allowed to use locf() in the same case.
ERROR: multiple interpolate/locf function calls per resultset column not supported
Somebody have an idea how I can resolve that?
How to reproduce:
Create table powers
CREATE table powers (
delivery_point_id BIGINT NOT NULL,
at timestamp NOT NULL,
value BIGINT NOT NULL
);
Create hypertable
SELECT create_hypertable('powers', 'at');
Create indexes
CREATE UNIQUE INDEX idx_dpid_at ON powers(delivery_point_id, at);
CREATE INDEX index_at ON powers(at);
Insert data for one day, one delivery point, point 10 minutes
INSERT INTO powers SELECT 1, at, round(random()*10000) FROM generate_series(TIMESTAMP '2021-01-01 00:00:00', TIMESTAMP '2022-01-02 00:00:00', INTERVAL '10 minutes') AS at;
Remove three hours of data from 4am to 7am
DELETE FROM powers WHERE delivery_point_id = 1 AND at < '2021-01-1 07:00:00' AND at > '2021-01-01 04:00:00';
The query that need to be fixed
SELECT
time_bucket_gapfill('5 minutes', at) AS point_five,
avg(value) AS avg,
CASE
WHEN (locf(at) - at) > interval '1 hour' THEN null
ELSE locf(avg(value))
END AS gapfilled
FROM powers
GROUP BY point_five, at
ORDER BY point_five;
Actual: ERROR: multiple interpolate/locf function calls per resultset column not supported
Expected: Gapfilled values each 5 minutes except between 4am and 7 am (real missing values).
This is a great question! I'm going to provide a workaround for how to do this with the current stuff, but I think it'd be great if you'd open a Github issue as well, because there might be a way to add an option for this that doesn't require a workaround like this.
I also think your attempt was a good approach and just requires a few tweaks to get it right!
The error that you're seeing is that we can't have multiple locf calls in a single column, this is a limitation that's pretty easy to work around as we can just shift both of them into a subquery, but that's not enough. The other thing that we need to change is that locf only works on aggregates, right now, you’re trying to use it on a column (at) that isn’t aggregated, which isn’t going to work, because it wouldn’t know which of the values of at in a time_bucket to “pull forward” for the gapfill.
Now you said you want to fill data as long as the previous point wasn’t more than one hour ago, so, we can take the last value of at in the bucket by using last(at, at) this is also the max(at) so either of those aggregates would work. So we put that into a CTE (common table expression or WITH query) and then we do the case statement outside like so:
WITH filled as (SELECT
time_bucket_gapfill('5 minutes', at) AS point_five,
avg(value) AS avg,
locf(last(at, at)) as filled_from,
locf(avg(value)) as filled_avg
FROM powers
WHERE at BETWEEN '2021-01-01 01:30:00' AND '2021-01-01 08:30:00'
AND delivery_point_id = 1
GROUP BY point_five
ORDER BY point_five)
SELECT point_five,
avg,
filled_from,
CASE WHEN point_five - filled_from > '1 hour'::interval THEN NULL
ELSE filled_avg
END as gapfilled
FROM filled;
Note that I’ve tried to name my CTE expressively so that it’s a little easier to read!
Also, I wanted to point out a couple other hyperfunctions that you might think about using:
heartbeat_agg is a new/experimental one that will help you determine periods when your system is up or down, so if you're expecting points at least every hour, you can use it to find the periods where the delivery point was down or the like.
When you have more irregular sampling or want to deal with different data frequencies from different delivery points, I’d take a look a the time_weight family of functions. They can be more efficient than using something like gapfill to upsample, by instead letting you treat all the different sample rates similarly, without having to create more points and more work to do so. Even if you want to, for instance, compare sums of values, you’d use something like integral to get the time weighted sum over a period based on the locf interpolation.
Anyway, hope all that is helpful!

Subtraction of dates with hours and minutes (result in float)

I would like some help with an SSIS problem.
I have two columns, one with a date of when demand was open and another when the demand was responded to.
My date comes in this way:
DT_ANSWERED_DATE
DT_CREATED_DATE
2021-02-04 19:48:00.000
2021-02-04 19:44:00.000
I would like to subtract DT_ANSWERED_DATE MINUS DT_CREATED_DATE
but I would like the result would be a float number:
like in this case when a subtract in excel
I get the result:
DT_ANSWERED_DATE
DT_CREATED_DATE
DT_ANSWERED_DATE minus DT_CREATED_DATE
2021-02-04 19:48:00.000
2021-02-04 19:44:00.000
0,00277777777228039
I would like to do the same thing but in a derived column at SSIS (Microsoft Visual Studio)
Thanks for the response in advance
It looks like your granularity is in minutes. This should get you the decimal number you are looking for...
DATEDIFF("mi", DT_CREATED_DATE, DT_ANSWERED_DATE) / (60 * 24)
(60 min per hour * 24 hours in a day)
Microsoft documentation... https://learn.microsoft.com/en-us/sql/integration-services/expressions/datediff-ssis-expression?view=sql-server-ver16
In your example above this results in:
4 min / (60*24) = 0.00277777777
Note:
I highly recommend using decimal vs float. Unless you really, really have a reason. 1=1 is usually not true when using a float number. It will always be true with integers or decimals.

How to create an SQL time-in-location table from location/timestamp SQL data stream

I have a question that I'm struggling with in SQL.
I currently have a series of location and timestamp data. It consists of devices in locations at varying timestamps. The locations are repeated, so while they are lat/long coordinates there are several that repeat. The timestamp comes in irregular intervals (sometimes multiple times a second, sometimes nothing for 30 seconds). For example see the below representational data (I am sorting by device name in this example, but could order by anything if it would help):
Device Location Timestamp
X A 1
X A 1.7
X A 2
X A 3
X B 4
X B 5.2
X B 6
X A 7
X A 8
Y A 2
Y A 4
Y C 6
Y C 7
I wish to create a table based on the above data that would show entry/exit or first/last time in each location, with the total duration of that instance. i.e:
Device Location EntryTime ExitTime Duration
X A 1 3 2
X B 4 6 2
X A 7 8 1
Y A 2 4 2
Y C 6 7 1
From here I could process it further to work out a total time in location for a given day, for example.
This is something I could do in Python or some other language with something like a while loop, but I'm really not sure how to accomplish this in SQL.
It's probably worth noting that this is in Azure SQL and I'm creating this table via a Stream Analytics Query to an Event Hubs instance.
The reason I don't want to just simply total all in a location is because it is going to be streaming data and rolling through for a display for say, the last 24 hrs.
Any hints, tips or tricks on how I might accomplish this would be greatly appreciated. I've looked and haven't be able to quite find what I'm looking for - I can see things like datediff for calculating duration between two timestamps, or max and min for finding the first and last dates, but none quite seem to tick the box. The challenge I have here is that the devices move around and come back to the same locations many times within the period. Taking the first occurrence/timestamp of device X at location A and subtracting it from the last, for example, doesn't take into account the other locations it may have traveled to in between those timestamps. Complicating things further, the timestamps are irregular, so I can't simply count the number of occurrences for each location and add them up either.
Maybe I'm missing something simple or obvious, but this has got me stumped! Help would be greatly appreciated :)
I believe grouping would work
SELECT Device, Location, [EntryTime] = MIN(Timestamp), [ExitTime] = Max(Timestamp), [Duration] = MAX(Timestamp)- MIN(Timestamp)
FROM <table>
GROUP BY Device, Location
I was working on similar issue, to some extent in my dataset.
SELECT U.*, TO_DATE(U.WEND,'DD-MM-YY HH24:MI') - TO_DATE(U.WSTART,'DD-MM-YY HH24:MI') AS DURATION
FROM
(
SELECT EMPNAME,TLOC, TRUNC(TO_DATE(T.TDATETIME,'DD-MM-YY HH24:MI')) AS WDATE, MIN(T.TDATETIME) AS WSTART, MAX(T.TDATETIME) AS WEND FROM EMPTRCK_RSMSM T
GROUP BY EMPNAME,TLOC,TRUNC(TO_DATE(T.TDATETIME,'DD-MM-YY HH24:MI'))
) U

In Crystal Report print only first record in group and leave it summable

I have a table that lists every task an operator completed during a day. This is gathered by a Shop Floor Control program. There is also a column that has the total hours worked that day, this field comes from their time punches. The table looks something like this:
Operator 1 Bestupid 0.5 8 5/12/1986
Operator 1 BeProductive 0.1 8 5/12/1986
Operator 1 Bestupidagain 3.2 8 5/12/1986
Operator 1 Belazy 0.7 8 5/13/1986
Operator 2 BetheBest 1.7 9.25 5/12/1986
I am trying to get an efficiency out of this by summing the process hours and comparing it to the hours worked. The problem is that when I do any kind of summary on the hours worked column it sums EVERY DETAIL LINE.
I have tried:
If Previous (groupingfield) = (groupingfield) Then
HoursWorked = 0
Else
HoursWorked = HoursWorked
I have tried a global three formula trick, but neither of the above leave me with a summable field, I get "A summary has been specified on a non-recurring field"
I currently use a global variable, reset in the group header, but not WhilePrintinganything. However it is missing some records and upon occasion I will get two hoursworked > 0 in the same group :(
Any ideas?
I just want to clarify, I have three groups:
Groups: Work Center --> Operator --> Date
I can summarize the process hours across any group and that's fine. However, the hours worked prints on every detail line even though it really should only print once per Date. Therefore when I summarize the Hours Worked for an operator the total is WAY off because it is adding up 8hours for each entry instead of 8 hours for each day.
Try grouping by the operators. Then create a running total for the process hours that sum for each record and reset on change of group. In the group footer you can display the running total and any other stats for that operator you care to.
Try another running total for the daily hours but pick maximum as the type of summary. Since all the records for the day will have the same hours work the maximum will be correct. Reset with the change of the date group and you should be good to go.

Mysql Datetime queries

I'm new to this forum. I've been having trouble constructing a MySQL query. Basically I want to select data and use some sort of function to output the timestamp field in a certain way. I know that dateformat can do this by minute, day, hour, etc. But consider the following:
Say it is 12:59pm. I want to be able to select data from the past day, and have the data be placed into two hour wide time 'bins' based on it's timestamp.
So these bins would be: 10:00am, 8:00am, 6:00am, 4:00am, etc, and the query would convert the data's timestamp in one of these bins.
E.G.
data converted
4:45am becomes 4:00am,
6:30am becomes 6:00am,
9:55am becomes 8:00am,
10:03am becomes 10:00am,
11:00am becomes 10:00am
Make sense? The width of the bins needs to be dynamic as well. I hope I described the problem clearly, and any help is appreciated.
Examples:
Monthly buckets:
GROUP BY YEAR(datestampfield) desc, MONTH(datestampfield) desc
Hourly buckets, with number of hours configurable:
set #rangehrs = 2; select *,FLOOR(HOUR(dateadded)/#rangehrs )*#rangehrs as x from mytable GROUP BY FLOOR(HOUR(dateadded)/#rangehrs )*#rangehrs limit 5;
Sounds like you're looking for a histogram of time. Not sure that's a real thing, but the term histogram might get you in a good place....like this related question:
Getting data for histogram plot