SignalFX detector data().count() based on condition - splunk

Is it possible to implement count() MTS based on condition?
For instance:
We need to monitor the amount of time RDS CPU picks the point of 95% for the last 3 days.
A = data('CPU_Utilization').count(...when point > 95%).
detector(when(A > {number_of_times_breached}, lasting='3d')).publish(...)
Update.
Solution was found by my colleague:
A = data('CPU_Utilization').above({condition_value}, inclusive=True).count(...)

You can use eval() with boolean result inside count() in your SPL query.
Something like
| <your search> | stats count(eval(point>0.95))

Related

KQL performance of WHERE condition

Is there any difference in KQL in performance between joining WHERE conditions using ADD statements or adding them separately?
Will smth like
Events
| where Source == "myapp"
and Timestamp > ago(7d)
and isnotnull(DeviceId)
and isnotnull(UserId)
be faster than
Events
| where Source == "myapp"
| where Timestamp > ago(7d)
| where isnotnull(DeviceId)
| where isnotnull(UserId)
?
No difference whatsoever. Both queries are equivalent.
According to the docs, you should use a time filter first because Kusto is highly optimized to use time filters.

Cloudwatch log insights sum() set default 0 value when no logs are present

I'm trying to run the following Cloudwatch log insights query for two different log stream sources. However, when one or both streams have no entries, the sum() function returns a void result instead of 0. Because of that, I can't use that result in another stats operation. Do you know if it's possible to circumvent this behavior and make the sum() function return 0 when there are no results? Thanks!
stats
sum(raw.stream_1.TotalBill) as stream_1_bill,
sum(raw.stream_2.TotalBill) as stream_2_bill,
stream_1_bill + stream_2_bill as total_bill
Expected result:
stream_1_bill: 0
stream_2_bill: 1
total_bill: 1
Received result:
stream_1_bill:
stream_2_bill: 1
total_bill:
I can achieve the desired result by using the coalesce function.
COALESCE(SUM(stream_1_bill),0)

Splunk conditional distinct count

I'm running a distinct count syntax, stats dc(src_ip) by, and it returns the number of distinct source IPs but I would like to create a conditional statement (eval?) that it should only return the stats if the count is greater than 50.
Tried something like this, but no joy. Any idea how to make a conditional distinct count where count has to be more than X?
stats dc(src_ip) | eval status=if(count>50) => doesn't work
The stats command will always return results (although sometimes they'll be null). You can, however, suppress results that meet your conditions.
stats dc(src_ip) as ip_count
| where ip_count > 50

SQL Optimization in Oracle

We are using Oracle 11 and I recently acquired a Dell SQL Optimizer (included with the Xpert Toad package). We had a statement this morning that was taking longer than normal to run, and after we eventually got it running (missing some conditions from when it was created) I was curious, having never used any SQL optimizer before, what it would change it to. It came back with over 150 variations of the same statement, but the one with the lowest cost simply added to the following line.
AND o.curdate > 0 + UID * 0
We already had o.curdate > 0, and the "+ UID * 0" was added. This decreased the runtime from over a minute to 3 seconds. I assume it has something to do with how Oracle translates and processes the conditions, but I was curious if any of the Oracle gurus would be able to provide some insight as to how this addition to the greater than zero check decreased the runtime by 15 times. Thanks!
The UID * 0 is used to hide the 0 from the optimizer. The optimizer would use its statistic data to find out whether using an index scan on o.curdate > 0 makes sense. As long as the optimizer knows the value in o.curdate > value it will do so. But when the value is unknown (here because the function UID will be called on execution and somehow mathed into the value), the optimizers cannot foresee what percentage of rows may be accessed and thus choses an avarage best access method.
Example: You have a table with IDs 1 to 100. Asking for ID > 0 will result in a full table scan, whereas asking for ID > 99 will likely result in an index range scan. When asking for ID > 0 + UID * 0 suddenly makes the optimizer blind to the value, and it may chose the index plan rather then full table scan.

SQL Server aggregate performance

I am wondering whether SQL Server knows to 'cache' if you like aggregates while in a query, if they are used again.
For example,
Select Sum(Field),
Sum(Field) / 12
From Table
Would SQL Server know that it has already calculated the Sum function on the first field and then just divide it by 12 for the second? Or would it run the Sum function again then divide it by 12?
Thanks
It calculates once
Select
Sum(Price),
Sum(Price) / 12
From
MyTable
The plan gives:
|--Compute Scalar(DEFINE:([Expr1004]=[Expr1003]/(12.)))
|--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1010]=(0) THEN NULL ELSE [Expr1011] END))
|--Stream Aggregate(DEFINE:([Expr1010]=Count(*), [Expr1011]=SUM([myDB].[dbo].[MyTable].[Price])))
|--Index Scan(OBJECT:([myDB].[dbo].[MyTable].[IX_SomeThing]))
This table has 1.35 million rows
Expr1011 = SUM
Expr1003 = some internal thing to do with "no rows" etc but is Expr1011 basically
Expr1004 = Expr1011 / 12
According to the execution plan, it doesn't re-sum the column.
good question, i think the answer is no, it doesn't not cache it.
I ran a test query with around 3000 counts in it, and it was much slower than one with only a few. Still want to test if the query would be just as slow selecting just plain columns
edit: OK, i just tried selecting a large amount of columns or just one, and the amount of columns (when talking about thousands being returned) does effect the speed.
Overall, unless you are using that aggregate number a ton of times in your query, you should be fine. Push comes to shove, you could always save the outcome to a variable and do the math after the fact.