In Amazon Cloudwatch Insights, how do you take a statistic of a statistic? - amazon-cloudwatch

I am using AWS Cloudwatch Insights and running a query like this:
fields #message, #timestamp
| filter strcontains(#message, "Something of interest happened")
| stats count() as interestCount by bin(10m) as tenMinuteTime
| stats max(interestCount) by datefloor(tenMinuteTime, 1d)
However, on the last line, I get the following error:
mismatched input 'stats' expecting {K_PARSE, K_SEARCH, K_FIELDS, K_DISPLAY, K_FILTER, K_SORT, K_ORDER, K_HEAD, K_LIMIT, K_TAIL}
It would seem to mean from this that I cannot take multiple layers of stat queries in Insights, and thus cannot take a statistic of a statistic. Is there a way around this?

You cannot currently use multiple stat commands and from what I know there is no direct way around that at this time. You can however thicken up your single stat command and separate by comma, like so:
fields #message, #timestamp
| filter strcontains(#message, "Something of interest happened")
| stats count() as #interestCount,
max(interestCount) as #maxInterest,
interestCount by bin(10m) as #tenMinuteTime
You define fields and use functions after stats and then process those result fields.

Related

How to find time duration between two splunk events which has unique key

First Event
17:09:05:362 INFO com.a.b.App - Making a GET Request and req-id: [123456]
Second Event
17:09:06:480 INFO com.a.b.App - Output Status Code: 200 req-id:"123456"
I tried to use index="xyz" container="service-name" | transaction "req-id" startswith="Making a GET Request" endswith="Output Status Code" | table duration but it is also not working.
I want to calculate duration of above two events for every request. I went over some solutions in splunk and Stack Overflow, but still can't get the proper result.
Try doing it with stats instead:
index=ndx sourcetype=srctp
| rex field=_raw "req\-id\D+(?<req_id>\d+)"
| rex field=_raw "(?<sequence>Making a GET Request)"
| rex field=_raw "(?<sequence>Output Status Code)"
| eval sequence=sequence+";"+_time
| stats values(sequence) as sequence by req_id
| mvexpand sequence
| rex field=sequence "(?<sequence>[^;]+);(?<time>\d+)"
| eval time=strftime(time,"%c")
This will extract the "req-id" into a field named req_id, and the start and end of the sequence into a field named sequence
Presuming the sample data you shared is correct, when you stats values(sequence) as sequence, it will put the "Making..." entry first and the "Output..." entry second
Because values() will do this, when you mvexpand and then split the values()'d field part into sequence and time, they'll be in the proper order
If the sample data is incomplete, you may need to tweak the regexes for populating sequence
It’s seem you’re going with my previously suggested approach 😉
Now you have 2 possibilities
1. SPL
Below the simplest query, only invoking 1 rex and assuming _time field correctly filled
index=<your_index> source=<your_source>
("*Making a GET Request*" OR "*Output Status Code*")
| rex field=_raw "req\-id\D+(?<req_id>\d+)"
| stats max(_time) as end, min(_time) as start by id
| eval duration = end - start
| table id duration
Note that depending the amount of data to scan, this one can be ressources consuming for your Splunk cluster
2. Log the response time directly in API (more efficient)
It seem you are working on an API. You must have capabilities to get the response time of each call and directly trace it in your log
Then you can exploit it easily in SPL without calculation
It always preferable to persist data at index time vs. operate systematic calculation at search time

Count how often logs with dynamic field value that is the same gets emitted?

I don't think this is possible, but wanted to ask anyway -- is there a way in Cloudwatch Insights where I can find the count of how often a log with a dynamic value is emitted with the same value from distinct logs? The use case I have is we want to compare log statements from two different code paths, so we attach the same requestID to both log statements. So to illustrate what might happen, two logs may get emitted
Log 1:
{
message: "SPECIAL_LOG_EMITTED Operation1_Emitted"
requestID: "123456"
}
Log2:
{
message: "SPECIAL_LOG_EMITTED Operation2_Emitted"
requestID: "123456"
}
So ideally I could do something like
fields #timestamp, #message, requestID
| filter #message like "SPECIAL_LOG_EMITTED"
| parse #message '*_Emitted' as operation
| stats count(*) as all, sum (operation LIKE 'Operation1') as Op1, sum (operation LIKE 'Operation2') as Operation2 by bin(5m)
And then from this find out where the requestID is matching. The requestID is dynamic, though, so I can't just hard-code it -- I want to find how often logs are emitted with matching requestIDs.
I've considered looking into count_distinct but that seems like the wrong approach (correct me if I'm wrong)

Splunk searching event logs to find values exceeding a given threshold

I want to search the log event
"Closure request counts: startAssets: "
and find occurrences where the startAssets are larger than 50.
How would I do that?
Something like:
Closure request counts: startAssets: 51
would maybe give a search similar to
"Closure request counts: startAssets: {num} AND num >=50"
perhaps?
What does that look like in SPL?
That's pretty simple, but you'll need to extract the number to do it. I like to use the rex command to do that, but there may be other ways.
index=foo "Closure request counts: startAssets: *"
| rex "startAssets: (?<startAssets>\d+)"
| where startAssets > 50

Determine if field is in a subset of values

I'm writing a query to determine what percentage of events are error events for a camera-based system.
To narrow logged events down to camera events, I have event=camera* in the initial query.
What I want to do next is treat the event as bad if it's in a subset, so I want something like:
event=camera* | eval bad_event=IF(event IN (camera-failed, camera-error, ...))
but I am not sure of the correct syntax for this in Splunk.
I tried eval bad_event=IF(event=camera-failed OR event=camera-error), but got the message Error in 'eval' command: The arguments to the 'if' function are invalid.
How do I check if the event is in a subset of its possible values?
You could do with with coalesce and case, or if and match (documentation):
Using case:
| eval event_type=coalesce(case(event=='camera-failed','bad',event=='camera-error','bad'), 'good')
Using match:
| eval event_type=if(match(event_type, 'camera-(failed|error)'),'bad', 'good')

Need to query splunk using rest api call and pull mean and stdev

I am trying to query using Rest API on splunk with the following:
curl -u "<user>":"<pass>" -k https://splunkserver.com:8089/services/search/jobs/export -d'search=search index%3d"<index_name" sourcetype%3d"access_combined_wcookie" starttime%3d06/02/2013:0:0:0 endtime%3d06/10/2013:0:0:0 uri_path%3d"<uri1>" OR uri_path%3d"<uri2>" user!%3d"-" referer!%3d"-" | eval Time %3d request_time_length%2f1000000 | stats stdev%28Time%29 as stdev, mean%28Time%29 as mean, count%28uri_path%29 as count by uri_path'
However I do not get the computed mean and stdev, I only see count. How can I add the mean and stdev?
The query looks about right. I tried a similar query on my end it seemed to give me all 3 aggregates. Only thing I can think of is to make sure you have events that match the search criteria. It could be your time boundaries. Try expanding those or maybe removing one/both of them to see if you get any data for mean and stdev.