How do I get percentile for an aggregated sum in Cloudwatch Logs Insights? - amazon-cloudwatch

I have multiple log files where I would want to aggregate time_taken by sum. Next, I would like to compute the p90 of these aggregated sums.
Example of my logs
{"identifier" : 1,"time_taken": 10}
{"identifier" : 1,"time_taken": 20}
{"identifier" : 2,"time_taken": 30}
{"identifier" : 2,"time_taken": 40}
What I essentially want to achieve is combine logs of same identifier and then get the p90 of time taken. For the logs above, I expect to get p90 of time taken = 70
Invalid query
fields values
| stats sum(time_taken) as summed_time_taken by identifier
| stats pct(summed_time_taken, 90)
Query works but wrong output
fields values
| stats pct(sum(time_taken),90) as p90_of_summed_time_taken by identifier

Related

Splunk Query Recommendation

I have below log from my application:
BookData, {
id: 12312
}, appID : 'APP1', Relation_ID : asdas-12312
host = aws#asd. sourcetype=service_name
The entire log above is in the form of a single String. I want to create a table with the no. of times an appID has hit the service. i.e. I want to count the no. of events and group them by appID.
Basically, something like:
appID Count
APP1 23
APP2 25
APP3 100
I tried with below query, but it is not working. It is giving as 0 records found.
index=my_index sourcetype=service_name * | table appID Count | addColTotals labelfield=appID label="appID" count
As per my understanding, above query is not working because appID is not a label, but in that case, how do I go about forming the query with my desired result.
The query doesn't work in part because there is no Count field for the table command to display and no count field for the addcoltotals command to add to the results. To get a count you must tell Splunk to count fields by using the stats, eventstats, streamstats, or timechart command.
Try this:
index=my_index sourcetype=service_name
| stats count as Count by appID

Count the number of different value of a field, and get the average per minute

I have some domain like this:
domain |
A |
B |
C |
D |
...
One domain can be called in one request, now I want to know what is the average request number per minute for a domain (no matter what domain is). So I split it into three steps:
get the total request number per minute
get the number of domains been called per minute
avg = total request number per minute / number of domain per minute
I have got the result of the first step by:
index="whatever" source="sourceurl"
| bin _time span=1m
| stats count as requestsPerMin by _time
However, I don't know how to get the number of domains that been called. For example, in a minute, domain A has been called twice, domain B has been called once, so the number of domains that been called should be two. But I don't know which query can get this result.
If I understand you correctly, you probably want a timechart instead:
index=ndx sourcetype=srctp domain=* source="sourceurl"
| timechart span=1m dc(domain) as count by source

Display empty bin as a zero value in AWS Log Insights graph

With this count query by bin:
filter #message like / error /
| stats count() as exceptionCount by bin(30m)
I get a discontinuous graph, which is hard to grasp:
Is is possible for AWS Cloudwatch Log Insights to consider the empty bin as zero count to get a continuous graph?
Found your question looking for my own answer to this.
The best that I came up with is to calculate a 'presence' field and then use sum to get 0's in the time bins.
I used strcontains, which returns a 1 when matched or 0 when not. https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html#CWL_QuerySyntax-operations-functions
Mine looks like this:
fields #timestamp, #message
| fields strcontains(#message, 'Exit status 1') as is_exit_message
| stats sum(is_exit_message) as is_exit_message_count by bin(15m) as time_of_crash
| sort time_of_crash desc
So, yours would be:
fields strcontains(#message, 'error') as is_error
| stats sum(is_error) as exceptionCount by bin(30m)
Use strcontains + sum or parse + count.
The point is not using filter. You should query all of logs.

Using dedup to find unique hosts. How can I find an average for the selected time frame?

The goal is to provide percent availability. I would like to check every 15 minutes if the unique count for server1, server2, and server3 is equal to 3 for each interval (indicating the system is fully healthy). From this count I want to check on the average for whatever time period is selected in splunk to output an average and convert to percent.
index="os" sourcetype=ps host="server1" OR host="server2" OR host="server3"
| search "/logs/temp/random/path" OR "application_listener"
| dedup host
| timechart span=30m count
The count should be 3 for each interval.
It's not clear how much of your requirements the example SPL solves, so I'll assume it does nothing.
Having dedup followed by timechart means the timechart command will only see 3 events - one for each host. That doesn't make for a helpful chart. I suggest using dc(host), instead to get a count of hosts for each interval.
The appendpipe command can be used to add average and percentage values on the end.
index="os" sourcetype=ps host="server1" OR host="server2" OR host="server3"
| search "/logs/temp/random/path" OR "application_listener"
| timechart span=30m dc(host) as count
| appendpipe [ stats avg(count) as Avg | eval Pct=round(Avg*100/3,2) ]

how count and plot several searches at once?

I am counting the number of hits on my website using splunk. My current search looks for a keywordA as follows:
index=mydata keywordA |bucket _time span=day |stats count by _time
However, I would like to add several other searches to the output, say for other keywords (keywordB for instance):
index=mydata keywordB |bucket _time span=day |stats count by _time
Note: these searches are not necessarily mutually exlusive! So the searches need to be run independently.
I would like to have the total daily count for each search at once, so that I avoid running each search separately.
Output should be:
day keyA keyB
2020-01-01 423 354
2020-01-02 523 254
What is the best way to proceed?
Thanks!
Try this search that combines your two. Other than the stats command, it doesn't scale well for many keywords.
index=mydata (keywordA OR keywordB)
| bin span=1d _time
| eval keyword = case(match(_raw, "keywordA"), "keywordA", match(_raw, "keywordB"), "keywordB", 1==1, "other")
| stats count by _time, keyword