Splunk Concurrency Calculation - splunk

I have some data from logs in Splunk where I need to determine what other requests were running concurrently at the time of any single event.
Using the following query, I was able to have it return a column for the number of requests that ran at the same time within my start time and duration.
index="sfdc" source="sfdc_event_log://EventLog_SFDC_Production_eventlog_hourly" EVENT_TYPE IN (API, RestAPI) RUN_TIME>20000
| eval endTime=_time
| eval permitTimeInSecs=(RUN_TIME-20000)/1000
| eval permitAcquiredTime=endTime-permitTimeInSecs
| eval dbTotalTime=DB_TOTAL_TIME/1000000
| concurrency start=permitAcquiredTime duration=permitTimeInSecs
| table _time API_TYPE EVENT_TYPE ENTITY_NAME apimethod concurrency permitAcquiredTime permitTimeInSecs RUN_TIME CPU_TIME dbtotalTime REQUEST_ID USER_ID
| fieldformat dbTotalTime=round(dbTotalTime,0)
| rename permitAcquiredTime as "Start Time", permitTimeInSecs as "Concurrency Duration", concurrency as "Concurrent Running Events", API_TYPE as "API Type", EVENT_TYPE as "Event Type", ENTITY_NAME as "Entity Name", apimethod as "API Method", RUN_TIME as "Run Time", CPU_TIME as "CPU Time", dbtotalTime as "DB Total Time", REQUEST_ID as "Request ID", USER_ID as "User ID"
| sort "Concurrent Running Events" desc
I am now trying to investigate a single event in these results. For example, the top event says that at the time it ran, there were 108 concurrent requests running in the 20 second window of time.
How can I identify those 108 events using this data?
I imagine it would be querying the events that had a specific time frame range, but I am not sure if I need to check something like _time + - 10 seconds to see what was running within the 20 second window?
Just need to understand the data behind this 108 events a little more for this top example. My end goal here is to be able to add a drill-down to the dashboard so that when I click on the 108, I can see those events that were running.

Essentially, you are on right lines. What you want to do is create a search (presumably on the original data) using 'earliest=<beginning of 20 second window> latest=<end of 20 second window> using your calculated values.
You have start time and can calculate end time. Then pipe these as variables into a new search.
| search earliest=start_time latest=end_time index="sfdc" etc..
I cant check this here right now. But its probably something along those lines. Quite likely more elegant ways to do the same. Hope I'm not wildly off mark and this at least helps a little.

Related

How to find time duration between two splunk events which has unique key

First Event
17:09:05:362 INFO com.a.b.App - Making a GET Request and req-id: [123456]
Second Event
17:09:06:480 INFO com.a.b.App - Output Status Code: 200 req-id:"123456"
I tried to use index="xyz" container="service-name" | transaction "req-id" startswith="Making a GET Request" endswith="Output Status Code" | table duration but it is also not working.
I want to calculate duration of above two events for every request. I went over some solutions in splunk and Stack Overflow, but still can't get the proper result.
Try doing it with stats instead:
index=ndx sourcetype=srctp
| rex field=_raw "req\-id\D+(?<req_id>\d+)"
| rex field=_raw "(?<sequence>Making a GET Request)"
| rex field=_raw "(?<sequence>Output Status Code)"
| eval sequence=sequence+";"+_time
| stats values(sequence) as sequence by req_id
| mvexpand sequence
| rex field=sequence "(?<sequence>[^;]+);(?<time>\d+)"
| eval time=strftime(time,"%c")
This will extract the "req-id" into a field named req_id, and the start and end of the sequence into a field named sequence
Presuming the sample data you shared is correct, when you stats values(sequence) as sequence, it will put the "Making..." entry first and the "Output..." entry second
Because values() will do this, when you mvexpand and then split the values()'d field part into sequence and time, they'll be in the proper order
If the sample data is incomplete, you may need to tweak the regexes for populating sequence
It’s seem you’re going with my previously suggested approach 😉
Now you have 2 possibilities
1. SPL
Below the simplest query, only invoking 1 rex and assuming _time field correctly filled
index=<your_index> source=<your_source>
("*Making a GET Request*" OR "*Output Status Code*")
| rex field=_raw "req\-id\D+(?<req_id>\d+)"
| stats max(_time) as end, min(_time) as start by id
| eval duration = end - start
| table id duration
Note that depending the amount of data to scan, this one can be ressources consuming for your Splunk cluster
2. Log the response time directly in API (more efficient)
It seem you are working on an API. You must have capabilities to get the response time of each call and directly trace it in your log
Then you can exploit it easily in SPL without calculation
It always preferable to persist data at index time vs. operate systematic calculation at search time

Avoid using Transaction in splunk queries

I am looking for alternate way to write splunk query without using transaction
Example
assuming r is a unique field in both the searches
(sourcetype=* "search log 1") OR (sourcetype=* "search log 2") | transaction r startswith="X" endsWith="y" maxspan=4s
Typically, stats will be found to be your friend here
However, without seeing sample data or what actual SPL you have tried so far, any answer is mostly going to be speculation :)
I'll happily update this answer if/when you provide such, but here's a possible start:
(index=ndxA sourcetype=srctpA "search log 1" r=*) OR (index=ndxB sourcetype=srctpB "search log 2" r=*)
| stats min(_time) as begintime max(_time) as endtime values(index) as rindex values(sourcetype) a rsourcetype by r
| eval begintime=strftime(begintime,"%c"), endtime=strftime(endtime,"%c")

In Amazon Cloudwatch Insights, how do you take a statistic of a statistic?

I am using AWS Cloudwatch Insights and running a query like this:
fields #message, #timestamp
| filter strcontains(#message, "Something of interest happened")
| stats count() as interestCount by bin(10m) as tenMinuteTime
| stats max(interestCount) by datefloor(tenMinuteTime, 1d)
However, on the last line, I get the following error:
mismatched input 'stats' expecting {K_PARSE, K_SEARCH, K_FIELDS, K_DISPLAY, K_FILTER, K_SORT, K_ORDER, K_HEAD, K_LIMIT, K_TAIL}
It would seem to mean from this that I cannot take multiple layers of stat queries in Insights, and thus cannot take a statistic of a statistic. Is there a way around this?
You cannot currently use multiple stat commands and from what I know there is no direct way around that at this time. You can however thicken up your single stat command and separate by comma, like so:
fields #message, #timestamp
| filter strcontains(#message, "Something of interest happened")
| stats count() as #interestCount,
max(interestCount) as #maxInterest,
interestCount by bin(10m) as #tenMinuteTime
You define fields and use functions after stats and then process those result fields.

Splunk search issue

I have a search query like below.
index = abc_dev sourcetype = data RequestorSystem = * Description="Request Receieved from Consumer Service"
OR Description="Total Time taken in sending response"
| dedup TId
| eval InBoundCount=if(Description="Request Receieved from Consumer Service",1,0)
| eval OutBoundCount=if(Description="Total Time taken in sending response",1,0)
| stats sum(InBoundCount) as "Inbound Count",sum(OutBoundCount) as "Outbound Count"
I am not sure why inbound count is always showing as 0, outbound count works perfectly
There is a typo in your eval InBoundCount=... Received is spelled wrong, and if your events have it spelled correctly it won't match!
If that's not the case:
try running the query for both counts separately and make sure you are getting events. Also, posting some example input events will make our answer be more precise.
Splunk queries are joined by an implicit AND which means that your OR needs to either be included in parenthesis or (if you are using Splunk 6.6 or newer) use the IN keyword like so:
index = abc_dev sourcetype = data RequestorSystem = *
Description IN ("Request Receieved from Consumer Service", "Total Time taken in sending response")
Using IN is more portable in case you want add other strings later on. With some tweaks, you could even use a variation of stats count by Description with this.

Need a count for a field from different timezones (have multiple fields from .csv uploaded file)

I am little confused, as i have some events ingesting from .csv file in splunk from different different timezones china, pacific, eastern, europe etc...
I have fields like start time, end time, TimeZone, TimeZoneID, sitename, conferenceID & hostname.....etc
for your info(conferenceID=131146947830496273, 130855971227450408......)
was wondering if i have to do a ".......|stats count of conferenceID" for particular time interval(ex., 12:00 pm to 15:00 pm today ) by sitting on pacific timezone, using the start time and end time from the events search should collect all events sorting from there originating timezones time interval but not the taking splunk timezone time interval.
below are some samples of logs which I have
testincsso,130878564690050083, Shona,"GMT-07:00, Pacific (San Francisco)",4,06/17/2019 09:33:17,06/17/2019 09:42:23,10,0,0,0,0,0,0,9,0,0,1,0,0,1,1
host = usloghost1.example.com sourcetype =webex_cdr
6/17/19
12:29:03.060 AM
testincsso,129392485072911500,Meng,"GMT+08:00, China (Beijing)",45,06/17/2019 07:29:03,06/17/2019 07:59:22,31,0,0,0,0,0,0,0,0,30,1,1,0,0,1
host = usloghost1.corp.example.com sourcetype = webex_cdr
6/17/19
12:19:11.060 AM
testincsso,131121310031953680,sarah ward,"GMT-07:00, Pacific (San Francisco)",4,06/17/2019 07:19:11,06/17/2019 07:52:54,34,0,0,0,0,0,0,0,0,34,3,3,0,0,2
host = usloghost1.corp.example.com sourcetype = webex_cdr
6/17/19
12:00:53.060 AM
testincsso,130878909569842780,Patrick Janesch,"GMT+02:00, Europe (Amsterdam)",22,06/17/2019 07:00:53,06/17/2019 07:04:50,4,0,0,0,0,0,0,4,0,2,3,2,0,1,2
host = usloghost1.corp.example.com sourcetype = webex_cdr
update:
there is 2 fields in the events start time and end time for every conference it held in there local timezone(event originating TZ).
also _time refers the splunk time which I don't need in this case. what I need is there is date_hour, date_minutes, date_seconds...etc which shows events local timezone time(china, europe, asia...etc).
so when i sit here pacific TZ and try searching for
index=test "testincsso" | stats count(conferenceID) by _time
taking timeinterval last 4 hours then the output should display the count of Cenferences by taking the count from all events by comparing with there local TZ's time for last 4 hours.
so do I need to use "| eval hour = strftime(_time,"%H")" or "| eval mytime=_time | convert timeformat="%H ctime(mytime)" before stats.
thanks
-also changing timepicker default behavior may give correct results.
I have events with fields "start time" and "end time" from different TZ. so when I try to search events ex., date range "06-16-2019" using time-picker I should get all events by seeing the field "start time" in events not the "_time" of Splunk.
I want change my splunk time picker default behavior and gives output by sieng events fields(ex., "start time" & "end time". below the query I changed in source xml.
index=test sourcetype=webex] "testinc" | eval earliest = $toearliest$ | eval latest=if($tolatest$ < 0, now(),$tolatest$) | eval datefield = strptime($Time$, "%m/%d/%Y %H:%M:%S")|stats count(Conference)
If you have any control over how the logs are generated, it's best to include the time zone as part of the timestamp. For example, "06/17/2019 07:00:53+0200". Then Splunk can easily convert the time.
If that's not an option, perhaps you can specify the time zone when the logs are read. Assuming each log is stored on a system in the originating time zone, the props.conf stanza for the Universal Forwarder should include a TZ attribute telling Splunk where in the world the log is from.
If this doesn't help, please edit your question to say what problem you are trying to solve.