First Event
17:09:05:362 INFO com.a.b.App - Making a GET Request and req-id: [123456]
Second Event
17:09:06:480 INFO com.a.b.App - Output Status Code: 200 req-id:"123456"
I tried to use index="xyz" container="service-name" | transaction "req-id" startswith="Making a GET Request" endswith="Output Status Code" | table duration but it is also not working.
I want to calculate duration of above two events for every request. I went over some solutions in splunk and Stack Overflow, but still can't get the proper result.
Try doing it with stats instead:
index=ndx sourcetype=srctp
| rex field=_raw "req\-id\D+(?<req_id>\d+)"
| rex field=_raw "(?<sequence>Making a GET Request)"
| rex field=_raw "(?<sequence>Output Status Code)"
| eval sequence=sequence+";"+_time
| stats values(sequence) as sequence by req_id
| mvexpand sequence
| rex field=sequence "(?<sequence>[^;]+);(?<time>\d+)"
| eval time=strftime(time,"%c")
This will extract the "req-id" into a field named req_id, and the start and end of the sequence into a field named sequence
Presuming the sample data you shared is correct, when you stats values(sequence) as sequence, it will put the "Making..." entry first and the "Output..." entry second
Because values() will do this, when you mvexpand and then split the values()'d field part into sequence and time, they'll be in the proper order
If the sample data is incomplete, you may need to tweak the regexes for populating sequence
It’s seem you’re going with my previously suggested approach 😉
Now you have 2 possibilities
1. SPL
Below the simplest query, only invoking 1 rex and assuming _time field correctly filled
index=<your_index> source=<your_source>
("*Making a GET Request*" OR "*Output Status Code*")
| rex field=_raw "req\-id\D+(?<req_id>\d+)"
| stats max(_time) as end, min(_time) as start by id
| eval duration = end - start
| table id duration
Note that depending the amount of data to scan, this one can be ressources consuming for your Splunk cluster
2. Log the response time directly in API (more efficient)
It seem you are working on an API. You must have capabilities to get the response time of each call and directly trace it in your log
Then you can exploit it easily in SPL without calculation
It always preferable to persist data at index time vs. operate systematic calculation at search time
I am looking for alternate way to write splunk query without using transaction
Example
assuming r is a unique field in both the searches
(sourcetype=* "search log 1") OR (sourcetype=* "search log 2") | transaction r startswith="X" endsWith="y" maxspan=4s
Typically, stats will be found to be your friend here
However, without seeing sample data or what actual SPL you have tried so far, any answer is mostly going to be speculation :)
I'll happily update this answer if/when you provide such, but here's a possible start:
(index=ndxA sourcetype=srctpA "search log 1" r=*) OR (index=ndxB sourcetype=srctpB "search log 2" r=*)
| stats min(_time) as begintime max(_time) as endtime values(index) as rindex values(sourcetype) a rsourcetype by r
| eval begintime=strftime(begintime,"%c"), endtime=strftime(endtime,"%c")
I am using AWS Cloudwatch Insights and running a query like this:
fields #message, #timestamp
| filter strcontains(#message, "Something of interest happened")
| stats count() as interestCount by bin(10m) as tenMinuteTime
| stats max(interestCount) by datefloor(tenMinuteTime, 1d)
However, on the last line, I get the following error:
mismatched input 'stats' expecting {K_PARSE, K_SEARCH, K_FIELDS, K_DISPLAY, K_FILTER, K_SORT, K_ORDER, K_HEAD, K_LIMIT, K_TAIL}
It would seem to mean from this that I cannot take multiple layers of stat queries in Insights, and thus cannot take a statistic of a statistic. Is there a way around this?
You cannot currently use multiple stat commands and from what I know there is no direct way around that at this time. You can however thicken up your single stat command and separate by comma, like so:
fields #message, #timestamp
| filter strcontains(#message, "Something of interest happened")
| stats count() as #interestCount,
max(interestCount) as #maxInterest,
interestCount by bin(10m) as #tenMinuteTime
You define fields and use functions after stats and then process those result fields.
I have a search query like below.
index = abc_dev sourcetype = data RequestorSystem = * Description="Request Receieved from Consumer Service"
OR Description="Total Time taken in sending response"
| dedup TId
| eval InBoundCount=if(Description="Request Receieved from Consumer Service",1,0)
| eval OutBoundCount=if(Description="Total Time taken in sending response",1,0)
| stats sum(InBoundCount) as "Inbound Count",sum(OutBoundCount) as "Outbound Count"
I am not sure why inbound count is always showing as 0, outbound count works perfectly
There is a typo in your eval InBoundCount=... Received is spelled wrong, and if your events have it spelled correctly it won't match!
If that's not the case:
try running the query for both counts separately and make sure you are getting events. Also, posting some example input events will make our answer be more precise.
Splunk queries are joined by an implicit AND which means that your OR needs to either be included in parenthesis or (if you are using Splunk 6.6 or newer) use the IN keyword like so:
index = abc_dev sourcetype = data RequestorSystem = *
Description IN ("Request Receieved from Consumer Service", "Total Time taken in sending response")
Using IN is more portable in case you want add other strings later on. With some tweaks, you could even use a variation of stats count by Description with this.
I am little confused, as i have some events ingesting from .csv file in splunk from different different timezones china, pacific, eastern, europe etc...
I have fields like start time, end time, TimeZone, TimeZoneID, sitename, conferenceID & hostname.....etc
for your info(conferenceID=131146947830496273, 130855971227450408......)
was wondering if i have to do a ".......|stats count of conferenceID" for particular time interval(ex., 12:00 pm to 15:00 pm today ) by sitting on pacific timezone, using the start time and end time from the events search should collect all events sorting from there originating timezones time interval but not the taking splunk timezone time interval.
below are some samples of logs which I have
testincsso,130878564690050083, Shona,"GMT-07:00, Pacific (San Francisco)",4,06/17/2019 09:33:17,06/17/2019 09:42:23,10,0,0,0,0,0,0,9,0,0,1,0,0,1,1
host = usloghost1.example.com sourcetype =webex_cdr
6/17/19
12:29:03.060 AM
testincsso,129392485072911500,Meng,"GMT+08:00, China (Beijing)",45,06/17/2019 07:29:03,06/17/2019 07:59:22,31,0,0,0,0,0,0,0,0,30,1,1,0,0,1
host = usloghost1.corp.example.com sourcetype = webex_cdr
6/17/19
12:19:11.060 AM
testincsso,131121310031953680,sarah ward,"GMT-07:00, Pacific (San Francisco)",4,06/17/2019 07:19:11,06/17/2019 07:52:54,34,0,0,0,0,0,0,0,0,34,3,3,0,0,2
host = usloghost1.corp.example.com sourcetype = webex_cdr
6/17/19
12:00:53.060 AM
testincsso,130878909569842780,Patrick Janesch,"GMT+02:00, Europe (Amsterdam)",22,06/17/2019 07:00:53,06/17/2019 07:04:50,4,0,0,0,0,0,0,4,0,2,3,2,0,1,2
host = usloghost1.corp.example.com sourcetype = webex_cdr
update:
there is 2 fields in the events start time and end time for every conference it held in there local timezone(event originating TZ).
also _time refers the splunk time which I don't need in this case. what I need is there is date_hour, date_minutes, date_seconds...etc which shows events local timezone time(china, europe, asia...etc).
so when i sit here pacific TZ and try searching for
index=test "testincsso" | stats count(conferenceID) by _time
taking timeinterval last 4 hours then the output should display the count of Cenferences by taking the count from all events by comparing with there local TZ's time for last 4 hours.
so do I need to use "| eval hour = strftime(_time,"%H")" or "| eval mytime=_time | convert timeformat="%H ctime(mytime)" before stats.
thanks
-also changing timepicker default behavior may give correct results.
I have events with fields "start time" and "end time" from different TZ. so when I try to search events ex., date range "06-16-2019" using time-picker I should get all events by seeing the field "start time" in events not the "_time" of Splunk.
I want change my splunk time picker default behavior and gives output by sieng events fields(ex., "start time" & "end time". below the query I changed in source xml.
index=test sourcetype=webex] "testinc" | eval earliest = $toearliest$ | eval latest=if($tolatest$ < 0, now(),$tolatest$) | eval datefield = strptime($Time$, "%m/%d/%Y %H:%M:%S")|stats count(Conference)
If you have any control over how the logs are generated, it's best to include the time zone as part of the timestamp. For example, "06/17/2019 07:00:53+0200". Then Splunk can easily convert the time.
If that's not an option, perhaps you can specify the time zone when the logs are read. Assuming each log is stored on a system in the originating time zone, the props.conf stanza for the Universal Forwarder should include a TZ attribute telling Splunk where in the world the log is from.
If this doesn't help, please edit your question to say what problem you are trying to solve.