Splunk search - How to loop on multi values field

Splunk search - How to loop on multi values field - splunk

My use case is analysing ticket in order to attribute a state regarding all the status of a specific ticket.
Raw data look like this :
Id
Version
Status
Event Time
0001
1
New
2021-01-07T09:14:00Z
0001
1
Completed - Action Performed
2021-01-07T09:38:00Z
Data looks like this after transaction command:
Id
Version
Status
Event Time
state
0001, 0001
1, 1
New, Completed - Action Performed
2021-01-07T09:14:00Z, 2021-01-07T09:38:00Z
Acknowlegdement, Work
I'm using transcation command in order to calculate the duration of acknnowlegdement and resolution of the ticket.
I have predefine rule to choose the correct state. This rules compare the n-1 status (New), and the current status (Completed - Action Performed) to choose the state.
Issue
Each ticket has a different number of status. We can not know in advance the max status number. I can not write a static search comparing each value of the Status field.
Expected Solution
I have a field that inform me the number of index on the status (number of status of a ticket) field.
I want to use a loop (Why not a loop for), to iterate on each index of the field Status and compare the value i-1 and i.
I can not find how to do this. Is this possible ?
Thank you

Update to reflect more details
Here's a method with streamstats that should get you towards an answer:
index=ndx sourcetype=srctp Id=* Version=* Status=* EventTime=* state=*
| eval phash=sha256(Version.Status)
| sort 0 _time
| streamstats current=f last(phash) as chash by Id state
| fillnull value="noprev"
| eval changed=if(chash!=phash OR chash="noprev","true","false")
| search NOT changed="false"
| table *
original answer
Something like the following should work to get the most-recent status:
index=ndx sourcetype=srctp Id=* Version=* Status=* EventTime=* state=*
| stats latest(Status) as Status latest(Version) as Version latest(state) state latest(EventTime) as "Event Time" by Id
edit in light of mentioning g the transaction command
Don't use transaction unless you really really really need to.
99% of the time, stats will accomplish what transaction does faster and more efficiently.
For example:
index=ndx sourcetype=srctp Id=* Version=* Status=* EventTime=* state=*
| stats earliest(Status) as eStatus latest(Status) as lStatus earliest(Version) as eVersion latest(Version) as lVersion earliest(status) as estate latest(state) lstate earliest(EventTime) as Opened latest(EventTime) as MostRecent by Id
Will yield a table you can then manipulate further with eval and such. Eg (presuming the time format is subtractable (ie still in Unix epoch format)):
| eval ticketAge=MostRecent-Opened
| eval Versions=eVersion+" - "+lVersion
| eval Statuses=eStatus+" - "+lStatus
| eval State=estate+", ",lstate
| eval Opened=strftime(Opened,"%c"), MostRecent=strftime(MostRecent,"%c")
| eval D=if(ticketAge>86400,round(ticketAge/86400),0)
| eval ticketAge=if(D>0,round(ticketAge-(D*86400)),ticketAge)
| eval H=if(ticketAge>3600,round(ticketAge/3600),0)
| eval ticketAge=if(H>0,round(ticketAge-(H*3600)),ticketAge)
| eval M=if(ticketAge>60,round(ticketAge/60),0)
| eval ticketAge=if(M>0,round(ticketAge-(M*60)),ticketAge)
| rename ticketAge as S
| eval Age=D+" days "+H+" hours"+M+" minutes"+S+" seconds"
| table Id Versions Statuses Opened MostRecent State Age
| rename MostRecent as "Most Recent"
Note: I may have gotten the conversion from raw seconds into days, hours, minutes, seconds off - but it should be close

Related

Regex count capture group members

I have multiple log messages each containing a list of JobIds -
IE -
1. `{"JobIds":["661ce07c-b5f3-4b37-8b4c-a0b76d890039","db7a18ae-ea59-4987-87d5-c80adefa4475"]}`
2. `{"JobIds":["661ce07c-b5f3-4b37-8b4c-a0b76d890040","db7a18ae-ea59-4987-87d5-c80adefa4489"]}`
3. `{"JobIds":["661ce07c-b5f3-4b37-8b4c-a0b76d890070"]}`
I have a rex to get those jobIds. Next I want to count the number of jobIds
My query looks like this -
| rex field=message "\"(?<job_ids>(?:\w+-\w+-\w+-\w+-\w+)+),?\""
| stats count(job_ids)
But this will only give me a count of 3 when I am looking for 5. How can I get a count of all jobIds? I am not sure if this is a splunk limitation or I am missing something in my regex.
Here is my regex - https://regex101.com/r/vqlq5j/1

Also with max-match=0 but with mvcount() instead of mvexpand():
| makeresults count=3 | streamstats count
| eval message=case(count=1, "{\"JobIds\":[\"a1a2a2-b23-b34-d4d4d4\", \"x1a2a2-y23-y34-z4z4z4\"]}", count=2, "{\"JobIds\":[\"a1a9a9-b93-b04-d4d4d4\", \"x1a9a9-y93-y34-z4z4z4\"]}", count=3, "{\"JobIds\":[\"a1a9a9-b93-b04-d14d14d14\"]}")
``` above is test data setup ```
``` below is the actual query ```
| rex field=message max_match=0 "\"(?<id>[\w\d]+\-[\w\d]+\-[\w\d]+\-[\w\d]+\")"
| eval cnt=mvcount(id)
| stats sum(cnt)

In Splunk, to capture multiple matches from a single event, you need to add max_match=0 to your rex, per docs.Splunk
But to get them then separated into a singlevalue field from the [potential] multivalue field job_ids that you made, you need to mvxepand or similar
So this should get you closer:
| rex field=message max_match=0 "\"(?<job_id>(?:\w+-\w+-\w+-\w+-\w+)+),?\""
| mvexpand job_id
| stats dc(job_id)
I also changed from count to dc, as it seems you're looking for a unique count of job IDs, and not just a count of how many in total you've seen
Note: if this is JSON data (and not JSON-inside-JSON) coming into Splunk, and the sourcetype is configured correctly, you shouldn't have to manually extract the multivalue field, as Splunk will do it automatically
Do you have a full set of sample data (a few entire events) you can share?

Extracting a count from raw splunk data by id

I am trying to get a count from transactional information that is retained within raw data in splunk. I have 3-5 transactions that occur.
One has raw data stating: pin match for id 12345678-1234-1234-abcd-12345678abcd or pin mismatched for id etc.
I'm trying to count the number of times the pin match occurs within the transaction time window of 180sec.
I was trying to do something like:
|eval raw=_raw |search index=transa
|eval pinc= if((raw like "%pin match%"),1,0) |stats count(pinc) as Pincount by ID
The issue I'm having is it is counting cumulatively over whatever time I am looking at those transactions. Is there a way to attach it to the ID that is within the message or have it count every one that occurs within that time window?
Thanks!

Presuming the pin status and ID have not been extracted:
index=ndx sourcetype=srctp "pin" "match" OR "mismatched"
| rex field=_raw "pin (?<pin_status>\w+)"
| rex field=_raw "id (?<id>\S+)"
| eval status_time=pin_status+"|"+_time
| stats earliest(status_time) as beginning latest(status_time) as ending by id
| eval beginning=split(beginning,"|"), ending=split(ending,"|")
| eval begining=mvindex(beginning,-1), ending=mvindex(ending,-1)
| table id beginning ending
| sort 0 id
| eval beginning=strftime(beginning,"%c"), ending=strftime(ending,"%c")
After extracting the status ("match" or "mismatched") and the id, append the individual event's _time to the end of the status - we'll pull that value back out after statsing
Using stats, find the earliest and latest status_time entries (fields just created on the previous line) by id, saving them into new fields beginning and ending
Next, split() beginning and ending on the pipe we added to separate the status from the timestamp into a multivalue field
Then assign the last item from the multivalue field (which we know is the timestamp) into itself (because we know that the earliest entry for a status_time should always be "match", and the latest entry for a status_time should always be "mismatched")
Lastly, table the id and time stamps, sort by id, and format the timestamp into something human readable (strftime takes many arguments, %c just happens to be quick)

Using dedup to find unique hosts. How can I find an average for the selected time frame?

The goal is to provide percent availability. I would like to check every 15 minutes if the unique count for server1, server2, and server3 is equal to 3 for each interval (indicating the system is fully healthy). From this count I want to check on the average for whatever time period is selected in splunk to output an average and convert to percent.
index="os" sourcetype=ps host="server1" OR host="server2" OR host="server3"
| search "/logs/temp/random/path" OR "application_listener"
| dedup host
| timechart span=30m count
The count should be 3 for each interval.

It's not clear how much of your requirements the example SPL solves, so I'll assume it does nothing.
Having dedup followed by timechart means the timechart command will only see 3 events - one for each host. That doesn't make for a helpful chart. I suggest using dc(host), instead to get a count of hosts for each interval.
The appendpipe command can be used to add average and percentage values on the end.
index="os" sourcetype=ps host="server1" OR host="server2" OR host="server3"
| search "/logs/temp/random/path" OR "application_listener"
| timechart span=30m dc(host) as count
| appendpipe [ stats avg(count) as Avg | eval Pct=round(Avg*100/3,2) ]

Query for calculating duration between two different logs in Splunk

As part of my requirements, I have to calculate the duration between two different logs using Splunk query.
For example:
Log 2:
2020-04-22 13:12 ADD request received ID : 123
Log 1 :
2020-04-22 12:12 REMOVE request received ID : 122
The common String between two logs is " request received ID :" and unique strings between two logs are "ADD", "REMOVE". And the expected output duration is 1 hour.
Any help would be appreciated. Thanks

You can use the transaction command, https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Transaction
Assuming you have the field ID extracted, you can do
index=* | transaction ID
This will automatically produce a field called duration, which is the time between the first and last event with the same ID

While transaction will work, it's very inefficient
This stats should show you what you're looking for (presuming the fields are already extracted):
(index=ndxA OR index=ndxB) ID=* ("ADD" OR "REMOVE")
| stats min(_time) as when_added max(_time) as when_removed by ID
| eval when_added=strftime(when_added,"%c"), when_removed(when_removed,"%c")
If you don't already have fields extracted, you'll need to modify thusly (remove the "\D^" in the regex if the ID value isn't at the end of the line):
(index=ndxA OR index=ndxB) ("ADD" OR "REMOVE")
| rex field=_raw "ID \s+:\s+(?<ID>\d+)\D^"
| stats min(_time) as when_added max(_time) as when_removed by ID
| eval when_added=strftime(when_added,"%c"), when_removed(when_removed,"%c")

Splunk - Get Prefefined Outputs Based on the event count and event data

I have a query as below. The result is always predefined as -
If the query result has 3 events and if the 3rd event has event="delivered" as value then the whole transaction needs to be returned as "COMPLETE".
If the 3rd event is present and event!="delivered" then the status becomes "PENDING"
If the 3rd event is not present at all, then the transaction is marked as ERROR
My Query -
index=myindex OR index=myindex2 uuid=98as786-ffe6-4de1-929y-080e99bc2e6r (status="202") OR (TransactionStatus="PUBLISHED") | append [search index=myindex2 (logMessage="Producer created new event") event="delivered" OR event="processed" serviceName="abc" [search index=myindex uuid=98as786-ffe6-4de1-929y-080e99bc2e6r AND status="SUCCESS" AND serviceName="abc" | top limit=1 headerId | fields + headerId | rename headerId as message_id]]
Result events -
Event1 - 202 Accepted
Event 2 - Adapter Success
Event 3 - delivered or error or processed
My high level dashboard should look like below -
Complete - 6378638
Pending - 2173
Error - 6356
The unique ID will be the UUID on which the count to be performed.
What can be the possible way we can do this - eval ? Lookup ? not sure as I am new to splunk.
Please let me know if more information is needed if I am missing something.

See if this helps. The terminology in your question is a little inconsistent so you may need to adjust the field names in this query.
index=myindex OR index=myindex2 uuid=98as786-ffe6-4de1-929y-080e99bc2e6r ((status="202") OR (TransactionStatus="PUBLISHED")) OR (index=myindex2 (logMessage="Producer created new event") event="delivered" OR event="processed" serviceName="abc") (index=myindex uuid=98as786-ffe6-4de1-929y-080e99bc2e6r AND status="SUCCESS" AND serviceName="abc" )
| stats count, latest(event) as event by headerId
| eval result=case(count=3 AND event="delivered", "COMPLETE", count=3 AND event!="delivered", "PENDING", count!=3, "ERROR", 1=1, "UNKNOWN")
| stats count by result
| table result count

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Splunk search - How to loop on multi values field - splunk

Related

Regex count capture group members

Extracting a count from raw splunk data by id

Using dedup to find unique hosts. How can I find an average for the selected time frame?

Query for calculating duration between two different logs in Splunk

Splunk - Get Prefefined Outputs Based on the event count and event data

Categories

Resources