how to write splunk query for xml - splunk

<!DOCTYPE EmployeeInventory SYSTEM 'EmployeeInventory.dtd'><EmployeeInventory version="2.0"><ProductInventoryInfo><Product>7781105882846</Product><EmployeeID>12151</EmployeeID><Quantity>28</Quantity><CenterID>167551</CenterID></ProductInventoryInfo></EmployeeInventory>
<!DOCTYPE EmployeeInventory SYSTEM 'EmployeeInventory.dtd'><EmployeeInventory version="2.0"><ProductInventoryInfo><Product>1781305782846</Product><EmployeeID>12152</EmployeeID><Quantity>18</Quantity><CenterID>167552</CenterID></ProductInventoryInfo></EmployeeInventory>
How to write splunk query from above splunk log which will fetch table like this .
Product EmployeeID Quantity CenterID
7781105882846 12151 28 167551
1781305782846 12152 18 167552

It would help to know what you've tried so far and how those attempts failed to meet your needs.
The trick is extracting fields from the XML. You could use a series of rex commands, but spath is simpler.
| makeresults
| eval data="<!DOCTYPE EmployeeInventory SYSTEM 'EmployeeInventory.dtd'><EmployeeInventory version=\"2.0\"><ProductInventoryInfo><Product>7781105882846</Product><EmployeeID>12151</EmployeeID><Quantity>28</Quantity><CenterID>167551</CenterID></ProductInventoryInfo></EmployeeInventory>;<!DOCTYPE EmployeeInventory SYSTEM 'EmployeeInventory.dtd'><EmployeeInventory version=\"2.0\"><ProductInventoryInfo><Product>1781305782846</Product><EmployeeID>12152</EmployeeID><Quantity>18</Quantity><CenterID>167552</CenterID></ProductInventoryInfo></EmployeeInventory>"
| eval data=split(data,";")
| mvexpand data
```The above is just for setting up test data```
```Parse the data```
| spath input=data ```Replace "data" with the name of the field containing the data, perhaps "_raw"```
```Simplify the field names```
| rename EmployeeInventory.ProductInventoryInfo.* as *
```Display the data```
| table Product EmployeeID Quantity CenterID

Related

Regex count capture group members

I have multiple log messages each containing a list of JobIds -
IE -
1. `{"JobIds":["661ce07c-b5f3-4b37-8b4c-a0b76d890039","db7a18ae-ea59-4987-87d5-c80adefa4475"]}`
2. `{"JobIds":["661ce07c-b5f3-4b37-8b4c-a0b76d890040","db7a18ae-ea59-4987-87d5-c80adefa4489"]}`
3. `{"JobIds":["661ce07c-b5f3-4b37-8b4c-a0b76d890070"]}`
I have a rex to get those jobIds. Next I want to count the number of jobIds
My query looks like this -
| rex field=message "\"(?<job_ids>(?:\w+-\w+-\w+-\w+-\w+)+),?\""
| stats count(job_ids)
But this will only give me a count of 3 when I am looking for 5. How can I get a count of all jobIds? I am not sure if this is a splunk limitation or I am missing something in my regex.
Here is my regex - https://regex101.com/r/vqlq5j/1
Also with max-match=0 but with mvcount() instead of mvexpand():
| makeresults count=3 | streamstats count
| eval message=case(count=1, "{\"JobIds\":[\"a1a2a2-b23-b34-d4d4d4\", \"x1a2a2-y23-y34-z4z4z4\"]}", count=2, "{\"JobIds\":[\"a1a9a9-b93-b04-d4d4d4\", \"x1a9a9-y93-y34-z4z4z4\"]}", count=3, "{\"JobIds\":[\"a1a9a9-b93-b04-d14d14d14\"]}")
``` above is test data setup ```
``` below is the actual query ```
| rex field=message max_match=0 "\"(?<id>[\w\d]+\-[\w\d]+\-[\w\d]+\-[\w\d]+\")"
| eval cnt=mvcount(id)
| stats sum(cnt)
In Splunk, to capture multiple matches from a single event, you need to add max_match=0 to your rex, per docs.Splunk
But to get them then separated into a singlevalue field from the [potential] multivalue field job_ids that you made, you need to mvxepand or similar
So this should get you closer:
| rex field=message max_match=0 "\"(?<job_id>(?:\w+-\w+-\w+-\w+-\w+)+),?\""
| mvexpand job_id
| stats dc(job_id)
I also changed from count to dc, as it seems you're looking for a unique count of job IDs, and not just a count of how many in total you've seen
Note: if this is JSON data (and not JSON-inside-JSON) coming into Splunk, and the sourcetype is configured correctly, you shouldn't have to manually extract the multivalue field, as Splunk will do it automatically
Do you have a full set of sample data (a few entire events) you can share?

How to extract values from json array and validate in Splunk

I am new to Splunk, trying to fetch the values from json request body. I am able to fetch values one by one by using
"json_extract(json,path)"
but I have more than 10 fields so I am trying to use
"json_extract(json,path1,path2..pathN)"
which is returning the json array.
But I am not getting how to read the values from json array and check if it is null or not.
eval keyValues ="json_extract(json,"firstname","lastname","dob")"
| table keyValues
output: ["testfirstname","testlastname","1/1/1999"]
["testfirstname","testlastname",null]
[null,"testlastname",null]
Can someone please help how to loop above json array and check the value, if it is null or not(eval isnotnull(firstname))
if your events are in JSON and it is easily accessible through field names then I suggest using the Spath command like below. Just replace your condition with YOUR_REQUIRED_CONDITION.
| makeresults
| eval json="{ \"firstname\": \"firstname123\", \"lastname\": \"lastname123\", \"dob\": \"1/1/1999\"}"
| spath input=json
| where YOUR_REQUIRED_CONDITION
You can also go with your approach but here you have to extract the same 10 fields and need to compare them as per your requirements, see the below code.
| makeresults
| eval json="{ \"firstname\": \"firstname123\", \"lastname\": \"lastname123\", \"dob\": \"1/1/1999\"}"
| eval keyValues =json_extract(json,"firstname","lastname","dob")
| table keyValues
| eval keyValues = json_array_to_mv(keyValues)
| eval firstname=mvindex(keyValues,0),lastname=mvindex(keyValues,1),dob=mvindex(keyValues,2)
| where YOUR_REQUIRED_CONDITION
If you share your SAMPLE events and the exact where conditions with us then we can provide the optimum solution.
Thanks
KV
You don't need to loop through the values, you need to treat them as multivalue fields, and expand/filter appropriately
For example:
index=ndx sourcetype=srctp fieldname.subname{}.value=*
| rename fieldname.subname{}.value as subname
| mvexpand subname
| stats count by subname
| fields - count

Splunk query with conditions of an object

I need a Splunk query to fetch the counts of each field used in my dashboard.
Splunk sample data for each search is like this
timestamp="2022-11-07 02:06:38.427"
loglevel="INFO" pid="1"
thread="http-nio-8080-exec-10"
appname="my-test-app"
URI="/testapp/v1/mytest-app/dashboard-service"
RequestPayload="{\"name\":\"test\",\"number\":\"\"}"
What would a search look like to print a table with the number of times the name and number is used to search data (at a time only either number/name data can be given by user).
Expected output in table format with counts for Name and Number
#Hanuman
Can you please try this? You can change regular expression as per your events and match with JSON data.
YOUR_SEARCH | rex field=_raw "RequestPayload=\"(?<data>.*[}])\""
| spath input=data
|table name number
My Sample Search:
| makeresults | eval _raw="*timestamp=\"2022-11-07 02:06:38.427\" loglevel=\"INFO\" pid=\"1\" thread=\"http-nio-8080-exec-10\" appname=\"my-test-app\" URI=\"/testapp/v1/mytest-app/dashboard-service\" RequestPayload=\"{\"name\":\"test\",\"number\":\"1\"}\"*"
| rex field=_raw "RequestPayload=\"(?<data>.*[}])\""
| spath input=data
|table name number
Screen
Thanks

Splunk left jion is not giving as exepcted

Requirement: I want to find out, payment card information used in a particular day are there any tele sales order placed with the same payment card information.
I tried with below query it is supposed to give me all the payment card information from online orders and matching payment info from telesales. But i am not giving correct results basically results shows there are no telesales for payment information, but when i search splunk i am finding telesales as well. So the query wrong.
index="orders" "Online order received" earliest=-9d latest=-8d
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| rename timestamp as onlineOrderTime
| table payHash, onlineOrderTime
| join type=left payHash [search index="orders" "Telesale order received" earliest=-20d latest=-5m | rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))" | rename timestamp as TeleSaleTime | table payHash, TeleSaleTime]
| table payHash, onlineOrderTime, TeleSaleTime
Please help me in fixing the query or a query to find out results for my requirement.
If you do want to do this with a join, what you had, slightly changed, should be correct:
index="orders" "Online order received" earliest=-9d latest=-8d
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| stats values(_time) as onlineOrderTime by payHash
| join type=left payHash
[search index="orders" "Telesale order received" earliest=-20d latest=-5m
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| rename timestamp as TeleSaleTime
| stats values(TeleSaleTime) by payHash ]
| rename timestamp as onlineOrderTime
Note the added | stats values(...) by in the subsearch: you need to ensure you've removed any duplicates from the list, which this will do. By using values(), you'll also ensure if there're repeated entries for the payHash field, they get grouped together. (Similarly, added a | stats values... before the subsearch to speed the whole operation.)
You should be able to do this without a join, too:
index="orders" (("Online order received" earliest=-9d latest=-8d) OR "Telesale order received" earliest=-20d))
| rex field=_raw "(?<order_type>\w+) order received"
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| stats values(order_type) as order_type values(_time) as orderTimes by payHash
| where mvcount(order_type)>1
After you've ensured your times are correct, you can format them - here's one I use frequently:
| eval onlineOrderTime=strftime(onlineOrderTime,"%c"), TeleSaleTime=strftime(TeleSaleTime,"%c")
You may also need to do further reformatting, but these should get you close
fwiw - I'd wonder why you were trying to look at Online orders from only 9 days ago, but Telesale orders from 20 days ago to now: but that's just me.
The join command expects a list of field names on which events from each search will be matched. If no fields are listed then all fields are used. In the example, the fields 'onlineOrderTime' and 'TeleSaleTime' exist only on one side of the join so no matches can be made. The fix is simple: specify the common field name. ... | join type=left payHash ....
First of all, you can delete the last row | table payHash, onlineOrderTime, TeleSaleTime beacuse it doesn't do anything(the join command already joins both tables you created).
Secondly, when running both queries separately - both queries have the same "payHash"es? both queries return back a table with the true results?
Because by the looks of it, you used the join command correctly...

How to use rex command to extract two fields and chart the count for both in one search query?

I have a log statement like 2017-06-21 12:53:48,426 INFO transaction.TransactionManager.Info:181 -{"message":{"TransactionStatus":true,"TransactioName":"removeLockedUser-1498029828160"}} .
How can i extract TransactionName and TranscationStatus and print in table form TransactionName and its count.
I tried below query but didn't get any success. It is always giving me 0.
sourcetype=10.240.204.69 "TransactionStatus" | rex field=_raw ".TransactionStatus (?.)" |stats count((status=true)) as success_count
Solved it with this :
| makeresults
| eval _raw="2017-06-21 12:53:48,426 INFO transaction.TransactionManager.Info:181 -{\"message\":{\"TransactionStatus\":true,\"TransactioName\":\"removeLockedUser-1498029828160\"}}"
| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"
| rex "{\"TransactionStatus\":(?[^,]),\"TransactioName\":\"(?[^\"])\""
| chart count OVER TransactioName BY TransactionStatus