Splunk left jion is not giving as exepcted - splunk

Requirement: I want to find out, payment card information used in a particular day are there any tele sales order placed with the same payment card information.
I tried with below query it is supposed to give me all the payment card information from online orders and matching payment info from telesales. But i am not giving correct results basically results shows there are no telesales for payment information, but when i search splunk i am finding telesales as well. So the query wrong.
index="orders" "Online order received" earliest=-9d latest=-8d
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| rename timestamp as onlineOrderTime
| table payHash, onlineOrderTime
| join type=left payHash [search index="orders" "Telesale order received" earliest=-20d latest=-5m | rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))" | rename timestamp as TeleSaleTime | table payHash, TeleSaleTime]
| table payHash, onlineOrderTime, TeleSaleTime
Please help me in fixing the query or a query to find out results for my requirement.

If you do want to do this with a join, what you had, slightly changed, should be correct:
index="orders" "Online order received" earliest=-9d latest=-8d
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| stats values(_time) as onlineOrderTime by payHash
| join type=left payHash
[search index="orders" "Telesale order received" earliest=-20d latest=-5m
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| rename timestamp as TeleSaleTime
| stats values(TeleSaleTime) by payHash ]
| rename timestamp as onlineOrderTime
Note the added | stats values(...) by in the subsearch: you need to ensure you've removed any duplicates from the list, which this will do. By using values(), you'll also ensure if there're repeated entries for the payHash field, they get grouped together. (Similarly, added a | stats values... before the subsearch to speed the whole operation.)
You should be able to do this without a join, too:
index="orders" (("Online order received" earliest=-9d latest=-8d) OR "Telesale order received" earliest=-20d))
| rex field=_raw "(?<order_type>\w+) order received"
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| stats values(order_type) as order_type values(_time) as orderTimes by payHash
| where mvcount(order_type)>1
After you've ensured your times are correct, you can format them - here's one I use frequently:
| eval onlineOrderTime=strftime(onlineOrderTime,"%c"), TeleSaleTime=strftime(TeleSaleTime,"%c")
You may also need to do further reformatting, but these should get you close
fwiw - I'd wonder why you were trying to look at Online orders from only 9 days ago, but Telesale orders from 20 days ago to now: but that's just me.

The join command expects a list of field names on which events from each search will be matched. If no fields are listed then all fields are used. In the example, the fields 'onlineOrderTime' and 'TeleSaleTime' exist only on one side of the join so no matches can be made. The fix is simple: specify the common field name. ... | join type=left payHash ....

First of all, you can delete the last row | table payHash, onlineOrderTime, TeleSaleTime beacuse it doesn't do anything(the join command already joins both tables you created).
Secondly, when running both queries separately - both queries have the same "payHash"es? both queries return back a table with the true results?
Because by the looks of it, you used the join command correctly...

Related

Regex count capture group members

I have multiple log messages each containing a list of JobIds -
IE -
1. `{"JobIds":["661ce07c-b5f3-4b37-8b4c-a0b76d890039","db7a18ae-ea59-4987-87d5-c80adefa4475"]}`
2. `{"JobIds":["661ce07c-b5f3-4b37-8b4c-a0b76d890040","db7a18ae-ea59-4987-87d5-c80adefa4489"]}`
3. `{"JobIds":["661ce07c-b5f3-4b37-8b4c-a0b76d890070"]}`
I have a rex to get those jobIds. Next I want to count the number of jobIds
My query looks like this -
| rex field=message "\"(?<job_ids>(?:\w+-\w+-\w+-\w+-\w+)+),?\""
| stats count(job_ids)
But this will only give me a count of 3 when I am looking for 5. How can I get a count of all jobIds? I am not sure if this is a splunk limitation or I am missing something in my regex.
Here is my regex - https://regex101.com/r/vqlq5j/1
Also with max-match=0 but with mvcount() instead of mvexpand():
| makeresults count=3 | streamstats count
| eval message=case(count=1, "{\"JobIds\":[\"a1a2a2-b23-b34-d4d4d4\", \"x1a2a2-y23-y34-z4z4z4\"]}", count=2, "{\"JobIds\":[\"a1a9a9-b93-b04-d4d4d4\", \"x1a9a9-y93-y34-z4z4z4\"]}", count=3, "{\"JobIds\":[\"a1a9a9-b93-b04-d14d14d14\"]}")
``` above is test data setup ```
``` below is the actual query ```
| rex field=message max_match=0 "\"(?<id>[\w\d]+\-[\w\d]+\-[\w\d]+\-[\w\d]+\")"
| eval cnt=mvcount(id)
| stats sum(cnt)
In Splunk, to capture multiple matches from a single event, you need to add max_match=0 to your rex, per docs.Splunk
But to get them then separated into a singlevalue field from the [potential] multivalue field job_ids that you made, you need to mvxepand or similar
So this should get you closer:
| rex field=message max_match=0 "\"(?<job_id>(?:\w+-\w+-\w+-\w+-\w+)+),?\""
| mvexpand job_id
| stats dc(job_id)
I also changed from count to dc, as it seems you're looking for a unique count of job IDs, and not just a count of how many in total you've seen
Note: if this is JSON data (and not JSON-inside-JSON) coming into Splunk, and the sourcetype is configured correctly, you shouldn't have to manually extract the multivalue field, as Splunk will do it automatically
Do you have a full set of sample data (a few entire events) you can share?

Splunk query with conditions of an object

I need a Splunk query to fetch the counts of each field used in my dashboard.
Splunk sample data for each search is like this
timestamp="2022-11-07 02:06:38.427"
loglevel="INFO" pid="1"
thread="http-nio-8080-exec-10"
appname="my-test-app"
URI="/testapp/v1/mytest-app/dashboard-service"
RequestPayload="{\"name\":\"test\",\"number\":\"\"}"
What would a search look like to print a table with the number of times the name and number is used to search data (at a time only either number/name data can be given by user).
Expected output in table format with counts for Name and Number
#Hanuman
Can you please try this? You can change regular expression as per your events and match with JSON data.
YOUR_SEARCH | rex field=_raw "RequestPayload=\"(?<data>.*[}])\""
| spath input=data
|table name number
My Sample Search:
| makeresults | eval _raw="*timestamp=\"2022-11-07 02:06:38.427\" loglevel=\"INFO\" pid=\"1\" thread=\"http-nio-8080-exec-10\" appname=\"my-test-app\" URI=\"/testapp/v1/mytest-app/dashboard-service\" RequestPayload=\"{\"name\":\"test\",\"number\":\"1\"}\"*"
| rex field=_raw "RequestPayload=\"(?<data>.*[}])\""
| spath input=data
|table name number
Screen
Thanks

Splunk - I want to add a value from stats count() to a value from a lookup table and show that value in a table

The objective of the query im trying to write is to take a count of raw data from the previous month and add that to a count from a lookup table (.csv)
What I have attempted to do is…
index=*** source=***
| stats count(_raw) as monthCount
| join
[ | inputlookup Log_Count_YTD.csv]
| eval countYTD = toNumber(monthCount) + toNumber(TOTAL_COUNT_YTD)
| table countYTD
This query doesn’t return any value on a table. The TOTAL_COUNT_YTD is the only field from the inputlookup file. Let me know if there is any other information you need to help me out with this one. Thanks!
The stats command transforms the data so it has only 1 field: monthCount. The inputlookup returns only the TOTAL_COUNT_YTD field. The join command works by comparing values of common fields between the main search and the subsearch. Since there are no common fields no events are joined.
There is no need for join in this case. The appendcols command will do, assuming the CSV contains a single field in a single row.
index=*** source=***
| stats count() as monthCount
| appendcols
[ | inputlookup Log_Count_YTD.csv]
| eval countYTD = toNumber(monthCount) + toNumber(TOTAL_COUNT_YTD)
| table countYTD
FWIW, the tonumber function is unnecessary, but doesn't hurt.

Join two Splunk queries without predefined fields

I am trying to join 2 splunk queries. However in this case the common string between the 2 queries is not a predefined splunk field and is logged in a different manner. I have created the regex which individually identifies the string but when I try to combine using join, I do not get the result.
I have logs like this -
Logline 1 -
21-04-2019 11:01:02.001 server1 app1 1023456789 1205265352567565 1234567Z-1234-1234-1234-123456789123 Application Completed
Logline 2 -
21-04-2019 11:00:00.000 journey_ends server1 app1 1035625855585989 .....(lots of text) commonID:1234567Z-1234-1234-1234-123456789123 .....(lots of text) status(value) OK
the second Logline can be NOTOK as well
Logline 2 -
21-04-2019 11:00:00.000 journey_ends server1 app1 1035625855585989 .....(lots of text) commonID:1234567Z-1234-1234-1234-123456789123 .....(lots of text) status(value) NOTOK
I have tried multiple things but the best that I can come up with is -
index=test "journey_ends" | rex "status(value) (?<StatusType>[A-Z][A-Z]*)" | rex "commonID\:(?<commonID>[^\t]{37})" | table StatusType, commonID | join type=inner commonID [ search index=test "Application Completed" | rex "^(?:[^\t\n]*\t){7}(?P<commonID>[^\t]+)" | table _time, commonID] | chart count over StatusType by commonID
However the above query does not provide me the stats. In verbose mode, I can just see the events of query 1. Please note that the above 2 queries run correctly individually.
However currently I have to initially run the query to fetch the commonIDs from "Application Completed" logline and then in another query give the list of commonIDs found in the result first query as input and find the status value for each commonId from logline 2.
Expected Result (in a table):
StatusType commonID OK 1234567Z-1234-1234-1234-123456789123 NOTOK 1234567Z-1234-1234-1234-985625623541
Can you try the below query,
index=main
AND "Application Completed"
| rex "(?<common_id>[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+)"
| table _time, common_id
| join type=inner common_id [
search index=main
| rex "status\(value\)\s+(?<status>.+)$"
| rex "(?<common_id>[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+)"
| table status, common_id
]

Stats Count Splunk Query

I wonder whether someone can help me please.
I'd made the following post about Splunk query I'm trying to write:
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html
I received some great help, but despite working on this for a few days now concentrating on using eval if statements, I still have the same issue with the "Successful" and "Unsuccessful" columns showing blank results. So I thought I'd cast the net a little wider and ask please whether someone maybe able to look at this and offer some guidance on how I may get around the problem.
Many thanks and kind regards
Chris
I tried exploring your use-case with splunkd-access log and came up with a simple SPL to help you.
In this query I am actually joining the output of 2 searches which aggregate the required results (Not concerned about the search performance).
Give it a try. If you've access to _internal index, this will work as is. You should be able to easily modify this to suit your events (eg: replace user with ClientID).
index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| stats count as All sum(eval(if(status <= 303,1,0))) as Successful sum(eval(if(status > 303,1,0))) as Unsuccessful by user
| join user type=left
[ search index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| chart count BY user status ]
I updated your search from splunk community answers (should look like this):
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| stats count as All sum(eval(if(statusCode <= 303,1,0))) as Successful sum(eval(if(statusCode > 303,1,0))) as Unsuccessful by ClientID
| join ClientID type=left
[ search w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| chart count BY ClientID statusCode ]
I answered in Splunk
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html?childToView=729492#answer-729492
but using dummy encoding, it looks like
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientId as ClientID, detail.statusCode as Status
| eval X_{Status}=1
| stats count as Total sum(X_*) as X_* by ClientID
| rename X_* as *
Will give you ClientID, count and then a column for each status code found, with a sum of each code in that column.
As I gather you can't get this working, this query should show dummy encoding in action
`index=_internal sourcetype=*access
| eval X_{status}=1
| stats count as Total sum(X_*) as X_* by source, user
| rename X_* as *`
This would give an output of something like