Multisearch not doing what I expect - splunk

The message format we chose uses a field called scope to control the level of aggregation you want (by request_type, site, zone, cluster). The scope is set with a dropdown and passed in as a token. I wanted to use multi-search to coalesce the results of 4 different searches. So that if the scope was site, only the results from the site search would be shown.
Actual Search:
index=cloud_aws namespace=cloudship lambda=SCScloudshipStepFunctionStats metric_type=*_v0.3
| spath input=message
| multisearch
[search $request_type_token$ | where "$scope_token$" == "request_type" ]
[search $request_type_token$ $site_token$ | where "$scope_token$" == "site"]
[search $request_type_token$ $site_token$ $zone_token$ | where "$scope_token$" == "zone"]
[search scope=$scope_token$ $request_type_token$ $site_token$ $zone_token$ $cluster_token$ | where "$scope_token$" == "cluster"]
| timechart cont=FALSE span=$span_token$ sum(success) by request_type
Search after token substitution with literal values.
index=cloud_aws namespace=cloudship lambda=SCScloudshipStepFunctionStats metric_type=*_v0.3
| spath input=message
| multisearch
[search request_type="*"
| where "site" == "request_type" ]
[search request_type="*" site="RTP"
| where "site" == "site"]
[search request_type="*" site="RTP" zone="*"
| where "site" == "zone"]
[search scope=site request_type="*" site="RTP" zone="*" cluster="*"
| where "site" == "cluster"]
| timechart cont=FALSE span=hour sum(success) by request_type
BUT ... the results of this query are equivalent to no search at all and I basically do not filter anything.
index=cloud_aws namespace=cloudship lambda=SCScloudshipStepFunctionStats metric_type=*_v0.3
| spath input=message
| timechart cont=FALSE span=hour sum(success) by request_type
This query and the one above give the same result. What am I missing here? When I execute each part of the multi-search separately, the results are correct. I get empty results for all but the 'where "site" == "site"' search. But when I run the whole query I get no filtering at all. Help!

First, I think what you're looking for is the value of site to match request_type (in the initial multisearch search line) - but what you're actually checking for in the where clause is whether the text "site" equals the text "request_type". And, of course, that is not the case!
Start by removing the second line of the multisearch (since comparing site to site will always be true), and using upper() and match():
index=cloud_aws namespace=cloudship lambda=SCScloudshipStepFunctionStats metric_type=*_v0.3
| spath input=message
| multisearch
[search request_type="*" site=*
| eval request_type=upper(request_type), site=upper(site)
| where "site" == "request_type" ]
[search request_type="*" site="RTP" zone="*"
| eval zone=upper(zone), site=upper(site)
| where match(site,zone)]
[search scope=site request_type="*" site="RTP" zone="*" cluster="*"
it would be even easier to do cluster="rtp" instead of cluster=* here, but I've left the idiom of upper()ing and match()ing for reading consistency
| where match(site,cluster)]
| timechart cont=FALSE span=hour sum(success) by request_type

Related

How better I can optimize this Kusto Query to get my logs

I have below query which I am running and getting logs for Azure K8s, but its takes hour to generate the logs and i am hoping there is a better way to write what i have already written. Can some Kusto experts advice here as how can I better the performance?
AzureDiagnostics
| where Category == 'kube-audit'
| where TimeGenerated between (startofday(datetime("2022-03-26")) .. endofday(datetime("2022-03-27")))
| where (strlen(log_s) >= 32000
and not(log_s has "aksService")
and not(log_s has "system:serviceaccount:crossplane-system:crossplane")
or strlen(log_s) < 32000
| extend op = parse_json(log_s)
| where not(tostring(op.verb) in ("list", "get", "watch"))
| where substring(tostring(op.responseStatus.code), 0, 1) == "2"
| where not(tostring(op.requestURI) in ("/apis/authorization.k8s.io/v1/selfsubjectaccessreviews"))
| extend user = op.user.username
| extend decision = tostring(parse_json(tostring(op.annotations)).["authorization.k8s.io/decision"])
| extend requestURI = tostring(op.requestURI)
| extend name = tostring(parse_json(tostring(op.objectRef)).name)
| extend namespace = tostring(parse_json(tostring(op.objectRef)).namespace)
| extend verb = tostring(op.verb)
| project TimeGenerated, SubscriptionId, ResourceId, namespace, name, requestURI, verb, decision, ['user']
| order by TimeGenerated asc
You could try starting your query as follow.
Please note the additional condition at the end.
AzureDiagnostics
| where TimeGenerated between (startofday(datetime("2022-03-26")) .. endofday(datetime("2022-03-27")))
| where Category == 'kube-audit'
| where log_s hasprefix '"code":2'
I assumed that code is integer, in case it is string, use the following (added qualifier)
| where log_s has prefix '"code":"2'

how to write splunk query to create a dashboard

I have a Splunk log which contains a message at different time stamp with some case number
"message":"Welcome home user case num 1ABCD-201901-765-2  UserId - 1203 XV - 543 UserAd - 76542 Elect - 5789875 Later Code - QWERZX"
In below log few log message also get printed at different timestamp if certain conditions are met
"message":"Passed First class case num 1ABCD-201901-765-2"
"message":"Failed First class case num 1ABCD-201901-765-2"
"message":"Passed Second class case num 1ABCD-201901-765-2"
"message":"Fully Failed case num 1ABCD-201901-765-2"
"message":"Saved case num 1ABCD-201901-765-2"
"message":"Not saved case num 1ABCD-201901-765-2"
"message":"Not user to us case num 1ABCD-201901-765-2"
I want to create a table in Splunk dashboard to view using Splunk query with these columns list all the case numbers with the details
Case Num | XV | UserId | UserAd | Elect | Later Code | Passed First class | Passed Second class | Failed First class | Saved | Not saved | Not user to us
How to print true and false for these columns  Passed First class | Passed Second class | Failed First class | Saved | Not saved | Not user to us I want to check for each case num whether the case num is present in those logs if its present then print true for that column else false
I'm going to presume you have no field extractions yet built (except for message) for the sample data you provided, and that - as provided - it's in the correct format (though, since it seems to be missing timestamps, I can tell something is likely amiss)
This should get you down the right road:
index=ndx sourcetype=srctp message=*
| rex field=message "Passed (?<passed_attempt>\w+)"
| rex field=message "Failed (?<failed_attempt>\w+)"
| rex field=message "case num (?<case_num>\S+)"
| rex field=message "(?<saved>Not saved)"
| rex field=message "(?<saved>Saved)"
| rex field=message "UserId - (?<userid>\w+)"
| rex field=message "XV - (?<xv>\w+)"
| rex field=message "UserAd - (?<userad>\w+)"
| rex field=message "Elect - (?<elect>\w+)"
| rex field=message "Later Code - (?<later_code>\w+)"
| fields passed_attempt failed_attempt _time case_num xv userid elect later_code saved userad
| stats max(_time) as _time values(*) as * by userid case_num
I've used separate regular expressions to pull the fields because they're easier to read - they may (or may not) be more performant to combine.

Display result count of multiple search query in Splunk table

I want to display a table in my dashboard with 3 columns called Search_Text, Count, Count_Percentage
How do I formulate the Splunk query so that I can display 2 search query and their result count and percentage in Table format.
Example,
Heading Count Count_Percentage
SearchText1 4 40
SearchText2 6 60
The below query will create a column named SearchText1 which is not what I want:
index=something "SearchText1" | stats count AS SearchText1
Put each query after the first in an append and set the Heading field as desired. Then use the stats command to count the results and group them by Heading. Finally, get the total and compute percentages.
index=foo "SearchText1" | eval Heading="SearchText1"
| append [ | search index=bar "SearchText2" | eval Heading="SearchText2" ]
| stats count as Count by Heading
| eventstats sum(Count) as Total
| eval Count_Percentage=(Count*100/Total)
| table Heading Count Count_Percentage
Showing the absence of search results is a little tricky and changes the above query a bit. Each search will need its own stats command and an appendpipe command to detect the lack of results and create some. Try this:
index=main "SearchText1"
| eval Heading="SearchText1"
| stats count as Count by Heading
| appendpipe
[ stats count
| eval Heading="SearchText1", Count=0
| where count=0
| fields - count]
| append
[| search index=main "SearchText2"
| eval Heading="SearchText2"
| stats count as Count by Heading
| appendpipe
[ stats count
| eval Heading="SearchText2", Count=0
| where count=0
| fields - count] ]
| eventstats sum(Count) as Total
| eval Count_Percentage=(Count*100/Total)
| table Heading Count Count_Percentage

Splunk dbxquery merge with splunk search

I am trying to merge Splunk search query with a database query result set. Basically I have a Splunk dbxquery 1 which returns userid and email from database as follows for a particualr user id:
| dbxquery connection="CMDB009" query="SELECT dra.value, z.email FROM DRES_PRINTABLE z, DRES.CREDENTIAL bc, DRES.CRATTR dra WHERE z.userid = bc.drid AND z.drid = dra.dredid AND dra.value in ('xy67383') "
Above query outputs
VALUE EMAIL
xv67383 xyz#test.com
Another query is a Splunk query 2 that provides the user ids as follows:
index=index1 (host=xyz OR host=ABC) earliest=-20m#m
| rex field=_raw "samlToken\=(?>user>.+?):"
| join type=outer usetime=true earlier=true username,host,user
[search index=index1 source="/logs/occurences.log" SERVER_SERVER_CONNECT NOT AMP earliest=#w0
| rex field=_raw "Origusername\((?>username>.+?)\)"
| rex field=username"^(?<user>,+?)\:"
| rename _time as epoch1]
| "stats count by user | sort -count | table user
This above query 2 returns a column called user but not email.
What I want to do is add a column called email from splunk dbxquery 1 for all matching rows by userid in output of query 1. Basically want to add email as additional field for each user returned in query 2.
What I tried so far is this but it does not give me any results. Any help would be appreciated.
index=index1 (host=xyz OR host=ABC) earliest=-20m#m
| rex field=_raw "samlToken\=(?>user>.+?):"
| join type=outer usetime=true earlier=true username,host,user
[search index=index1 source="/logs/occurences.log" SERVER_SERVER_CONNECT NOT AMP earliest=#w0
| rex field=_raw "Origusername\((?>username>.+?)\)"
| rex field=username"^(?<user>,+?)\:"
| rename _time as epoch1]
| "stats count by user | sort -count
| table user
| map search="| | dbxquery connection=\"CMDB009\" query=\"SELECT dra.value, z.email FROM DRES_PRINTABLE z, DRES.CREDENTIAL bc, DRES.CRATTR dra WHERE z.userid = bc.drid AND z.drid = dra.dredid AND dra.value in ('$user'):\""
Replace $user with $user$ in the map command. Splunk uses a $ on each end of a token.
The username field is not available at the end of the query because the stats command stripped it out. The only fields available after stats are the ones mentioned in the command (user and count in this case). To make the username field available, add it to the stats command. That may, however, change your results.
| rex field=_raw "samlToken\=(?<user>.+?):"
| join type=outer usetime=true earlier=true username,host,user
[search index=index1 source="/logs/occurences.log" SERVER_SERVER_CONNECT NOT AMP earliest=#w0
| rex field=_raw "Origusername\((?<username>.+?)\)"
| rex field=username"^(?<user>,+?)\:"
| rename _time as epoch1]
| stats count by user, username | sort -count
| table user, username
| map search="| dbxquery connection=\"CMDB009\" query=\"SELECT dra.value, z.email FROM DRES_PRINTABLE z, DRES.CREDENTIAL bc, DRES.CRATTR dra WHERE z.userid = bc.drid AND z.drid = dra.dredid AND dra.value in ('$user'):\""```

Splunk get inner Query results with in the time frame provided by outer Query

Successfully scheduled PushNotification in UserMessageChanelMap LINK_MORE_ACCOUNTS |eval fields=split(raw,"|") | eval messageKey =mvindex(fields,2) |eval num=mvindex(fields,5) | table messageKey_, num | eval scheduledDate = replace(num, "scheduledDate:", "") | eval messageKey = replace(messageKey_,"messageKey:","") | eval newTS=strftime(strptime(scheduledDate, "%a %b %d %H:%M:%S %Z %Y"), "%Y-%m-%d %H:%M:%S") | stats count by newTS,messageKey | stats min(newTS) as fromScheduledDate, max(newTS) as toScheduledDate | appendcols [search ( ("Could not send PushNotification") messageKey:LINK_MORE_ACCOUNTS NOT ("*|reason:Failed to Deliver|") | extract pairdelim="|" kvdelim=":" | table userId,userMessageId,messageKey| stats count by userId,userMessageId,messageKey | table userId,userMessageId, messageKey | stats count as pushFallOffPoints by messageKey ]
Here I want to run my SubQuery with in the time range of fromScehduledDate - toScehduledDate. I was trying to pass these dates to earliest and latest but that did not work. Help is appreciated .
Subsearches run first so there is no such thing as passing fields into a subsearch. A subsearch, however, can return fields to the main search using the format or return command. Run the subsearch by itself to see what exactly it returns and to verify the returned string makes sense when combined with the main search.
I was able to figure out the solution
( [ search Successfully scheduled PushNotification in LINK_MORE_ACCOUNTS |eval fields=split(raw,"|") | eval messageKey =mvindex(fields,2) |eval num=mvindex(fields,5) | table messageKey_, num | eval scheduledDate = replace(num, "scheduledDate:", "") | eval messageKey = replace(messageKey_,"messageKey:","") | eval newTS=strptime(scheduledDate, "%a %b %d %H:%M:%S %Z %Y") | stats count by newTS,messageKey | stats min(newTS) as earliest | return earliest ]
, [ search Successfully scheduled PushNotification in UserMessageChanelMap LINK_MORE_ACCOUNTS |eval fields=split(raw,"|") | eval messageKey =mvindex(fields,2) |eval num=mvindex(fields,5) | table messageKey_, num | eval scheduledDate = replace(num, "scheduledDate:", "") | eval messageKey = replace(messageKey_,"messageKey:","") | eval newTS=strptime(scheduledDate, "%a %b %d %H:%M:%S %Z %Y") | stats count by newTS,messageKey | stats max(newTS) as latest | return latest] )
( container_name="ace-service") ("Could not send PushNotification") messageKey:LINK_MORE_ACCOUNTS NOT ("*|reason:Failed to Deliver|") | extract pairdelim="|" kvdelim=":" | table userId,userMessageId,messageKey| stats count by userId,userMessageId,messageKey | table userId,userMessageId, messageKey | stats count as pushFallOffPoints by messageKey