Use Parameters in Table in Search Query in Splunk - splunk

I have a saved table dataset in Splunk. When I choose to "Investigate in Search" this table dataset, I see
sample 1
| from datamodel:"My_Table_ForDay"
The SPL My_Table_ForDay looks like the following:
sample 2
index="my_index"
sourcetype="*"
earliest=#d
latest=now
| fields
_time
statusCode
result
| table
_time
statusCode
result
I would like to reuse My_Table_ForDay for separate days. In other words, I would like to pass a value to the datamodel that's used in the query. I want to use a parameter for the earliest attribute. For example, I would pass the following parameter values:
For today: #d
For yesterday: -1d#d
Two days ago: -2d#d
How do I a) pass a value from sample 1 and b) use a parameter in sample 2?
Thank you.

The from command does not support passing arguments. The savedsearch command does, however. You could save Sample2 as this saved search
index="my_index"
sourcetype="*"
earliest=$earliest_time$
latest=now
| fields
_time
statusCode
result
| table
_time
statusCode
result
And then invoke it using `| savedsearch My_Table_ForDay earliest_time="#d". See https://docs.splunk.com/Documentation/Splunk/8.2.6/SearchReference/Savedsearch for details.

Related

Extracting a count from raw splunk data by id

I am trying to get a count from transactional information that is retained within raw data in splunk. I have 3-5 transactions that occur.
One has raw data stating: pin match for id 12345678-1234-1234-abcd-12345678abcd or pin mismatched for id etc.
I'm trying to count the number of times the pin match occurs within the transaction time window of 180sec.
I was trying to do something like:
|eval raw=_raw |search index=transa
|eval pinc= if((raw like "%pin match%"),1,0) |stats count(pinc) as Pincount by ID
The issue I'm having is it is counting cumulatively over whatever time I am looking at those transactions. Is there a way to attach it to the ID that is within the message or have it count every one that occurs within that time window?
Thanks!
Presuming the pin status and ID have not been extracted:
index=ndx sourcetype=srctp "pin" "match" OR "mismatched"
| rex field=_raw "pin (?<pin_status>\w+)"
| rex field=_raw "id (?<id>\S+)"
| eval status_time=pin_status+"|"+_time
| stats earliest(status_time) as beginning latest(status_time) as ending by id
| eval beginning=split(beginning,"|"), ending=split(ending,"|")
| eval begining=mvindex(beginning,-1), ending=mvindex(ending,-1)
| table id beginning ending
| sort 0 id
| eval beginning=strftime(beginning,"%c"), ending=strftime(ending,"%c")
After extracting the status ("match" or "mismatched") and the id, append the individual event's _time to the end of the status - we'll pull that value back out after statsing
Using stats, find the earliest and latest status_time entries (fields just created on the previous line) by id, saving them into new fields beginning and ending
Next, split() beginning and ending on the pipe we added to separate the status from the timestamp into a multivalue field
Then assign the last item from the multivalue field (which we know is the timestamp) into itself (because we know that the earliest entry for a status_time should always be "match", and the latest entry for a status_time should always be "mismatched")
Lastly, table the id and time stamps, sort by id, and format the timestamp into something human readable (strftime takes many arguments, %c just happens to be quick)

Filtering out holidays in Splunk

I am attempting to use a search lookup table csv to filter out holidays for some Splunk queries.
To do this, I created a holidays.csv in the following style:
dateof,dateafter,description
01/17/2022,01/18/2022,MLK Day 22
02/21/2022,02/22/2022,Presidents Day 22
05/30/2022,05/31/2022,Memorial Day 22
[...]
Some of the queries run the day after the holiday, which is why I created a dateof and datefater field. I am trying to pipe the condition onto the end of the existing queries.
environment=staging "message=\"This line would contain the original search query"
| eval date=strftime(_time,"%m/%d/%Y")
| search NOT [ inputlookup holidays.csv | fields dateof ]
Note that an Event from the original query will look something like this:
time=2022-08-31T12:01:39,495Z [...] message="This line would contain the original search query"
Despite giving read access to the csv, the above condition does not filter out anything, regardless of whether it is a holiday listed in the csv file or not.
I suspect something is missing. However, I have a limited knowledge of the Spunk querying language. Would anyone be able to give guidance on this? Thanks in advance!
Subsearches can be tricky. If the result of the subsearch isn't just right, the query will fail. It helps to run the subsearch by itself to confirm the string produced makes sense as part of a query. In this case, check that
| inputlookup holidays.csv | fields dateof | rename dateof as date | format
produces something that works with search NOT.
An alternative to try is to explicitly look for the date field in the lookup.
environment=staging "message=\"This line would contain the original search query"
| eval date=strftime(_time,"%m/%d/%Y")
| where NOT [ | inputlookup holidays.csv | fields dateof | rename dateof as date | format ]
Here is another way to do it without a subsearch. A null description field tells us the date was not found in the lookup file and so is not a holiday.
environment=staging "message=\"This line would contain the original search query"
| eval date=strftime(_time,"%m/%d/%Y")
| lookup holidays.csv dateof as date output description
| where isnull(description)

Merge url with parameters into 1 in Splunk

I am creating a dashboard for our service. And I want to create metrics for url requests.
Lets say have a similar url like this one:
/api/v1/users/{userId}/settings
And I have following query in Splunk
url=*/api/v1/users/*/settings
| stats avg(timeTaken) as avg_latency, p99(timeTaken) as "p99(ms)", perc75(timeTaken) as "p75(ms)", count as total_requests, count(eval(responseStatus=500)) as failed_requests by url
| eval "success_rate"=round((total_requests - failed_requests)/total_requests*100,2)
| eval avg = round(avg)
| sort success_rate
All I want is to have a table with one common url showing all the metrics. But instead, I get a table with a list of all urls with different parameters.
You want to create a field which is the URL minus the UserId part, And therefore the stats will be grouped by which url is called.
You can do this by using split(url,"/") to make a mv field of the url, and take out the UserId by one of two ways depending on the URLs.
Mvfilter: Eg: mvfilter(eval(x!=userId))
Or created a new mvfield with the userId removed by it's index in the mvfield using this: Add/Edit/Delete mvfield
Instead of removing you could also choose to replace the UserId with "{userId}", so long as you do the same for all Urls.
And then you can rejoin the url using mvjoin(url,"/")
I hope I understood your question correctly and this helps you!
You could try doing a replace() on your URL field with eval before calling stats:
| eval url=replace(url,"\/\d+\/settings","/settings")
If it turns out the userid is important to hold onto, pull it into its own field prior to running replace():
| rex field=url "\/(?<userid>\d+)\/settings"
expansion for comment
For multiple possible endings of your URL, try something like this:
index=ndx sourcetype=srctp URL IN("*/api/v1/users/*/settings","*/api/v1/users/*/logout","*/api/v1/users/*/profile")
| rex field=url "\/(?<url_type>\w+)$"
| eval url=replace(url,"\/\d+\/\w+$","")
| stats avg(timeTaken) as avg_latency, p99(timeTaken) as "p99(ms)", perc75(timeTaken) as "p75(ms)", count as total_requests, count(eval(responseStatus=500)) as failed_requests by url type
| eval "success_rate"=round((total_requests - failed_requests)/total_requests*100,2)
| eval avg = round(avg)
| sort success_rate
This will extract the "type" (logout, profile, settings) into a new field, then cleanup the URL by removing everything from userid to the end

Take output from query and use in subsequent KQL query

I'm using Azure Log Analytics to review certain events of interest.
I would like to obtain timestamps from data that meets a certain criteria, and then reuse these timestamps in further queries, i.e. to see what else occurred around these times.
The following query returns the desired results, but I'm stuck at how to use the interestingTimes var to then perform further searches and show data within X minutes of each previously returned timestamp.
let interestingTimes =
Event
| where TimeGenerated between (datetime(2021-04-01T11:57:22) .. datetime('2021-04-01T15:00:00'))
| where EventID == 1
| parse EventData with * '<Data Name="Image">' ImageName "<" *
| where ImageName contains "MicrosoftEdge.exe"
| project TimeGenerated
;
Any pointers would be greatly appreciated.
interestingTimes will only be available for use in the query where you declare it. You can't use it in another query, unless you define it there as well.
By the way, you can make your query much more efficient by adding a filter that will utilize the built-in index for the EventData column, so that the parse operator will run on a much smaller amount of records:
let interestingTimes =
Event
| where TimeGenerated between (datetime(2021-04-01T11:57:22) .. datetime('2021-04-01T15:00:00'))
| where EventID == 1
| where EventData has "MicrosoftEdge.exe" // <-- OPTIMIZATION that will filter out most records
| parse EventData with * '<Data Name="Image">' ImageName "<" *
| where ImageName contains "MicrosoftEdge.exe"
| project TimeGenerated
;

How to aggregate logs by field and then by bin in AWS CloudWatch Insights?

I'm trying to do a query that will first aggregate by field count and after by bin(1h) for example I would like to get the result like:
# Date Field Count
1 2019-01-01T10:00:00.000Z A 123
2 2019-01-01T11:00:00.000Z A 456
3 2019-01-01T10:00:00.000Z B 567
4 2019-01-01T11:00:00.000Z B 789
Not sure if it's possible though, the query should be something like:
fields Field
| stats count() by Field by bin(1h)
Any ideas how to achieve this?
Is this what you need?
fields Field | stats count() by Field, bin(1h)
If you want to create a line chart, you can do it by separately counting each value that your field could take.
fields
Field = 'A' as is_A,
Field = 'B' as is_B
| stats sum(is_A) as A, sum(is_B) as B by bin(1hour)
This solution requires your query to include a string literal of each value ('A' and 'B' in OP's example). It works as long as you know what those possible values are.
This might be what Hugo Mallet was looking for, except the avg() function won't work here so he'd have to calculate the average by dividing by a total
Not able to group by a certain field and create visualizations.
fields Field
| stats count() by Field, bin(1h)
Keep getting this message
No visualization available. Try this to get started:
stats count() by bin(30s)