I am able to get the multiple events (api's logs) in splunk dashboard like below
event-1:
{ "corrId":"12345", "traceId":"srh-1", "apiName":"api1" }
event-2:
{ "corrId":"69863", "traceId":"srh-2", "apiName":"api2" }
event-3:
{ "corrId":"12345", "traceId":"srh-3", "apiName":"api3" }
I want to retrieve corrId (ex:- "corrId":"12345") dynamically from one event (api log)by providing apiName and build splunk search query based on retrieved corrId value that means it will pull all the event logs which contains same corrId ("corrId":"12345").
Output
In above scenario expected results would be like below
event-1:
{ "corrId":"12345", "traceId":"srh-1", "apiName":"api1" }
event-3:
{ "corrId":"12345", "traceId":"srh-3", "apiName":"api3" }
I am new to splunk, please help me out here, how to fetch "corrId":"12345" dynamically by providing other field like apiName and build Splunk search query based on that.
I have tried out like below, but to no luck.
index = "test_srh source=policy.log [ search index = "test_srh source=policy.log | rex field=_raw "apiName":|s+"(?[^"]+)" | search name="api1" | table corrId]
This query gives event-1 log only but we need all other events which contain same corrId ("corrId":"12345"). Appreciate quick help here.
Given you're explicitly extracting the apiName field, I'll assume the corrId field is not automatically extracted, either. That means putting corrId="12345" in the base query won't work. Try index=test_srh source=policy.log corrId="12345" to verify that.
If the corrId field needs to be extracted then try this query.
index=test_srh source=policy.log
| rex "corrId\\":\\"(?<corrId>[^\\"]+)"
| where [ search index = "test_srh source=policy.log
| rex "apiName\":\"(?<name>[^\"]+)"
| search name="api1"
| rex "corrId\\":\\"(?<corrId>[^\\"]+)"
| fields corrId | format ]
Note: I also corrected the regex to properly extract the apiName field.
Related
First Event
17:09:05:362 INFO com.a.b.App - Making a GET Request and req-id: [123456]
Second Event
17:09:06:480 INFO com.a.b.App - Output Status Code: 200 req-id:"123456"
I tried to use index="xyz" container="service-name" | transaction "req-id" startswith="Making a GET Request" endswith="Output Status Code" | table duration but it is also not working.
I want to calculate duration of above two events for every request. I went over some solutions in splunk and Stack Overflow, but still can't get the proper result.
Try doing it with stats instead:
index=ndx sourcetype=srctp
| rex field=_raw "req\-id\D+(?<req_id>\d+)"
| rex field=_raw "(?<sequence>Making a GET Request)"
| rex field=_raw "(?<sequence>Output Status Code)"
| eval sequence=sequence+";"+_time
| stats values(sequence) as sequence by req_id
| mvexpand sequence
| rex field=sequence "(?<sequence>[^;]+);(?<time>\d+)"
| eval time=strftime(time,"%c")
This will extract the "req-id" into a field named req_id, and the start and end of the sequence into a field named sequence
Presuming the sample data you shared is correct, when you stats values(sequence) as sequence, it will put the "Making..." entry first and the "Output..." entry second
Because values() will do this, when you mvexpand and then split the values()'d field part into sequence and time, they'll be in the proper order
If the sample data is incomplete, you may need to tweak the regexes for populating sequence
It’s seem you’re going with my previously suggested approach 😉
Now you have 2 possibilities
1. SPL
Below the simplest query, only invoking 1 rex and assuming _time field correctly filled
index=<your_index> source=<your_source>
("*Making a GET Request*" OR "*Output Status Code*")
| rex field=_raw "req\-id\D+(?<req_id>\d+)"
| stats max(_time) as end, min(_time) as start by id
| eval duration = end - start
| table id duration
Note that depending the amount of data to scan, this one can be ressources consuming for your Splunk cluster
2. Log the response time directly in API (more efficient)
It seem you are working on an API. You must have capabilities to get the response time of each call and directly trace it in your log
Then you can exploit it easily in SPL without calculation
It always preferable to persist data at index time vs. operate systematic calculation at search time
Full disclosure, I am very new Splunk so I may explain my question incorrectly.
I have two data sources and was given a query to pull data from them individually. I am trying to join this data together so I can create some type of chart, but I am unsure of this would be a join/search etc.
My initial query is as follows:
This allows me to search through the mail logs by sender address and show all emails with a bcSendAction=1, which is a successful send.
index=mail sourcetype=barracuda [search index=mail sourcetype=barracuda bcSender="someemail#domain.com" | table bcMsgId] bcSendAction=1
The result of this search is as follows:
Now, my other search is a log that shows all of the sender email addresses during a certain time period. I would like to use the result of this (the email value) in the first search so that I don't have to hard-code the bcSender, but rather have it use the results from the other source.
// Returns an email address
index=mail sourcetype=sendmail_syslog *#sfdc.net |
rex field=from "<(?<from>.*)>" |
table from | dedup from
I was able to parse the log and pull out just the email addresses that I want to use to plug into my first search.
I followed a few emails and tutorials, but a lot of the joins I was seeing only used two different sources/datasets and didn't use the search as I did in my first query.
My attempt at this was something like:
index=mail sourcetype=sendmail_syslog *#sfdc.net
| rex field=from "<(?<from>.*)>"
| table from | dedup from
| join from
[search index=mail sourcetype=barracuda [search index=mail sourcetype=barracuda bcSender=from | table bcMsgId] bcSendAction=1]
I don't know that I am referencing the email from the first result set correctly.
Can someone point me in the right direction with how to approach this search?
If I understand your request properly, then you need 3 steps:
get the sender addresses from index=mail sourcetype=sendmail_syslog
use these sender addresses to get a list of messageID's from index=mail sourcetype=barracuda
use these messageID's to finally get the events you are looking for
This sounds like you need a subsearch (for getting the sender addresses) inside of another subsearch (for getting the messageID's), meaning your own attempt was pointing in the right direction already.
Try something along these lines:
index=mail sourcetype=barracuda bcSendAction=1
[ search
index=mail sourcetype=barracuda
[ search
index=mail sourcetype=sendmail_syslog *#sfdc.net
| rex field=from "<(?<bcSender>.*)>"
| stats count by bcSender
| fields bcSender
| format
]
| stats count by bcMsgId
| fields bcMsgId
| format
]
I can not really verify it without having your data, but I'll try to explain what it's supposed to do. Let's start from the innermost subsearch.
Line 4 starts the innermost subsearch
Line 5 selects the events in from which you generate the address list
Line 6 extracts the addresses directly into the field bcSender. (We could extract it to the field from first and then rename it, but this is more direct.)
We need the fieldname to be bcSender for the outer search.
Line 7 is a different way to deduplicate by bcSender and at the same time reduce the amount of data which needs to be sent back from indexers to the searchhead (if you have a distributed environment).
Line 8 gets rid of all the fields we don't require. They would be problematic with the following format command.
Line 9 passes the results back to he enclosing search in a way so it can be used as part of the search string.
Line 10, of course, closes the innermost subsearch.
Now let's have a look at the outer subsearch.
Line 2 starts the subsearch.
Line 3 selects the events from which we can get the messageID's. This is, of cause, augmented by the enclosed subsearch we've just discussed.
Line 11 again is a way to dedup the messageID's.
Line 12 again limits things to the field we need.
Line 13 passes the found messageID's to the outermost (main) search in a such a way that they become part of the search string.
Line 14, you already know, closes the subsearch.
And the outermost search:
Line 1 selects the data you are targetting and is augmented by what the subsearches pass to it.
That one side of the join is a single field indicates it is a good candidate for a subsearch. Subsearches run first and their results then become part of the main search.
index=mail sourcetype=barracuda bcSendAction=1
[ search index=mail sourcetype=sendmail_syslog *#sfdc.net
| rex field=from "<(?<from>.*)>"
| fields from | rename from as bcSender | format ]
It's important that the result of the subsearch contain a field present in the main search. That's why I used rename.
After the subsearch runs, you get a search that's equivalent to this:
index=mail sourcetype=barracuda bcSendAction=1 (bcSender="someemail#domain.com" OR bcSender="anotheremail#domain.com")
I have a list of usernames that I have to monitor and the list is growing every day. I read Splunk documentation and it seems like lookup is the best way to handle this situation.
The goal is for my query to leverage the lookup function and prints out all the download events from all these users in the list.
Sample logs
index=proxy123 activity="download"
{
"machine":"1.1.1.1",
"username":"ABC#xyz.com",
"activity":"download"
}
{
"machine":"2.2.2.2",
"username":"ASDF#xyz.com",
"activity":"download"
}
{
"machine":"3.3.3.3",
"username":"GGG#xyz.com",
"activity":"download"
}
Sample Lookup (username.csv)
users
ABC#xyz.com
ASDF#xyz.com
BBB#xyz.com
Current query:
index=proxy123 activity="download" | lookup username.csv users OUTPUT users | where not isnull(users)
Result: 0 (which is not correct)
I probably don't understand lookup correctly. Can someone correct me and teach me the correct way?
In the lookup file, the name of the field is users, whereas in the event, it is username. Fortunately, the lookup command has a mechanism for renaming the fields during the lookup. Try the following
index=proxy123 activity="download" | lookup username.csv users AS username OUTPUT users | where isnotnull(users)
Now, depending on the volume of data you have in your index and how much data is being discarded when not matching a username in the CSV, there may be alternate approaches you can try, for example, this one using a subsearch.
index=proxy123 activity="download" [ | inputlookup username.csv | rename users AS username | return username ]
What happens here in the subsearch (the bit in the []) is that the subsearch will be expanded first, in this case, to (username="ABC#xyz.com" OR username="ASDF#xyz.com" OR username="BBB#xyz.com"). So your main search will turn into
index=proxy123 activity="download" (username="ABC#xyz.com" OR username="ASDF#xyz.com" OR username="BBB#xyz.com")
which may be more efficient than returning all the data in the index, then discarding anything that doesn't match the list of users.
This approach assumes that you have the username field extracted in the first place. If you don't, you can try the following.
index=proxy123 activity="download" [ | inputlookup username.csv | rename users AS search | format ]
This expanded search will be
index=proxy123 activity="download" "ABC#xyz.com" OR "ASDF#xyz.com" OR "BBB#xyz.com")
which may be more suitable to your data.
While using data-driven feature in Karate framework, I see the generated report just show the title as configured in Scenario Outline NOT attached the value using in Example table. It causes the Tester confuse which data is using, and take time to expand each scenarios to know which data is using; so I want the report can pass variable into the title - Scenario/Scenario Outline. Please take a look at the example below.
E.g.
Feature: Login Feature
Background:
* configure headers = { 'Webapp-Version': '1.0.0'}
Scenario Outline: As a <description> user, I want to get the corresponding response_code <status_code>
Given def path = 'classpath:features/Authentication/authentication.feature'
And def signIn = call read(path) {username: '<username>', password: '1234567890'}
Then match signIn.status == <status_code>
Examples:
|username | status_code| description |
|test#gmail.com | 200 | valid user |
|null | 400 | invalid user|
My expected result, the generated report should fill the value on table for field "status code" and "description" fields.
-> As a valid user user, I want to get the corresponding response_code 200.
Please share your ideas and comments on it.
Thanks,
Learn.
Not supported. Just use the print syntax and you will see it in the report.
EDIT: okay this will be possible in the next version: https://github.com/intuit/karate/issues/553
I am gathering performance metrics for each each api that we have. With the below query I get results as
method response_time
Create Billing 2343.2323
index="dev-uw2" logger_name="*Aspect*" message="*ApiImpl*" | rex field=message "PerformanceMetrics - method='(?<method>.*)' execution_time=(?<response_time>.*)" | table method, response_time | replace "public com.xyz.services.billingservice.model.Billing com.xyz.services.billingservice.api.BillingApiImpl.createBilling(java.lang.String)” WITH "Create Billing” IN method
If the user clicks on each api text in table cell to drill down further it will open a new search with "Create Billing" obviosuly it will give zero results since we don't have any log with that string.
I want splunk to search with original text that was replaced earlier.
You can use click.value to get around this.
http://docs.splunk.com/Documentation/SplunkCloud/6.6.3/Viz/tokens