Splunk query to find previous requests from different ip - splunk

Suppose I've got the following log entries:
timestamp ip path
1668603956000 1.1.1.1 /some/path
1668603955000 2.2.2.2 /some/path
1668603954000 2.2.2.2 /some/other/path
1668603953000 3.3.3.3 /some/path
1668603952000 3.3.3.3 /some/other/path
1668603951000 4.4.4.4 /some/path
1668603950000 5.5.5.5 /some/path
I want to end up with a table that shows for ip=2.2.2.2, the ip from the previous request to the same path but with a different IP.
Expected results:
L.time R.time R.ip R.path
1668603954200 1668603954400 3.3.3.3 /some/path
1668603954300 1668603954600 4.4.4.4 /some/other/path
What I've tried:
source=my_log
| where ip = "2.2.2.2"
| table path, ip, _time
| join type=inner left=L right=R usetime=true earlier=true where L.path = R.path [
| where L.ip != R.ip
]
| table L._time, R._time, R.ip, R.path
But this does not give me any tabular results. I get raw events back, but the join doesn't seem to be working.

You seem to be trying to write SQL, but in SPL
I suggest starting here for how to change your approach - https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/SQLtoSplunk
That said, I believe this will get you toward your goal:
index=ndx sourcetype=srctp ip=* path=*
| fields path ip _time
| fields - _raw
| sort 0 path -_time +ip
| streamstats reset_on_change=true current=true latest(_time) as tick by path ip
| eval tick=strftime(tick,"%c")
| stats count by path ip tick
| fields - count

Related

Extract data from splunk

I have a Post query where I want to extract request payload or parameters and print a table. In the query, I am trying to extract the user_search name field
I have written a Splunk query but it is not working for me
"Parameters: {\"user_search\"=>{\"name\"=>*" | rex field=_raw "/\"user_search\"=>{\"name\"=>/(?<result>.*)" | table result
Splunk Data
I, [2021-09-23T00:46:31.172197 #44154] INFO -- : [651235bf-7ad5-4a2e-a3b8-7737a3af9fc3] Parameters: {"user_search"=>{"name"=>"aniket", "has_primary_phone"=>"false", "query_params"=>{"searchString"=>"", "start"=>"0", "filters"=>[""]}}}
host = qa-1132-lx02source = /src/project.logsourcetype = data:log
I, [2021-09-23T00:48:31.162197 #44154] INFO -- : [651235bf-7ad5-4a2e-a3b8-7737a3af9fc3] Parameters: {"user_search"=>{"name"=>"shivam", "has_primary_phone"=>"false", "query_params"=>{"searchString"=>"", "start"=>"0", "filters"=>[""]}}}
host = qa-1132-lx02source = /src/project.logsourcetype = data:log
I, [2021-09-23T00:52:27.171197 #44154] INFO -- : [651235bf-7ad5-4a2e-a3b8-7737a3af9fc3] Parameters: {"user_search"=>{"name"=>"tiwari", "has_primary_phone"=>"false", "query_params"=>{"searchString"=>"", "start"=>"0", "filters"=>[""]}}}
host = qa-1132-lx02source = /src/project.logsourcetype = data:log
I have 2 questions
How to write a splunk query to extract request payload in post query
In my above query I am not not sure what I am doing wrong. I would really appreciate if someone has any suggestion.
At the least, your regular expression has an error
You have:
"/\"user_search\"=>{\"name\"=>/(?<result>.*)"
There is an extra "/" after the "=>"
This seems to pull what you're looking for:
user_search\"=>{\"name\"=>(?<result>.*)
Edit per comment "I only want to fetch the values such as aniket & shivam from the name key"
There're a couple ways to do what you're asking, and which is going to be mroe performant will depend on your environment and data
Option 1
index=ndx sourcetype=srctp ("aniket" OR "shivam")
| rex field=_raw "user_search\"=>{\"name\"=>(?<result>.*)"
| stats count by result
Option 2
index=ndx sourcetype=srctp
| rex field=_raw "user_search\"=>{\"name\"=>(?<result>.*)"
| search result="aniket" OR result="shivam"
| stats count by result

Splunk left jion is not giving as exepcted

Requirement: I want to find out, payment card information used in a particular day are there any tele sales order placed with the same payment card information.
I tried with below query it is supposed to give me all the payment card information from online orders and matching payment info from telesales. But i am not giving correct results basically results shows there are no telesales for payment information, but when i search splunk i am finding telesales as well. So the query wrong.
index="orders" "Online order received" earliest=-9d latest=-8d
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| rename timestamp as onlineOrderTime
| table payHash, onlineOrderTime
| join type=left payHash [search index="orders" "Telesale order received" earliest=-20d latest=-5m | rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))" | rename timestamp as TeleSaleTime | table payHash, TeleSaleTime]
| table payHash, onlineOrderTime, TeleSaleTime
Please help me in fixing the query or a query to find out results for my requirement.
If you do want to do this with a join, what you had, slightly changed, should be correct:
index="orders" "Online order received" earliest=-9d latest=-8d
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| stats values(_time) as onlineOrderTime by payHash
| join type=left payHash
[search index="orders" "Telesale order received" earliest=-20d latest=-5m
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| rename timestamp as TeleSaleTime
| stats values(TeleSaleTime) by payHash ]
| rename timestamp as onlineOrderTime
Note the added | stats values(...) by in the subsearch: you need to ensure you've removed any duplicates from the list, which this will do. By using values(), you'll also ensure if there're repeated entries for the payHash field, they get grouped together. (Similarly, added a | stats values... before the subsearch to speed the whole operation.)
You should be able to do this without a join, too:
index="orders" (("Online order received" earliest=-9d latest=-8d) OR "Telesale order received" earliest=-20d))
| rex field=_raw "(?<order_type>\w+) order received"
| rex field=message "paymentHashed=(?<payHash>.([a-z0-9_\.-]+))"
| stats values(order_type) as order_type values(_time) as orderTimes by payHash
| where mvcount(order_type)>1
After you've ensured your times are correct, you can format them - here's one I use frequently:
| eval onlineOrderTime=strftime(onlineOrderTime,"%c"), TeleSaleTime=strftime(TeleSaleTime,"%c")
You may also need to do further reformatting, but these should get you close
fwiw - I'd wonder why you were trying to look at Online orders from only 9 days ago, but Telesale orders from 20 days ago to now: but that's just me.
The join command expects a list of field names on which events from each search will be matched. If no fields are listed then all fields are used. In the example, the fields 'onlineOrderTime' and 'TeleSaleTime' exist only on one side of the join so no matches can be made. The fix is simple: specify the common field name. ... | join type=left payHash ....
First of all, you can delete the last row | table payHash, onlineOrderTime, TeleSaleTime beacuse it doesn't do anything(the join command already joins both tables you created).
Secondly, when running both queries separately - both queries have the same "payHash"es? both queries return back a table with the true results?
Because by the looks of it, you used the join command correctly...

How to get two fields using rex from log file?

I'm newbie with Splunk. My goal is take two or more fields from logs. I must check if one field is true and so use another field to make a counter. The counter is about how many requests is make by client using user-agent attribute.
My logic desired:
int count1, count2;
count1 = 0;
count2 = 0;
if (GW == true) {
if (UA == "user-agent1") count1++;
if (UA == "user-agent2") count2++;
}
At the moment I can get just one field and make a counter without if-condition.
This query works fine, and return the correct requests counter:
source="logfile.log" | rex "UA=(?<ua>\w+)" | stats count(eval(ua="user-agent1")) as USER-AGENT1
But, when I try get the second field (GW) to make the logic, the query returns 0.
source="logsfile.log" | rex "UA=(?<ua>\w+) GW=(?<gw>\w+)" |stats count(eval(ua="user-agent1")) as USER-AGENT1
So, how I get more fields and how make if-condition on query?
Sample log:
2020-01-10 14:38:44,539 INFO [http-nio-8080-exec-8] class:ControllerV1, UA=user-agent1, GW=true
2020-01-10 14:23:51,818 INFO [http-nio-8080-exec-3] class:ControllerV1, UA=user-agent2, GW=true
It will be something like this:
source="logsfile.log" UA GW
| rex "UA=(?<ua>\w+), GW=(?<gw>\w+)"
| stats count(eval(gw="true" AND ua="user-agent1")) as AGENT1,
count(eval(gw="true" AND ua="user-agent2")) as AGENT2
If, for example, you do not know the order of variables or you have more than 2, you can use separate rex statements:
source="logsfile.log" UA GW
| rex "UA=(?<ua>\w+)"
| rex "GW=(?<gw>\w+)"
| stats count(eval(gw="true" AND ua="user-agent1")) as AGENT1,
count(eval(gw="true" AND ua="user-agent2")) as AGENT2
This could be a bit slower since _raw will be parsed twice.

Join two Splunk queries without predefined fields

I am trying to join 2 splunk queries. However in this case the common string between the 2 queries is not a predefined splunk field and is logged in a different manner. I have created the regex which individually identifies the string but when I try to combine using join, I do not get the result.
I have logs like this -
Logline 1 -
21-04-2019 11:01:02.001 server1 app1 1023456789 1205265352567565 1234567Z-1234-1234-1234-123456789123 Application Completed
Logline 2 -
21-04-2019 11:00:00.000 journey_ends server1 app1 1035625855585989 .....(lots of text) commonID:1234567Z-1234-1234-1234-123456789123 .....(lots of text) status(value) OK
the second Logline can be NOTOK as well
Logline 2 -
21-04-2019 11:00:00.000 journey_ends server1 app1 1035625855585989 .....(lots of text) commonID:1234567Z-1234-1234-1234-123456789123 .....(lots of text) status(value) NOTOK
I have tried multiple things but the best that I can come up with is -
index=test "journey_ends" | rex "status(value) (?<StatusType>[A-Z][A-Z]*)" | rex "commonID\:(?<commonID>[^\t]{37})" | table StatusType, commonID | join type=inner commonID [ search index=test "Application Completed" | rex "^(?:[^\t\n]*\t){7}(?P<commonID>[^\t]+)" | table _time, commonID] | chart count over StatusType by commonID
However the above query does not provide me the stats. In verbose mode, I can just see the events of query 1. Please note that the above 2 queries run correctly individually.
However currently I have to initially run the query to fetch the commonIDs from "Application Completed" logline and then in another query give the list of commonIDs found in the result first query as input and find the status value for each commonId from logline 2.
Expected Result (in a table):
StatusType commonID OK 1234567Z-1234-1234-1234-123456789123 NOTOK 1234567Z-1234-1234-1234-985625623541
Can you try the below query,
index=main
AND "Application Completed"
| rex "(?<common_id>[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+)"
| table _time, common_id
| join type=inner common_id [
search index=main
| rex "status\(value\)\s+(?<status>.+)$"
| rex "(?<common_id>[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+-[[:alnum:]]+)"
| table status, common_id
]

Return first entry in table for every row given a specifc column sort order [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 5 years ago.
I have a table with three columns, hostname, address, and virtual. The address column is unique, but a host can have up to two address entries, one virtual and one non-virtual. In other words, the hostname and virtual column pair are also unique. I want to produce a result set that contains one address entry for a host giving priority to the virtual address. For example, I have:
hostname | address | virtual
---------+---------+--------
first | 1.1.1.1 | TRUE
first | 1.1.1.2 | FALSE
second | 1.1.2.1 | FALSE
third | 1.1.3.1 | TRUE
fourth | 1.1.4.2 | FALSE
fourth | 1.1.4.1 | TRUE
The query should return the results:
hostname | address
---------+--------
first | 1.1.1.1
second | 1.1.2.1
third | 1.1.3.1
fourth | 1.1.4.1
Which is the virtual address for every host, and the non-virtual address for hosts lacking a virtual address. The closest I've come is asking for one specific host:
SELECT hostname, address
FROM system
WHERE hostname = 'first'
ORDER BY virtual DESC NULLS LAST
LIMIT 1;
Which gives this:
hostname | address
---------+--------
first | 1.1.1.1
I would like to get this for every host in the table with a single query if possible.
What you're looking for is a RANK function. It would look something like this:
SELECT * FROM (
SELECT hostname, address
, RANK() OVER (PARTITION BY hostname ORDER BY virtual DESC NULLS LAST) AS rk
FROM system
)
WHERE rk = 1
This is a portable solution that also works in Oracle and SQL Server.
In Postgres, the simplest way is distinct on:
SELECT DISTINCT ON (hostname) hostname, address
FROM system
ORDER BY hostname, virtual DESC NULLS LAST