Need help in splunk regex field extraction - splunk

I have a splunk query(index=sat sourcetype="sat_logs" Message="application message published for") which returns list of messages published by different applications.I need to extract specific field values from the messages.Please let me know the query to get the expected results. Thanks
Splunk query results:
Message:Alpha application message published for UserId: 12345678, UID: 92345678, Date: 2019-10-04, Message: {"Application":"Alpha","ID":"123"}
Message:Beta application message published for UserId: 12345670, UID: 92345670,Date: 2019-10-03, Message: {"Application":"Beta","ID":"623"}
Message:Zeta application message published for UserId: 12345677, UID: 92345677,Date: 2019-10-02, Message: {"Application":"Zeta","ID":"523"}
Expected fields to be extracted and displayed as Table
Application UserId UID ID
Alpha 12345678 92345678 123
Beta 12345670 92345670 623
Zeta 12345677 92345677 523

The rex command can do that for you. Assuming the fields are always in the same order, this should do the job.
index=sat sourcetype="sat_logs" Message="application message published for"
| rex field=Message "UserId: (?<UserId>[^,]+), UID: (?<UID>[^,]+).*{"Application":"(?<Application>[^"]+)","ID":"(?<ID>[^"]+)"
| table Application UserId UID ID

Related

Splunk Query Recommendation

I have below log from my application:
BookData, {
id: 12312
}, appID : 'APP1', Relation_ID : asdas-12312
host = aws#asd. sourcetype=service_name
The entire log above is in the form of a single String. I want to create a table with the no. of times an appID has hit the service. i.e. I want to count the no. of events and group them by appID.
Basically, something like:
appID Count
APP1 23
APP2 25
APP3 100
I tried with below query, but it is not working. It is giving as 0 records found.
index=my_index sourcetype=service_name * | table appID Count | addColTotals labelfield=appID label="appID" count
As per my understanding, above query is not working because appID is not a label, but in that case, how do I go about forming the query with my desired result.
The query doesn't work in part because there is no Count field for the table command to display and no count field for the addcoltotals command to add to the results. To get a count you must tell Splunk to count fields by using the stats, eventstats, streamstats, or timechart command.
Try this:
index=my_index sourcetype=service_name
| stats count as Count by appID

Extract data from splunk

I have a Post query where I want to extract request payload or parameters and print a table. In the query, I am trying to extract the user_search name field
I have written a Splunk query but it is not working for me
"Parameters: {\"user_search\"=>{\"name\"=>*" | rex field=_raw "/\"user_search\"=>{\"name\"=>/(?<result>.*)" | table result
Splunk Data
I, [2021-09-23T00:46:31.172197 #44154] INFO -- : [651235bf-7ad5-4a2e-a3b8-7737a3af9fc3] Parameters: {"user_search"=>{"name"=>"aniket", "has_primary_phone"=>"false", "query_params"=>{"searchString"=>"", "start"=>"0", "filters"=>[""]}}}
host = qa-1132-lx02source = /src/project.logsourcetype = data:log
I, [2021-09-23T00:48:31.162197 #44154] INFO -- : [651235bf-7ad5-4a2e-a3b8-7737a3af9fc3] Parameters: {"user_search"=>{"name"=>"shivam", "has_primary_phone"=>"false", "query_params"=>{"searchString"=>"", "start"=>"0", "filters"=>[""]}}}
host = qa-1132-lx02source = /src/project.logsourcetype = data:log
I, [2021-09-23T00:52:27.171197 #44154] INFO -- : [651235bf-7ad5-4a2e-a3b8-7737a3af9fc3] Parameters: {"user_search"=>{"name"=>"tiwari", "has_primary_phone"=>"false", "query_params"=>{"searchString"=>"", "start"=>"0", "filters"=>[""]}}}
host = qa-1132-lx02source = /src/project.logsourcetype = data:log
I have 2 questions
How to write a splunk query to extract request payload in post query
In my above query I am not not sure what I am doing wrong. I would really appreciate if someone has any suggestion.
At the least, your regular expression has an error
You have:
"/\"user_search\"=>{\"name\"=>/(?<result>.*)"
There is an extra "/" after the "=>"
This seems to pull what you're looking for:
user_search\"=>{\"name\"=>(?<result>.*)
Edit per comment "I only want to fetch the values such as aniket & shivam from the name key"
There're a couple ways to do what you're asking, and which is going to be mroe performant will depend on your environment and data
Option 1
index=ndx sourcetype=srctp ("aniket" OR "shivam")
| rex field=_raw "user_search\"=>{\"name\"=>(?<result>.*)"
| stats count by result
Option 2
index=ndx sourcetype=srctp
| rex field=_raw "user_search\"=>{\"name\"=>(?<result>.*)"
| search result="aniket" OR result="shivam"
| stats count by result

what's the most effecient way to query all the messages in a group chat application?

i will use an example to illustrate my question.
you have a group-chat table that stores data about group chat.
-------------------+
id | name |owner_id|
-------------------+
33 | code | 45
you have a messages table that hold messages
-------------------------------------+
id | content | user_id | chat_room_id
-------------------------------------+
5 | "hello" | 41 | 33
2 | "hi" | 43 | 33
you have a users table that holds user information and which group chat they are part of:
-------------------------------------+
id | name | chat_room_id
-------------------------------------+
5 |"nick"| 33
2 |"mike"| 33
is this the right way to set up the database?
without joints or foreign keys. what's the most efficient way to load all the messages and user data and have it in a form that allows you to construct a ui where the user data is displayed next to the message?
My solutions:
if you query the messages database and retrieved all the messages where chat room id is equal to 33, you're gonna get an array that looks like
[
{
id : 5,
user_id : 41,
content : "hello"
},
{
id : 2,
user_id : 43,
content : "hi"
}
]
as you can see the user ids are part of the message object.
solution 1 : (naive) :
loop through the messages array and query the database using the user id.
this is a bad solution since querying the database from a loop is never a good idea.
solution 2 : (efficient but less data to send in the response) :
loop through the messages array and construct an array of user ids and use that in a query
using WHERE user_id IN
then loop through the array of users and construct a hash table using the user id as a key since it is unique.
on the front end just loop through the messages array and lookup the user.
is this solution going to be very slow if you have a large amount of messages. will it scale well since it's O(n).
solution 3 : (efficient but more data to send in the response) :
its the same as before but the difference here is adding properties to the messages object that store user data.
the problem with this solution is that you will have duplicate data since one user can publish multiple messages.
these are my solutions i hope to hear yours.
for context : system design videos on youtube don't address this part of chat apps. if you found one that does please post the link.

Splunk: Unable to get the correct min and max values

I'm a newbie as far as Splunk is concerned with modest regex skills.
We have events with the following patterns:
fallbackAPIStatus={api1=133:..., api2=472:...,api3=498:...}
fallbackAPIStatus={api1=3535:...}
fallbackAPIStatus={api2=252:...,api3=655:...}
The numeric value indicates the response times and the ellipsis inidcates fields that I'm not interested in.
The number of apis within the braces is dynamic (between 1 and 4)
I want to be able to create a table as follows:
apiName TotalRequests Max-Response-Time Min-Response-Time
api1 2 3535 133
api2 2 472 252
api2 2 655 498
Here's my search:
index=my_logs sourcetype=my_sourcetype | rex field=_raw "fallbackAPIStatus=\{(?P<fallBackApis>[^\}]+)\}" | eval temp=split(fallBackApis,",") | rex field=temp "(?P<apiName>[a-zA-Z-]+)=(?P<responseTime>[0-9]+):"|stats count as TotalRequests max(responseTime) as Max-Response-Time min(responseTime) as Min-Response-Time by apiName
I'm able to get the TotalRequests right but I'm not able to get the correct max and min response times
Can someone advise what I'm doing wrong here?
I think there is an issue with your field extraction, the following works fine
| eval temp=split(fallBackApis,",") | rex field=temp "(?<apiName>\S+)=(?<responseTime>\d+):"

Query for calculating duration between two different logs in Splunk

As part of my requirements, I have to calculate the duration between two different logs using Splunk query.
For example:
Log 2:
2020-04-22 13:12 ADD request received ID : 123
Log 1 :
2020-04-22 12:12 REMOVE request received ID : 122
The common String between two logs is " request received ID :" and unique strings between two logs are "ADD", "REMOVE". And the expected output duration is 1 hour.
Any help would be appreciated. Thanks
You can use the transaction command, https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Transaction
Assuming you have the field ID extracted, you can do
index=* | transaction ID
This will automatically produce a field called duration, which is the time between the first and last event with the same ID
While transaction will work, it's very inefficient
This stats should show you what you're looking for (presuming the fields are already extracted):
(index=ndxA OR index=ndxB) ID=* ("ADD" OR "REMOVE")
| stats min(_time) as when_added max(_time) as when_removed by ID
| eval when_added=strftime(when_added,"%c"), when_removed(when_removed,"%c")
If you don't already have fields extracted, you'll need to modify thusly (remove the "\D^" in the regex if the ID value isn't at the end of the line):
(index=ndxA OR index=ndxB) ("ADD" OR "REMOVE")
| rex field=_raw "ID \s+:\s+(?<ID>\d+)\D^"
| stats min(_time) as when_added max(_time) as when_removed by ID
| eval when_added=strftime(when_added,"%c"), when_removed(when_removed,"%c")