Splunk: count by Id - splunk

I did a query in Splunk which looks like this:
source="/log/ABCDE/cABCDEFGH/ABCDE.log" doSomeTasks
I now want to count the entries in the logfile by Id (Id is an extracted field). But I only want to count every Id once and not every time when doSomeTasks is executed. How could I do this?

To count unique instances of field values, use the distinct_count or dc function.
source="/log/ABCDE/cABCDEFGH/ABCDE.log" doSomeTasks
| stats dc(Id) as IdCount

Related

Splunk Query Recommendation

I have below log from my application:
BookData, {
id: 12312
}, appID : 'APP1', Relation_ID : asdas-12312
host = aws#asd. sourcetype=service_name
The entire log above is in the form of a single String. I want to create a table with the no. of times an appID has hit the service. i.e. I want to count the no. of events and group them by appID.
Basically, something like:
appID Count
APP1 23
APP2 25
APP3 100
I tried with below query, but it is not working. It is giving as 0 records found.
index=my_index sourcetype=service_name * | table appID Count | addColTotals labelfield=appID label="appID" count
As per my understanding, above query is not working because appID is not a label, but in that case, how do I go about forming the query with my desired result.
The query doesn't work in part because there is no Count field for the table command to display and no count field for the addcoltotals command to add to the results. To get a count you must tell Splunk to count fields by using the stats, eventstats, streamstats, or timechart command.
Try this:
index=my_index sourcetype=service_name
| stats count as Count by appID

Extracting a count from raw splunk data by id

I am trying to get a count from transactional information that is retained within raw data in splunk. I have 3-5 transactions that occur.
One has raw data stating: pin match for id 12345678-1234-1234-abcd-12345678abcd or pin mismatched for id etc.
I'm trying to count the number of times the pin match occurs within the transaction time window of 180sec.
I was trying to do something like:
|eval raw=_raw |search index=transa
|eval pinc= if((raw like "%pin match%"),1,0) |stats count(pinc) as Pincount by ID
The issue I'm having is it is counting cumulatively over whatever time I am looking at those transactions. Is there a way to attach it to the ID that is within the message or have it count every one that occurs within that time window?
Thanks!
Presuming the pin status and ID have not been extracted:
index=ndx sourcetype=srctp "pin" "match" OR "mismatched"
| rex field=_raw "pin (?<pin_status>\w+)"
| rex field=_raw "id (?<id>\S+)"
| eval status_time=pin_status+"|"+_time
| stats earliest(status_time) as beginning latest(status_time) as ending by id
| eval beginning=split(beginning,"|"), ending=split(ending,"|")
| eval begining=mvindex(beginning,-1), ending=mvindex(ending,-1)
| table id beginning ending
| sort 0 id
| eval beginning=strftime(beginning,"%c"), ending=strftime(ending,"%c")
After extracting the status ("match" or "mismatched") and the id, append the individual event's _time to the end of the status - we'll pull that value back out after statsing
Using stats, find the earliest and latest status_time entries (fields just created on the previous line) by id, saving them into new fields beginning and ending
Next, split() beginning and ending on the pipe we added to separate the status from the timestamp into a multivalue field
Then assign the last item from the multivalue field (which we know is the timestamp) into itself (because we know that the earliest entry for a status_time should always be "match", and the latest entry for a status_time should always be "mismatched")
Lastly, table the id and time stamps, sort by id, and format the timestamp into something human readable (strftime takes many arguments, %c just happens to be quick)

Splunk interesting field exclusion

i have 4 fields (Name , age, class, subject) in one index (Student_Entry) and i want to add total events but i want to exclude those events who has any value in subject field.
I tried the below two ways
index=Student_Entry Subject !=* | stats count by event
index=Student_Entry NOT Subject= * | stats count by event
The NOT and != operators are similar, but not equivalent. NOT will return events with no value in the Subject field, whereas != will not. In your case, use !=. See https://docs.splunk.com/Documentation/Splunk/8.0.4/Search/NOTexpressions
stats count by event does nothing because there is no field called 'event'. To count events, just use stats count.
It looks like you were right using index=Student_Entry Subject !=*
Then you can add only - | stats count
You can do it this way, too:
index=Student_Entry
| where isnull(subject)
| stats count

How to compare a value with the number of matches for a second query?

"daily unique entry text"
| spath input=stats
| where ('expectedCount' != _???_)
I have one daily unique log entry that states the expectedCount of items that will be processed for that day.
Let's say this daily entry contains the unique text daily unique entry text, and each time an item gets processed successfully, I log item processed.
I'd like an alert that fires if expectedCount is not equal to the number of item processed log entries that follow, in that day.
Can this be accomplished with something at _???_? Or: what's the best way to do this? Thanks in advance!
index=* "item processed" | stats count | append [ search index=* "daily unique entry text" | spath input=stats | fields expectedCount ] | stats values(count) as c, values(expectedCount) as ec | where c != ec
I think this is the most straightforward approach, but there are other ways. First search just gets a count of all the items processed. You may need to restrict it to a day of required. Then we append another search, which is just the value of the expected count. We use another stats command to get the value of the actual count and the expected count together. Then, just compare them

How to aggregate logs by field and then by bin in AWS CloudWatch Insights?

I'm trying to do a query that will first aggregate by field count and after by bin(1h) for example I would like to get the result like:
# Date Field Count
1 2019-01-01T10:00:00.000Z A 123
2 2019-01-01T11:00:00.000Z A 456
3 2019-01-01T10:00:00.000Z B 567
4 2019-01-01T11:00:00.000Z B 789
Not sure if it's possible though, the query should be something like:
fields Field
| stats count() by Field by bin(1h)
Any ideas how to achieve this?
Is this what you need?
fields Field | stats count() by Field, bin(1h)
If you want to create a line chart, you can do it by separately counting each value that your field could take.
fields
Field = 'A' as is_A,
Field = 'B' as is_B
| stats sum(is_A) as A, sum(is_B) as B by bin(1hour)
This solution requires your query to include a string literal of each value ('A' and 'B' in OP's example). It works as long as you know what those possible values are.
This might be what Hugo Mallet was looking for, except the avg() function won't work here so he'd have to calculate the average by dividing by a total
Not able to group by a certain field and create visualizations.
fields Field
| stats count() by Field, bin(1h)
Keep getting this message
No visualization available. Try this to get started:
stats count() by bin(30s)