How can I create a splunk query to show when there is activity from a country which has not shown activity in the prior 90 days? - splunk

I'm generally new to using Splunk and I've been assigned a task to create a stats table that would show countries that have not been active for at least 90 days but are showing activity the day of the search. Looking for some guidance to point me in the right direction in terms of what commands/searches to use in order to build a query that would meet this requirement.

Finding what's not there is not Splunk's strong suit so building a list of countries not heard from will be a challenge. Try turning it around: build a list of countries active in the last 90 days then alert when there's activity from a country not on the list.
Build the list by running a daily report that searches over the last 90 days for "activity". Look up the source countries, remove duplicates, then save only the country names to a lookup file.
index=foo <<search for activity>>
| iplocation ip
| fields Country
| stats count by Country
| fields - count
| outputlookup countries.csv
To find new countries, use the lookup file to rule out countries seen recently.
index=foo <<search for activity>>
| iplocation ip
| search NOT [ | inputlookup countries.csv | format ]

Related

Splunk index usage search adding column titled NULL to results

I'm running a fairly simple search to identify index usage on my Splunk install by source, as we're running through the Enterprise 30-day trial with the intention of using Splunk Free after it expires:
index=_internal source=*license_usage.log | eval MB=b/1024/1024 | timechart span=1d sum(MB) by s where count in top50
The results for all of my data sources are returned as expected but there's an additional column titled "NULL" at the end of the results:
Splunk index search NULL column
All of my data has an input source and when I click on the column and choose to view the data, it brings back no results.
Can anyone help me understand what this NULL column is please? If it's correct it suggests I'm using over the 500MB/day limit for Splunk Free, which I need to address before the trial period ends.
The NULL column appears because some events do not have an 's' field. You only want to sum those events with an s field so modify your query to
index=_internal source=*license_usage.log type=Usage
| eval MB=b/1024/1024
| timechart span=1d sum(MB) by s where count in top50

Stats Count Splunk Query

I wonder whether someone can help me please.
I'd made the following post about Splunk query I'm trying to write:
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html
I received some great help, but despite working on this for a few days now concentrating on using eval if statements, I still have the same issue with the "Successful" and "Unsuccessful" columns showing blank results. So I thought I'd cast the net a little wider and ask please whether someone maybe able to look at this and offer some guidance on how I may get around the problem.
Many thanks and kind regards
Chris
I tried exploring your use-case with splunkd-access log and came up with a simple SPL to help you.
In this query I am actually joining the output of 2 searches which aggregate the required results (Not concerned about the search performance).
Give it a try. If you've access to _internal index, this will work as is. You should be able to easily modify this to suit your events (eg: replace user with ClientID).
index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| stats count as All sum(eval(if(status <= 303,1,0))) as Successful sum(eval(if(status > 303,1,0))) as Unsuccessful by user
| join user type=left
[ search index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| chart count BY user status ]
I updated your search from splunk community answers (should look like this):
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| stats count as All sum(eval(if(statusCode <= 303,1,0))) as Successful sum(eval(if(statusCode > 303,1,0))) as Unsuccessful by ClientID
| join ClientID type=left
[ search w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| chart count BY ClientID statusCode ]
I answered in Splunk
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html?childToView=729492#answer-729492
but using dummy encoding, it looks like
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientId as ClientID, detail.statusCode as Status
| eval X_{Status}=1
| stats count as Total sum(X_*) as X_* by ClientID
| rename X_* as *
Will give you ClientID, count and then a column for each status code found, with a sum of each code in that column.
As I gather you can't get this working, this query should show dummy encoding in action
`index=_internal sourcetype=*access
| eval X_{status}=1
| stats count as Total sum(X_*) as X_* by source, user
| rename X_* as *`
This would give an output of something like

How do I create a Splunk query for unused event types?

I have found that I can create a Splunk query to show how many times results of a certain event type appear in results.
severity=error | stats count by eventtype
This creates a table like so:
eventtype | count
------------------------
myEventType1 | 5
myEventType2 | 12
myEventType3 | 30
So far so good. However, I would like to find event types with zero results. Unfortunately, those with a count of 0 do not apear in the query above, so I can't just filter by that.
How do I create a Splunk query for unused event types?
There are lots of different ways for that, depending on what you mean by "event types". Somewhere, you have to get a list of whatever you are interested in, and roll them into the query.
Here's one version, assuming you had a csv that contained a list of eventtypes you wanted to see...
severity=error
| stats count as mycount by eventtype
| inputcsv append=t mylist.csv
| eval mycount=coalesce(mycount,0)
| stats sum(mycount) as mycount by eventtype
Here's another version, assuming that you wanted a list of all eventtypes that had occurred in the last 90 days, along with the count of how many had occurred yesterday:
earliest=-90d#d latest=#d severity=error
| addinfo
| stats count as totalcount count(eval(_time>=info_max_time-86400)) as yesterdaycount by eventtype

SSRS report to display Outlook type calendar - People x Days x Activities

I am after a pattern/view/opinion/tip/weblink from an SSRS/SQL expert on how I might create a report that enables me to list something like the following:
[----Person----] | [29-Sep-11] | [30-Sep-11] | [01-Oct-11] | [02-Oct-11] | [03-Oct-11]... and so on...
Bob Bobertson | Activity A | Activity B | ----Empty--- | Activity C | ---Empty--- |
Rob Robertson | Activity D | Activity E | Activity F | Activity G | ---Empty--- |
...and so on...
Date columns are dynamic - example, 10 days on from today (rolling window report)
Person column is dynamic list based on a user collection
So the above table looks pretty simple, but theres an extra dimension/depth to it.
I need to get the details for the Activity to create a link to it and style it in the report based on other flags.
I'm currently stumped on how to structure my resultset, and then how to Group/Pivot the data into a report structure.
Has anyone done similar before?
I'm using CRM4.0 for records including Date, Person, ActivityTitle, Billable etc
SSRS 2008 for the report building via VS2008 BI studio
Turns out to be very simple! You can just insert a tablix and chose the date field for the column headers, and the Username for the row labels. Add a formula for the intersect to use more than one of the remaining values, and thats the three dimensions solved! – BennIT
Adolf Garlic:
Think of the tablix (matrix) control as a "design time pivot table" and you can't go wrong
Many thanks for your input Adolf!

Cumulative average number of records created for specific day of week or date range

Yeah, so I'm filling out a requirements document for a new client project and they're asking for growth trends and performance expectations calculated from existing data within our database.
The best source of data for something like this would be our logs table as we pretty much log every single transaction that occurs within our application.
Now, here's the issue, I don't have a whole lot of experience with MySql when it comes to collating cumulative sum and running averages. I've thrown together the following query which kind of makes sense to me, but it just keeps locking up the command console. The thing takes forever to execute and there are only 80k records within the test sample.
So, given the following basic table structure:
id | action | date_created
1 | 'merp' | 2007-06-20 17:17:00
2 | 'foo' | 2007-06-21 09:54:48
3 | 'bar' | 2007-06-21 12:47:30
... thousands of records ...
3545 | 'stab' | 2007-07-05 11:28:36
How would I go about calculating the average number of records created for each given day of the week?
day_of_week | average_records_created
1 | 234
2 | 23
3 | 5
4 | 67
5 | 234
6 | 12
7 | 36
I have the following query which makes me want to murderdeathkill myself by casting my body down an elevator shaft... and onto some bullets:
SELECT
DISTINCT(DAYOFWEEK(DATE(t1.datetime_entry))) AS t1.day_of_week,
AVG((SELECT COUNT(*) FROM VMS_LOGS t2 WHERE DAYOFWEEK(DATE(t2.date_time_entry)) = t1.day_of_week)) AS average_records_created
FROM VMS_LOGS t1
GROUP BY t1.day_of_week;
Halps? Please, don't make me cut myself again. :'(
How far back do you need to go when sampling this information? This solution works as long as it's less than a year.
Because day of week and week number are constant for a record, create a companion table that has the ID, WeekNumber, and DayOfWeek. Whenever you want to run this statistic, just generate the "missing" records from your master table.
Then, your report can be something along the lines of:
select
DayOfWeek
, count(*)/count(distinct(WeekNumber)) as Average
from
MyCompanionTable
group by
DayOfWeek
Of course if the table is too large, then you can instead pre-summarize the data on a daily basis and just use that, and add in "today's" data from your master table when running the report.
I rewrote your query as:
SELECT x.day_of_week,
AVG(x.count) 'average_records_created'
FROM (SELECT DAYOFWEEK(t.datetime_entry) 'day_of_week',
COUNT(*) 'count'
FROM VMS_LOGS t
GROUP BY DAYOFWEEK(t.datetime_entry)) x
GROUP BY x.day_of_week
The reason why your query takes so long is because of your inner select, you are essentialy running 6,400,000,000 queries. With a query like this your best solution may be to develop a timed reporting system, where the user receives an email when the query is done and the report is constructed or the user logs in and checks the report after.
Even with the optimization written by OMG Ponies (bellow) you are still looking at around the same number of queries.
SELECT x.day_of_week,
AVG(x.count) 'average_records_created'
FROM (SELECT DAYOFWEEK(t.datetime_entry) 'day_of_week',
COUNT(*) 'count'
FROM VMS_LOGS t
GROUP BY DAYOFWEEK(t.datetime_entry)) x
GROUP BY x.day_of_week