Best method to keep lookup file value fresh - splunk

Say, I have to monitor users' activities from 3 specific departments: Science, History, and Math.
The goal is to send an alert if any of the users in any of those departments download a file from site XYZ.
Currently, I have a lookup file for all the users from those three departments.
users
----------------------
user1#organization.edu
user2#organization.edu
user3#organization.edu
user4#organization.edu
user5#organization.edu
One problem: users can join, leave, or transfer to another department anytime.
Fortunately, those activities (join and leave) are tracked and they are Splunk-able.
index=directory status=*
-----------------------------------------------
{
"username":"user1#organization.edu",
"department":"Science",
"status":"added"
}
{
"username":"user1#organization.edu",
"department":"Science",
"status":"removed"
}
{
"username":"user2#organization.edu",
"department":"History",
"status":"added"
}
{
"username":"user3#organization.edu",
"department":"Math",
"status":"added"
}
{
"username":"MRROBOT#organization.edu",
"department":"Math",
"status":"added"
}
In this example, assuming I forgot to update the lookup file, I won't get an alert when MRROBOT#organization.edu downloads a file, and at the same time, I will still get an alert when user1#organization.edu downloads a file.
One solution that I could think of is to update the lookup manually via using inputlookup and outputlook method like:
inputlookup users.csv | users!=user1#organization.edu | outputlookup users.csv
But, I don't think this is an efficient method, especially there's high likely I might miss a user or two.
Is there a better way to keep the lookup file up-to-date? I googled around, and one suggestion is to use a cronjob CURL to update the list. But, I was wondering if there's a simpler or better alternative than that.

Here's a search that should automate the maintenance of the lookup file using the activity events in Splunk.
`comment("Read in the lookup file. Force them to have old timestamps")`
| inputlookup users.csv | eval _time=1, status="added"
`comment("Add in activity events")`
| append [ search index=foo ]
`comment("Keep only the most recent record for each user")`
| stats latest(_time) as _time, latest(status) as status by username
`comment("Throw out users with status of 'removed'")`
| where NOT status="removed"
`comment("Save the new lookup")`
| table username
| outputlookup users.csv
After the append command, you should have a list that looks like this:
user1#organization.edu added
user2#organization.edu added
user3#organization.edu added
user4#organization.edu added
user5#organization.edu added
user1#organization.edu added
user1#organization.edu removed
user2#organization.edu added
user3#organization.edu added
MRROBOT#organization.edu added
The stats command will reduce it to:
user4#organization.edu added
user5#organization.edu added
user1#organization.edu removed
user2#organization.edu added
user3#organization.edu added
MRROBOT#organization.edu added
with the where command further reducing it to:
user4#organization.edu added
user5#organization.edu added
user2#organization.edu added
user3#organization.edu added
MRROBOT#organization.edu added

Related

Search using Lookup from a single field CSV file

I have a list of usernames that I have to monitor and the list is growing every day. I read Splunk documentation and it seems like lookup is the best way to handle this situation.
The goal is for my query to leverage the lookup function and prints out all the download events from all these users in the list.
Sample logs
index=proxy123 activity="download"
{
"machine":"1.1.1.1",
"username":"ABC#xyz.com",
"activity":"download"
}
{
"machine":"2.2.2.2",
"username":"ASDF#xyz.com",
"activity":"download"
}
{
"machine":"3.3.3.3",
"username":"GGG#xyz.com",
"activity":"download"
}
Sample Lookup (username.csv)
users
ABC#xyz.com
ASDF#xyz.com
BBB#xyz.com
Current query:
index=proxy123 activity="download" | lookup username.csv users OUTPUT users | where not isnull(users)
Result: 0 (which is not correct)
I probably don't understand lookup correctly. Can someone correct me and teach me the correct way?
In the lookup file, the name of the field is users, whereas in the event, it is username. Fortunately, the lookup command has a mechanism for renaming the fields during the lookup. Try the following
index=proxy123 activity="download" | lookup username.csv users AS username OUTPUT users | where isnotnull(users)
Now, depending on the volume of data you have in your index and how much data is being discarded when not matching a username in the CSV, there may be alternate approaches you can try, for example, this one using a subsearch.
index=proxy123 activity="download" [ | inputlookup username.csv | rename users AS username | return username ]
What happens here in the subsearch (the bit in the []) is that the subsearch will be expanded first, in this case, to (username="ABC#xyz.com" OR username="ASDF#xyz.com" OR username="BBB#xyz.com"). So your main search will turn into
index=proxy123 activity="download" (username="ABC#xyz.com" OR username="ASDF#xyz.com" OR username="BBB#xyz.com")
which may be more efficient than returning all the data in the index, then discarding anything that doesn't match the list of users.
This approach assumes that you have the username field extracted in the first place. If you don't, you can try the following.
index=proxy123 activity="download" [ | inputlookup username.csv | rename users AS search | format ]
This expanded search will be
index=proxy123 activity="download" "ABC#xyz.com" OR "ASDF#xyz.com" OR "BBB#xyz.com")
which may be more suitable to your data.

Is there any KQL queries to extract page views, download counts from the W3C IISlogs on Azure-Log analytics?

We're trying to extract page views, file download count, users list from w3c IIS logs. we want to define what's page view, i.e. any user stayed on same page more than 10 sec to be a one page view. anything less is not a page view. w3c logs doesn't seem to be having enough data to extract this. can this be possible with what's already available?
This is the data available to extract the above info from,
Datatable operator
datatable (TimeGenerated:datetime, csUriStem:string, scStatus:string, csUserName:string, sSiteName :string)
[datetime(2019-04-12T11:55:13Z),"/Account/","302","-","WebsiteName",
datetime(2019-04-12T11:55:16Z),"/","302","-","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Account/","200","myemail#mycom.com","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Content/site.css","200","-","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Scripts/modernizr-2.8.3.js","200","-","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Scripts/bootstrap.js","200","-","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Content/bootstrap.css","200","-","WebsiteName",
datetime(2019-04-12T11:55:18Z),"/Scripts/jquery-3.3.1.js","200","-","WebsiteName",
datetime(2019-04-12T11:55:23Z),"/","302","-","WebsiteName",
datetime(2019-04-12T11:56:39Z),"/","200","myemail#mycom.com","WebsiteName",
datetime(2019-04-12T11:57:13Z),"/Home/About","200","myemail#mycom.com","WebsiteName",
datetime(2019-04-12T11:58:16Z),"/Home/Contact","200","myemail#mycom.com","WebsiteName",
datetime(2019-04-12T11:59:03Z),"/","200","myemail#mycom.com","WebsiteName"]
I am not sure I got all your requirements right, but here is something to get started and provide you initial direction.
datatable (TimeGenerated:datetime, csUriStem:string, scStatus:string, csUserName:string, sSiteName :string)
[datetime(2019-04-12T11:55:13Z),"/Account/","302","-","WebsiteName",
datetime(2019-04-12T11:55:16Z),"/","302","-","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Account/","200","myemail#mycom.com","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Content/site.css","200","-","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Scripts/modernizr-2.8.3.js","200","-","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Scripts/bootstrap.js","200","-","WebsiteName",
datetime(2019-04-12T11:55:17Z),"/Content/bootstrap.css","200","-","WebsiteName",
datetime(2019-04-12T11:55:18Z),"/Scripts/jquery-3.3.1.js","200","-","WebsiteName",
datetime(2019-04-12T11:55:23Z),"/","302","-","WebsiteName",
datetime(2019-04-12T11:56:39Z),"/","200","myemail#mycom.com","WebsiteName",
datetime(2019-04-12T11:57:13Z),"/Home/About","200","myemail#mycom.com","WebsiteName",
datetime(2019-04-12T11:58:16Z),"/Home/Contact","200","myemail#mycom.com","WebsiteName",
datetime(2019-04-12T11:59:03Z),"/","200","myemail#mycom.com","WebsiteName"]
| where scStatus !in ('302') // exclude status 302
| where csUriStem !startswith '/Scripts' and csUriStem !endswith ".css" // exclude pages coming from '/Script' and .css files
| order by TimeGenerated asc
| summarize t=make_list(TimeGenerated) by csUriStem, csUserName // create time-series of visit events
| mv-apply t to typeof(datetime) on // run subquery on each of the series
(
project isVisit = (t - prev(t)) > 1min // compare with previous timestamp, and see if >1min passed
| summarize Visits=sum(isVisit)
)
| project csUriStem, csUserName, Visits
Here are links to make_list() (aggregation function), prev() (window function), summarize operator, and mv-apply operator

Splunk: Find the difference between 2 events

I have a server with 2 APIs: /migrate/start and /migrate/end
For each request, I log the userID (field usrid="") of the user using my service to be migrated and the api called (field api="").
Users call /migrate/start, then call /migrate/end. I would like to write a slunk query to list the userIDs that are being migrated, i.e. those that called /migrated/start but have yet to call /migrate/end. How would I write that query?
Thank you
Assuming you have only 2 api calls (start/end) in the logs, you can use a stats command to do this.
| your_search
| stats values(api) as api by usrid
| where api!="/migrate/end"
This clubs all api calls done per user and removes the ones which have called /migrate/end
The general method is to get all the start and end events and match them up by user ID. Take the most recent event for each user and throw out the ones that are "migrate/end". What's left are all the in-progress migrations. Something like this:
index = foo (api="/migrate/start" OR api="/migrate/end")
| stats latest(api) by usrid
| where api="/migrate/start"

Search with original text that was replaced earlier

I am gathering performance metrics for each each api that we have. With the below query I get results as
method response_time
Create Billing 2343.2323
index="dev-uw2" logger_name="*Aspect*" message="*ApiImpl*" | rex field=message "PerformanceMetrics - method='(?<method>.*)' execution_time=(?<response_time>.*)" | table method, response_time | replace "public com.xyz.services.billingservice.model.Billing com.xyz.services.billingservice.api.BillingApiImpl.createBilling(java.lang.String)” WITH "Create Billing” IN method
If the user clicks on each api text in table cell to drill down further it will open a new search with "Create Billing" obviosuly it will give zero results since we don't have any log with that string.
I want splunk to search with original text that was replaced earlier.
You can use click.value to get around this.
http://docs.splunk.com/Documentation/SplunkCloud/6.6.3/Viz/tokens

Trying to filter an AD export script in powershell by user type?

I've been asked to pull a report containing User's name, username, enabled/disabled, and the last login time from our Windows server 2008 domain. I'm using the script below and it's working, but the problem is it's pulling built-in security accounts and some system accounts, and I need just users. Does anyone know if this filtering is possible? The script I'm using is below. Thanks in advance!
$ADUserParams=#{
'Server' = 'servername.domain.local'
'Searchbase' = 'DC=domain,DC=local'
'Searchscope'= 'Subtree'
'Filter' = '*'
'Properties' = '*'
}
$SelectParams=#{
'Property' = 'CN', 'SAMAccountname', 'DisplayName', 'enabled', 'lastlogondate',
}
get-aduser #ADUserParams | select-object #SelectParams | export-csv "c:\temp\users.csv"
At the very least you'll want to modify your filter to something like:
'(&(|(objectclass=person)(objectclass=inetorgperson))(!(objectclass=computer)))'.
That will still leave Administrator, Guest and and domain/realm trusts you've got, but otherwise it's pretty clean.
'(&(sAMAccountType=805306368)(!(isCriticalSystemObject=TRUE)))' is even cleaner, and may be exactly what you need. This uses sAMAccountType, but I pulled from existing AD users rather than build that value from scratch.
Also there is no Enabled attribute. The closest you can get is userAccountControl. lastLogonDate is actually lastLogonTimestamp.
part of your requirements for the report are to show all users in AD, this would include system and built-in accounts. That being said, ff you can exclude the OUs or containers that contain the built-in/system accounts you don't want in the report that would be easiest. It looks like your trying to audit the whole AD DS and should use exclusions otherwise only include the OU that contains the User Accounts as long as it is only possible to not have User accounts anywhere else.
It really depends on what you can use to separate your built-ins and system accounts.
The easiest way would be to add a SearchBase to your $ADUserParams:
$ADUserParams=#{
'Server' = 'servername.domain.local'
...
'SearchBase' = 'OU=Lemmings,DC=contoso,DC=com'
}
If there's one OU that you need to filter out, try adding a Where-Object:
get-aduser #ADUserParams | ?{$_.DistinguishedName -notlike '*ou=Beancounters,*'} | select-object #SelectParams | export-csv c:\temp\users.csv"
The ?{ } bit is an alias for the Where-Object command. $_ represents the objects passed along the pipe.
This is all assuming that these accounts are cleanly separated by OU, however. I know this isn't true in my environment.
You might have to play around for a while before finding something that will separate your users cleanly. It might help to store your initial query as a variable, $users = Get-ADUser #ADUserParams, and see what you can pick apart:
$users | ?{$_.SomeProperty -eq 'SomeValue'}
Try running $users[0] to get an idea of what properties there might be to help you filter through these users. If you need to wrap your head around things like -eq and -like, take a look here.
If all the accounts you're wanting to filter contain a character like $, you could filter the output like so:
$users | ?{$_.SamAccountName -notlike "*$*"}