Match partial string from list with field - kql

I'm trying to check if a field contains a value from a list using Kusto in Log analytics/Sentinel in Azure.
The list contains top level domains but I only want matches for subdomains of these top levels domains. The list value example.com should match values such as forum.example.com or api.example.com.
I got the following code but it does exact matches only.
let domains = dynamic(["example.com", "amazon.com", "microsoft.com", "google.com"]);
DeviceNetworkEvents
| where RemoteUrl in~ (domains)
| project TimeGenerated, DeviceName, InitiatingProcessAccountUpn, RemoteUrl
I tried with endswith, but couldn't get that to work with the list.

It seems that has_any() would work for you:
let domains = dynamic(["example.com", "amazon.com", "microsoft.com", "google.com"]);
DeviceNetworkEvents
| where RemoteUrl has_any(domains)
| project TimeGenerated, DeviceName, InitiatingProcessAccountUpn, RemoteUrl
Note that you can also use the has_any_index() to get which item in the array was matched

In order to correctly match URLs with a list of domains, you need to build a regex from these domains, and then use the matches regex operator.
Make sure you build the regex correctly, in order not to allow these:
example.com.hacker.com
hackerexample.com
hacker.com/example.com
Etc...

Related

user wants to apply a quite complex "User Search Filter" in his LDAP Configuration

user have to apply a quite complex "User Search Filter" in his LDAP Configuration.
The filter is too big and exceed the 256 allowed character. For customer business policy is not possible to modify the LDAP structure or data How can we proceed?
Here there is a sample of the filter:
(&
(|
(memberOf=CN=Applicazione_DocB_AmmApplicativo,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_AmmPiattaforma,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_ArchFIRead,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_ArchFIWrite,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_AreaFinanza,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_Arm,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_BoGestCanc,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_BoUpdDocum,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_Crif,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_VisualBase,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
(memberOf=CN=Applicazione_DocB_VisualEsteso,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
)(|
(userAccountControl=512)
(userAccountControl=544)
(userAccountControl=66048)
)
)
Have the customer create one single group to control access to the application, then they can add all of those groups to that one group. Then you only need to look at that one group. However, you will need to use the LDAP_MATCHING_RULE_IN_CHAIN operator so that it will look at the members of nested groups.
If the name of that new group is Applicazione_DocB, that would look something like this:
(memberOf:1.2.840.113556.1.4.1941:=CN=Applicazione_DocB,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)
Your conditions on userAccountControl can also be simplified. That attribute is a bit flag, which means that each bit in the binary value is a flag that means something. Those values are listed in the documentation for userAccountControl. The three conditions you are using are:
512: ADS_UF_NORMAL_ACCOUNT
544: ADS_UF_NORMAL_ACCOUNT | ADS_UF_PASSWD_NOTREQD (password not required)
66048: ADS_UF_NORMAL_ACCOUNT | ADS_UF_DONT_EXPIRE_PASSWD (password does not expire)
If the intent is to exclude disabled accounts (514: ADS_UF_NORMAL_ACCOUNT | ADS_UF_ACCOUNTDISABLE), then you can do that by using the LDAP_MATCHING_RULE_BIT_AND operator to check if the second bit is not set (which indicates a disabled account), like this:
(!userAccountControl:1.2.840.113556.1.4.803:=2)
Putting that all together, you get a query that is less than 256 characters:
(&(memberOf:1.2.840.113556.1.4.1941:=CN=Applicazione_DocB,OU=Intranet,OU=Gruppi,DC=CBMAIN,DC=CBDOM,DC=IT)(!userAccountControl:1.2.840.113556.1.4.803:=2))

Azure Workbook parameter for resource and resource group

I've been defeated by Kusto on what I thought to be a simple query...
I'm making my first workbook and playing with parameters. I can list and select a Resource Group, but I can't make the following parameter (Virtual Machines) populate with the VMs when more than one Resource Group is selected. The ResourceGroup passes a comma delineated string of the group names as a property resourcegroup just fine. I cannot figure out how to translate that string into a usable where-statement. My query works just fine when I manually string several Resource Groups together so I assume I'm getting burned by my understanding of let and arrays in Kusto. If there is a better way of doing what I'm trying to do, please let me know.
//This will work so long as 1 Resource Group is passed from the previous parameter
resources
| where resourceGroup in ('{ResourceGroup:resourcegroup}') and type =~ microsoft.compute/virtualmachines'
| project value = id , label = name
I've figured out I can get a proper array with split('{ResourceGroup:resourcegroup}',","), but, again, I haven't been able to marry up that object with a where-statement.
Any help is much appreciated!
in https://github.com/microsoft/Application-Insights-Workbooks/blob/master/Documentation/Parameters/DropDown.md#special-casing-all
NORMALLY there is a way to do this:
let resourceGroups = dynamic([{ResourceGroup:resourcegroup}]);// turns even an empty string into a valid array
resources
| where (array_length(resourceGroups)==0 // allows 0 length array to be "all"
or resourceGroup in (resourceGroups)) // or filters to only those in the set
and type =~ microsoft.compute/virtualmachines'
| project value = id , label = name
however, i don't think Azure Resource Graph allows using let this way?
if you get an error that let isn't allowed, you'll have to do that dynamic thing inline a couple times instead:
resources
| where (array_length(dynamic([{ResourceGroup:resourcegroup}]))==0 // allows 0 length array to be "all"
or resourceGroup in (dynamic([{ResourceGroup:resourcegroup}]))) // or filters to only those in the set
and type =~ microsoft.compute/virtualmachines'
| project value = id , label = name

Splunk Host header overrides host key from log messages

How can I stop Splunk considering hostname "host" more important than "host" key?
Let's suppose that I have the following logs:
color = red ; host = localhost
color = blue ; host = newhost
The following query works fine:
index=myindex | stats count by color
but the following doesn't:
index=myindex | stats count by host
because instead of considering "host" being the key from the log, it sees the Host header as "host".
How can I deal with this?
When there are two fields with the same name one of them has to "win". In this case, it's the one Splunk defines before it processes the event itself. As you probably know, every event is given 4 fields at input time: index, host, source, and sourcetype. Data from the event won't override these unless specifically told to do so in the config files.
To override the settings, put this in your transforms.conf file
[sethost]
REGEX = host\s*=\s*(\w+)
DEST_KEY = MetaData:Host
FORMAT = host::$1
You'll also need to reference the transform in your props.conf file
[mysourcetype]
TRANSFORMS-host = sethost
I would have thought this solution would be more prominent, but I found it buried deep in the Splunk docs.
https://docs.splunk.com/Documentation/Splunk/8.2.6/Metrics/Search
You can use reserved fields such as "source", "sourcetype", or "host" as dimensions. However, when extracted dimension names are reserved names, the name is prefixed with "extracted_" to avoid name collision. For example, if a dimension name is "host", search for "extracted_host" to find it.
So, in your case:
index=myindex | stats count by extracted_host

group by part of url using regex splunk

I have multiple url's all start with /api/net, I want to group by next couple of strings that are separated by / like
/api/net/abc/def?key=value
/api/net/c/d?key1=value1
/api/net/j/h?key2=value2
I have below regular expression which parses all url's but I explicitly have to specify required in regular expression .
| rex field=requestPath "(?<volga>.+?(\/abc\/def)|(\/c\/d)|(\/j\/h).+?)"
volga is a named capturing group, I want to do a group by on volga without adding /abc/def, /c/d,/j/h in regular expression so that I would know number of expressions in there instead of hard coding.
There are other expressions I would not know to add, So I want to group by on next 2 words split by / after "net" and do a group by , also ignore rest of the url. Let me know if you did not understand, I could explain more.
If I understand the question correctly, this regex will parse the URL and return the two domains as 'dom1' and 'dom2', respectively. Then you can group/sort on them.
... | rex field=requestPath "\/api\/net\/(?<dom1>[^\/]+)\/(?<dom2>[^\/\?]+)"
| stats values(*) as * by dom1,dom2

Using REGEXP_EXTRACT to get domain and subdomains

I have only managed to extract the TLD of the list of websites that I have using
REGEXP_EXTRACT(Domain_name, r'(\.[^.:]*)]\.?:?[0-9]*$') AS web_tld
Example:
I have
www.example1.abc.com
www.example2.efg.123.net
I want the result
Subdomain
example1
efg
Domain
abc
123
TLD
.com
.net
EDIT:
Encountered an error in my query
'Exactly one capturing group must be specified'
when I use (.?([^.:]+).([^.:]+).([^.:]+):?[0-9]*$) as regex
SELECT
REGEXP_EXTRACT(Domain, r'(\.?([^.:]+)\.([^.:]+)\.([^.:]+):?[0-9]*$)'),
FROM [weblist.domain]
ORDER BY 1
LIMIT 250;
As you can only use one capturing group, I think you can actually use 3 separate regular expressions to get the values you want:
SELECT
REGEXP_EXTRACT(Domain, r'([^.:]+):?[0-9]*$'),
REGEXP_EXTRACT(Domain, r'([^.:]+).[^.:]+:?[0-9]*$'),
REGEXP_EXTRACT(Domain, r'([^.:]+).[^.:]+.[^.:]+:?[0-9]*$')
FROM [weblist.domain]
ORDER BY 1
LIMIT 250;
Note you may be better off using the HOST, DOMAIN, and TLD rather than custom regular expressions.