Deep dive Azure Log analytics cost using KQL query - azure-log-analytics

I'm running following Log Analytics Kusto query to get data what uses and thus generetes our Log Analytics cost
Usage
| where IsBillable == true
| summarize BillableDataGB = sum(Quantity) by Solution, DataType
| sort by Solution asc, DataType asc
and then the output is following:
What kinda query should I use if I want to deep dive more eg to ContainerInsights/InfrastructureInsights/ServiceMap/VMInsights/LogManagement so to get more detailed data what name or namespaces really cost?
Insightmetrics table have e.g these names and namespaces.
I was able maybe able to get something out using following query but something is still missing. Not totally sure if I'm on right or wrong way
union withsource = tt *
| where _IsBillable == true
| extend Namespace, Name

Here is the code for getting the name and namespace details. using Kusto query
let startTimestamp = ago(1h);
KubePodInventory
| where TimeGenerated > startTimestamp
| project ContainerID, PodName=Name, Namespace
| where PodName contains "name" and Namespace startswith "namespace"
| distinct ContainerID, PodName
| join
(
ContainerLog
| where TimeGenerated > startTimestamp
)
on ContainerID
// at this point before the next pipe, columns from both tables are available to be "projected". Due to both
// tables having a "Name" column, we assign an alias as PodName to one column which we actually want
| project TimeGenerated, PodName, LogEntry, LogEntrySource
| summarize by TimeGenerated, LogEntry
| order by TimeGenerated desc
For more information you can go through the Microsoft document and here is the Kust Query Tutorial.

Related

Self-join Kusto Query in Analytics Rule

I am working within Microsoft Sentinel Analytics Rules with the Kusto Query Language. (KQL)
I need to work in a Table called CrowdstrikeReplicatorLogs_CL which contains rows that contain a) data rows for which I need to alert on and b) metadata. that contains information about the subject in the alert.
This means I need to self-join the KQL table with itself to get the final result.
The column in question to join the table itself is the aid_g column.
ThreatIntelligenceIndicator
| where foo == bar
| join kind=innerunique (
CrowdstrikeReplicatorLogs_CL
| where TimeGenerated >= ago(dt_lookBack)
| where event_simpleName_s has_any ("NetworkConnectIP4", "NetworkConnectIP6")
| extend json=parse_json(custom_fields_message_s)
| extend ip4 = json["RemoteAddressIP4"], ip6=json["RemoteAddressIP6"]
| extend CS_ipEntity = tostring(iff(isnotempty(ip4), ip4, ip6))
| extend CommonSecurityLog_TimeGenerated = TimeGenerated
) on $left.TI_ipEntity == $right.CS_ipEntity
| join kind=innerunique (
CrowdstrikeReplicatorLogs_CL
| where custom_fields_message_s has "ComputerName"
| extend customFields=parse_json(custom_fields_message_s)
| project Hostname=customFields['ComputerName'], Platform=event_platform_s, aid_g
) on $left.aid_g == $right.aid_g
;
However, this raises a Query contains incompatible 'set' commands. error in Sentinel.
Is there a proper way to self-join tables?

Create a Join in Azure Resource Graph

I'm new to KQL and wondering if anyone would know how to do a join with using the query tables below?
policyresources
| where type == "microsoft.authorization/policysetdefinitions" |join advisorresources
Just looking at the graph resources
https://learn.microsoft.com/en-us/azure/governance/resource-graph/samples/samples-by-table?tabs=azure-cli#advisorresources
&
https://learn.microsoft.com/en-us/azure/governance/resource-graph/samples/samples-by-table?tabs=azure-cli#advisorresources
policyresources
| where type == "microsoft.authorization/policysetdefinitions" | extend resourceId = tostring(properties.resourceId)
| join (advisorresources | extend resourceId = tostring(properties.resourceMetadata.resourceId)) on resourceId
may work. Not a 100% sure if the resourceId is the common key. Worth a try though

Take output from query and use in subsequent KQL query

I'm using Azure Log Analytics to review certain events of interest.
I would like to obtain timestamps from data that meets a certain criteria, and then reuse these timestamps in further queries, i.e. to see what else occurred around these times.
The following query returns the desired results, but I'm stuck at how to use the interestingTimes var to then perform further searches and show data within X minutes of each previously returned timestamp.
let interestingTimes =
Event
| where TimeGenerated between (datetime(2021-04-01T11:57:22) .. datetime('2021-04-01T15:00:00'))
| where EventID == 1
| parse EventData with * '<Data Name="Image">' ImageName "<" *
| where ImageName contains "MicrosoftEdge.exe"
| project TimeGenerated
;
Any pointers would be greatly appreciated.
interestingTimes will only be available for use in the query where you declare it. You can't use it in another query, unless you define it there as well.
By the way, you can make your query much more efficient by adding a filter that will utilize the built-in index for the EventData column, so that the parse operator will run on a much smaller amount of records:
let interestingTimes =
Event
| where TimeGenerated between (datetime(2021-04-01T11:57:22) .. datetime('2021-04-01T15:00:00'))
| where EventID == 1
| where EventData has "MicrosoftEdge.exe" // <-- OPTIMIZATION that will filter out most records
| parse EventData with * '<Data Name="Image">' ImageName "<" *
| where ImageName contains "MicrosoftEdge.exe"
| project TimeGenerated
;

How to fix: cannot retrieve all fields MongoDB collection with Apache Drill SQL expression query

I'm trying to retrieve all(*) columns from a MongoDB object with Apache Drill expression SQL:
`_id`.`$oid`
Background: I'm using Apache Drill to query MongoDB collections. By default, Drill retrieves the ObjectId values in a different format than the stored in the database. For example:
Mongo: ObjectId(“59f2c3eba83a576fe07c735c”)
Drill query result: [B#3149a…]
In order to get the data in String format (59f2c3eba83a576fe07c735c) from the object, I changed the Drill config "store.mongo.bson.record.reader" to "false".
ALTER SESSION SET store.mongo.bson.record.reader = false
Drill query result after config set to false:
select * from calc;
+--------------------------------------+---------+
| _id | name |
+--------------------------------------+---------+
| {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
+--------------------------------------+---------+
Running a query by _id:
select `o`.`_id`.`$oid` , `o`.`name` from mongo.od_teste.calc o where `o`.`_id`.`$oid`='5cb0e161f0849231dfe16d99';
Result:
+---------------------------+---------+
| EXPR$0 | name |
+---------------------------+---------+
| 5cb0e161f0849231dfe16d99 | thiago |
+---------------------------+---------+
For an object with a few columns like the one above (_id, name) it's ok to specify all the columns in the select query by id. However, in my production database, the objects have a "hundred" of columns.
If I try to query all (*) columns from the collection, this is the result:
select `o`.* from mongo.od_teste.calc o where `o`.`_id`.`$oid`='5cb0e161f0849231dfe16d99';
or
select * from mongo.od_teste.calc o where `o`.`_id`.`$oid`='5cb0e161f0849231dfe16d99';
+-----+
| ** |
+-----+
+-----+
No rows selected (6.112 seconds)
Expected result: Retrieve all columns from a MongoDB collection instead of declaring all of them on the SQL query.
I have no suggestions here, because it is a bug in Mongo Storage Plugin.
I have created Jira ticket for it, please take a look and feel free to add any related info there: DRILL-7176

Stats Count Splunk Query

I wonder whether someone can help me please.
I'd made the following post about Splunk query I'm trying to write:
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html
I received some great help, but despite working on this for a few days now concentrating on using eval if statements, I still have the same issue with the "Successful" and "Unsuccessful" columns showing blank results. So I thought I'd cast the net a little wider and ask please whether someone maybe able to look at this and offer some guidance on how I may get around the problem.
Many thanks and kind regards
Chris
I tried exploring your use-case with splunkd-access log and came up with a simple SPL to help you.
In this query I am actually joining the output of 2 searches which aggregate the required results (Not concerned about the search performance).
Give it a try. If you've access to _internal index, this will work as is. You should be able to easily modify this to suit your events (eg: replace user with ClientID).
index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| stats count as All sum(eval(if(status <= 303,1,0))) as Successful sum(eval(if(status > 303,1,0))) as Unsuccessful by user
| join user type=left
[ search index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| chart count BY user status ]
I updated your search from splunk community answers (should look like this):
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| stats count as All sum(eval(if(statusCode <= 303,1,0))) as Successful sum(eval(if(statusCode > 303,1,0))) as Unsuccessful by ClientID
| join ClientID type=left
[ search w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| chart count BY ClientID statusCode ]
I answered in Splunk
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html?childToView=729492#answer-729492
but using dummy encoding, it looks like
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientId as ClientID, detail.statusCode as Status
| eval X_{Status}=1
| stats count as Total sum(X_*) as X_* by ClientID
| rename X_* as *
Will give you ClientID, count and then a column for each status code found, with a sum of each code in that column.
As I gather you can't get this working, this query should show dummy encoding in action
`index=_internal sourcetype=*access
| eval X_{status}=1
| stats count as Total sum(X_*) as X_* by source, user
| rename X_* as *`
This would give an output of something like