Kusto: How to unpivot - turn columns into rows? - azure-log-analytics

Using the StormEvents table on the Samples database on the help cluster:
StormEvents
| where State startswith "AL"
| where EventType has "Wind"
| where StartTime == "2007-01-02T02:16:00Z"
| project StartTime, State, EventType, InjuriesDirect, InjuriesIndirect, DeathsDirect, DeathsIndirect
I would like row-based output of the form:
I see the pivot() function, but it appears to only go the other direction, from rows to columns.
I've been trying various pack() ideas, but can't seem to get the required output.
Example:
StormEvents
| where State startswith "AL"
| where EventType has "Wind"
| where StartTime == "2007-01-02T02:16:00Z"
| project StartTime, State, EventType, InjuriesDirect, InjuriesIndirect, DeathsDirect, DeathsIndirect
| extend Packed = pack(
"CasualtyType", "InjuriesDirect", "CasualtyCount", InjuriesDirect,
"CasualtyType", "InjuriesIndirect", "CasualtyCount", InjuriesIndirect,
"CasualtyType", "DeathsDirect", "CasualtyCount", DeathsDirect,
"CasualtyType", "DeathsIndirect", "CasualtyCount", DeathsIndirect
)
| project-away InjuriesDirect, InjuriesIndirect, DeathsDirect, DeathsIndirect
| mv-expand Packed
This gives me too many rows, and it's not clear to me how to convert them to columns anyway.
What's a correct pattern to use for the required output?

you could try something along the following lines:
let casualty_types = dynamic(["InjuriesDirect", "DeathsDirect", "InjuriesIndirect", "DeathsIndirect"]);
StormEvents
| where State startswith "AL"
| where EventType has "Wind"
| where StartTime == "2007-01-02T02:16:00Z"
| project StartTime, State, EventType, properties = pack_all()
| mv-apply casualty_type = casualty_types to typeof(string) on (
project casualty_type, casualty_count = tolong(properties[casualty_type])
)
| project-away properties

Related

Defender KQL to show blocked Bluetooth Devices with all relevant fields

I'm trying to write a query to report on BlueToothPolicyTriggered events, that will return all the details to show when a device was blocked by policy AND the details of that device.
Our BT policy basically should allow everything but block file transfer over BT. That seems to be working as expected, but before rolling out wider, want a quick way to 'see' if any other devices are being blocked incorrectly or be able to refer to it if a user reports an issue so we can get all the details of the device blocked to add an exception etc.
However (and I'm new to kql) it seems once I filter a table using an 'ActionType' the columns available to report on are restricted, and in this case we lose details of the BT device that has been blocked
This shows all events that have triggered the policy and whether it was 'accepted' or 'blocked' but not the details of the device
search in (DeviceEvents) ActionType == "BluetoothPolicyTriggered"
| extend parsed=parse_json(AdditionalFields)
| extend Result = tostring(parsed.Accepted)
| extend BluetoothMACAddress = tostring(parsed.BluetoothMacAddress)
| extend PolicyName = tostring(parsed.PolicyName)
| extend PolicyPath = tostring(parsed.PolicyPath)
| summarize arg_max(Timestamp, *) by DeviceName, BluetoothMACAddress
| sort by Timestamp desc
| project Timestamp, DeviceName, DeviceId, Result, ActionType, BluetoothMACAddress, PolicyPath, PolicyName, ReportId
Then I have this which will show every BT connection, the device details im looking for, but not whether it was blocked or accepted
DeviceEvents
| extend parsed=parse_json(AdditionalFields)
| extend MediaClass = tostring(parsed.ClassName)
| extend MediaDeviceId = tostring(parsed.DeviceId)
| extend MediaDescription = tostring(parsed.DeviceDescription)
| extend MediaSerialNumber = tostring(parsed.SerialNumber)
| where MediaClass == "Bluetooth"
| project Timestamp, DeviceId, DeviceName, MediaClass, MediaDeviceId, MediaDescription, parsed
| order by Timestamp desc
Ive been trying to somehow join these together (despite being the same DeviceEvents table) with not much success. I don't trust the output as im seeing entries saying a device was blocked when I know it wasnt.
DeviceEvents
| where ActionType == "BluetoothPolicyTriggered"
| extend parsed=parse_json(AdditionalFields)
| extend Result = tostring(parsed.Accepted)
| extend BluetoothMACAddress = tostring(parsed.BluetoothMacAddress)
| extend PolicyName = tostring(parsed.PolicyName)
| extend PolicyPath = tostring(parsed.PolicyPath)
| project Timestamp, DeviceName, DeviceId, Result, ActionType, BluetoothMACAddress, PolicyPath, PolicyName, ReportId
| join kind =inner (DeviceEvents
| extend parsed=parse_json(AdditionalFields)
| extend MediaClass = tostring(parsed.ClassName)
| extend MediaDeviceId = tostring(parsed.DeviceId)
| extend MediaDescription = tostring(parsed.DeviceDescription)
| extend MediaSerialNumber = tostring(parsed.SerialNumber)
) on DeviceName
| where MediaClass == "Bluetooth"
| project Timestamp, DeviceName, Result, ActionType, MediaClass, MediaDeviceId, MediaDescription,BluetoothMACAddress
| sort by Timestamp desc
Am i going about this completely wrong ?

KQL :: return only tags with more than 4 records

I have created a Kusto query that allows me to return all our database park. The query only takes 10 lines of code:
Resources
| join kind=inner (
resourcecontainers
| where type == 'microsoft.resources/subscriptions'
| project subscriptionId, subscriptionName = name)
on subscriptionId
| where subscriptionName in~ ('Subscription1','Subscription2')
| where type =~ 'microsoft.sql/servers/databases'
| where name != 'master'
| project subscriptionName, resourceGroup, name, type, location,sku.tier, properties.requestedServiceObjectiveName, tags.customerCode
By contract we are supposed to give only 4 Azure SQL Database per customer but sometimes developers take a copy of them and they rename it _old or _backup and suddenly a customer can have 5 or 6 databases.
This increase the overall costs of the Cloud and I would like to have a list of all customers that have more than 4 databases.
In order to do so I can use the tag tags.customerCode which has the 3 letters identifier for each customer.
The code should work like this: if a customer is called ABC and there are 4 Azure SQL Databases with tags.customerCode ABC the query should return nothing. If there are 5 or 6 databases with tags.customerCode ABC the query should return all of them.
Not sure if Kusto can be that flexible.
Here is a possible solution.
It should be noted that Azure resource graph supports only a limited subset of KQL.
resourcecontainers
| where type == 'microsoft.resources/subscriptions'
//and name in~ ('Subscription1','Subscription2')
| project subscriptionId, subscriptionName = name
| join kind=inner
(
resources
| where type =~ 'microsoft.sql/servers/databases'
and name != 'master'
)
on subscriptionId
| project subscriptionId, subscriptionName, resourceGroup, name, type, location
,tier = sku.tier
,requestedServiceObjectiveName = properties.requestedServiceObjectiveName
,customerCode = tostring(tags.customerCode)
| summarize dbs = count(), details = make_list(pack_all()) by customerCode
| where dbs > 4
| mv-expand with_itemindex=db_seq ['details']
| project customerCode
,dbs
,db_seq = db_seq + 1
,subscriptionId = details.subscriptionId
,subscriptionName = details.subscriptionName
,resourceGroup = details.resourceGroup
,name = details.name
,type = details.type
,location = details.location
,tier = details.tier
,requestedServiceObjectiveName = details.requestedServiceObjectiveName

Self-join Kusto Query in Analytics Rule

I am working within Microsoft Sentinel Analytics Rules with the Kusto Query Language. (KQL)
I need to work in a Table called CrowdstrikeReplicatorLogs_CL which contains rows that contain a) data rows for which I need to alert on and b) metadata. that contains information about the subject in the alert.
This means I need to self-join the KQL table with itself to get the final result.
The column in question to join the table itself is the aid_g column.
ThreatIntelligenceIndicator
| where foo == bar
| join kind=innerunique (
CrowdstrikeReplicatorLogs_CL
| where TimeGenerated >= ago(dt_lookBack)
| where event_simpleName_s has_any ("NetworkConnectIP4", "NetworkConnectIP6")
| extend json=parse_json(custom_fields_message_s)
| extend ip4 = json["RemoteAddressIP4"], ip6=json["RemoteAddressIP6"]
| extend CS_ipEntity = tostring(iff(isnotempty(ip4), ip4, ip6))
| extend CommonSecurityLog_TimeGenerated = TimeGenerated
) on $left.TI_ipEntity == $right.CS_ipEntity
| join kind=innerunique (
CrowdstrikeReplicatorLogs_CL
| where custom_fields_message_s has "ComputerName"
| extend customFields=parse_json(custom_fields_message_s)
| project Hostname=customFields['ComputerName'], Platform=event_platform_s, aid_g
) on $left.aid_g == $right.aid_g
;
However, this raises a Query contains incompatible 'set' commands. error in Sentinel.
Is there a proper way to self-join tables?

Splunk query filter out based on other event in same index

I have a index named Events
It contains a bunch of different events, all events have a property called EventName.
Now I want to do a query where I return everything that matches the following:
IF AccountId exists in event with EventName AccountCreated AND there is at least 1 event with EventName FavoriteCreated with the same AccountId -> return all events where EventName == AccountCreated
Example events:
AccountCreated
{
"AccountId": 1234,
"EventName": "AccountCreated",
"SomeOtherProperty": "Some value",
"Brand": "My Brand",
"DeviceType": "Mobile",
"EventTime": "2020-06-01T12:13:14Z"
}
FavoriteCreated
{
"AccountId": 1234,
"EventName": "FavoritesCreated,
"Brand": "My Brand",
"DeviceType": "Mobile",
"EventTime": "2020-06-01T12:13:14Z"
}
Given the following two events, I would like to create 1 query that returns the AccountCreated event.
I've tried the following but it does not work, surely I must be missing something simple?
index=events EventName=AccountCreated
[search index=events EventName=FavoriteCreated | dedup AccountId | fields AccountId]
| table AccountId, SomeOtherProperty
Im expecting ~6000 hits here but Im only getting 2298 events. What am I missing?
UPDATE
Based on the answer given by #warren below, the following query works. The only problem is that it's using a JOIN which limits us to 50K results from the subsearch. When running this query I get 5900 results in total = Correct.
index=events EventName=AccountCreated AccountId=*
| stats count by AccountId, EventName
| fields - count
| join AccountId
[ | search index=events EventName=FavoriteCreated AccountId=*
| stats count by AccountId ]
| fields - count
| table AccountId, EventName
I then tried to use his updated example like this but the problem seems to be that it returns FavoriteCreated events instead of AccountCreated.
When running this query I get 25 494 hits = Incorrect.
index=events AccountId=* (EventName=AccountCreated OR EventName=FavoriteCreated)
| stats values(EventName) as EventName by AccountId
| eval EventName=mvindex(EventName,-1)
| search EventName="FavoriteCreated"
| table AccountId, EventName
Update 2 - WORKING
#warren is awesome, here is a full working query that only returns data from the AccountCreated events IF 1 or more FavoriteCreated event exists.
index=events AccountId=* (EventName=AccountCreated OR EventName=FavoriteCreated)
| stats
values(Brand) as Brand,
values(DeviceType) as DeviceType,
values(Email) as Email,
values(EventName) as EventName
values(EventTime) as EventTime,
values(Locale) as Locale,
values(ClientIp) as ClientIp
by AccountId
| where mvcount(EventName)>1
| eval EventName=mvindex(EventName,0)
| eval EventTime=mvindex(EventTime,0)
| eval ClientIp=mvindex(ClientIp,0)
| eval DeviceType=mvindex(DeviceType,0)
You found, perhaps, one factor of your issues - that subsearches are capped at 50,000 (when doing a join) events (or 60 seconds run time (or 10,000 results when you use a "normal" subsearch)).
Start by dumping dedup in favor of stats:
index=events EventName=AccountCreated AccountId=*
| stats count by AccountId, SomeOtherProperty [, more, fields, as, desired]
| fields - count
| search
[ | search index=events EventName=FavoriteCreated AccountId=*
| stats count by AccountId
| fields - count]
<rest of search>
If that doesn't get you where you want to be (ie, you still have too many results in your subsearch), you can try join:
index=events EventName=AccountCreated AccountId=*
| stats count by AccountId, SomeOtherProperty [, more, fields, as, desired]
| fields - count
| join AccountId
[ | search index=events EventName=FavoriteCreated AccountId=*
| stats count by AccountId ]
| fields - count
<rest of search>
There are yet more ways of doing what you're looking for - but these two should get you a long ways toward your goal
Here's a join-less approach which'll show only "FavoriteCreated" events:
index=events AccountId=* (EventName=AccountCreated OR EventName=FavoriteCreated)
| stats values(EventName) as EventName values(SomeOtherProperty) as SomeOtherProperty by AccountId
| eval EventName=mvindex(EventName,-1)
| search EventName="FavoriteCreated"
And here's one that shows "FavoriteCreated" only if there was also an "AccountCreated" event in the same timeframe:
index=events AccountId=* (EventName=AccountCreated OR EventName=FavoriteCreated)
| stats values(EventName) as EventName values(SomeOtherProperty) as SomeOtherProperty by AccountId
| where mvcount(EventName)>1
And if you want to 'pretend' the values() didn't happen (ie, throw-out the "favoriteCreated" entry), add this:
| eval EventName=mvindex(EventName,0)
Turns out that Splunk truncates the SubSearch result if its bigger than 10 000 results...

SQL - Multiple select filter: Combine filter conditions to get proper results

I'm working on a filter where the user can choose different conditions for the end output. Right now I'm doing the construction of the SQL query, but whenever more conditions are selected, it doesn't work.
Example of the advalues table.
+----+-----------+---------------+------------+
| id | listingId | value | identifier |
+----+-----------+---------------+------------+
| 1 | 1a | Alaskan Husky | race |
+----+-----------+---------------+------------+
| 2 | 1a | Højt | activity |
+----+-----------+---------------+------------+
| 3 | 1c | Akita | race |
+----+-----------+---------------+------------+
| 4 | 1c | Mellem | activity |
+----+-----------+---------------+------------+
As you can see, there's a different row for each advalue.
The outcome I expect
Let's say the user has checked/ticked the checkbox for the race where it says "Alaskan Husky", then it should return the listingId for the match (once). If the user has selected both "Alaskan Husky" and activity level to "Low" then it should return nothing, if the activity level is either "Mellem" or "Højt" (medium, high), then it should return the listingId for where the race is "Alaskan Husky" only, not "Akita". I hope you understand what I'm trying to accomplish.
I tried something like this, which returns nothing.
SELECT * FROM advalues WHERE (identifier="activity" AND value IN("Mellem","Højt")) AND (identifier="race" AND value IN("Alaskan Husky"))
By the way, I want to select distinct listingId as well, so it only returns unique listingId's.
I will continue to search around for solutions, which I've been doing for the past few hours, but wanted to post here too, since I haven't been able to find anything that helped me yet. Thanks!
You can split the restictions on identifier in two tables for each type. Then you join on listingid to obtain the listingId wich have the two type of identifier.
SELECT ad.listingId
FROM advalues ad
JOIN advalues ad2
ON ad.listingId = ad2.listingId
WHERE ( ad.identifier = 'activity' AND ad.value IN( 'Mellem', 'Højt' ) )
AND ( ad2.identifier = 'race' AND ad2.value IN( 'Alaskan Husky' ) )
The question isn't exactly clear, but I think you want this:
WHERE (identifier="activity" AND value IN("Mellem","Højt")) OR (identifier="race" AND value IN("Alaskan Husky"))
If I got you right you are trying to fetch data with different "filters".
Your Query
SELECT listingId FROM advalues
WHERE identifier="activity"
AND value IN("Mellem","Højt")
AND identifier="race"
AND value IN("Alaskan Husky")
Will always return 0 results as you are asking for identifier = "activity" AND identifier = "race"
I think you wanted to do something like this instead:
SELECT listingId FROM advalues
WHERE
(identifier="activity" AND value IN("Mellem","Højt"))
OR
(identifier="race" AND value IN("Alaskan Husky"))