How better I can optimize this Kusto Query to get my logs - kql

I have below query which I am running and getting logs for Azure K8s, but its takes hour to generate the logs and i am hoping there is a better way to write what i have already written. Can some Kusto experts advice here as how can I better the performance?
AzureDiagnostics
| where Category == 'kube-audit'
| where TimeGenerated between (startofday(datetime("2022-03-26")) .. endofday(datetime("2022-03-27")))
| where (strlen(log_s) >= 32000
and not(log_s has "aksService")
and not(log_s has "system:serviceaccount:crossplane-system:crossplane")
or strlen(log_s) < 32000
| extend op = parse_json(log_s)
| where not(tostring(op.verb) in ("list", "get", "watch"))
| where substring(tostring(op.responseStatus.code), 0, 1) == "2"
| where not(tostring(op.requestURI) in ("/apis/authorization.k8s.io/v1/selfsubjectaccessreviews"))
| extend user = op.user.username
| extend decision = tostring(parse_json(tostring(op.annotations)).["authorization.k8s.io/decision"])
| extend requestURI = tostring(op.requestURI)
| extend name = tostring(parse_json(tostring(op.objectRef)).name)
| extend namespace = tostring(parse_json(tostring(op.objectRef)).namespace)
| extend verb = tostring(op.verb)
| project TimeGenerated, SubscriptionId, ResourceId, namespace, name, requestURI, verb, decision, ['user']
| order by TimeGenerated asc

You could try starting your query as follow.
Please note the additional condition at the end.
AzureDiagnostics
| where TimeGenerated between (startofday(datetime("2022-03-26")) .. endofday(datetime("2022-03-27")))
| where Category == 'kube-audit'
| where log_s hasprefix '"code":2'
I assumed that code is integer, in case it is string, use the following (added qualifier)
| where log_s has prefix '"code":"2'

Related

Combine KQL Queries into a single query

I have two separate KQL queries I'm using on Azure Log analytics to find blocks on an Azure WAF, the first one finds the transactionid or requests that were blocked.
//Query 1
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayFirewall"
| where hostname_s in ('www.website.com')
| where Message !startswith "Mandatory"
| where action_s in ('Blocked', 'Detected')
| project transactionId_g
To find all the parts of the request that combine to generate the block I then copy/paste the transactionid's to a 2nd query and that all requests that are part of that transaction which gives me the data I need to create exceptions on the WAF if it's a false positive.
//Query 2
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayFirewall"
| where Message !startswith "Mandatory"
| where transactionId_g == "8eb316fd-4a5b-66c7-0136-fc67e21d282b"
| project transactionId_g, TimeGenerated, clientIp_s, hostname_s, requestUri_s, ruleId_s, ruleSetVersion_s, ruleGroup_s, action_s, Message, details_message_s, details_data_s
| sort by transactionId_g
Is there a way to combine these two queries together into a single query that finds the transactionid's of blocked requests, then shows ALL log lines containing those transactionid's?
if there's a single transaction ID returned by the 1st query, you could use the toscalar() function:
let tid = toscalar(
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName ==
"ApplicationGatewayFirewall"
| where hostname_s in ('www.website.com')
| where action_s in ('Blocked', 'Detected')
| where Message !startswith "Mandatory"
| project transactionId_g
| take 1
);
AzureDiagnostics
| where transactionId_g == tid // <-- this is where you use the result of the 1st subquery
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayFirewall"
| where Message !startswith "Mandatory"
| project transactionId_g, TimeGenerated, clientIp_s, hostname_s, requestUri_s, ruleId_s, ruleSetVersion_s, ruleGroup_s, action_s, Message, details_message_s, details_data_s
| sort by transactionId_g
or, if the first query returns a set of transaction IDs and not a single one, you could use the in operator:
let tids =
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName ==
"ApplicationGatewayFirewall"
| where hostname_s in ('www.website.com')
| where action_s in ('Blocked', 'Detected')
| where Message !startswith "Mandatory"
| project transactionId_g
;
AzureDiagnostics
| where transactionId_g in (tids) // <-- this is where you use the result of the 1st subquery
| where ResourceType == "APPLICATIONGATEWAYS" and OperationName == "ApplicationGatewayFirewall"
| where Message !startswith "Mandatory"
| project transactionId_g, TimeGenerated, clientIp_s, hostname_s, requestUri_s, ruleId_s, ruleSetVersion_s, ruleGroup_s, action_s, Message, details_message_s, details_data_s
| sort by transactionId_g

Condition with SQL

I've come to see you for a question. Is there a condition in SQL that allows you to do that:
IF(sup = void) {
}
Database
id | name | lastname | city | mail | number | picture | ...
1 | kiwi | kiwi | USA | kiwi#gmail.com | 0000000000 | default.img | (vide)
SELECT * FROM your_table WHERE sup IS NULL
https://www.w3schools.com/sql/sql_null_values.asp
Update after reading your comment.
$test = $db->query("SELECT * FROM ressource_view WHERE ID = 1")
Will give you the result of your query. Be careful as there could be multiple rows returned.
To fetch the first row
$row = $result->fetch_array()
And then to check if the sup column of your row is null you can use:
if(is_null($row['sup']))
{
}
Or this will have the same effect
if($row['sup'] === NULL)
{
}
But best to tag your question with PHP, MySQL. Your problem seems to be more on the PHP side and someone else could provide a better answer.

Is there a conditional where statement in SQLite?

I have a Todo application with a database for the todos. It contains a column done which will be set to false or true. Now I want to filter the data by all or unfinished. So either done is irrelevant or it has to be false.
I am using SQLite3 in a TypeScript application. Currently I do this by using string templates but I'd prefer an SQL-based solution.
db.prepare(`
select
rowid as id,
title,
description
from todo
${selectAll ? '' : 'where done = 0'}
limit ?
offset ?
`).all(limit, offset);
My idea was to use the CASE clause but it seems not to work around the WHERE clause.
Is there any better solution?
selectAll is a TypeScript variable that is set depending on the query parameters of the app.
Output when selectAll is false
id | title | description | done
1 | Clean the kitchen | as the title says... | 1
2 | Do the shopping | potatoes, tomatoes | 0
3 | Program stuff | Todo app | 1
Output when selectAll is true
id | title | description | done
2 | Do the shopping | potatoes, tomatoes | 0
You can use boolean logic:
where ? = '' or done = 0

Composite key where one of the ID-components don't always update

Im using Entity Framework 6.
Given the following models:
MyType (Id+Language are keys)
- Id
- Language // will be represented by values like "en", "no", "dk", "de"
- Label
MyValue (Id+SomeOtherEntityId+Language are keys)
- Id
- SomeOtherEntityId
- Language
- Value
I'd like the Id column in MyTypes to auto increment whenever a new language is given for that Id, if that makes sense.
Example:
context.MyTypes.Add(new MyType { Label = "Label", Language = "no" });
context.MyTypes.Add(new MyType { Label = "Label", Language = "da" });
context.MyTypes.Add(new MyType { Label = "Label", Language = "nl" });
This gives me a database that looks like the following:
| Id | Language | Label |
|----|----------|-------|
| 1 | no | Label |
| 2 | da | Label |
| 3 | nl | Label |
While what I really want is this:
| Id | Language | Label |
|----|----------|--------|
| 1 | no | Label |
| 1 | da | Label |
| 1 | nl | Label |
| 2 | no | Label2 |
| 2 | da | Label2 |
| 2 | nl | Label2 |
I have currently set my ID column as an Identity, thus making it auto incrementing.
Is it possible, from a database perspective, to configure this to be auto incrementing, depending on the Language column?
Is this configurable via EntityFramework?
I guess an alternative would be to have a separate Type-table with just an ID, and then link the Labels for different languages to this. However, I was hoping to avoid that simply because this structure is so simple, and having another table just for the ID's sake seems to me to just add some unnecessary complexity.
Another alternative could be to control this from code (when inserting new MyTypes, get the correct ID first and then add the rest).
Is there any best practices concerning this type of problem?
what you try to do can't be done, in my knowledge. (identity is one distinct value by row)
two tables seems right to me:
tableHead(Id, Code, Blabla)
tableTrans(Id, ExtId, Language, Value), with ExtId as a FK
or
tableTrans(ExtId, Language, Value) with a composite Key comprisong the FK
then you get you translation by a Code value and a Language value and get readable code.

Assistance with SQL multi-table query - returning duplicate results

We use an online project management system, and I'm trying to extend it somewhat.
It has the following tables of interest:
todo_itemStatus:
+--------------+-----------------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------------------+----------------+
| itemStatusId | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| itemId | int(10) unsigned | NO | MUL | 0 | |
| statusDate | datetime | NO | | 0000-00-00 00:00:00 | |
| statusKey | tinyint(3) unsigned | NO | | 0 | |
| memberId | mediumint(8) unsigned | NO | | 0 | |
+--------------+-----------------------+------+-----+---------------------+----------------+
This table keeps track of when a task is complete, and also keeps the status of all task changes.
There's then a project table, and an 'item' (or task) table.
I basically want to be able to extract a list of projects, with details on the percentage of tasks complete. However, for now I'd be happy if I could just list each task in a project with details on whether they're complete.
As far as I'm aware, the best way to get the most recent status of a task is to choose an todo_itemStatus where the statusDate is the newest, or the itemStatusId is the largest whilst itemId equals the task I'm interested.
I tried a query like this:
<pre>
select todo_item.itemId, todo_item.title, todo_itemStatus.statusKey, todo_itemStatus.statusDate
from todo_item, todo_project, todo_itemStatus
where todo_item.projectId = todo_project.projectId
and todo_project.projectId = 13
and todo_itemStatus.itemId = todo_item.itemId
and todo_itemStatus.statusDate = (
select MAX(todo_itemStatus.statusDate)
from todo_itemStatus key1 where todo_itemStatus.itemId = key1.itemId);
</pre>
However, this yields all status updates with output like this:
+--------+-----------------------------------------------------------------------------+-----------+---------------------+
| itemId | title | statusKey | statusDate |
+--------+-----------------------------------------------------------------------------+-----------+---------------------+
| 579 | test complete item - delete me | 1 | 2009-07-28 13:04:38 |
| 579 | test complete item - delete me | 0 | 2009-07-28 14:12:12 |
+--------+-----------------------------------------------------------------------------+-----------+---------------------+
Which isn't what I want, as I only want one task entry returning with the statusKey / statusDate from the most recent entry in the todo_itemStatus table.
I know I've been a bit vague in my description, but I didn't want to write a massively long message. I can provide much more detail if necessary.
Please can someone suggest what I'm doing wrong? It's been a long time since I've done any real database stuff, so I'm a bit unsure what I'm doing wrong here...
Many thanks!
Dave
You should look into using the DISTINCT keyword (Microsoft SQL Server)
EDIT: I've just re-read your question and I think that the GROUP BY clause is more suited in this situation. You should read http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/, however essentially what you need to do is first select the columns that you are interested in using a GROUP BY clause:
SELECT todo_itemStatus.itemStatusId, MAX(todo_itemStatus.statusDate)
FROM todo_item, todo_project, todo_itemStatus
WHERE todo_item.projectId = todo_project.projectId
AND todo_itemStatus.itemId = todo_item.itemId
AND todo_project.projectId = 13
GROUP BY itemStatusId
We then self-join to this set of id's to get the rest of the columns we are interested in:
SELECT
todo_item.itemId,
todo_item.title,
todo_itemStatus.statusKey,
todo_itemStatus.statusDate
FROM todo_item
JOIN todo_itemStatus
ON todo_itemStatus.itemId = todo_item.itemId
JOIN
(SELECT todo_itemStatus.itemStatusId, MAX(todo_itemStatus.statusDate)
FROM todo_item, todo_project, todo_itemStatus
WHERE todo_item.projectId = todo_project.projectId
AND todo_itemStatus.itemId = todo_item.itemId
AND todo_project.projectId = 13
GROUP BY itemStatusId) AS x
ON todo_itemStatus.itemStatusId = x.itemStatusId
I've experimented some more and the following query does what I want:
select todo_item.itemId, todo_item.title, todo_itemStatus.statusKey, todo_itemStatus.statusDate from todo_itemStatus, todo_item where todo_item.itemId = todo_itemStatus.itemId and todo_item.projectId = 13 and todo_itemStatus.statusDate = (select MAX(status.statusDate) from todo_itemStatus as status where status.itemId = todo_item.itemId);
So I'm now happy. Thanks for all the help and the suggestions.
Dave.