Azure Kusto language query through all tables - azure-log-analytics

I am trying to build a KUSTO query to verify that logs are coming to Azure log analytic tables or not. This is my code. This command work perfectly & give number of records it received. But Issue is it does not consider in the query output the table names that received zero(not any) logs
union withsource=sourceTable kind=outer Table1, Table2, Table3
| summarize AggregatedValue=count() by bin(TimeGenerated, 5m), sourceTable
Expected output:
| Table Name | Count |
----------------------
| Table1 | 5 |
| Table2 | 3 |
| Table3 | 0 | //If the count is zero, query output does not show the table name
----------------------

You didn't specify the values in the column for bin(TimeGenerated, 5m) in your expected output. I assume you didn't really want this column there (otherwise, I'm not sure what exactly you wish to see in the expected output for Table3, which has 0 records).
To get the output you want, use the following trick:
let DefaultResult = datatable(['Table Name']: string, Count: long) [
"Table1", 0,
"Table2", 0,
"Table3", 0
];
union withsource=sourceTable kind=outer Table1, Table2, Table3
| summarize AggregatedValue=count() by bin(TimeGenerated, 5m), sourceTable
| union DefaultResult
| summarize Count = sum(Count) by ['Table Name']
| order by ['Table Name'] asc

This might help:
let reCount = union withsource=sourceTable kind=outer AppServiceFileAuditLogs,AzureDiagnostics, BaiClusterEvent
| summarize AggregatedValue=count() by sourceTable;
let tableList = datatable (name:string)
[
'AppServiceFileAuditLogs',
'AzureDiagnostics',
'BaiClusterEvent'
];
tableList
| join kind=leftouter reCount on $left.name == $right.sourceTable
|project name,count = iff(isnull(AggregatedValue)==true,0,AggregatedValue )
Idea here is to do a left join with table expression having table names (tableList) and then place 0 where AggregatedValue is NULL.

Related

PostgreSQL query to select records which a specific value doesn't include in text array

I have a table like this
| id | data |
|---------------|---------------------|
| org:abc:basic | {org,org:abc:basic} |
| org:xyz:basic | {org,basic} |
| org:efg:basic | {org} |
I need to write a query to select all the rows which doesn't have the id inside the data column.
Or at least I need to query all the records which doesn't have a text starting from org: and ending with :basic within data.
Currently for this I try to run
SELECT * FROM t_permission WHERE 'org:%:basic' NOT LIKE ANY (data)
query which returns everything even the first row.
you can use the <> operator with ALL against the array:
select *
from the_table
where id <> all(data);

PostgreSQL CTE UPDATE-FROM query skips rows

2 tables
table_1 rows: NOTE: id 2 has two rows
-----------------------
| id | counts | track |
-----------------------
| 1 | 10 | 1 |
| 2 | 10 | 2 |
| 2 | 10 | 3 |
-----------------------
table_2 rows
---------------
| id | counts |
---------------
| 1 | 0 |
| 2 | 0 |
---------------
Query:
with t1_rows as (
select id, sum(counts) as counts, track
from table_1
group by id, track
)
update table_2 set counts = (coalesce(table_2.counts, 0) + t1.counts)::float
from t1_rows t1
where table_2.id = t1.id;
select * from table_2;
When i ran above query i got table_2 output as
---------------
| id | counts |
---------------
| 1 | 10 |
| 2 | 10 | (expected counts as 20 but got 10)
---------------
I noticed that above update query is considering only 1st match and skipping rest.
I can make it work by changing the query like below. Now the table_2 updates as expected since there are no duplicate rows from table_1.
But i would like to know why my previous query is not working. Is there anything wrong in it?
with t1_rows as (
select id, sum(counts) as counts, array_agg(track) as track
from table_1
group by id
)
update table_2 set counts = (coalesce(table_2.counts, 0) + t1.counts)::float
from t1_rows t1
where table_2.id = t1.id;
Schema
CREATE TABLE IF NOT EXISTS table_1(
id varchar not null,
counts integer not null,
track integer not null
);
CREATE TABLE IF NOT EXISTS table_2(
id varchar not null,
counts integer not null
);
insert into table_1(id, counts, track) values(1, 10, 1), (2, 10, 2), (2, 10, 3);
insert into table_2(id, counts) values(1, 0), (2, 0);
The problem is that an UPDATE in PostgreSQL creates a new version of the row rather than changing the row in place, but the new row version is not visible in the snapshot of the current query. So from the point of view of the query, the row “vanishes” when it is updated the first time.
The documentation says:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
So if I read your question correctly, you expect row 2&3 from table_1 to get added together? If so, the reason your first approach didn't work is because it grouped by id, track.
Since row 2&3 have a different number in the track column, they didn't get added together by the group by clause.
Your second approach worked because it only grouped by id

Greatest N Per Group with JOIN and multiple order columns

I have two tables:
Table0:
| ID | TYPE | TIME | SITE |
|----|------|-------|------|
| aa | 1 | 12-18 | 100 |
| aa | 1 | 12-10 | 101 |
| bb | 2 | 12-10 | 102 |
| cc | 1 | 12-09 | 100 |
| cc | 2 | 12-12 | 103 |
| cc | 2 | 12-01 | 109 |
| cc | 1 | 12-07 | 101 |
| dd | 1 | 12-08 | 100 |
and
Table1:
| ID |
|----|
| aa |
| cc |
| cc |
| dd |
| dd |
I'm trying to output results where:
ID must exist in both tables.
TYPE must be the maximum for each ID.
TIME must be the minimum value for the maximum TYPE for each ID.
SITE should be the value from the same row as the minimum TIME value.
Given my sample data, my results should look like this:
| ID | TYPE | TIME | SITE |
|----|------|-------|------|
| aa | 1 | 12-10 | 101 |
| cc | 2 | 12-01 | 109 |
| dd | 1 | 12-08 | 100 |
I've tried these statements:
INSERT INTO "NuTable"
SELECT DISTINCT(QTS."ID"), "SITE",
CASE WHEN MAS.MAB=1 THEN 'B'
WHEN MAS.MAB=2 THEN 'F'
ELSE NULL END,
"TIME"
FROM (SELECT DISTINCT("ID") FROM TABLE1) AS QTS,
TABLE0 AS MA,
(SELECT "ID", MAX("TYPE") AS MASTY, MIN("TIME") AS MASTM
FROM TABLE0
GROUP BY "ID") AS MAS,
WHERE QTS."ID" = MA."ID"
AND QTS."ID" = MAS."ID"
AND MSD.MASTY =MA."TYPE"
...which generates a syntax error
INSERT INTO "NuTable"
SELECT DISTINCT(QTS."ID"), "SITE",
CASE WHEN MAS.MAB=1 THEN 'B'
WHEN MAS.MAB=2 THEN 'F'
ELSE NULL END,
"TIME"
FROM (SELECT DISTINCT("ID") FROM TABLE1) AS QTS,
TABLE0 AS MA,
(SELECT "ID", MAX("TYPE") AS MAB
FROM TABLE0
GROUP BY "ID") AS MAS,
((SELECT "ID", MIN("TIME") AS MACTM, MIN("TYPE") AS MACTY
FROM TABLE0
WHERE "TYPE" = 1
GROUP BY "ID")
UNION
(SELECT "ID", MIN("TIME"), MAX("TYPE")
FROM TABLE0
WHERE "TYPE" = 2
GROUP BY "ID")) AS MACU
WHERE QTS."ID" = MA."ID"
AND QTS."ID" = MAS."ID"
AND MACU."ID" = QTS."ID"
AND MA."TIME" = MACU.MACTM
AND MA."TYPE" = MACU.MACTB
... which is getting the wrong results.
Answering your direct question "how to avoid...":
You get this error when you specify a column in a SELECT area of a statement that isn't present in the GROUP BY section and isn't part of an aggregating function like MAX, MIN, AVG
in your data, I cannot say
SELECT
ID, site, min(time)
FROM
table
GROUP BY
id
I didn't say what to do with SITE; it's either a key of the group (in which case I'll get every unique combination of ID,site and the min time in each) or it should be aggregated (eg max site per ID)
These are ok:
SELECT
ID, max(site), min(time)
FROM
table
GROUP BY
id
SELECT
ID, site, min(time)
FROM
table
GROUP BY
id,site
I cannot simply not specify what to do with it- what should the database return in such a case? (If you're still struggling, tell me in the comments what you think the db should do, and I'll better understand your thinking so I can tell you why it can't do that ). The programmer of the database cannot make this decision for you; you must make it
Usually people ask this when they want to identify:
The min time per ID, and get all the other row data as well. eg "What is the full earliest record data for each id?"
In this case you have to write a query that identifies the min time per id and then join that subquery back to the main data table on id=id and time=mintime. The db runs the subquery, builds a list of min time per id, then that effectively becomes a filter of the main data table
SELECT * FROM
(
SELECT
ID, min(time) as mintime
FROM
table
GROUP BY
id
) findmin
INNER JOIN table t ON t.id = findmin.id and t.time = findmin.mintime
What you cannot do is start putting the other data you want into the query that does the grouping, because you either have to group by the columns you add in (makes the group more fine grained, not what you want) or you have to aggregate them (and then it doesn't necessarily come from the same row as other aggregated columns - min time is from row 1, min site is from row 3 - not what you want)
Looking at your actual problem:
The ID value must exist in two tables.
The Type value must be largest group by id.
The Time value must be smallest in the largest type group.
Leaving out a solution that involves having or analytics for now, so you can get to grips with the theory here:
You need to find the max type group by id, and then join it back to the table to get the other relevant data also (time is needed) for that id/maxtype and then on this new filtered data set you need the id and min time
SELECT t.id,min(t.time) FROM
(
SELECT
ID, max(type) as maxtype
FROM
table
GROUP BY
id
) findmax
INNER JOIN table t ON t.id = findmax.id and t.type = findmax.maxtype
GROUP BY t.id
If you can't see why, let me know
demo:db<>fiddle
SELECT DISTINCT ON (t0.id)
t0.id,
type,
time,
first_value(site) OVER (PARTITION BY t0.id ORDER BY time) as site
FROM table0 t0
JOIN table1 t1 ON t0.id = t1.id
ORDER BY t0.id, type DESC, time
ID must exist in both tables
This can be achieved by joining both tables against their ids. The result of inner joins are rows that exist in both tables.
SITE should be the value from the same row as the minimum TIME value.
This is the same as "Give me the first value of each group ofids ordered bytime". This can be done by using the first_value() window function. Window functions can group your data set (PARTITION BY). So you are getting groups of ids which can be ordered separately. first_value() gives the first value of these ordered groups.
TYPE must be the maximum for each ID.
To get the maximum type per id you'll first have to ORDER BY id, type DESC. You are getting the maximum type as first row per id...
TIME must be the minimum value for the maximum TYPE for each ID.
... Then you can order this result by time additionally to assure this condition.
Now you have an ordered data set: For each id, the row with the maximum type and its minimum time is the first one.
DISTINCT ON gives you exactly the first row of each group. In this case the group you defined is (id). The result is your expected one.
I would write this using distinct on and in/exists:
select distinct on (t0.id) t0.*
from table0 t0
where exists (select 1 from table1 t1 where t1.id = t0.id)
order by t0.id, type desc, time asc;

CASE...WHEN in WHERE clause in Postgresql

My query looks like:
SELECT *
FROM table
WHERE t1.id_status_notatka_1 = ANY (selected_type)
AND t1.id_status_notatka_2 = ANY (selected_place)
here I would like to add CASE WHEN
so my query is:
SELECT *
FROM table
WHERE t1.id_status_notatka_1 = ANY (selected_type)
AND t1.id_status_notatka_2 = ANY (selected_place)
AND CASE
WHEN t2.id_bank = 12 THEN t1.id_status_notatka_4 = ANY (selected_effect)
END
but it doesn't work. The syntax is good but it fails in searching for anything. So my question is - how use CASE WHEN in WHERE clause. Short example: if a=0 then add some condition to WHERE (AND condition), if it's not then don't add (AND condition)
No need for CASE EXPRESSION , simply use OR with parenthesis :
AND (t2.id_bank <> 12 OR t1.id_status_notatka_4 = ANY (selected_effect))
For those looking to use a CASE in the WHERE clause, in the above adding an else true condition in the case block should allow the query to work as expected. In the OP, the case will resolve as NULL, which will result in the WHERE clause effectively selecting WHERE ... AND NULL, which will always fail.
SELECT *
FROM table
WHERE t1.id_status_notatka_1 = ANY (selected_type)
AND t1.id_status_notatka_2 = ANY (selected_place)
AND CASE
WHEN t2.id_bank = 12 THEN t1.id_status_notatka_4 = ANY (selected_effect)
ELSE true
END
The accepted answer works, but I'd like to share input for those who are looking for a different answer. Thanks to sagi, I've come up with the following query, but I'd like to give a test case as well.
Let us assume this is the structure of our table
tbl
id | type | status
-----------------------
1 | Student | t
2 | Employee | f
3 | Employee | t
4 | Student | f
and we want to select all Student rows, that have Status = 't', however, We also like to retrieve all Employee rows regardless of its Status.
if we perform SELECT * FROM tbl WHERE type = 'Student' AND status = 't' we would only get the following result, we won't be able to fetch Employees
tbl
id | type | status
-----------------------
1 | Student | t
and performing SELECT * FROM tbl WHERE Status = 't' we would only get the following result, we got an Employee Row on the result but there are Employee Rows that were not included on the result set, one could argue that performing IN might work, however, it will give the same result set. SELECT * FROM tbl WHERE type IN('Student', 'Employee') AND status = 't'
tbl
id | type | status
-----------------------
1 | Student | t
3 | Employee | t
remember, we want to retrieve all Employee rows regardless of its Status, to do that we perform the query
SELECT * FROM tbl WHERE (type = 'Student' AND status = 't') OR (type = 'Employee')
result will be
table
id | type | status
-----------------------
1 | Student | t
2 | Employee | f
3 | Employee | t

Access query to grab +5 or more duplicates

i have a little problem with an Access query ( dont ask me why but i cannot use a true SGBD but Access )
i have a huge table with like 920k records
i have to loop through all those data and grab the ref that occur more than 5 time on the same date
table = myTable
--------------------------------------------------------------
| id | ref | date | C_ERR_ANO |
--------------------------------------------|-----------------
| 1 | A12345678 | 2012/02/24 | A 4565 |
| 2 | D52245708 | 2011/05/02 | E 5246 |
| ... | ......... | ..../../.. | . .... |
--------------------------------------------------------------
so to resume it a bit, i have like 900000+ records
there is duplicates on the SAME DATE ( oh by the way there is another collumn i forgot to add that have C_ERR_ANO as name)
so i have to loop through all those row, grab each ref based on date AND errorNumber
and if there is MORE than 5 time with the same errorNumber i have to grab them and display it in the result
i ended up using this query:
SELECT DISTINCT Centre.REFERENCE, Centre.DATESE, Centre.C_ERR_ANO
FROM Centre INNER JOIN (SELECT
Centre.[REFERENCE],
COUNT(*) AS `toto`,
Centre.DATESE
FROM Centre
GROUP BY REFERENCE
HAVING COUNT(*) > 5) AS Centre_1
ON Centre.REFERENCE = Centre_1.REFERENCE
AND Centre.DATESE <> Centre_1.DATESE;
but this query isent good
i tried then
SELECT DATESE, REFERENCE, C_ERR_ANO, COUNT(REFERENCE) AS TOTAL
FROM (
SELECT *
FROM Centre
WHERE (((Centre.[REFERENCE]) NOT IN (SELECT [REFERENCE]
FROM [Centre] AS Tmp
GROUP BY [REFERENCE],[DATESE],[C_ERR_ANO]
HAVING Count(*)>1 AND [DATESE] = [Centre].[DATESE]
AND [C_ERR_ANO] = [Centre].[C_ERR_ANO]
AND [LIBELLE] = [Centre].[LIBELLE])))
ORDER BY Centre.[REFERENCE], Centre.[DATESE], Centre.[C_ERR_ANO])
GROUP BY REFERENCE, DATESE, C_ERR_ANO
still , not working
i'm struggeling
Your group by clause needs to include all of the items in your select. Why not use:
select Centre.DATESE, Centre.C_ERR_ANO, Count (*)
Group by Centre.DATESE, Centre.C_ERR_ANO
HAVING COUNT (*) > 5
If you need other fields then you can add them, as long as you ensure the same fields appear in the select as the group by.
No idea what is going on with the formatting here!