Add a Dummy Row for Each Row in the Table

Add a Dummy Row for Each Row in the Table - azure-log-analytics

I have below query which returns %CPU of each Computer by every 1 hour
Query
Perf
| where TimeGenerated > ago(1h)
| where CounterName == "% Processor Time"
| where Computer endswith "XYZ"
| summarize avg(CounterValue) by bin(TimeGenerated, 1h), Computer
Result
I want to append Dummy row for each-row in the table with fixed value except TimeGenerated should be same as previous row in the table. Expecting result should look something like this.
Expected Result

you could try something like this (note that you'll need to explicitly order your records as you wish):
let T =
Perf
| where TimeGenerated > ago(1h)
| where CounterName == "% Processor Time"
| where Computer endswith "XYZ"
| summarize avg(CounterValue) by bin(TimeGenerated, 1h), Computer
;
T
| union (T | extend Computer = "Dummy", avg_CounterValue = 10)
| order by TimeGenerated

Related

How better I can optimize this Kusto Query to get my logs

I have below query which I am running and getting logs for Azure K8s, but its takes hour to generate the logs and i am hoping there is a better way to write what i have already written. Can some Kusto experts advice here as how can I better the performance?
AzureDiagnostics
| where Category == 'kube-audit'
| where TimeGenerated between (startofday(datetime("2022-03-26")) .. endofday(datetime("2022-03-27")))
| where (strlen(log_s) >= 32000
and not(log_s has "aksService")
and not(log_s has "system:serviceaccount:crossplane-system:crossplane")
or strlen(log_s) < 32000
| extend op = parse_json(log_s)
| where not(tostring(op.verb) in ("list", "get", "watch"))
| where substring(tostring(op.responseStatus.code), 0, 1) == "2"
| where not(tostring(op.requestURI) in ("/apis/authorization.k8s.io/v1/selfsubjectaccessreviews"))
| extend user = op.user.username
| extend decision = tostring(parse_json(tostring(op.annotations)).["authorization.k8s.io/decision"])
| extend requestURI = tostring(op.requestURI)
| extend name = tostring(parse_json(tostring(op.objectRef)).name)
| extend namespace = tostring(parse_json(tostring(op.objectRef)).namespace)
| extend verb = tostring(op.verb)
| project TimeGenerated, SubscriptionId, ResourceId, namespace, name, requestURI, verb, decision, ['user']
| order by TimeGenerated asc

You could try starting your query as follow.
Please note the additional condition at the end.
AzureDiagnostics
| where TimeGenerated between (startofday(datetime("2022-03-26")) .. endofday(datetime("2022-03-27")))
| where Category == 'kube-audit'
| where log_s hasprefix '"code":2'
I assumed that code is integer, in case it is string, use the following (added qualifier)
| where log_s has prefix '"code":"2'

PostgreSQL Compare value from row to value in next row (different column)

I have a table of encounters called user_dates that is ordered by 'user' and 'start' like below. I want to create a column indicating whether an encounter was followed up by another encounter within 30 days. So basically I want to go row by row checking if "encounter_stop" is within 30 days of "encounter_start" in the following row (as long as the following row is the same user).
user | encounter_start | encounter_stop
A | 4-16-1989 | 4-20-1989
A | 4-24-1989 | 5-1-1989
A | 6-14-1993 | 6-27-1993
A | 12-24-1999 | 1-2-2000
A | 1-19-2000 | 1-24-2000
B | 2-2-2000 | 2-7-2000
B | 5-27-2001 | 6-4-2001
I want a table like this:
user | encounter_start | encounter_stop | subsequent_encounter_within_30_days
A | 4-16-1989 | 4-20-1989 | 1
A | 4-24-1989 | 5-1-1989 | 0
A | 6-14-1993 | 6-27-1993 | 0
A | 12-24-1999 | 1-2-2000 | 1
A | 1-19-2000 | 1-24-2000 | 0
B | 2-2-2000 | 2-7-2000 | 1
B | 5-27-2001 | 6-4-2001 | 0

You can select..., exists <select ... criteria>, that would return a boolean (always true or false) but if really want 1 or 0 just cast the result to integer: true=>1 and false=>0. See Demo
select ts1.user_id
, ts1.encounter_start
, ts1. encounter_stop
, (exists ( select null
from test_set ts2
where ts1.user_id = ts2.user_id
and ts2.encounter_start
between ts1.encounter_stop
and (ts1.encounter_stop + interval '30 days')::date
)::integer
) subsequent_encounter_within_30_days
from test_set ts1
order by user_id, encounter_start;
Difference: The above (and demo) disagree with your expected result:
B | 2-2-2000 | 2-7-2000| 1
subsequent_encounter (last column) should be 0. This entry starts and ends in Feb 2000, the other B entry starts In May 2001. Please explain how these are within 30 days (other than just a simple typo that is).
Caution: Do not use user as a column name. It is both a Postgres and SQL Standard reserved word. You can sometimes get away with it or double quote it. If you double quote it you MUST always do so. The big problem being it has a predefined meaning (run select user;) and if you forget to double quote is does not necessary produce an error or exception; it is much worse - wrong results.

How to sum the minutes of each activity in Postgresql?

The column "activitie_time_enter" has the times.
The column "activitie_still" indicates the type of activity.
The column "activitie_walking" indicates the other type of activity.
Table example:
activitie_time_enter | activitie_still | activitie_walking
17:30:20 | Still |
17:31:32 | Still |
17:32:24 | | Walking
17:33:37 | | Walking
17:34:20 | Still |
17:35:37 | Still |
17:45:13 | Still |
17:50:23 | Still |
17:51:32 | | Walking
What I need is to sum up the total minutes for each activity separately.
Any suggestions or solution?

First calculate the duration for each activity (the with CTE) and then do conditional sum.
with t as
(
select
*, lead(activitie_time_enter) over (order by activitie_time_enter) - activitie_time_enter as duration
from _table
)
select
sum (duration) filter (where activitie_still = 'Still') as total_still,
sum (duration) filter (where activitie_walking = 'Walking') as total_walking
from t;
/** Result:
total_still|total_walking|
-----------+-------------+
00:19:16| 00:01:56|
*/
BTW do you really need two columns (activitie_still and activitie_walking)? Only one activity column with those values will do. This will allow more activities (Running, Sleeping, Working etc.) w/o having to change the table structure.

How to GROUP BY a new index in a new VIEW

I have 2 tables (load+road) and I want to make a new view with creating new column (flag) that indexate the COUNT of the lines, and then GROUP BY this new index.
I have tried this: (but it doesnt work)
sprintf(my_cmd,
"CREATE VIEW myVIEW(id, Rlength, Llength, flag) AS "
"SELECT road.id, road.length, load.length, COUNT(*) AS flag
FROM road, load "
"WHERE road.id=load.id; "
"SELECT id, Rlength, Llength
FROM myVIEW"
"GROUP BY flag");
ERROR:
Error executing query: ERROR: column "road.id" must appear in the GROUP BY clause or be used in an aggregate function
I am using MY SQL.
*edit:
I dont want that the new column (flag) appears in the last SELECT, but I want to group by it.. dont know if it can be done. if not, the thing I wanna reach, is to use group by on "SELECT id, Rlength, Llength " and to get all the lines in an only one group, but I dont have a Common parameter between thees lines so I have trying to add this "flag"
the full code (sorry for long question):
sprintf(my_cmd,
"CREATE VIEW myVIEW3(id, Rlength, Llength, flag) AS"
" SELECT road.id, road.length, load.length, COUNT(*) AS flag
FROM road, load"
" WHERE road.id=load.id;"
" SELECT id, Rlength, Llength FROM myVIEW3"
" GROUP BY flag"
" HAVING COUNT(*) <= %d"
" ORDER BY (CAST(Llength AS float) / CAST(Rlength AS float)) DESC, id DESC",k);
and what I am trying to do, is to get first k lines after making some ORDER without using LIMIT/TOP (its an assigment). So I have tried using new VIEW with some indecator that I will use for grouping all lines into one group and then use HAVING COUNT(flag) <= k.
road:
.--------.----------------.----------------.
| Id | length | speed |
.--------.----------------.----------------.-
| 9 | 55 | 90 |
| 10 | 44 | 80 |
| 11 | 70 | 100 |
load:
.--------.----------------.----------------.
| Id | length | speed |
.--------.----------------.----------------.-
| 9 | 10 | 20 |
| 10 | 15 | 30 |
| 11 | 30 | 60 |
COMMAND:
loadRanking 2
(k=2, so I want to get first 2 lines after some ORDER, lets not talk about the ORDER in this result)
result:
.--------.----------------.----------------.
| Id | length | speed |
.--------.----------------.----------------.-
| 9 | 10/55 | 20/90 |
| 10 | 15/44 | 30/80 |

Your group by should contain all columns that are being selected that are not part of the aggregate function. So your GROUP BY should look like this:
GROUP BY road.id, road.length, load.length
That being said, I am quite confused by why you have two queries here. I suspect your query should look something like this:
SELECT road.id, road.length, load.length, COUNT(*) AS flag
FROM road, load
WHERE road.id=load.id
GROUP BY road.id, road.length, load.length
HAVING COUNT(*) <= %d
ORDER BY (CAST(load.length AS float) / CAST(road.length AS float)) DESC, road.id DESC
The GROUP BY Statement
Additional note: Try making sure your query works before making it into a view.

PLSQL Double Split using delimiter and then transpose somehow?

so this is where I realize the difference between theory and practice. Because while I can theoretically picture how it should be/look I can't for the life of me actually figure out how to actually do it. I have tens of thousands of observations that look like this:
>+--------+-------------------------------+--+
>| ID | CALLS | |
>+--------+-------------------------------+--+
>| 162743 | BAD DVR-3|NO PIC-1 | |
>| 64747 | NO PIC-1|BOX HIT-4|PPV DROP-1 | |
>+--------+-------------------------------+--+
And the end results should be something like this:
+--------+---------+--------+---------+----------+--+
| ID | BAD DVR | NO PIC | BOX HIT | PPV DROP | |
+--------+---------+--------+---------+----------+--+
| 162743 | 3 | 1 | 0 | 0 | |
| 64747 | 0 | 1 | 4 | 1 | |
+--------+---------+--------+---------+----------+--+
I'm using PLSQL passthru in SAS so if I need to do transposing I can also always use proc transpose. But getting to that point is quite honestly beyond me. I know I will probably have to create a function likie the question asked here:T-SQL: Opposite to string concatenation - how to split string into multiple records
Any ideas?

Do you have any reference material that describes all the possible values for those PIPE delimited values in the CALLS column? Or do you already know the particular values you need to keep and can ignore others?
If so, you can just process the entire thing in a data step; here is an example:
data have;
input #1 ID 6. #9 CALLS $50.;
datalines;
162743 BAD DVR-3|NO PIC-1
64747 NO PIC-1|BOX HIT-4|PPV DROP-1
run;
data want;
set have; /* point to your Oracle source here */
length field $50;
idx = 1;
BAD_DVR = 0;
NO_PIC = 0;
BOX_HIT = 0;
PPV_DROP = 0;
do i=1 to 5 while(idx ne 0);
field = scan(calls,idx,'|');
if field = ' ' then idx=0;
else do;
if field =: 'BAD DVR' then BAD_DVR = input(substr(field,9),8.);
else if field =: 'NO PIC' then NO_PIC = input(substr(field,8),8.);
else if field =: 'BOX HIT' then BOX_HIT = input(substr(field,9),8.);
else if field =: 'PPV DROP' then PPV_DROP = input(substr(field,10),8.);
idx + 1;
end;
end;
output;
keep ID BAD_DVR NO_PIC BOX_HIT PPV_DROP;
run;
The SCAN function steps through the CALLS column by token; The ":=" operator is "begins with", and the SUBSTR function with only two parameters finds the characters following the hyphen to be read by the INPUT function.
Of course, I'm making a few assumptions about your source data but you get the idea.

I can think of at least two ways to achieve this:
1. Read the entire data from SQL into SAS. Then use DATA STEP to manipulate the data i.e.,
convert data that is in two columns:
>+--------+-------------------------------+--+
>| ID | CALLS | |
>+--------+-------------------------------+--+
>| 162743 | BAD DVR-3|NO PIC-1 | |
>| 64747 | NO PIC-1|BOX HIT-4|PPV DROP-1 | |
>+--------+-------------------------------+--+
to something that looks like this:
result of DATA STEP manipulation:
ID CALLS COUNT
162743 BAD_DVR 3
162743 NO_PIC 1
64747 NO_PIC 1
64747 BOX_HIT 4
64747 PPV_DROP 1
From then it would be a simple matter of passing the above dataset to PROC TRANSPOSE
to get a table like this:
+--------+---------+--------+---------+----------+--+
| ID | BAD DVR | NO PIC | BOX HIT | PPV DROP | |
+--------+---------+--------+---------+----------+--+
| 162743 | 3 | 1 | 0 | 0 | |
| 64747 | 0 | 1 | 4 | 1 | |
+--------+---------+--------+---------+----------+--+
If you want to do everything in pass-through SQL then that to should be easy IF the no. of categories such as {BAD DVR, NO PIC, BOX HIT etc...} are small.
The code will look like:
SELECT
ID
,CASE WHEN SOME_FUNC_TO_FIND_LOCATION_OF_SUBSTRING(CALLS, 'BAD DVR-')>0 THEN <SOME FUNCTION TO EXTRACT EVERYTHING FROM - TO |> ELSE 0 END AS BAD_DVR__COUNT
,CASE WHEN SOME_FUNC_TO_FIND_LOCATION_OF_SUBSTRING(CALLS, 'NO PIC-')>0 THEN <SOME FUNCTION TO EXTRACT EVERYTHING FROM - TO |> ELSE 0 END AS NO_PIC__COUNT
,<and so on>
FROM YOUR_TABLE
You just need to look string manipulation functions available in your database to make everything work.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Add a Dummy Row for Each Row in the Table - azure-log-analytics

Related

How better I can optimize this Kusto Query to get my logs

PostgreSQL Compare value from row to value in next row (different column)

How to sum the minutes of each activity in Postgresql?

How to GROUP BY a new index in a new VIEW

PLSQL Double Split using delimiter and then transpose somehow?

Categories

Resources