Is there a log analytics query to get the ADF pipleine details that are running more than 24 hours? - azure-data-factory-2

I tried the below query to get the pipelines that are in progress for more than 1 day.
however it retrieved the results of the pipelines that were once in progress from the past 24 hours.
ADFActivityRun
| where TimeGenerated > ago(1d)
| where Status contains "progress"
| extend dataFactory=split(ResourceId, '/')[-1]
| project TimeGenerated, dataFactory, OperationName,Status, PipelineName
| summarize count() by PipelineName, tostring(dataFactory), Status,TimeGenerated
My requirement is to get only those pipeline results that are running more than 24 hours.
Could anyone please let me know if this is even possible?
Thanks!

The below query may work for you.
ADFPipelineRun
| where TimeGenerated > ago(1d)
| where Status == 'InProgress'
| where RunId !in (( ADFPipelineRun | where Status in ("Succeeded","Failed","Cancelled") | project RunId ))
| where datetime_diff('hour',now(),Start) > 24
| extend dataFactory=split(ResourceId, '/')[-1]
| project TimeGenerated, dataFactory, OperationName,Status, PipelineName
| summarize count() by PipelineName, tostring(dataFactory), Status,TimeGenerated
It will give the pipelines which are InProgress and not completed even after 24 hours.
Please check this output for your reference:
As I don’t have any pipelines which are running more than 24 hours, It is not displaying any details.
Please check the below result where my pipelines are InProgress for some time and failed but the execution time is more than 1 second here.
You can try the above query to get pipeline details which are running more than 24 hours and still running.
Reference:
https://www.techtalkcorner.com/long-running-azure-data-factory-pipelines/

I'd recommend using summarize arg_max(...) by ... to find the latest state of every ADF pipeline details. See more info here.

Related

How can I find last time an azure function was executed in log analytics workspace?

Trying to find the last execution time for all my functions in log analytics
I wrote this simple query to start
AppRequests
| distinct OperationName
| take 10
But how Can I get the last executionTime? I tried adding TimeGenerated which would also work
I would like the final result to be:
OperationName, LastExecutionTime
Function1 2022-01-01
Function2 2021-05-05
And so on
summarize operator
AppRequests
| summarize max(LastExecutionTime) by OperationName

Kusto query to get percentage value of events over time

I have a Kusto / KQL query in azure log analytics that aggregates a count of events over time, e.g.:
customEvents
| where name == "EventICareAbout"
| extend channel = customDimensions["ChannelName"]
| summarize events=count() by bin(timestamp, 1m), tostring(channel)
This gives a good results set of a count of the events in each minute bucket.
But the count is fairly meaningless, what I want to know is if that count is different to the average of over the say last hour.
But I'm not even sure how to start constructing something like that.
Any pointers?
There are a couple of ways to achieve this, first, calculate the hourly avg as an additional column then calculate the diffs from the hourly average:
let minuteValues = customEvents
| where name == "EventICareAbout"
| extend channel = customDimensions["ChannelName"]
| summarize events=count() by bin(timestamp, 1m), tostring(channel)
| extend Day = startofday(timestamp), hour =hourofday(timestamp);
let hourlyAverage = customEvents
| where name == "EventICareAbout"
| extend channel = customDimensions["ChannelName"]
| summarize events=count() by bin(timestamp, 1m), tostring(channel)
| summarize hourlyAvgEvents = avg(events) by bin(timestamp,1h), tostring(channel)
| extend Day = startofday(timestamp),hour =hourofday(timestamp);
minuteValues
| lookup hourlyAverage on hour, Day
| extend Diff = events- hourlyAvgEvents
Another option is to use the built-in Anomaly detection

Setting up an alert for Long Running Pipelines in ADF v2 using Kusto Query

I have a pipeline in ADF V2 which generally takes 3 hours to run but some times it takes more than 3 hours. so I want to set up an alert if the pipeline running more than 3 hours using Azure log analytics (Kusto Query), I have written a query but it shows the result if the pipeline succeeded or failed. I want an alert if the pipeline taking more than 3 hours and it's in progress.
My query is
ADFPipelineRun
| where PipelineName == "XYZ"
| where (End - Start) > 3h
| project information = 'Expected Time : 3 Hours, Pipeline took more that 3 hours' ,PipelineName,(End - Start)
Could you please help me to solve this issue?
Thanks in Advance.
Lalit
Updated:
Please change your query like below:
ADFPipelineRun
| where PipelineName == "pipeline11"
| top 1 by TimeGenerated
| where Status in ("Queued","InProgress")
| where (now() - Start) > 3h //please change the time to 3h in your case
| project information = 'Expected Time : 3 Hours, Pipeline took more that 3 hours' ,PipelineName,(now() - Start)
Explanation:
The pipeline has some statuses like: Succeeded, Failed, Queued, InProgress. If the pipeline is now running and not completed, its status must be one of the two: Queued, InProgress.
So we just need to get the latest one record by using top 1 by TimeGenerated, then check its status if Queued or InProgress(in query, its where Status in ("Queued","InProgress")).
At last, we just need to check if it's running more than 3 hours by using where (now() - Start) > 3h.
I test it by myself, it works ok. Please let me know if you still have more issue.

Splunk search - How to loop on multi values field

My use case is analysing ticket in order to attribute a state regarding all the status of a specific ticket.
Raw data look like this :
Id
Version
Status
Event Time
0001
1
New
2021-01-07T09:14:00Z
0001
1
Completed - Action Performed
2021-01-07T09:38:00Z
Data looks like this after transaction command:
Id
Version
Status
Event Time
state
0001, 0001
1, 1
New, Completed - Action Performed
2021-01-07T09:14:00Z, 2021-01-07T09:38:00Z
Acknowlegdement, Work
I'm using transcation command in order to calculate the duration of acknnowlegdement and resolution of the ticket.
I have predefine rule to choose the correct state. This rules compare the n-1 status (New), and the current status (Completed - Action Performed) to choose the state.
Issue
Each ticket has a different number of status. We can not know in advance the max status number. I can not write a static search comparing each value of the Status field.
Expected Solution
I have a field that inform me the number of index on the status (number of status of a ticket) field.
I want to use a loop (Why not a loop for), to iterate on each index of the field Status and compare the value i-1 and i.
I can not find how to do this. Is this possible ?
Thank you
Update to reflect more details
Here's a method with streamstats that should get you towards an answer:
index=ndx sourcetype=srctp Id=* Version=* Status=* EventTime=* state=*
| eval phash=sha256(Version.Status)
| sort 0 _time
| streamstats current=f last(phash) as chash by Id state
| fillnull value="noprev"
| eval changed=if(chash!=phash OR chash="noprev","true","false")
| search NOT changed="false"
| table *
original answer
Something like the following should work to get the most-recent status:
index=ndx sourcetype=srctp Id=* Version=* Status=* EventTime=* state=*
| stats latest(Status) as Status latest(Version) as Version latest(state) state latest(EventTime) as "Event Time" by Id
edit in light of mentioning g the transaction command
Don't use transaction unless you really really really need to.
99% of the time, stats will accomplish what transaction does faster and more efficiently.
For example:
index=ndx sourcetype=srctp Id=* Version=* Status=* EventTime=* state=*
| stats earliest(Status) as eStatus latest(Status) as lStatus earliest(Version) as eVersion latest(Version) as lVersion earliest(status) as estate latest(state) lstate earliest(EventTime) as Opened latest(EventTime) as MostRecent by Id
Will yield a table you can then manipulate further with eval and such. Eg (presuming the time format is subtractable (ie still in Unix epoch format)):
| eval ticketAge=MostRecent-Opened
| eval Versions=eVersion+" - "+lVersion
| eval Statuses=eStatus+" - "+lStatus
| eval State=estate+", ",lstate
| eval Opened=strftime(Opened,"%c"), MostRecent=strftime(MostRecent,"%c")
| eval D=if(ticketAge>86400,round(ticketAge/86400),0)
| eval ticketAge=if(D>0,round(ticketAge-(D*86400)),ticketAge)
| eval H=if(ticketAge>3600,round(ticketAge/3600),0)
| eval ticketAge=if(H>0,round(ticketAge-(H*3600)),ticketAge)
| eval M=if(ticketAge>60,round(ticketAge/60),0)
| eval ticketAge=if(M>0,round(ticketAge-(M*60)),ticketAge)
| rename ticketAge as S
| eval Age=D+" days "+H+" hours"+M+" minutes"+S+" seconds"
| table Id Versions Statuses Opened MostRecent State Age
| rename MostRecent as "Most Recent"
Note: I may have gotten the conversion from raw seconds into days, hours, minutes, seconds off - but it should be close

Stats Count Splunk Query

I wonder whether someone can help me please.
I'd made the following post about Splunk query I'm trying to write:
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html
I received some great help, but despite working on this for a few days now concentrating on using eval if statements, I still have the same issue with the "Successful" and "Unsuccessful" columns showing blank results. So I thought I'd cast the net a little wider and ask please whether someone maybe able to look at this and offer some guidance on how I may get around the problem.
Many thanks and kind regards
Chris
I tried exploring your use-case with splunkd-access log and came up with a simple SPL to help you.
In this query I am actually joining the output of 2 searches which aggregate the required results (Not concerned about the search performance).
Give it a try. If you've access to _internal index, this will work as is. You should be able to easily modify this to suit your events (eg: replace user with ClientID).
index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| stats count as All sum(eval(if(status <= 303,1,0))) as Successful sum(eval(if(status > 303,1,0))) as Unsuccessful by user
| join user type=left
[ search index=_internal source="/opt/splunk/var/log/splunk/splunkd_access.log"
| chart count BY user status ]
I updated your search from splunk community answers (should look like this):
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| stats count as All sum(eval(if(statusCode <= 303,1,0))) as Successful sum(eval(if(statusCode > 303,1,0))) as Unsuccessful by ClientID
| join ClientID type=left
[ search w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientID as ClientID detail.statusCode AS statusCode
| chart count BY ClientID statusCode ]
I answered in Splunk
https://answers.splunk.com/answers/724223/in-a-table-powered-by-a-stats-count-search-can-you.html?childToView=729492#answer-729492
but using dummy encoding, it looks like
w2_wmf(RequestCompleted)`request.detail.Context="*test"
| dedup eventId
| rename request.ClientId as ClientID, detail.statusCode as Status
| eval X_{Status}=1
| stats count as Total sum(X_*) as X_* by ClientID
| rename X_* as *
Will give you ClientID, count and then a column for each status code found, with a sum of each code in that column.
As I gather you can't get this working, this query should show dummy encoding in action
`index=_internal sourcetype=*access
| eval X_{status}=1
| stats count as Total sum(X_*) as X_* by source, user
| rename X_* as *`
This would give an output of something like