Clarification on how to calculate watermark for Azure Streaming Analytics

Clarification on how to calculate watermark for Azure Streaming Analytics - azure-stream-analytics

I'm working on understanding watermarks in Azure Streaming Analytics. Per MS documentation, there are two ways that watermarks can be calculated.
https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-time-handling#how-time-progresses-in-azure-stream-analytics
The second method states: "When there's no incoming event, the watermark is the current estimated arrival time minus the late arrival tolerance window. The estimated arrival time is the time that has elapsed from the last time an input event was seen plus that input event's arrival time."
Questions:
What is meant by "No incoming events"? Does this mean that the source
(ex. Event Hub) is verified to be empty?
What is meant by "The last time an input event was seen."? Does this
mean when it has exited the processing engine to the source?
Currently, this is how I interpret the calculation:
(7) Watermark = (5) [Estimated Arrival Time] - (6) [Late Arrival Tolerance Window]
(5) Estimated Arrival Time = (1) [Elapsed Time] + (4) [Last Arrival Time]
(1) Elapsed Time = Time that elapsed from the (2) [Last time an input event was last seen] and (3) [the current time]

Yes "No incoming events" means that ideally there is nothing to process on the Eventhub .
For the second part , i think you have already gone through the doc which states
"When there's no incoming event, the watermark is the current estimated arrival time minus the late arrival tolerance window. The estimated arrival time is the time that has elapsed from the last time an input event was seen plus that input event's arrival time." .
If we are in the 0:45 mark and if you see that the expected events was at :40 unit ( lets assume that the
event is coming in every 5 unit of time ) . so the watermark will be (45-15 (6) in your example)
https://learn.microsoft.com/en-us/answers/questions/42145/clarification-on-how-to-calculate-watermark-for-az.html

Related

AWS Cloudwatch alarm set to NonBreaching (or notBreaching) is not triggering, based on a log filter

With the following Metric and Alarm combination
Metric
Comes from a Cloudwatch log filter (when a match is found on the log)
Metric value: "1"
Default value: None
Unit: Count
Alarm
Statistic: Sum
Period: 1 minute
Treat missing data as: notBreaching
Threshold: [Metric] > 0 for 1 datapoints within 1 minute
The alarm goes to:
State changed to OK at 2018/12/17.
Reason: Threshold Crossed: no datapoints were received for 1 period and 1 missing datapoint was treated as [NonBreaching].
And then it doesn't trigger, even though I force the metric > 0
Why is the alarm stuck in OK? How can the alarm become triggered again?

Solution
Remove the "Unit" property from the stack template Alarm config.
The source of the problem was actually the "Unit" property. This being set to "Count" actually made the alarm become stuck :(
Ensure the stack is producing the same result as a manual alarm setup by checking with the describe-alarms API.

AnyLogic selectOutput condition

I'm simulation a queuing system where customers join one queue called RDQueue with a capacity of 5, and then moves to a different queue called TDQueue when RDQueue is full (reached the capacity).
I used a selectOutput block with RDQueue on the true branch and TDQueue on the false branch with the condition: RDQueue.size()<5
There should be customers going to TDQueue, but when I run this simulation no customers ever go through the false branch.
(for some reason the image of what I've done won't upload)
I have a source with arrival rate of 0.361 per minute and a delay for RD with a delay time: exponential(8.76) minutes.
According to queuing theory, 68.5% of arrival customers should find RDQueue full and go to TDQueue.
TIA

If your delay time is exponential(8.76) the delay time will always be below the rate in which they are coming:
Random sample from exponential distribution: x = log(1-u)/(−λ)
with λ=8.76 and u as a uniform random number, the expected value of your delay time is 0.114 minutes, so your RDQueue has a probability of being full of nearly 0%

eWAM - In Wynsure - Invalid time format error in aWFOperationAssignment object

When processes like GBP Subscription/Member Enrollment/Member Endorsement are performed and when these processes are accepted, the system throws an error as:
“Object of the class type aWFOperationAssignment cannot be stored in
the database with the corresponding NSID, ID & Version”
and the transaction is roll-backed with the below error shown in the error report.
“The transaction is roll backed. Err Code= 22007.
ErrMsg=SQLState=22007 . [Microsoft][SQL Server Native Client
10.0]Invalid time format”.
This happens only in few of the environments. Not sure if this is a code or configuration issue.

This issue is caused because of the “Bank Holidays Context” configuration in Wynsure.
In the Bank Holidays (Business Administration -> General Settings -> Bank Holiday), the End Time is supposed to be configured in 24 hr time format. If this is configured as for example: 8 for start and 5 for end time, instead of 8 for start and 17 for end time, then the duration is calculated incorrectly. Note that Wynsure tries to subtract the start time from the end time (in this case, it tries to subtract 8 from 5 and gives an incorrect duration)
This configuration will cause an issue while processing any transactions because at the completion of any transaction a corresponding operation is created with 2 fields viz., “Expected Limit Date” and “Expected Limit Time” and this field uses the difference between the “End Time” and “Start Time” to calculate the expected date and time limit.
As the difference between the End Time & Start Time will return an incorrect value, an invalid date & time will be calculated and the system will throw an error with the invalid date and time format and the transaction is rolled back.
To fix this issue, the “End Time” should be configured in 24 hr time format.

Perf: what do [<n percent>] records mean in perf stat output?

perf stat -e <events> <command> with many different events usually returns an output like this
127.352.815.472 r53003c [23,76%]
65.712.112.871 r53019c [23,81%]
178.027.463.861 r53010e [23,88%]
162.854.142.303 r5302c2 [24,05%]
...
What do the percentage records mean?

The percentages show the percentage of time that the specific event was being measured in the case where perf has to multiplex events. Event multiplexing is explained in more detail on the perf wiki, and I've included a brief quote below:
If there are more events than counters, the kernel uses time
multiplexing (switch frequency = HZ, generally 100 or 1000) to give
each event a chance to access the monitoring hardware. Multiplexing
only applies to PMU events. With multiplexing, an event is not
measured all the time. At the end of the run, the tool scales the
count based on total time enabled vs time running.

When does the first occurrence of a recurring timed BackgroundTask run?

If you register a BackgroundTask with a recurring TimeTrigger (OneShot set to false), when does the first occurrence run? After the first FreshnessTime minutes or before?

Microsoft documentation states:
If FreshnessTime is set to 15 minutes and OneShot is false, the task
will run every 15 minutes starting between 0 and 15 minutes from the
time it is registered.
edit
I tested this a few times and it seems to run the first occurrence at anytime during the 15 minute period after registration. It then runs future occurrences at regular 15 minute periods based on 15 minutes from the start time of the previous run.
I'm not sure internally how the OS is scheduling the timer cycles but the answer to your question is not after, not before but during.
nb. You cannot get any timer background task to fire immediately.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Clarification on how to calculate watermark for Azure Streaming Analytics - azure-stream-analytics

Related

AWS Cloudwatch alarm set to NonBreaching (or notBreaching) is not triggering, based on a log filter

AnyLogic selectOutput condition

eWAM - In Wynsure - Invalid time format error in aWFOperationAssignment object

Perf: what do [<n percent>] records mean in perf stat output?

When does the first occurrence of a recurring timed BackgroundTask run?

Categories

Resources