Spinnaker Automated Canary Analysis results in either 0 or 100 score nothing in between - spinnaker

I am using Spinnaker/Kayenta for canary analysis. When canary stage runs, it either results in giving 0 or 100 score and nothing in between.
Is this is expected behavior??
How is the scoring done??
Looking at the pattern, seems like if the
Run Canary# fails because of genuine reason ['Canary score of previous interval(doesn't matter whether you have intervals or not) in less than marginal score.']. The Aggregate Canary Results phase never runs. Example snapshot below. It just produces a score of 0.
Steps to Reproduce:
Set up a canary pipeline in spinnaker.
Set it fail during canary analysis.
Additional Details:
When Run Canary# phase is successful. It executes the Aggregate Canary Results phase and produces a score of 100.

Just found out it was configuration that was causing Kayenta to terminate the canary build and not perform the Aggregate Canary Results phase.
Criticality: Fail the canary if this metric fails
option was toggled on. It should be off to get the score.

Related

Number of Concurrent pipeline Execution config

Have been looking at Spinnaker code to see where the MAX value for the number of concurrent pipeline execution is defined.
Closest I found was new JedisPipelineStack("PIPELINE_QUEUE", jedisPool) in OrcaPersistenceConfiguration.groovy. And I'm trying to find the configmap for Orca.Any pointers?

Partitioned Search in Optaplanner

I have been trying to test the new partitioned search function out from Optaplanner v7 release. I implemented a custom SolutionPartitioner according to the example provided in the documentation.
It seems working fine as I can see the scores are improving with each thread (no hard rule broken, soft scores are improving)... however at every merging/reduce step to the main thread it suddenly got hit by a large negative hard scores from a hard constraint which I can't figure out how does it developed and it's hard to debug.
This hard constraint is similar to the 'requiredCpuPowerTotal' rule in the CloudBalance example - simply checking if the allocation has exceeded one's capacity.
This is the output at the end from 2 threads:
[OptaPool-1-PartThread-2] INFO Local Search phase (1) ended: time
spent (120000), best score (0hard/-28317medium/4671205soft), score
calculation speed (1544/sec), step total (60272).
[OptaPool-1-PartThread-1] INFO Local Search phase (1) ended: time
spent (120000), best score (0hard/-16medium/3362676soft), score
calculation speed (1807/sec), step total (112408).
So I am expecting the final score in the main thread to be the sum of those two threads, i.e. 0hard/-28333medium/8033881soft.
But the actual result is very different:
[main] DEBUG PS step (229), time spent (120103), score
(-3458hard/-159medium/10503603soft), best score
(0hard/-7837556medium/0soft), picked move
(org.optaplanner.core.impl.partitionedsearch.scope.PartitionChangeMove#5fe73332).
[main] DEBUG PS step (230), time spent (120214), score
(-3452hard/-160medium/10511701soft), best score
(0hard/-7837556medium/0soft), picked move
(org.optaplanner.core.impl.partitionedsearch.scope.PartitionChangeMove#5dd30a7d).
[main] INFO Partitioned Search phase (0) ended: time spent (121216),
best score (0hard/-7837556medium/0soft), score calculation speed
(2/sec), step total (231), partCount (2), runnablePartThreadLimit
(2).
As you can see when the results from two threads reduced to the main thread, there is a -3452hard therefore the main thread has to pick the beginning score as the best score.
Any idea of how could is happen and how should I debug it? Thanks.

JMeter fail tests if threshold exceeded

I'm hoping that I can find help here because I didn't find anything on the internet. I have multiple JMeter plans and I want to fail the plan if a throughput threshold for a group of requests is exceeded. How can I get the real threshold value from JMeter and fail the test if it is exceeded. I need to do this per request, like the threshold value displayed in the Summary Report per each group of requets.
Thank you in advance.
You cannot fail the "plan", you can fail only a sampler using Assertion
The options are in:
Using JMeter AutoStop Plugin stop the test if average response time exceeds threshold. After test finishes you can compare anticipated duration with real duration and if it less - state that test has failed somehow (i.e. return non-zero exit code
Using Taurus tool as a wrapper for your JMeter test you can use Pass/Fail Criteria subsystem to set the desired failure conditions. Taurus will automatically fail the test if the specified criteria is met.

How to control the exact number of tests to run with caliper

I tried to understand, what is the proper way to control the number of runs: is it the trial or rep? It is confusing: I run the benchmark with --trial 1 and recieve the output:
0% Scenario{vm=java, trial=0, benchmark=SendPublisher} 1002183670.00 ns; Ï=315184.24 ns # 3 trials
It looks like 3 trials were run. What are that trials? What are reps? I can control the rep value with the options --debug & --debug-reps, but what is the value when running w/o debug? I need to know how many times exactly my tested method was called.
Between Caliper 0.5 and 1.0 a bit of the terminology has changed, but this should explain it for both. Keep in mind than things were a little murky in 0.5, so most of the changes made for 1.0 were to make things clearer and more precise.
Caliper 0.5
A single invocation of Caliper is a run. Each run has some number of trials, which is just another iteration of all of the work performed in a run. Within each trial, Caliper executes some number of scenarios. A scenario is the combination of VM, benchmark, etc. The runtime of a scenario is measured by timing the execution of some number of reps, which is the number passed to your benchmark method at runtime. Multiple reps are, of course, necessary because it would be impossible to get precise measurements for a single invocation in a microbenchmark.
Caliper 1.0
Caliper 1.0 follows a pretty similar model. A single invocation of Caliper is still a run. Each run consists of some number of trials, but a trial is more precisely defined as an invocation of a scenario measured with an instrument.
A scenario is roughly defined as what you're measuring (host, VM, benchmark, parameters) and the instrument is what code performs the measurement and how it was configured. The idea being that if a perfectly repeatable benchmark were a function of the form f(x)=y, Caliper would be defined as instrument(scenario)=measurements.
Within the execution of the runtime instrument (it's similar for others), there is still the same notion of reps, which is the number of iterations passed to the benchmark code. You can't control the rep value directly since each instrument will perform its own calculation to determine what it should be.
At runtime, Caliper plans its execution by calculating some number of experiments, which is the combination of instrument, benchmark, VM and parameters. Each experiment is run --trials number of times and reported as an individual trial with its own ID.
How to use the reps parameter
Traditionally, the best way to use the reps parameter is to include a loop in your benchmark code that looks more or less like:
for (int i = 0; i < reps; i++) {…}
This is the most direct way to ensure that the number of reps scales linearly with the reported runtime. That is a necessary property because Caliper is attempting to infer the cost of a single, uniform operation based on the aggregate cost of many. If runtime isn't linear with the number of reps, the results will be invalid. This also implies that reps should not be passed directly to the benchmarked code.

How do I do the Delayed::Job equivalent of Process#waitall?

I have a large task that proceeds in several major steps: Step A must complete before Step B can be started, etc. But each major step can be divided up across multiple processes, in my case, using Delayed::Job.
The question: Is there a simple technique for starting Step B only after all the processes have completed working on Step A?
Note 1: I don't know a priori how many external workers have been spun up, so keeping a reference count of completed workers won't help.
Note 2: I'd prefer not to create a worker whose sole job is to busy wait for the other jobs to complete. Heroku workers cost money!
Note 3: I've considered having each worker examine the Delayed::Job queue in the after callback to decide if it's the last one working on Step A, in which case it could initiate Step B. This could work, but seems potentially fraught with gotchas. (In the absence of better answers, this is the approach I'm going with.)
I think it really depends on the specifics of what you are doing, but you could set priority levels such that any jobs from Step A run first. Depending on the specifics, that might be enough. From the github page:
By default all jobs are scheduled with priority = 0, which is top
priority. You can change this by setting
Delayed::Worker.default_priority to something else. Lower numbers have
higher priority.
So if you set Step A to run at priority = 0, and Step B to run at priority = 100, nothing in Step B will run until Step A is complete.
There's some cases where this will be problematic -- in particular, if you have a lot of jobs and are running a lot of workers, you will probably have some workers running Step B before the work in Step A is finished. Ideally in this setup, Step B has some sort of check to make check if it can be run or not.