Determine the running time of an algorithm with two parameters

Determine the running time of an algorithm with two parameters - time-complexity

I have implemented an algorithm that uses two other algorithms for calculating the shortest path in a graph: Dijkstra and Bellman-Ford. Based on the time complexity of the these algorithms, I can calculate the running time of my implementation, which is easy giving the code.
Now, I want to experimentally verify my calculation. Specifically, I want to plot the running time as a function of the size of the input (I am following the method described here). The problem is that I have two parameters - number of edges and number of vertices.
I have tried to fix one parameter and change the other, but this approach results in two plots - one for varying number of edges and the other for varying number of vertices.
This leads me to my question - how can I determine the order of growth based on two plots? In general, how can one experimentally determine the running time complexity of an algorithm that has more than one parameter?

It's very difficult in general.
The usual way you would experimentally gauge the running time in the single variable case is, insert a counter that increments when your data structure does a fundamental (putatively O(1)) operation, then take data for many different input sizes, and plot it on a log-log plot. That is, log T vs. log N. If the running time is of the form n^k you should see a straight line of slope k, or something approaching this. If the running time is like T(n) = n^{k log n} or something, then you should see a parabola. And if T is exponential in n you should still see exponential growth.
You can only hope to get information about the highest order term when you do this -- the low order terms get filtered out, in the sense of having less and less impact as n gets larger.
In the two variable case, you could try to do a similar approach -- essentially, take 3 dimensional data, do a log-log-log plot, and try to fit a plane to that.
However this will only really work if there's really only one leading term that dominates in most regimes.
Suppose my actual function is T(n, m) = n^4 + n^3 * m^3 + m^4.
When m = O(1), then T(n) = O(n^4).
When n = O(1), then T(n) = O(m^4).
When n = m, then T(n) = O(n^6).
In each of these regimes, "slices" along the plane of possible n,m values, a different one of the terms is the dominant term.
So there's no way to determine the function just from taking some points with fixed m, and some points with fixed n. If you did that, you wouldn't get the right answer for n = m -- you wouldn't be able to discover "middle" leading terms like that.
I would recommend that the best way to predict asymptotic growth when you have lots of variables / complicated data structures, is with a pencil and piece of paper, and do traditional algorithmic analysis. Or possibly, a hybrid approach. Try to break the question of efficiency into different parts -- if you can split the question up into a sum or product of a few different functions, maybe some of them you can determine in the abstract, and some you can estimate experimentally.

Luckily two input parameters is still easy to visualize in a 3D scatter plot (3rd dimension is the measured running time), and you can check if it looks like a plane (in log-log-log scale) or if it is curved. Naturally random variations in measurements plays a role here as well.
In Matlab I typically calculate a least-squares solution to two-variable function like this (just concatenates different powers and combinations of x and y horizontally, .* is an element-wise product):
x = log(parameter_x);
y = log(parameter_y);
% Find a least-squares fit
p = [x.^2, x.*y, y.^2, x, y, ones(length(x),1)] \ log(time)
Then this can be used to estimate running times for larger problem instances, ideally those would be confirmed experimentally to know that the fitted model works.
This approach works also for higher dimensions but gets tedious to generate, maybe there is a more general way to achieve that and this is just a work-around for my lack of knowledge.

I was going to write my own explanation but it wouldn't be any better than this.

Related

How to calculate continuous effect of gravitational pull between simulated planets

so I am making a simple simulation of different planets with individual velocity flying around space and orbiting each other.
I plan to simulate their pull on each other by considering each planet as projecting their own "gravity vector field." Each time step I'm going to add the vectors outputted from each planets individual vector field equation (V = -xj + (-yj) or some notation like it) except the one being effected in the calculation, and use the effected planets position as input to the equations.
However this would inaccurate, and does not consider the gravitational pull as continuous and constant. Bow do I calculate the movement of my planets if each is continuously effecting the others?
Thanks!

In addition to what Blender writes about using Newton's equations, you need to consider how you will be integrating over your "acceleration field" (as you call it in the comment to his answer).
The easiest way is to use Euler's Method. The problem with that is it rapidly diverges, but it has the advantage of being easy to code and to be reasonably fast.
If you are looking for better accuracy, and are willing to sacrifice some performance, one of the Runge-Kutta methods (probably RK4) would ordinarily be a good choice. I'll caution you that if your "acceleration field" is dynamic (i.e. it changes over time ... perhaps as a result of planets moving in their orbits) RK4 will be a challenge.
Update (Based on Comment / Question Below):
If you want to calculate the force vector Fi(tn) at some time step tn applied to a specific object i, then you need to compute the force contributed by all of the other objects within your simulation using the equation Blender references. That is for each object, i, you figure out how all of the other objects pull (apply force) and those vectors when summed will be the aggregate force vector applied to i. Algorithmically this looks something like:
for each object i
Fi(tn) = 0
for each object j ≠ i
Fi(tn) = Fi(tn) + G * mi * mj / |pi(tn)-pj(tn)|2
Where pi(tn) and pj(tn) are the positions of objects i and j at time tn respectively and the | | is the standard Euclidean (l2) normal ... i.e. the Euclidean distance between the two objects. Also, G is the gravitational constant.
Euler's Method breaks the simulation into discrete time slices. It looks at the current state and in the case of your example, considers all of the forces applied in aggregate to all of the objects within your simulation and then applies those forces as a constant over the period of the time slice. When using
ai(tn) = Fi(tn)/mi
(ai(tn) = acceleration vector at time tn applied to object i, Fi(tn) is the force vector applied to object i at time tn, and mi is the mass of object i), the force vector (and therefore the acceleration vector) is held constant for the duration of the time slice. In your case, if you really have another method of computing the acceleration, you won't need to compute the force, and can instead directly compute the acceleration. In either event, with the acceleration being held as constant, the position at time tn+1, p(tn+1) and velocity at time tn+1, v(tn+1), of the object will be given by:
pi(tn+1) = 0.5*ai(tn)*(tn+1-tn)2 + vi(tn)*(tn+1-tn)+pi(tn)
vi(tn+1) = ai(tn+1)*(tn+1-tn) + vi(tn)
The RK4 method fits the driver of your system to a 2nd degree polynomial which better approximates its behavior. The details are at the wikipedia site I referenced above, and there are a number of other resources you should be able to locate on the web. The basic idea is that instead of picking a single force value for a particular timeslice, you compute four force vectors at specific times and then fit the force vector to the 2nd degree polynomial. That's fine if your field of force vectors doesn't change between time slices. If you're using gravity to derive the vector field, and the objects which are the gravitational sources move, then you need to compute their positions at each of the four sub-intervals in order compute the force vectors. It can be done, but your performance is going to be quite a bit poorer than using Euler's method. On the plus side, you get more accurate motion of the objects relative to each other. So, it's a challenge in the sense that it's computationally expensive, and it's a bit of a pain to figure out where all the objects are supposed to be for your four samples during the time slice of your iteration.

There is no such thing as "continuous" when dealing with computers, so you'll have to approximate continuity with very small intervals of time.
That being said, why are you using a vector field? What's wrong with Newton?
And the sum of the forces on an object is that above equation. Equate the two and solve for a
So you'll just have to loop over all the objects one by one and find the acceleration on it.

Optimize MATLAB code (nested for loop to compute similarity matrix)

I am computing a similarity matrix based on Euclidean distance in MATLAB. My code is as follows:
for i=1:N % M,N is the size of the matrix x for whose elements I am computing similarity matrix
for j=1:N
D(i,j) = sqrt(sum(x(:,i)-x(:,j)).^2)); % D is the similarity matrix
end
end
Can any help with optimizing this = reducing the for loops as my matrix x is of dimension 256x30000.
Thanks a lot!
--Aditya

The function to do so in matlab is called pdist. Unfortunately it is painfully slow and doesnt take Matlabs vectorization abilities into account.
The following is code I wrote for a project. Let me know what kind of speed up you get.
Qx=repmat(dot(x,x,2),1,size(x,1));
D=sqrt(Qx+Qx'-2*x*x');
Note though that this will only work if your data points are in the rows and your dimensions the columns. So for example lets say I have 256 data points and 100000 dimensions then on my mac using x=rand(256,100000) and the above code produces a 256x256 matrix in about half a second.

There's probably a better way to do it, but the first thing I noticed was that you could cut the runtime in half by exploiting the symmetry D(i,j)==D(i,j)
You can also use the function norm(x(:,i)-x(:,j),2)

I think this is what you're looking for.
D=zeros(N);
jIndx=repmat(1:N,N,1);iIndx=jIndx'; %'# fix SO's syntax highlighting
D(:)=sqrt(sum((x(iIndx(:),:)-x(jIndx(:),:)).^2,2));
Here, I have assumed that the distance vector, x is initalized as an NxM array, where M is the number of dimensions of the system and N is the number of points. So if your ordering is different, you'll have to make changes accordingly.

To start with, you are computing twice as much as you need to here, because D will be symmetric. You don't need to calculate the (i,j) entry and the (j,i) entry separately. Change your inner loop to for j=1:i, and add in the body of that loop D(j,i)=D(i,j);
After that, there's really not much redundancy left in what that code does, so your only room left for improvement is to parallelize it: if you have the Parallel Computing Toolbox, convert your outer loop to a parfor and before you run it, say matlabpool(n), where n is the number of threads to use.

approximating log10[x^k0 + k1]

Greetings. I'm trying to approximate the function
Log10[x^k0 + k1], where .21 < k0 < 21, 0 < k1 < ~2000, and x is integer < 2^14.
k0 & k1 are constant. For practical purposes, you can assume k0 = 2.12, k1 = 2660. The desired accuracy is 5*10^-4 relative error.
This function is virtually identical to Log[x], except near 0, where it differs a lot.
I already have came up with a SIMD implementation that is ~1.15x faster than a simple lookup table, but would like to improve it if possible, which I think is very hard due to lack of efficient instructions.
My SIMD implementation uses 16bit fixed point arithmetic to evaluate a 3rd degree polynomial (I use least squares fit). The polynomial uses different coefficients for different input ranges. There are 8 ranges, and range i spans (64)2^i to (64)2^(i + 1).
The rational behind this is the derivatives of Log[x] drop rapidly with x, meaning a polynomial will fit it more accurately since polynomials are an exact fit for functions that have a derivative of 0 beyond a certain order.
SIMD table lookups are done very efficiently with a single _mm_shuffle_epi8(). I use SSE's float to int conversion to get the exponent and significand used for the fixed point approximation. I also software pipelined the loop to get ~1.25x speedup, so further code optimizations are probably unlikely.
What I'm asking is if there's a more efficient approximation at a higher level?
For example:
Can this function be decomposed into functions with a limited domain like
log2((2^x) * significand) = x + log2(significand)
hence eliminating the need to deal with different ranges (table lookups). The main problem I think is adding the k1 term kills all those nice log properties that we know and love, making it not possible. Or is it?
Iterative method? don't think so because the Newton method for log[x] is already a complicated expression
Exploiting locality of neighboring pixels? - if the range of the 8 inputs fall in the same approximation range, then I can look up a single coefficient, instead of looking up separate coefficients for each element. Thus, I can use this as a fast common case, and use a slower, general code path when it isn't. But for my data, the range needs to be ~2000 before this property hold 70% of the time, which doesn't seem to make this method competitive.
Please, give me some opinion, especially if you're an applied mathematician, even if you say it can't be done. Thanks.

You should be able to improve on least-squares fitting by using Chebyshev approximation. (The idea is, you're looking for the approximation whose worst-case deviation in a range is least; least-squares instead looks for the one whose summed squared difference is least.) I would guess this doesn't make a huge difference for your problem, but I'm not sure -- hopefully it could reduce the number of ranges you need to split into, somewhat.
If there's already a fast implementation of log(x), maybe compute P(x) * log(x) where P(x) is a polynomial chosen by Chebyshev approximation. (Instead of trying to do the whole function as a polynomial approx -- to need less range-reduction.)
I'm an amateur here -- just dipping my toe in as there aren't a lot of answers already.

One observation:
You can find an expression for how large x needs to be as a function of k0 and k1, such that the term x^k0 dominates k1 enough for the approximation:
x^k0 +k1 ~= x^k0, allowing you to approximately evaluate the function as
k0*Log(x).
This would take care of all x's above some value.

I recently read how the sRGB model compresses physical tri stimulus values into stored RGB values.
It basically is very similar to the function I try to approximate, except that it's defined piece wise:
k0 x, x < 0.0031308
k1 x^0.417 - k2 otherwise
I was told the constant addition in Log[x^k0 + k1] was to make the beginning of the function more linear. But that can easily be achieved with a piece wise approximation. That would make the approximation a lot more "uniform" - with only 2 approximation ranges. This should be cheaper to compute due to no longer needing to compute an approximation range index (integer log) and doing SIMD coefficient lookup.
For now, I conclude this will be the best approach, even though it doesn't approximate the function precisely. The hard part will be proposing this change and convincing people to use it.

Simplification / optimization of GPS track

I've got a GPS track produced by gpxlogger(1) (supplied as a client for gpsd). GPS receiver updates its coordinates every 1 second, gpxlogger's logic is very simple, it writes down location (lat, lon, ele) and a timestamp (time) received from GPS every n seconds (n = 3 in my case).
After writing down a several hours worth of track, gpxlogger saves several megabyte long GPX file that includes several thousands of points. Afterwards, I try to plot this track on a map and use it with OpenLayers. It works, but several thousands of points make using the map a sloppy and slow experience.
I understand that having several thousands of points of suboptimal. There are myriads of points that can be deleted without losing almost anything: when there are several points making up roughly the straight line and we're moving with the same constant speed between them, we can just leave the first and the last point and throw away anything else.
I thought of using gpsbabel for such track simplification / optimization job, but, alas, it's simplification filter works only with routes, i.e. analyzing only geometrical shape of path, without timestamps (i.e. not checking that the speed was roughly constant).
Is there some ready-made utility / library / algorithm available to optimize tracks? Or may be I'm missing some clever option with gpsbabel?

Yes, as mentioned before, the Douglas-Peucker algorithm is a straightforward way to simplify 2D connected paths. But as you have pointed out, you will need to extend it to the 3D case to properly simplify a GPS track with an inherent time dimension associated with every point. I have done so for a web application of my own using a PHP implementation of Douglas-Peucker.
It's easy to extend the algorithm to the 3D case with a little understanding of how the algorithm works. Say you have input path consisting of 26 points labeled A to Z. The simplest version of this path has two points, A and Z, so we start there. Imagine a line segment between A and Z. Now scan through all remaining points B through Y to find the point furthest away from the line segment AZ. Say that point furthest away is J. Then, you scan the points between B and I to find the furthest point from line segment AJ and scan points K through Y to find the point furthest from segment JZ, and so on, until the remaining points all lie within some desired distance threshold.
This will require some simple vector operations. Logically, it's the same process in 3D as in 2D. If you find a Douglas-Peucker algorithm implemented in your language, it might have some 2D vector math implemented, and you'll need to extend those to use 3 dimensions.
You can find a 3D C++ implementation here: 3D Douglas-Peucker in C++
Your x and y coordinates will probably be in degrees of latitude/longitude, and the z (time) coordinate might be in seconds since the unix epoch. You can resolve this discrepancy by deciding on an appropriate spatial-temporal relationship; let's say you want to view one day of activity over a map area of 1 square mile. Imagining this relationship as a cube of 1 mile by 1 mile by 1 day, you must prescale the time variable. Conversion from degrees to surface distance is non-trivial, but for this case we simplify and say one degree is 60 miles; then one mile is .0167 degrees. One day is 86400 seconds; then to make the units equivalent, our prescale factor for your timestamp is .0167/86400, or about 1/5,000,000.
If, say, you want to view the GPS activity within the same 1 square mile map area over 2 days instead, time resolution becomes half as important, so scale it down twice further, to 1/10,000,000. Have fun.

Have a look at Ramer-Douglas-Peucker algorithm for smoothening complex polygons, also Douglas-Peucker line simplification algorithm can help you reduce your points.

OpenSource GeoKarambola java library (no Android dependencies but can be used in Android) that includes a GpxPathManipulator class that does both route & track simplification/reduction (3D/elevation aware).
If the points have timestamp information that will not be discarded.
https://sourceforge.net/projects/geokarambola/
This is the algorith in action, interactively
https://lh3.googleusercontent.com/-hvHFyZfcY58/Vsye7nVrmiI/AAAAAAAAHdg/2-NFVfofbd4ShZcvtyCDpi2vXoYkZVFlQ/w360-h640-no/movie360x640_05_82_05.gif
This algorithm is based on reducing the number of points by eliminating those that have the greatest XTD (cross track distance) error until a tolerated error is satisfied or the maximum number of points is reached (both parameters of the function), wichever comes first.
An alternative algorithm, for on-the-run stream like track simplification (I call it "streamplification") is:
you keep a small buffer of the points the GPS sensor gives you, each time a GPS point is added to the buffer (elevation included) you calculate the max XTD (cross track distance) of all the points in the buffer to the line segment that unites the first point with the (newly added) last point of the buffer. If the point with the greatest XTD violates your max tolerated XTD error (25m has given me great results) then you cut the buffer at that point, register it as a selected point to be appended to the streamplified track, trim the trailing part of the buffer up to that cut point, and keep going. At the end of the track the last point of the buffer is also added/flushed to the solution.
This algorithm is lightweight enough that it runs on an AndroidWear smartwatch and gives optimal output regardless of if you move slow or fast, or stand idle at the same place for a long time. The ONLY thing that maters is the SHAPE of your track. You can go for many minutes/kilometers and, as long as you are moving in a straight line (a corridor within +/- tolerated XTD error deviations) the streamplify algorithm will only output 2 points: those of the exit form last curve and entry on next curve.

I ran in to a similar issue. The rate at which the gps unit takes points is much larger that needed. Many of the points are not geographically far away from each other. The approach that I took is to calculate the distance between the points using the haversine formula. If the distance was not larger than my threshold (0.1 miles in my case) I threw away the point. This quickly gets the number of points down to a manageable size.
I don't know what language you are looking for. Here is a C# project that I was working on. At the bottom you will find the haversine code.
http://blog.bobcravens.com/2010/09/gps-using-the-netduino/
Hope this gets you going.
Bob

This is probably NP-hard. Suppose you have points A, B, C, D, E.
Let's try a simple deterministic algorithm. Suppose you calculate the distance from point B to line A-C and it's smaller than your threshold (1 meter). So you delete B. Then you try the same for C to line A-D, but it's bigger and D for C-E, which is also bigger.
But it turns out that the optimal solution is A, B, E, because point C and D are close to the line B-E, yet on opposite sides.
If you delete 1 point, you cannot be sure that it should be a point that you should keep, unless you try every single possible solution (which can be n^n in size, so on n=80 that's more than the minimum number of atoms in the known universe).
Next step: try a brute force or branch and bound algorithm. Doesn't scale, doesn't work for real-world size. You can safely skip this step :)
Next step: First do a determinstic algorithm and improve upon that with a metaheuristic algorithm (tabu search, simulated annealing, genetic algorithms). In java there are a couple of open source implementations, such as Drools Planner.
All in all, you 'll probably have a workable solution (although not optimal) with the first simple deterministic algorithm, because you only have 1 constraint.
A far cousin of this problem is probably the Traveling Salesman Problem variant in which the salesman cannot visit all cities but has to select a few.

You want to throw away uninteresting points. So you need a function that computes how interesting a point is, then you can compute how interesting all the points are and throw away the N least interesting points, where you choose N to slim the data set sufficiently. It sounds like your definition of interesting corresponds to high acceleration (deviation from straight-line motion), which is easy to compute.

Try this, it's free and opensource online Service:
https://opengeo.tech/maps/gpx-simplify-optimizer/

I guess you need to keep points where you change direction. If you split your track into the set of intervals of constant direction, you can leave only boundary points of these intervals.
And, as Raedwald pointed out, you'll want to leave points where your acceleration is not zero.

Not sure how well this will work, but how about taking your list of points, working out the distance between them and therefore the total distance of the route and then deciding on a resolution distance and then just linear interpolating the position based on each step of x meters. ie for each fix you have a "distance from start" measure and you just interpolate where n*x is for your entire route. (you could decide how many points you want and divide the total distance by this to get your resolution distance). On top of this you could add a windowing function taking maybe the current point +/- z points and applying a weighting like exp(-k* dist^2/accuracy^2) to get the weighted average of a set of points where dist is the distance from the raw interpolated point and accuracy is the supposed accuracy of the gps position.

One really simple method is to repeatedly remove the point that creates the largest angle (in the range of 0° to 180° where 180° means it's on a straight line between its neighbors) between its neighbors until you have few enough points. That will start off removing all points that are perfectly in line with their neighbors and will go from there.
You can do that in Ο(n log(n)) by making a list of each index and its angle, sorting that list in descending order of angle, keeping how many you need from the front of the list, sorting that shorter list in descending order of index, and removing the indexes from the list of points.
def simplify_points(points, how_many_points_to_remove)
angle_map = Array.new
(2..points.length - 1).each { |next_index|
removal_list.add([next_index - 1, angle_between(points[next_index - 2], points[next_index - 1], points[next_index])])
}
removal_list = removal_list.sort_by { |index, angle| angle }.reverse
removal_list = removal_list.first(how_many_points_to_remove)
removal_list = removal_list.sort_by { |index, angle| index }.reverse
removal_list.each { |index| points.delete_at(index) }
return points
end

How to design acceptance probability function for simulated annealing with multiple distinct costs?

I am using simulated annealing to solve an NP-complete resource scheduling problem. For each candidate ordering of the tasks I compute several different costs (or energy values). Some examples are (though the specifics are probably irrelevant to the question):
global_finish_time: The total number of days that the schedule spans.
split_cost: The number of days by which each task is delayed due to interruptions by other tasks (this is meant to discourage interruption of a task once it has started).
deadline_cost: The sum of the squared number of days by which each missed deadline is overdue.
The traditional acceptance probability function looks like this (in Python):
def acceptance_probability(old_cost, new_cost, temperature):
if new_cost < old_cost:
return 1.0
else:
return math.exp((old_cost - new_cost) / temperature)
So far I have combined my first two costs into one by simply adding them, so that I can feed the result into acceptance_probability. But what I would really want is for deadline_cost to always take precedence over global_finish_time, and for global_finish_time to take precedence over split_cost.
So my question to Stack Overflow is: how can I design an acceptance probability function that takes multiple energies into account but always considers the first energy to be more important than the second energy, and so on? In other words, I would like to pass in old_cost and new_cost as tuples of several costs and return a sensible value .
Edit: After a few days of experimenting with the proposed solutions I have concluded that the only way that works well enough for me is Mike Dunlavey's suggestion, even though this creates many other difficulties with cost components that have different units. I am practically forced to compare apples with oranges.
So, I put some effort into "normalizing" the values. First, deadline_cost is a sum of squares, so it grows exponentially while the other components grow linearly. To address this I use the square root to get a similar growth rate. Second, I developed a function that computes a linear combination of the costs, but auto-adjusts the coefficients according to the highest cost component seen so far.
For example, if the tuple of highest costs is (A, B, C) and the input cost vector is (x, y, z), the linear combination is BCx + Cy + z. That way, no matter how high z gets it will never be more important than an x value of 1.
This creates "jaggies" in the cost function as new maximum costs are discovered. For example, if C goes up then BCx and Cy will both be higher for a given (x, y, z) input and so will differences between costs. A higher cost difference means that the acceptance probability will drop, as if the temperature was suddenly lowered an extra step. In practice though this is not a problem because the maximum costs are updated only a few times in the beginning and do not change later. I believe this could even be theoretically proven to converge to a correct result since we know that the cost will converge toward a lower value.
One thing that still has me somewhat confused is what happens when the maximum costs are 1.0 and lower, say 0.5. With a maximum vector of (0.5, 0.5, 0.5) this would give the linear combination 0.5*0.5*x + 0.5*y + z, i.e. the order of precedence is suddenly reversed. I suppose the best way to deal with it is to use the maximum vector to scale all values to given ranges, so that the coefficients can always be the same (say, 100x + 10y + z). But I haven't tried that yet.

mbeckish is right.
Could you make a linear combination of the different energies, and adjust the coefficients?
Possibly log-transforming them in and out?
I've done some MCMC using Metropolis-Hastings. In that case I'm defining the (non-normalized) log-likelihood of a particular state (given its priors), and I find that a way to clarify my thinking about what I want.

I would take a hint from multi-objective evolutionary algorithm (MOEA) and have it transition if all of the objectives simultaneously pass with the acceptance_probability function you gave. This will have the effect of exploring the Pareto front much like the standard simulated annealing explores plateaus of same-energy solutions.
However, this does give up on the idea of having the first one take priority.
You will probably have to tweak your parameters, such as giving it a higher initial temperature.

I would consider something along the lines of:
If (new deadline_cost > old deadline_cost)
return (calculate probability)
else if (new global finish time > old global finish time)
return (calculate probability)
else if (new split cost > old split cost)
return (calculate probability)
else
return (1.0)
Of course each of the three places you calculate the probability could use a different function.

It depends on what you mean by "takes precedence".
For example, what if the deadline_cost goes down by 0.001, but the global_finish_time cost goes up by 10000? Do you return 1.0, because the deadline_cost decreased, and that takes precedence over anything else?
This seems like it is a judgment call that only you can make, unless you can provide enough background information on the project so that others can suggest their own informed judgment call.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas