Laplace Tracnsform. Can we write solution without limit A approaches infinity? - differential-equations

Laplace Tracnsform. Can we write solution without limit A approaches infinity? Like: Integral from 0 to infinity and then functions without limit and A->infinity ?

Related

Is there a way to use less decimals in xgb.cv loss calculation to allow 'early_stopping_rounds' to trigger sooner?

I am using xgb.cv to determine a correct number of estimators for my problem and I am using 'multi:softprob' and 'mlogloss'. Originally in my code I set:
num_boost_round = 999
early_stopping_rounds = 10
Problem is that the loss is returned with many decimals, and even though the last decimals change, it has no practical effect on model goodness for me. This is an example of the losses from around boost round 170 of my run:
0.012855
0.012855
0.012855
0.012854666666666667
0.012854666666666667
0.012853999999999999
0.012853999999999999
0.012853666666666666
0.012853666666666666
0.012853666666666666
0.012852999999999998
You can see that there is little or no idea continuing anymore. My cv got down to these figures already after 15-20 boosting rounds.
Is there a way to use less decimals for the loss comparisons (or reporting) and that way make 'early_stopping_rounds' trigger sooner and stop the cv?
Any ideas would be appreciated.

Analysing time complexity when input is doubling for merge sort

I'm trying to theoretically understand how much longer it would take when the input size passed to merge sort is doubled. I was a reading a textbook which stated that:
"Since the runtime for mergesort (for large N) is O(N log_2 N), we should consider the ratio, r = N^{1.1} log_2(N^{1.1})/(N log_2(N)). This simplifies to 1.1 N^{0.1} which is around 3.5"
I wanted to asked how they computed that it would take roughly 3.5 times longer for merge sort to execute when input size is doubled. Essentially, how they want about that transformation.
Using logarithm arithmetic:
which still depends on N.
My guess is that they assumed that a large number for N is 100000 so:

Tensorflow exp limit to max instead of infinity?

A loss function i'm using has an exp term in it which blows up the loss to infinity, which then causes the gradients to go tto NaNs.... is there a way to currently handle this?
s = tf.exp(n)
# s becomes nan when n is large
Exponential terms in loss functions are usually handled in machine learning by minimising not the exponential itself, but its logarithm. Both functions are monotonically crescent, so minimising the logarithm brings you to the same minimum than minimising the exponential. However, the logarithm grows much slower, avoiding huge increases in your loss function.
Here it seems that you need to minimize directly on n but probably it is only an example.
For example, you could use this:
loss = tf.minimum(tf.exp(n), MAX_VALUE)
this returns the max element wise so you'll need to account for that.

Is multiplying y by 2^x and subtracting y faster that multiplying y by [(2^x)-1] directly?

I have a rather theoretical question:
Is multiplying y by 2^x and subtracting y faster than
multiplying y by [(2^x)-1] directly?
(y*(2^x) - y) vs (y*((2^x)-1))
I implemented a moving average filter on some data I get from a sensor. The basic idea is that I want to average the last 2^x values by taking the old average, multiplying that by [(2^x)-1], adding the new value, and dividing again by 2^x. But because I have to do this more than 500 times a second, I want to optimize it as much as possible.
I know that floating point numbers are represented in IEEE754 and therefore, multiplying and dividing by a power of 2 should be rather fast (basically just changing the mantissa), but how to do that most efficiently? Should I simply stick with just multiplying ((2^x)-1), or is multiplying by 2.0f and subtracting y better, or could I even do that more efficiently by performing a leftshift on the mantissa? And if that is possible, how to implement that properly?
Thank you very much!
I don't think that multiplying a floating-point number by a power of two is faster in practice than a generic multiplication (though I agree that in theory it should be faster, assuming no overflow/underflow). Said otherwise, I don't think that there is a hardware optimization.
Now, I can assume that you have a modern processor, i.e. with a FMA. In this case, (y*(2^x) - y) is faster if performed as fma(y,2^x,-y) (the way you have to write the expression depends on your language and implementation): a FMA should be as fast as a multiplication in practice.
Note also that the speed may also depend on the context. For instance, I've observed on simple code that doing more work can surprisingly yield faster code! So, you need to test (on your real code, not with an arbitrary benchmark).

Efficiency of the modulo operator for bounds checking

I was reading this page on operation performance in .NET and saw there's a really huge difference between the division operation and the rest.
Then, the modulo operator is slow, but how much with respect to the cost of a conditional block we can use for the same purpose?
Let's assume we have a positive number y that can't be >= 20. Which one is more efficient as a general rule (not only in .NET)?
This:
x = y % 10
or this:
x = y
if (x >= 10)
{
x -= 10
}
How many times are you calling the modulo operation? If it's in some tight inner loop that's getting called many times a second, maybe you should look at other ways of preventing array overflow. If it's being called < (say) 10,000 times, I wouldn't worry about it.
As for performance between your snippets - test them (with some real-world data if possible). You don't know what the compiler/JITer and CPU are doing under the hood. The % could be getting optimized to an & if the 2nd argument is constant and a power of 2. At the CPU level you're talking about the difference between division and branch prediction, which is going to depend on the rest of your code.