Scaling of parameters for a large order of magnitude 1e-26 to 1e-15?

I have parameters and they range in different order of magnitude.
T1 = 1e-26;
T2 = 1e-19;
T3 = 1e-18;
T4 = 1e-17;
T5 = 1e-16;
They are calculated at some conditions e.g temperature.
The scaling should be done in a way that the
values are in the range of, 0.1 to 1.5, approximately.
That means putting in the equation, these ranges could provide the relationship of this quantity with reference to the temperature, etc.
For instance, the equation is like that S = T*1.54/240; where T is the scaled value of the above quantities T1, T2, T3, T4, T5.
How to do scaling so that I get some reasonable scaled value i.e. S?

Why not use the logarithm of T or S if you're not worried about small differences between parameters?


Calculate intercepting vector?

I am trying to calculate an intercepting vector based on Velocity Location and time of two objects.
I found an post covering my problem but was left over with some technical questions i could not ask because my reputation is below 50.
Calculating Intercepting Vector
The answer marked as best goes over the process of how to solve my problem, however when i tried to calculate myself, i could not understand how the vectors of position and velocity are converted to a real number.
Using the data provided here for the positions and speeds of the target and the interceptor, the solving equation is the following:
plugging in the numbers, the coefficients of the quadratic equation in t are:
s_t = [120, 40]; v_t = [5,2]; s_i = [80, 80]; v_i = 10;
a = dot(v_t, v_t)-10^2
b = 2*dot((s_t - s_i),v_t)
c = dot(s_t - s_i, s_t - s_i)
Solving for t yields:
delta = sqrt(b^2-4*a*c)
t1 = (b + sqrt(b^2 - 4*a*c))/(2*a)
t2 = (b - sqrt(b^2 - 4*a*c))/(2*a)
With the data at hand, t1 turns out to be negative, and can be discarded.

How to speed up word2vec similarity calculation?

I trained a Word2Vec model using Gensim, and I have two sets of words:
S1 = {'','','' ...}
S2 = {'','','' ...}
for each word w1 in S1, I want to find top 5 words that are most similar to w1. I am currently doing this way:
model = w2v_model
word_similarities = {}
for w1 in S1:
similarities = {}
for w2 in S2:
if w1 in model.wv and w2 in model.wv:
similarity = model.similarity(w1, w2)
similarities[w2] = similarity
word_similarties[w1] = similarities
Then for each word in word_similarities, I can get the top N from its dict values. If S1 and S2 are large, this becomes very slow.
Is there a quicker way to compute large pairs of words in Word2Vec, either in genism or tensorflow?
Depending on the relative sizes of your model, S1, & S2, you may want to use the most_similar() method of gensim's various word-vector classes – which will use a bulk, optimized vector-comparison operations to check against all vectors in your model – then filter down to just the results in S2.
Alternatively, if S2 is much smaller than the full size of model.wv, and especially if you'll be re-using the same S2 set of word-vector many times, you could consider creating your own KeyedVectors instance with just the S2 words in it, by 1st creating an empty KeyedVectors then adding all the S2 words to it, then using s2.most_similar(positive=[target_word_vector], topn=5).

Optimization with constraint programming

I want to express and solve below equations in a constraint programming language.
I have variables t and trying to find best multipliers k which minimizes my objective function.
Time: t1, t2, t3... given in input
Multipler k1, k2, k3... (This is continuous variables which needs to be found)
c1, c2,.. cN are constants
Main equation k1*sin(c1*x)+k2*sin(c2*x)+k3*sin(c3*x)+k4*cos(c1*x)...
Problem is to minimize results of all equations below with best possible values of (k1, k2, k3..). Also it is known that there is not an exact solution to the problem. So,
when x is t1 --> P1-k1*sin(c1*t1)-k2*sin(c2*t1)-k3*sin(c3*t1)-k4*cos(c1*t1)...
when x is t2 --> P2-k1*sin(c1*t2)-k2*sin(c2*t2)-k3*sin(c3*t2)-k4*cos(c1*t2)...
when x is t3 --> P3-k1*sin(c1*t3)-k2*sin(c2*t3)-k3*sin(c3*t3)-k4*cos(c1*t3)...
P1 is a bound value of time variable. But P(t) is not a analytic function, i just have values for them, like when
t1 = 5 P1=0.7
t2= 6 P2= 0.3 etc..
Is it possible to solve this in minizinc or any other CP system?
I don't think that CP is particularly suited to solve this problem, as you don't really have constraints here. All you have are functions you want to minimize (f1,.., fi), and a few degrees of freedom to do so (k1,.., ki).
I feel like the problem is a pretty good candidate for the least squares method. Instead of trying to "fit" your functions f to a given value, you are trying to minimize them. So what you can do is try to fit f² to 0. (So we would be dealing with non-linear least squares in that care).
Here is what it would like written in Python:
import numpy as np
from scipy.optimize import curve_fit
xdata = np.array([t1, t2, t3, t4, ..., t10])
ydata = np.zeros(10) # this is your "target". 10 = Number of ti
def func(x, k1,k2,
return (P(x)-k1*sin(c1*x)-k2*sin(c2*x)-k3*sin(c3*x)-k4*cos(c1*x)...)**2 # The square is a trick to minimize the function
popt, pcov = curve_fit(func, xdata, ydata, k0=(1.0,1.0,...)) # Initial set of ki

Np.where function

I've got a little problem understanding the where function in numpy.
The ‘times’ array contains the discrete epochs at which GPS measurements exist (rounded to the nearest second).
The ‘locations’ array contains the discrete values of the latitude, longitude and altitude of the satellite interpolated from 10 seconds intervals to 1 second intervals at the ‘times’ epochs.
The ‘tracking’ array contains an array for each epoch in ‘times’ (array within an array). The arrays have 5 columns and 32 rows. The 32 rows correspond to the 32 satellites of the GPS constellation. The 0th row corresponds to the 1st satellite, the 31st to the 32nd. The columns contain the following (in order): is the satellite tracked (0), is L1 locked (1), is L2 locked (2), is L1 unexpectedly lost (3), is L2 unexpectedly lost (4).
We need to find all the unexpected losses and put them in an array so we can plot it on a map.
What we tried to do is:
i = 0
with np.load(r’folderpath\%i.npz' %i) as oneday_data: #replace folderpath with your directory
times = oneday_data['times']
locations = oneday_data['locations']
tracking = oneday_data['tracking']
A = np.where(tracking[:][:][4] ==1)
This should give us all the positions of the losses. With this indices it is easy to get the right locations. But it keeps returning useless data.
Can someone help us?
I think the problem is your dual slices. Further, having an array of arrays could lead to weird problems (I assume you mean an object array of 2D arrays).
So I think you need to dstack tracking into a 3D array, then do where on that. If the array is already 3D, then you can skip the dstack part. This will get the places where L2 is unexpectedly lost, which is what you did in your example:
tracking3d = np.dstack(tracking)
A0, A2 = np.where(tracking3d[:, 4, :]==1)
A0 is the position of the 1 along axis 0 (satellite), while A2 is the position of the same 1 along axis 2 (time epoch).
If the values of tracking can only be 0 or 1, you can simplify this by just doing np.where(tracking3d[:, 4, :]).
You can also roll the axes back into the configuration you were using (0: time epoch, 1: satellite, 2: tracking status)
tracking3d = np.rollaxis(np.dstack(tracking), 2, 0)
A0, A1 = np.where(tracking3d[:, :, 4]==1)
If you want to find the locations where L1 or L2 are unexpectedly lost, you can do this:
tracking3d = np.rollaxis(np.dstack(tracking), 2, 0)
A0, A1, _ = np.where(tracking3d[:, :, 3:]==1)
In this case it is the same, except there is a dummy variable _ used for the location along the last axis, since you don't care whether it was lost for L1 or L2 (if you do care, you could just do np.where independently for each axis).

Sparse dot product in SQL

Imagine I have a table which stores a series of sparse vectors. A sparse vector means that it stores only the nonzero values explicitly in the data structure. I could have a 1 million dimensional vector, but I only store the values for the dimensions which are nonzero. So the size is proportional to the number of nonzero entries, not the dimensionality of the vector.
Table definition would be something like this:
vector_id : int
dimension : int
value : float
Now, in normal programming land I can compute the inner product or dot product of two vectors in O(|v1| + |v2|) time. Basically the algorithm is to store the sparse vectors sorted by dimension and iterate through the dimensions in each until you find collisions between dimensions and multiply the values of the shared dimension and keep adding those up until you get to the end of either one of the vectors.
What's the fastest way to pull this off in SQL?
You should be able to replicate this algorithm in one query:
select sum(v1.value * v2.value)
from vectors v1
inner join vectors v2
on v1.dimension = v2.dimension
where v1.vector_id = ...
and v2.vector_id = ...