I'm doing a data science project, and I was wondering how to handle a music key (scale) as a feature in the KNN algorithm.
I know KNN is based on distances, therefore giving each key a number like 1-24 doesn't make that much sense (because key number 24 is close to 1 as much as 7 close to 8).
I have thought about making a column for "Major/Minor" and another for the note itself,
but I'm still facing the same problem, I need to specify the note with a number, but because notes are cyclic I cannot number them linearly 1-12.
For the people that have no idea how music keys work my question is equivalent to handling states in KNN, you can't just number them linearly 1-50.
One way you could think about the distance between scales is to think of each scale as a 12-element binary vector where there's a 1 wherever a note is in the scale and a zero otherwise.
Then you can compute the Hamming distance between scales. The Hamming distance, for example, between a major scale and its relative minor scale should be zero because they both contain the same notes.
Here's a way you could set this up in Python
from enum import IntEnum
import numpy as np
from scipy.spatial.distance import hamming
class Note(IntEnum):
C = 0
Db = 1
D = 2
Eb = 3
E = 4
F = 5
Gb = 6
G = 7
Ab = 8
A = 9
Bb = 10
B = 11
major = np.array((1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1))
minor = np.array((1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0)) #WHWWHWW Natural Minor
# Transpose the basic scale form to a key using Numpy's `roll` function
cMaj = np.roll(major, Note.C) # Rolling by zero changes nothing
aMin = np.roll(minor, Note.A)
gMaj = np.roll(major, Note.G)
fMaj = np.roll(major, Note.F)
print('Distance from cMaj to aMin', hamming(cMaj, aMin))
print('Distance from cMaj to gMaj', hamming(cMaj, gMaj)) # One step clockwise on circle of fifths
print('Distance from cMaj to fMaj', hamming(cMaj, fMaj)) # One step counter-clockwise on circle of fifths
IIUC, you can convert your features to something like sin as follows. Hear I have 10 values 1-10 and I am transforming them to keep their circular relation.
a = np.around(np.sin([np.deg2rad(x*18) for x in np.array(list(range(11)))]), 3)
import matplotlib.pyplot as plt
plt.plot(a)
Output:
Through this feature engineering you can see that the circularity of your feature is encoded. The value of 0 is equal to 10.
The problem is:
For a positive integer n, define f(n) as the least positive multiple of n that, written in base 10, uses only digits ≤ 2.
Thus f(2)=2, f(3)=12, f(7)=21, f(42)=210, f(89)=1121222.
To solve it in Mathematica, I wrote a function f which calculates f(n)/n :
f[n_] := Module[{i}, i = 1;
While[Mod[FromDigits[IntegerDigits[i, 3]], n] != 0, i = i + 1];
Return[FromDigits[IntegerDigits[i, 3]]/n]]
The principle is simple: enumerate all number with 0, 1, 2 using ternary numeral system until one of those number is divided by n.
It correctly gives 11363107 for 1~100, and I tested for 1~1000 (calculation took roughly a minute, and gives 111427232491), so I started to calculate the answer of the problem.
However, this method is too slow. The computer has been calculating the answer for two hours and hasn't finished computing.
How can I improve my code to calculate faster?
hammar's comment makes it clear that the calculation time is disproportionately spent on values of n that are a multiple of 99. I would suggest finding an algorithm that targets those cases (I have left this as an exercise for the reader) and use Mathematica's pattern matching to direct the calculation to the appropriate one.
f[n_Integer?Positive]/; Mod[n,99]==0 := (* magic here *)
f[n_] := (* case for all other numbers *) Module[{i}, i = 1;
While[Mod[FromDigits[IntegerDigits[i, 3]], n] != 0, i = i + 1];
Return[FromDigits[IntegerDigits[i, 3]]/n]]
Incidentally, you can speed up the fast easy ones by doing it a slightly different way, but that is of course a second-order improvement. You could perhaps set the code up to use ff initially, breaking the While loop if i reaches a certain point, and then switching to the f function you have already provided. (Notice I'm returning n i not i here - that was just for illustrative purposes.)
ff[n_] :=
Module[{i}, i = 1; While[Max[IntegerDigits[n i]] > 2, i++];
Return[n i]]
Table[Timing[ff[n]], {n, 80, 90}]
{{0.000125, 1120}, {0.001151, 21222}, {0.001172, 22222}, {0.00059,
11122}, {0.000124, 2100}, {0.00007, 1020}, {0.000655,
12212}, {0.000125, 2001}, {0.000119, 2112}, {0.04202,
1121222}, {0.004291, 122220}}
This is at least a little faster than your version (reproduced below) for the short cases, but it's much slower for the long cases.
Table[Timing[f[n]], {n, 80, 90}]
{{0.000318, 14}, {0.001225, 262}, {0.001363, 271}, {0.000706,
134}, {0.000358, 25}, {0.000185, 12}, {0.000934, 142}, {0.000316,
23}, {0.000447, 24}, {0.006628, 12598}, {0.002633, 1358}}
A simple thing that you can do to is compile your function to C and make it parallelizable.
Clear[f, fCC]
f[n_Integer] := f[n] = fCC[n]
fCC = Compile[{{n, _Integer}}, Module[{i = 1},
While[Mod[FromDigits[IntegerDigits[i, 3]], n] != 0, i++];
Return[FromDigits[IntegerDigits[i, 3]]]],
Parallelization -> True, CompilationTarget -> "C"];
Total[ParallelTable[f[i]/i, {i, 1, 100}]]
(* Returns 11363107 *)
The problem is that eventually your integers will be larger than a long integer and Mathematica will revert to the non-compiled arbitrary precision arithmetic. (I don't know why the Mathematica compiler does not include a arbitrary precision C library...)
As ShreevatsaR commented, the project Euler problems are often designed to run quickly if you write smart code (and think about the math), but take forever if you want to brute force it. See the about page. Also, spoilers posted on their message boards are removed and it's considered bad form to post spoilers on other sites.
Aside:
You can test that the compiled code is using 32bit longs by running
In[1]:= test = Compile[{{n, _Integer}}, {n + 1, n - 1}];
In[2]:= test[2147483646]
Out[2]= {2147483647, 2147483645}
In[3]:= test[2147483647]
During evaluation of In[53]:= CompiledFunction::cfn: Numerical error encountered at instruction 1; proceeding with uncompiled evaluation. >>
Out[3]= {2147483648, 2147483646}
In[4]:= test[2147483648]
During evaluation of In[52]:= CompiledFunction::cfsa: Argument 2147483648 at position 1 should be a machine-size integer. >>
Out[4]= {2147483649, 2147483647}
and similar for the negative numbers.
I am sure there must be better ways to do this, but this is as far as my inspiration got me.
The following code finds all values of f[n] for n 1-10,000 except the most difficult one, which happens to be n = 9999. I stop the loop when we get there.
ClearAll[f];
i3 = 1;
divNotFound = Range[10000];
While[Length[divNotFound] > 1,
i10 = FromDigits[IntegerDigits[i3++, 3]];
divFound = Pick[divNotFound, Divisible[i10, divNotFound]];
divNotFound = Complement[divNotFound, divFound];
Scan[(f[#] = i10) &, divFound]
] // Timing
Divisible may work on lists for both arguments, and we make good use of that here. The whole routine takes about 8 min.
For 9999 a bit of thinking is necessary. It is not brute-forceable in a reasonable time.
Let P be the factor we are looking for and T (consisting only of 0's, 1's and 2's) the result of multiplication P with 9999, that is,
9999 P = T
then
P(10,000 - 1) = 10,000 P - P = T
==> 10,000 P = P + T
Let P1, ...PL be the digits of P, and Ti the digits of T then we have
The last four zeros in the sum originate of course from the multiplication by 10,000. Hence TL+1,...,TL+4 and PL-3,...,PL are each others complement. Where the former only consists of 0,1,2 the latter allows:
last4 = IntegerDigits[#][[-4 ;; -1]] & /# (10000 - FromDigits /# Tuples[{0, 1, 2}, 4])
==> {{0, 0, 0, 0}, {9, 9, 9, 9}, {9, 9, 9, 8}, {9, 9, 9, 0}, {9, 9, 8, 9},
{9, 9, 8, 8}, {9, 9, 8, 0}, {9, 9, 7, 9}, ..., {7, 7, 7, 9}, {7, 7, 7, 8}}
There are only 81 allowable sets, with 7's, 8's, 9's and 0's (not all possible combinations of them) instead of 10,000 numbers, a speed gain of a factor of 120.
One can see that P1-P4 can only have ternary digits, being the sum of ternary digit and naught. You can see there can be no carry over from the addition of T5 and P1. A further reduction can be gained by realizing that P1 cannot be 0 (the first digit must be something), and if it were a 2 multiplication with 9999 would cause a 8 or 9 (if a carry occurs) in the result for T which is not allowed either. It must be a 1 then. Two's may also be excluded for P2-P4.
Since P5 = P1 + T5 it follows that P5 < 4 as T5 < 3, same for P6-P8.
Since P9 = P5 + T9 it follows that P9 < 6, same for P10-P11
In all these cases the additions don't need to include a carry over as they can't occur (Pi+Ti always < 8). This may not be true for P12 if L = 16. In that case we can have a carry over from the addition of the last 4 digits . So P12 <7. This also excludes P12 from being in the last block of 4 digits. The solution must therefore be at least 16 digits long.
Combining all this we are going to try to find a solution for L=16:
Do[
If[Max[IntegerDigits[
9999 FromDigits[{1, 1, 1, 1, i5, i6, i7, i8, i9, i10, i11, i12}~
Join~l4]]
] < 3,
Return[FromDigits[{1, 1, 1, 1, i5, i6, i7, i8, i9, i10, i11, i12}~Join~l4]]
],
{i5, 0, 3}, {i6, 0, 3}, {i7, 0, 3}, {i8, 0, 3}, {i9, 0, 5},
{i10, 0, 5}, {i11, 0, 5}, {i12, 0, 6}, {l4,last4}
] // Timing
==> {295.372, 1111333355557778}
and indeed 1,111,333,355,557,778 x 9,999 = 11,112,222,222,222,222,222
We could have guessed this as
f[9] = 12,222
f[99] = 1,122,222,222
f[999] = 111,222,222,222,222
The pattern apparently being the number of 1's increasing with 1 each step and the number of consecutive 2's with 4.
With 13 min, this is over the 1 min limit for project Euler. Perhaps I'll look into it some time soon.
Try something smarter.
Build a function F(N) which finds out the smallest number with {0, 1, 2} digits which is divisible by N.
So for a given N the number which we are looking for can be written as SUM = 10^n * dn + 10^(n-1) * dn-1 .... 10^1 * d1 + 1*d0 (where di are the digits of the number).
so you have to find out the digits such that SUM % N == 0
basically each digits contributes to the SUM % N with (10^i * di) % N
I am not giving any more hints, but the next hint would be to use DP. Try to figure out how to use DP to find out the digits.
for all numbers between 1 and 10000 it took under 1sec in C++. (in total)
Good luck.
I am processing a series of points which all have the same Y value, but different X values. I go through the points by incrementing X by one. For example, I might have Y = 50 and X is the integers from -30 to 30. Part of my algorithm involves finding the distance to the origin from each point and then doing further processing.
After profiling, I've found that the sqrt call in the distance calculation is taking a significant amount of my time. Is there an iterative way to calculate the distance?
In other words:
I want to efficiently calculate: r[n] = sqrt(x[n]*x[n] + y*y)). I can save information from the previous iteration. Each iteration changes by incrementing x, so x[n] = x[n-1] + 1. I can not use sqrt or trig functions because they are too slow except at the beginning of each scanline.
I can use approximations as long as they are good enough (less than 0.l% error) and the errors introduced are smooth (I can't bin to a pre-calculated table of approximations).
Additional information:
x and y are always integers between -150 and 150
I'm going to try a couple ideas out tomorrow and mark the best answer based on which is fastest.
Results
I did some timings
Distance formula: 16 ms / iteration
Pete's interperlating solution: 8 ms / iteration
wrang-wrang pre-calculation solution: 8ms / iteration
I was hoping the test would decide between the two, because I like both answers. I'm going to go with Pete's because it uses less memory.
Just to get a feel for it, for your range y = 50, x = 0 gives r = 50 and y = 50, x = +/- 30 gives r ~= 58.3. You want an approximation good for +/- 0.1%, or +/- 0.05 absolute. That's a lot lower accuracy than most library sqrts do.
Two approximate approaches - you calculate r based on interpolating from the previous value, or use a few terms of a suitable series.
Interpolating from previous r
r = ( x2 + y2 ) 1/2
dr/dx = 1/2 . 2x . ( x2 + y2 ) -1/2 = x/r
double r = 50;
for ( int x = 0; x <= 30; ++x ) {
double r_true = Math.sqrt ( 50*50 + x*x );
System.out.printf ( "x: %d r_true: %f r_approx: %f error: %f%%\n", x, r, r_true, 100 * Math.abs ( r_true - r ) / r );
r = r + ( x + 0.5 ) / r;
}
Gives:
x: 0 r_true: 50.000000 r_approx: 50.000000 error: 0.000000%
x: 1 r_true: 50.010000 r_approx: 50.009999 error: 0.000002%
....
x: 29 r_true: 57.825065 r_approx: 57.801384 error: 0.040953%
x: 30 r_true: 58.335225 r_approx: 58.309519 error: 0.044065%
which seems to meet the requirement of 0.1% error, so I didn't bother coding the next one, as it would require quite a bit more calculation steps.
Truncated Series
The taylor series for sqrt ( 1 + x ) for x near zero is
sqrt ( 1 + x ) = 1 + 1/2 x - 1/8 x2 ... + ( - 1 / 2 )n+1 xn
Using r = y sqrt ( 1 + (x/y)2 ) then you're looking for a term t = ( - 1 / 2 )n+1 0.36n with magnitude less that a 0.001, log ( 0.002 ) > n log ( 0.18 ) or n > 3.6, so taking terms to x^4 should be Ok.
Y=10000
Y2=Y*Y
for x=0..Y2 do
D[x]=sqrt(Y2+x*x)
norm(x,y)=
if (y==0) x
else if (x>y) norm(y,x)
else {
s=Y/y
D[round(x*s)]/s
}
If your coordinates are smooth, then the idea can be extended with linear interpolation. For more precision, increase Y.
The idea is that s*(x,y) is on the line y=Y, which you've precomputed distances for. Get the distance, then divide it by s.
I assume you really do need the distance and not its square.
You may also be able to find a general sqrt implementation that sacrifices some accuracy for speed, but I have a hard time imagining that beating what the FPU can do.
By linear interpolation, I mean to change D[round(x)] to:
f=floor(x)
a=x-f
D[f]*(1-a)+D[f+1]*a
This doesn't really answer your question, but may help...
The first questions I would ask would be:
"do I need the sqrt at all?".
"If not, how can I reduce the number of sqrts?"
then yours: "Can I replace the remaining sqrts with a clever calculation?"
So I'd start with:
Do you need the exact radius, or would radius-squared be acceptable? There are fast approximatiosn to sqrt, but probably not accurate enough for your spec.
Can you process the image using mirrored quadrants or eighths? By processing all pixels at the same radius value in a batch, you can reduce the number of calculations by 8x.
Can you precalculate the radius values? You only need a table that is a quarter (or possibly an eighth) of the size of the image you are processing, and the table would only need to be precalculated once and then re-used for many runs of the algorithm.
So clever maths may not be the fastest solution.
Well there's always trying optimize your sqrt, the fastest one I've seen is the old carmack quake 3 sqrt:
http://betterexplained.com/articles/understanding-quakes-fast-inverse-square-root/
That said, since sqrt is non-linear, you're not going to be able to do simple linear interpolation along your line to get your result. The best idea is to use a table lookup since that will give you blazing fast access to the data. And, since you appear to be iterating by whole integers, a table lookup should be exceedingly accurate.
Well, you can mirror around x=0 to start with (you need only compute n>=0, and the dupe those results to corresponding n<0). After that, I'd take a look at using the derivative on sqrt(a^2+b^2) (or the corresponding sin) to take advantage of the constant dx.
If that's not accurate enough, may I point out that this is a pretty good job for SIMD, which will provide you with a reciprocal square root op on both SSE and VMX (and shader model 2).
This is sort of related to a HAKMEM item:
ITEM 149 (Minsky): CIRCLE ALGORITHM
Here is an elegant way to draw almost
circles on a point-plotting display:
NEW X = OLD X - epsilon * OLD Y
NEW Y = OLD Y + epsilon * NEW(!) X
This makes a very round ellipse
centered at the origin with its size
determined by the initial point.
epsilon determines the angular
velocity of the circulating point, and
slightly affects the eccentricity. If
epsilon is a power of 2, then we don't
even need multiplication, let alone
square roots, sines, and cosines! The
"circle" will be perfectly stable
because the points soon become
periodic.
The circle algorithm was invented by
mistake when I tried to save one
register in a display hack! Ben Gurley
had an amazing display hack using only
about six or seven instructions, and
it was a great wonder. But it was
basically line-oriented. It occurred
to me that it would be exciting to
have curves, and I was trying to get a
curve display hack with minimal
instructions.
My inner loop contains a calculation that profiling shows to be problematic.
The idea is to take a greyscale pixel x (0 <= x <= 1), and "increase its contrast". My requirements are fairly loose, just the following:
for x < .5, 0 <= f(x) < x
for x > .5, x < f(x) <= 1
f(0) = 0
f(x) = 1 - f(1 - x), i.e. it should be "symmetric"
Preferably, the function should be smooth.
So the graph must look something like this:
.
I have two implementations (their results differ but both are conformant):
float cosContrastize(float i) {
return .5 - cos(x * pi) / 2;
}
float mulContrastize(float i) {
if (i < .5) return i * i * 2;
i = 1 - i;
return 1 - i * i * 2;
}
So I request either a microoptimization for one of these implementations, or an original, faster formula of your own.
Maybe one of you can even twiddle the bits ;)
Consider the following sigmoid-shaped functions (properly translated to the desired range):
error function
normal CDF
tanh
logit
I generated the above figure using MATLAB. If interested here's the code:
x = -3:.01:3;
plot( x, 2*(x>=0)-1, ...
x, erf(x), ...
x, tanh(x), ...
x, 2*normcdf(x)-1, ...
x, 2*(1 ./ (1 + exp(-x)))-1, ...
x, 2*((x-min(x))./range(x))-1 )
legend({'hard' 'erf' 'tanh' 'normcdf' 'logit' 'linear'})
Trivially you could simply threshold, but I imagine this is too dumb:
return i < 0.5 ? 0.0 : 1.0;
Since you mention 'increasing contrast' I assume the input values are luminance values. If so, and they are discrete (perhaps it's an 8-bit value), you could use a lookup table to do this quite quickly.
Your 'mulContrastize' looks reasonably quick. One optimization would be to use integer math. Let's say, again, your input values could actually be passed as an 8-bit unsigned value in [0..255]. (Again, possibly a fine assumption?) You could do something roughly like...
int mulContrastize(int i) {
if (i < 128) return (i * i) >> 7;
// The shift is really: * 2 / 256
i = 255 - i;
return 255 - ((i * i) >> 7);
A piecewise interpolation can be fast and flexible. It requires only a few decisions followed by a multiplication and addition, and can approximate any curve. It also avoids the courseness that can be introduced by lookup tables (or the additional cost in two lookups followed by an interpolation to smooth this out), though the lut might work perfectly fine for your case.
With just a few segments, you can get a pretty good match. Here there will be courseness in the color gradients, which will be much harder to detect than courseness in the absolute colors.
As Eamon Nerbonne points out in the comments, segmentation can be optimized by "choos[ing] your segmentation points based on something like the second derivative to maximize detail", that is, where the slope is changing the most. Clearly, in my posted example, having three segments in the middle of the five segment case doesn't add much more detail.