Table and Sum function in Mathematica - sum

I have a very simple question. I don't use Mathematica very often and I got stuck with one task. I need to compute this task:
T=5;
y (* it represents 54 numbers*);
h = 2;
c (*starting at 3, see below*);
Table[Sum[(y[[i]]*((i - c)/h)*((i - c)/h)), {i, T}]/
Sum[((i - c)/h)*((i - c)/h), {i, T}], {c, 3, 54, 2}]
I need to compute the "sum…/sum…" 26 times, where "c" starts at 3 and in another step it is (3+2)-> 5 and so on (e.g. 2 steps). I managed to implement this task with Table function.
The problem is, that I also need the "i" to go from 1 to 54, but in one step it should compute just 5 numbers: 1st computing i=1,2,3,4,5 ; 2nd computing i=3,4,5,6,7 and so on. In the sum function I implemented T as 5, so in first step everything is ok, but I have no idea how to create the loop where "i" overlaps like that. I hope that someone will understand my "great" explanation.

You could write T as c+2, but your table is too long, i.e.
z = Table[c, {c, 3, 54, 2}]
{3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53}
z + 2
{5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55}
So again, if you wrote T as c+2, (and minimum i as c-2) . . .
Table[Sum[(y[[i]]*((i - c)/h)*((i - c)/h)), {i, c - 2, c + 2}]/
Sum[((i - c)/h)*((i - c)/h), {i, c - 2, c + 2}], {c, 3, 54, 2}]
. . . you would need y to represent a list of 55 numbers, not 54.
For example, this works ok :-
y = Array[RandomInteger[10] &, 55];
Table[Sum[(y[[i]]*((i - c)/h)*((i - c)/h)), {i, c - 2, c + 2}]/
Sum[((i - c)/h)*((i - c)/h), {i, c - 2, c + 2}], {c, 3, 54, 2}]

Related

Outliers in data

I have a dataset like so -
15643, 14087, 12020, 8402, 7875, 3250, 2688, 2654, 2501, 2482, 1246, 1214, 1171, 1165, 1048, 897, 849, 579, 382, 285, 222, 168, 115, 92, 71, 57, 56, 51, 47, 43, 40, 31, 29, 29, 29, 29, 28, 22, 20, 19, 18, 18, 17, 15, 14, 14, 12, 12, 11, 11, 10, 9, 9, 8, 8, 8, 8, 7, 6, 5, 5, 5, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
Based on domain knowledge, I know that larger values are the only ones we want to include in our analysis. How do I determine where to cut off our analysis? Should it be don't include 15 and lower or 50 and lower etc?
You can do a distribution check with quantile function. Then you can remove values below lowest 1 percentile or 2 percentile. Following is an example:
import numpy as np
data = np.array(data)
print(np.quantile(data, (.01, .02)))
Another method is calculating the inter quartile range (IQR) and setting lowest bar for analysis is Q1-1.5*IQR
Q1, Q3 = np.quantile(data, (0.25, 0.75))
data_floor = Q1 - 1.5 * (Q3 - Q1)

This prime generating function using generateSequence in Kotlin is not easy to understand. :(

val primes = generateSequence(2 to generateSequence(3) {it + 2}) {
val currSeq = it.second.iterator()
val nextPrime = currSeq.next()
nextPrime to currSeq.asSequence().filter { it % nextPrime != 0}
}.map {it.first}
println(primes.take(10).toList()) // prints [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
I tried to understand this function about how it works, but not easy to me.
Could someone explain how it works? Thanks.
It generates an infinite sequence of primes using the "Sieve of Eratosthenes" (see here: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes).
This implementation uses a sequence of pairs to do this. The first element of every pair is the current prime, and the second element is a sequence of integers larger than that prime which is not divisible by any previous prime.
It starts with the pair 2 to [3, 5, 7, 9, 11, 13, 15, 17, ...], which is given by 2 to generateSequence(3) { it + 2 }.
Using this pair, we create the next pair of the sequence by taking the first element of the sequence (which is now 3), and then removing all numbers divisible by 3 from the sequence (removing 9, 15, 21 and so on). This gives us this pair: 3 to [5, 7, 11, 13, 17, ...]. Repeating this pattern will give us all primes.
After creating a sequence of pairs like this, we are finally doing .map { it.first } to pick only the actual primes, and not the inner sequences.
The sequence of pairs will evolve like this:
2 to [3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, ...]
3 to [5, 7, 11, 13, 17, 19, 23, 25, 29, ...]
5 to [7, 11, 13, 17, 19, 23, 29, ...]
7 to [11, 13, 17, 19, 23, 29, ...]
11 to [13, 17, 19, 23, 29, ...]
13 to [17, 19, 23, 29, ...]
// and so on

Appending numpy arrays using numpy.insert

I have a numpy array (inputs) of shape (30,1). I want to insert 31st value (eg. x = 2). Trying to use the np.insert function but it is giving me out of bounds error.
np.insert(inputs,b+1,x)
IndexError: index 31 is out of bounds for axis 0 with size 30
Short answer: you need to insert it at index b, not b+1.
The index you pass to np.insert(..) [numpy-doc], is the one where the element should be added. If you insert it at index 30, then it will be positioned last. Note that indexes are zero-based. So if you have an array with 30 elements, then the last index is 29. If you thus insert this at index 30, we get:
>>> a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29])
>>> np.insert(a,30,42)
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 42])

MultiPoint crossover using Numpy

I am trying to do crossover on a Genetic Algorithm population using numpy.
I have sliced the population using parent 1 and parent 2.
population = np.random.randint(2, size=(4,8))
p1 = population[::2]
p2 = population[1::2]
But I am not able to figure out any lambda or numpy command to do a multi-point crossover over parents.
The concept is to take ith row of p1 and randomly swap some bits with ith row of p2.
I think you want to select from p1 and p2 at random, cell by cell.
To make it easier to understand i've changed p1 to be 10 to 15 and p2 to be 20 to 25. p1 and p2 were generated at random in these ranges.
p1
Out[66]:
array([[15, 15, 13, 14, 12, 13, 12, 12],
[14, 11, 11, 10, 12, 12, 10, 12],
[12, 11, 14, 15, 14, 10, 13, 10],
[11, 12, 10, 13, 14, 13, 12, 13]])
In [67]: p2
Out[67]:
array([[23, 25, 24, 21, 24, 20, 24, 25],
[21, 21, 20, 20, 25, 22, 24, 22],
[24, 22, 25, 20, 21, 22, 21, 22],
[22, 20, 21, 22, 25, 23, 22, 21]])
In [68]: sieve=np.random.randint(2, size=(4,8))
In [69]: sieve
Out[69]:
array([[0, 1, 0, 1, 1, 0, 1, 0],
[1, 1, 1, 0, 0, 1, 1, 1],
[0, 1, 1, 0, 0, 1, 1, 0],
[0, 0, 0, 1, 1, 1, 1, 1]])
In [70]: not_sieve=sieve^1 # Complement of sieve
In [71]: pn = p1*sieve + p2*not_sieve
In [72]: pn
Out[72]:
array([[23, 15, 24, 14, 12, 20, 12, 25],
[14, 11, 11, 20, 25, 12, 10, 12],
[24, 11, 14, 20, 21, 10, 13, 22],
[22, 20, 21, 13, 14, 13, 12, 13]])
The numbers in the teens come from p1 when sieve is 1
The numbers in the twenties come from p2 when sieve is 0
This may be able to be made more efficient but is this what you expect as output?

Why extreme large value to 0 frequency fft (numpy.fft.fft method)

I have a signal ts which has rougly mean 40 and applied fft on that with code
ts = array([25, 40, 30, 40, 29, 48, 36, 32, 34, 38, 15, 33, 40, 32, 41, 25, 37,49, 41, 35, 23, 22, 36, 44, 28, 36, 32, 37, 39, 51])
index = fftshift(fftfreq(len(ts)))
ft_ts =fftshift(fft(ts))
output
ft_ts = array([ -76.00000000 +8.34887715e-14j, -57.72501110 +1.17054586e+01j,
7.69492662 +9.79582336e+00j, -29.11145618 -7.22493645e+00j,
14.92140414 +4.58471353e+01j, -26.00000000 -4.67653718e+01j,
-39.61803399 -2.83601821e+01j, -11.34044003 +8.66215368e+00j,
23.68703939 +1.57391882e+01j, -64.88854382 -2.44499549e+01j,
50.00000000 -3.98371686e+01j, 4.09382150 -6.27663403e+00j,
-37.38196601 -3.06708342e+01j, 35.97162964 +1.31929223e+01j,
18.69662985 -2.20453671e+00j, 1048.00000000 +0.00000000e+00j,
18.69662985 +2.20453671e+00j, 35.97162964 -1.31929223e+01j,
-37.38196601 +3.06708342e+01j, 4.09382150 +6.27663403e+00j,
50.00000000 +3.98371686e+01j, -64.88854382 +2.44499549e+01j,
23.68703939 -1.57391882e+01j, -11.34044003 -8.66215368e+00j,
-39.61803399 +2.83601821e+01j, -26.00000000 +4.67653718e+01j,
14.92140414 -4.58471353e+01j, -29.11145618 +7.22493645e+00j,
7.69492662 -9.79582336e+00j, -57.72501110 -1.17054586e+01j])
at 0 frequency ft_ts has value of 1048. Shouldn't that be the mean of my original signal ts which is 40 ? What happened here ?
Many thanks
The FFT is not normalized, so the first term should be the sum, not the mean.
For example, see the definition here
and you can see, that when k=0, the exponential term is 1, and you'll just get the sum of x_n.
This is why the first item in fft(np.ones(10)) is 10, not 1. 1 is the mean (since it's an array of ones), and 10 is the sum.