optimise distance calculation in matlab - optimization

I am a newbie with Matlab and I have the following scenario( which is part of a larger problem).
matrix A with 4754x1024 and matrix B with 6800x1024 rows.
For every row in matrix A i need to calculate the euclidean distance in matrix B. I am using the following technique to calculate the distance but I find that this is very inefficient and very time consuming in Matlab.
for i=1:row_A
for j=1:row_B
%calculate distance
Any suggestions to optimise this because the final step involves performing this operation on 50 such sets of A and B.
I'm not sure what your code is actually doing.
Assuming your data has the following properties
assert(size(A,2) == size(B,2))
d = zeros(size(A,1), size(B,1));
for i = 1:size(A,1)
d(i,:) = sqrt(sum(bsxfun(#minus, B, A(i,:)).^2, 2));
Or possibly better organised by columns (See "Store and Access Data in Columns" in http://www.mathworks.co.uk/company/newsletters/news_notes/june07/patterns.html):
At = A.'; Bt = B.';
d = zeros(size(At,2), size(Bt,2));
for i = 1:size(At,2)
numpy matmul is very very slow

I'm trying to implement the Gradient descent method for solving $Ax = b$ for a positive definite symmetric matrix $A$ of size about $9600 \times 9600$. I thought my code was relatively simple
#Solves the problem Ax = b for x within epsilon tolerance or until MAX_ITERATION is reached
def GradientDescent(Amat,target,epsilon = .01,MAX_ITERATION = 100,x=np.zeros(9604):
CurrentRes = target-np.matmul(Amat,x)
count = 0
while(np.linalg.norm(CurrentRes)> epsilon and count < MAX_ITERATION):
Ar = np.matmul(Amat,CurrentRes)
alpha = CurrentRes.T.dot(CurrentRes)/CurrentRes.T.dot(Ar)
x = x+alpha*CurrentRes
Ax = np.matmul(Amat,x)
CurrentRes = target-Ax
count = count+1
#A is square matrix about 9600x9600 and b is about 9600x1
GDSum = GradientDescent(A,b)
but the above takes almost 3 minutes to run a single iteration of the main while loop.
I didn't think that $9600 \times 9600$ was too big for NumPy to handle effectively, but even the step of computing alpha which is just the quotient of two dot products is taking over 30 seconds.
I tried error-testing the code by timing each action in the while loop, and they are all running much slower than expected. A single matrix multiplication is taking almost a minute. The steps involving vector addition or subtraction at least seem to be running quickly.
#A is square matrix about 9600x9600 and b is about 9600x1
GDSum = GradientDescent(A,b)
Perhaps the most relevant bit of information is missing.
Your function is fast when A and b are Numpy arrays, but it's terribly slow when they are lists.
Optimizing PowerBI query for creating a "to date" maximum

I have a time series data set that is almost monotone increasing. The values of the series will dip every now and then, but it generally increases. The dips that occur in the series are due to errors in a sensor reading either previous values being to high or later values being to low. My goal is to apply a pre-processing transformation to get a more stable signal. Here is an image for reference:
Pre-prossesing vs raw
I can do this relatively easily in python, but I'm having a difficult time doing this efficiently in Power BI. Here is an example table of raw and processed data to give an example of what I'm hoping to do:
Data table
Here is the DAX code that I've to apply to create the ProcessedValue Column:
ProcessedValue =
VAR CurrentIndex = Query1[Index]
Calculate (
MAX ( Query1[Value] ),
ALL ( Query1 ),
Query1[Index] < CurrentIndex
&& Query1[Index] > CurrentIndex - 50
The two issues that I'm running into are 1) I run out of memory and 2) I don't know if the code is even doing what I'm intending it to do (because it doesn't complete). The table has ~ 1M data points. I'm fairly new to Power BI and I'm not aware of the tips and tricks needed to do fast calculations, so any help would be greatly appreciated. For reference here is the code that I'm using in python:
def backward_max(a, axis=0):
bmax = np.array([np.max(a[: max(1, i)], axis=axis) for i in range(len(a))])
return bmax
Single Value Decomposition algorithm not working

I wrote the following function to perform SVD according to page 45 of 'the deep learning book' by Ian Goodfellow and co.
def SVD(A):
AT = np.transpose(A)
AAT = A.dot(AT)
ATA = AT.dot(A)
#Left single values
LSV = np.linalg.eig(AAT)[1]
U = LSV #some values of U have the wrong sign
#Right single values
RSV = np.linalg.eig(ATA)[1]
V[:,0] = V[:,0] #V isnt arranged properly
values = np.sqrt(np.linalg.eig(ata)[0])
#descending order
values = np.sort(values)[::-1]
rows = A.shape[0]
columns = A.shape[1]
D = np.zeros((rows,columns))
return U, D, V
However for any given matrix the results are not the same as using
and I have no idea why.
I tested my algorithm by saying
abs(UDV^T - A) < 0.0001
to check if it decomposed properly and it hasn't. The problem seems to lie with the V and U components but I can't see what's going wrong. D seems to be correct.
If anyone can see the problem it would be much appreciated.
I think you have a problem with the order of the eigenpairs that eig(ATA) and eig(AAT) return. The documentation of np.linalg.eig tells that no order is guaranteed. Replacing eig by eigh, which returns the eigenpairs in ascending order, should help. Also don't rearrange the values.
What is faster to calculate: Large looped calculations or Vlookup on multiple large data grids?

This is a conceptual question that will help me before I start coding my next project. Which approach do you think will be faster?
Loop Calculations: An example of how it would be set up:
for i = 0 to 67
While x < 350
y = 0
While y < 600
Call solved() 'solves and returns "concentration"
If c(y, x) <> Empty = True Then
c(y, x) = c(row, x) + concentration
c(y, x) = concentration
End If
y = y + 1
x = x + 1
Next i
Using Matlab I can generate millions of solved data points and store them into a matrix. This can be stored in a database.
Restrictions: Only Access available as a database. and with the amount of data needed to be stored it will hit memory limit. Excel is not a good idea to store this much data as well.
Taking the restriction into account, I have thought of using multiple text files to store data and use Excel to search and pull values.
Is it possible to optimize this Matlab code for doing vector quantization with centroids from k-means?

I've created a codebook using k-means of size 4000x300 (4000 centroids, each with 300 features). Using the codebook, I then want to label an input vector (for purposes of binning later on). The input vector is of size Nx300, where N is the total number of input instances I receive.
To compute the labels, I calculate the closest centroid for each of the input vectors. To do so, I compare each input vector against all centroids and pick the centroid with the minimum distance. The label is then just the index of that centroid.
My current Matlab code looks like:
function labels = assign_labels(centroids, X)
labels = zeros(size(X, 1), 1);
% for each X, calculate the distance from each centroid
for i = 1:size(X, 1)
% distance of X_i from all j centroids is: sum((X_i - centroid_j)^2)
% note: we leave off the sqrt as an optimization
distances = sum(bsxfun(#minus, centroids, X(i, :)) .^ 2, 2);
[value, label] = min(distances);
labels(i) = label;
However, this code is still fairly slow (for my purposes), and I was hoping there might be a way to optimize the code further.
One obvious issue is that there is a for-loop, which is the bane of good performance on Matlab. I've been trying to come up with a way to get rid of it, but with no luck (I looked into using arrayfun in conjunction with bsxfun, but haven't gotten that to work). Alternatively, if someone know of any other way to speed this up, I would be greatly appreciate it.
After doing some searching, I couldn't find a great solution using Matlab, so I decided to look at what is used in Python's scikits.learn package for 'euclidean_distance' (shortened):
XX = sum(X * X, axis=1)[:, newaxis]
YY = Y.copy()
YY **= 2
YY = sum(YY, axis=1)[newaxis, :]
distances = XX + YY
distances -= 2 * dot(X, Y.T)
distances = maximum(distances, 0)
which uses the binomial form of the euclidean distance ((x-y)^2 -> x^2 + y^2 - 2xy), which from what I've read usually runs faster. My completely untested Matlab translation is:
XX = sum(data .* data, 2);
YY = sum(center .^ 2, 2);
[val, ~] = max(XX + YY - 2*data*center');
Use the following function to calculate your distances. You should see an order of magnitude speed up
The two matrices A and B have the columns as the dimenions and the rows as each point.
A is your matrix of centroids. B is your matrix of datapoints.
function D=getSim(A,B)
You can vectorize it by converting to cells and using cellfun:
We assign each row of X to its own cell in the second line
This piece #(x)(sum(bsxfun(#minus,centroids,x).^2,2)) is an anonymous function which is the same as your distances=... line, and using cell2mat, we apply it to each row of X.
The labels are then the indices of the minimum row along each column.
For a true matrix implementation, you may consider trying something along the lines of:
P2 = kron(centroids, ones(size(X,1),1));
Q2 = kron(ones(size(centroids,1),1), X);
distances = reshape(sum((Q2-P2).^2,2), size(X,1), size(centroids,1));
This assumes the data is organized as [x1 y1 ...; x2 y2 ...;...]
You can use a more efficient algorithm for nearest neighbor search than brute force.
The most popular approach are Kd-Tree. O(log(n)) average query time instead of the O(n) brute force complexity.
Regarding a Maltab implementation of Kd-Trees, you can have a look here