How to minimize an objective function containing frobenius and nuclear norms? - optimization

subject to the constraint that square of Frobenius norm of Ds matrix has to be less than or equal to 1.
Currently I am using CVXPY library to solve the objective function. My code sample looks like
import cvxpy as cp
import numpy as np
np.random.seed(1)
Xs = np.random.randn(100, 4096)
Ys = np.random.randn(100, 300)
# Define and solve the CVXPY problem.
Ds = cp.Variable(shape=(300, 4096))
lamda1 = 1
obj = cp.Minimize(cp.square(cp.norm(Xs - (Ys * Ds), "fro")) + lamda1 * cp.norm(Ds, "nuc"))
constraints = [cp.square(cp.norm(Ds, "fro")) <= 1]
prob = cp.Problem(obj, constraints)
prob.solve(solver=cp.SCS, verbose=True)
The console gives an error that
Solver 'SCS' failed. Try another solver or solve with verbose=True for more information. Try re-centering the problem data around 0 and re-scaling to reduce the dynamic range.
I event tried to experiment with different solvers like cp.ECOS but they do not optimize the function.
Any suggestions ?

Related

Why does numpy and pytorch give different results after mean and variance normalization?

I am working on a problem in which a matrix has to be mean-var normalized row-wise. It is also required that the normalization is applied after splitting each row into tiny batches.
The code seem to work for Numpy, but fails with Pytorch (which is required for training).
It seems Pytorch and Numpy results differ. Any help will be greatly appreciated.
Example code:
import numpy as np
import torch
def normalize(x, bsize, eps=1e-6):
nc = x.shape[1]
if nc % bsize != 0:
raise Exception(f'Number of columns must be a multiple of bsize')
x = x.reshape(-1, bsize)
m = x.mean(1).reshape(-1, 1)
s = x.std(1).reshape(-1, 1)
n = (x - m) / (eps + s)
n = n.reshape(-1, nc)
return n
# numpy
a = np.float32(np.random.randn(8, 8))
n1 = normalize(a, 4)
# torch
b = torch.tensor(a)
n2 = normalize(b, 4)
n2 = n2.numpy()
print(abs(n1-n2).max())
In the first example you are calling normalize with a, a numpy.ndarray, while in the second you call normalize with b, a torch.Tensor.
According to the documentation page of torch.std, Bessel’s correction is used by default to measure the standard deviation. As such the default behavior between numpy.ndarray.std and torch.Tensor.std is different.
If unbiased is True, Bessel’s correction will be used. Otherwise, the sample deviation is calculated, without any correction.
torch.std(input, dim, unbiased, keepdim=False, *, out=None) → Tensor
Parameters
input (Tensor) – the input tensor.
unbiased (bool) – whether to use Bessel’s correction (δN = 1).
You can try yourself:
>>> a.std(), b.std(unbiased=True), b.std(unbiased=False)
(0.8364538, tensor(0.8942), tensor(0.8365))

How to use df.rolling(window, min_periods, win_type='exponential').sum()

I would like to calculate the rolling exponentially weighted mean with df.rolling().mean(). I get stuck at the win_type = 'exponential'.
I have tried other *win_types such as 'gaussian'. I think there would be sth a little different from 'exponential'.
dfTemp.rolling(window=21, min_periods=10, win_type='gaussian').mean(std=1)
# works fine
but when it comes to 'exponential',
dfTemp.rolling(window=21, min_periods=10, win_type='exponential').mean(tau=10)
# ValueError: The 'exponential' window needs one or more parameters -- pass a tuple.
How to use win_type='exponential'... Thanks~~~
I faced same issue and asked it on Russian SO:
Got the following answer:
x.rolling(window=(2,10), min_periods=1, win_type='exponential').mean(std=0.1)
You should pass tau value to window=(2, 10) parameter directly where 10 is a value for tau.
I hope it will help! Thanks to #MaxU
You can easily implement any kind of window by definining your kernel function.
Here's an example for a backward-looking exponential average:
import pandas as pd
import numpy as np
# Kernel function ( backward-looking exponential )
def K(x):
return np.exp(-np.abs(x)) * np.where(x<=0,1,0)
# Exponenatial average function
def exp_average(values):
N = len(values)
exp_weights = list(map(K, np.arange(-N,0) / N ))
return values.dot(exp_weights) / N
# Create a sample DataFrame
df = pd.DataFrame({
'date': [pd.datetime(2020,1,1)]*50 + [pd.datetime(2020,1,2)]*50,
'x' : np.random.randn(100)
})
# Finally, compute the exponenatial moving average using `rolling` and `apply`
df['mu'] = df.groupby(['date'])['x'].rolling(5).apply(exp_average, raw=True).values
df.head(10)
Notice that, if N is fixed, you can significantly reduce the execution time by keeping the weights constant:
N = 10
exp_weights = list(map(K, np.arange(-N,0) / N ))
def exp_average(values):
return values.dot(exp_weights) / N
Short answer: you should use pass tau to the applied function, e.g., rolling(d, win_type='exponential').sum(tau=10). Note that the mean function does not respect the exponential window as expected, so you may need to use sum(tau=10)/window_size to calculate the exponential mean. This is a BUG of current version Pandas (1.0.5).
Full example:
# To calculate the rolling exponential mean
import numpy as np
import pandas as pd
window_size = 10
tau = 5
a = pd.Series(np.random.rand(100))
rolling_mean_a = a.rolling(window_size, win_type='exponential').sum(tau=tau) / window_size
The answer of #Илья Митусов is not correct. With pandas 1.0.5, running the following code raises ValueError: exponential window requires tau:
import pandas as pd
import numpy as np
pd.Series(np.arange(10)).rolling(window=(4, 10), min_periods=1, win_type='exponential').mean(std=0.1)
This code has many problems. First, the 10 in window=(4, 10) is not tau, and will lead to wrong answers. Second, exponential window does not need the parameter std -- only gaussian window needs. Last, the tau should be provided to mean (although mean does not respect the win_type).

tensorflow logistic regression matrix

hello I'm new to tensorflow and I'm getting a feel for it. so i was given a task to multiply these 4 matrices. i was able to do that but now I'm being asked to Take the (16,4) outputs from the multiplication of the (16,8) and (8,4) and Apply a Logistics function on all outputs. Then multiply this new matrix of shape (16,4) by the (4,2) matrix. Take these (16,2) outputs and apply a Logistics function on them. Now multiply this new (16,2) matrix by the (2,1) matrix. I'm suppose to be able to do all this with matrix manipulation. I'm kind of confused on how to go about it because i only kind of sort of understand linear regression. i know they are similar but i wouldn't know how to apply it. any tips please. no I'm not asking for someone to finish i just would like a better example than what i was given because i can't figure out how to go about a logistic function using a matrix. this is what i have so far
import tensorflow as ts
import numpy as np
import os
# AWESOME SAUCE WARNING MESSAGE WAS GETTING ANNOYING
os.environ['TF_CPP_MIN_LOG_LEVEL']='2' #to avoid warnings about compilation
# for different matrix asked to multiply with
# use random for random numbers in each matrix
m1 = np.random.rand(16,8)
m2 = np.random.rand(8,4)
m3 = np.random.rand(4,2)
m4 = np.random.rand(2,1)
# using matmul to mulitply could use # or dot() but using tensorflow
c = ts.matmul(m1,m2)
d = ts.matmul(c,m3)
e = ts.matmul(d, m4)
#attempting to create log regression
arf = ts.Variable(m1,name = "ARF")
with ts.Session() as s:
r1 = s.run(c)
print("M1 * M2: \n",r1)
r2 = s.run(d)
print("Result of C * M3: \n ", r2)
r3 = s.run(e)
print("Result of D * M4: \n",r3)
#learned i cant reshape just that easily
#r4 = ts.reshape(m1,(16,4))
#print("Result of New M1: \n", r4)
I think you have the right idea. The logistic function is just 1 / (1 + exp(-z)) where z is the matrix you want to apply it to. With that in mind you can simply do:
logistic = 1 / (1 + ts.exp(-c))
This will apply the formula element-wise to the input. The result is that this:
lg = s.run(logistic)
print("Result of logistic function \n ", lg)
…will print a matrix the same size as c (16,4), where all values are between 0 and 1. You can then go on to the rest of the multiplications the assignment is asking for.

Sampling from posterior using custom likelihood in pymc3

I'm trying to create a custom likelihood using pymc3. The distribution is called Generalized Maximum Likelihood (GEV) which has the location (loc), scale (scale) and shape (c) parameters.
The main ideia is to choose a beta distribution as a prior to the scale parameter and fix the location and scale parameters in the GEV likelihood.
The GEV distribuition is not contained in the pymc3 standard distributions, so I have to create a custom likelihood. I googled it and found out that I should use the densitydist method but I don't know why it is incorrect.
See the code below:
import pymc3 as pm
import numpy as np
from theano.tensor import exp
data=np.random.randn(20)
with pm.Model() as model:
c=pm.Beta('c',alpha=6,beta=9)
loc=1
scale=2
gev=pm.DensityDist('gev', lambda value: exp(-1+c*(((value-loc)/scale)^(1/c))), testval=1)
modelo=pm.gev(loc=loc, scale=scale, c=c, observed=data)
step = pm.Metropolis()
trace = pm.sample(1000, step)
pm.traceplot(trace)
I'm sorry in advance if this is a dumb question, but I could'nt figure it out.
I'm studying annual maximum flows and I'm trying to implement the methodology described in "Generalized maximum-likelihood generalized extreme-value
quantile estimators for hydrologic data" written by Martins and Stedinger.
If you mean the generalized extreme value distribution (https://en.wikipedia.org/wiki/Generalized_extreme_value_distribution), then something like this should work (for c != 0):
import pymc3 as pm
import numpy as np
import theano.tensor as tt
from pymc3.distributions.dist_math import bound
data = np.random.randn(20)
with pm.Model() as model:
c = pm.Beta('c', alpha=6, beta=9)
loc = 1
scale = 2
def gev_logp(value):
scaled = (value - loc) / scale
logp = -(scale
+ ((c + 1) / c) * tt.log1p(c * scaled)
+ (1 + c * scaled) ** (-1/c))
alpha = loc - scale / c
bounds = tt.switch(value > 0, value > alpha, value < alpha)
return bound(logp, bounds, c != 0)
gev = pm.DensityDist('gev', gev_logp, observed=data)
trace = pm.sample(2000, tune=1000, njobs=4)
pm.traceplot(trace)
Your logp function was invalid. Exponentiation is ** in python, and part of the expression wasn't valid for negative values.

inverse of FFT not the same as original function

I don't understand why the ifft(fft(myFunction)) is not the same as my function. It seems to be the same shape but a factor of 2 out (ignoring the constant y-offset). All the documentation I can see says there is some normalisation that fft doesn't do, but that ifft should take care of that. Here's some example code below - you can see where I've bodged the factor of 2 to give me the right answer. Thanks for any help - its driving me nuts.
import numpy as np
import scipy.fftpack as fftp
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
def fourier_series(x, y, wn, n=None):
# get FFT
myfft = fftp.fft(y, n)
# kill higher freqs above wavenumber wn
myfft[wn:] = 0
# make new series
y2 = fftp.ifft(myfft).real
# find constant y offset
myfft[1:]=0
c = fftp.ifft(myfft)[0]
# remove c, apply factor of 2 and re apply c
y2 = (y2-c)*2 + c
plt.figure(num=None)
plt.plot(x, y, x, y2)
plt.show()
if __name__=='__main__':
x = np.array([float(i) for i in range(0,360)])
y = np.sin(2*np.pi/360*x) + np.sin(2*2*np.pi/360*x) + 5
fourier_series(x, y, 3, 360)
You're removing half the spectrum when you do myfft[wn:] = 0. The negative frequencies are those in the top half of the array and are required.
You have a second fudge to get your results which is taking the real part to find y2: y2 = fftp.ifft(myfft).real (fftp.ifft(myfft) has a non-negligible imaginary part due to the asymmetry in the spectrum).
Fix it with myfft[wn:-wn] = 0 instead of myfft[wn:] = 0, and remove the fudges. So the fixed code looks something like:
import numpy as np
import scipy.fftpack as fftp
import matplotlib.pyplot as plt
def fourier_series(x, y, wn, n=None):
# get FFT
myfft = fftp.fft(y, n)
# kill higher freqs above wavenumber wn
myfft[wn:-wn] = 0
# make new series
y2 = fftp.ifft(myfft)
plt.figure(num=None)
plt.plot(x, y, x, y2)
plt.show()
if __name__=='__main__':
x = np.array([float(i) for i in range(0,360)])
y = np.sin(2*np.pi/360*x) + np.sin(2*2*np.pi/360*x) + 5
fourier_series(x, y, 3, 360)
It's really worth paying attention to the interim arrays that you are creating when trying to do signal processing. Invariably, there are clues as to what is going wrong that should direct you to the problem. In this case, you taking the real part masked the problem and made your task more difficult.
Just to add another quick point: Sometimes taking the real part of the resultant array is exactly the correct thing to do. It's often the case that you end up with an imaginary part to the signal output which is just down to numerical errors in the input to the inverse FFT. Typically this manifests itself as very small imaginary values, so taking the real part is basically the same array.
You are killing the negative frequencies between 0 and -wn.
I think what you mean to do is to set myfft to 0 for all frequencies outside [-wn, wn].
Change the following line:
myfft[wn:] = 0
to:
myfft[wn:-wn] = 0