Numpy select checkers pattern to new array - numpy

I have a square 2D matrix with odd number of rows and columns. For example
11 12 13 14 15
21 22 23 24 25
31 32 33 34 35
41 42 43 44 45
51 52 53 54 55
I need to rotate it 45 degrees clockwise and select the maximal square matrix. In this case:
13 24 35
22 33 44
31 42 53
I can do this in two cycles:
new_arr = np.zeros(((orig_range+1)//2, (orig_range+1)//2)
for new_h in range((orig_range+1)//2):
for new_w in range((orig_range+1)//2):
old_h = new_h + new_w
old_w = ((orig_range+1)//2) - new_h + new_w
new_arr[new_h, new_w] = orig_arr[old_h, old_w]
But this approach is very slow. Rotation in cv2 is reasonably fast, but the "pixels" don't align well. Forward rotation with sqrt(2) scaling followed by backward rotation with sqrt(2) scaling results in a altered colors of pixels in the center region of the image due to the rounding errors.
What is the efficient way to rotate such a matrix?

You can use the Numba's JIT to drastically speed up the operation, especially by running it in parallel and natively. Moreover, note that the array do not need to be filled with zeros. Here is an untested example:
import numba as nb
#nb.njit(parallel=True)
def compute(orig_range, orig_arr):
new_arr = np.empty(((orig_range+1)//2, (orig_range+1)//2)
for new_h in nb.prange((orig_range+1)//2):
for new_w in range((orig_range+1)//2):
old_h = new_h + new_w
old_w = ((orig_range+1)//2) - new_h + new_w
new_arr[new_h, new_w] = orig_arr[old_h, old_w]
return new_arr
You can specify the type of the inputs to compile the function ahead of time and so avoid the first call to be significantly slower. This operation should be very fast on array fitting in cache. For big arrays, one can use tiling and a different read/write ordering to speed the operation up a bit more.

Related

Efficiently plot coordinate set to numpy (bitmap) array excluding offscreen coordinates

This question follows from Efficiently plot set of {coordinate+value}s to (numpy array) bitmap
A solution for plotting from x, y, color lists to a bitmap is given:
bitmap = np.zeros((10, 10, 3))
s_x = (0,1,2) ## tuple
s_y = (1,2,3) ## tuple
pixal_val = np.array([[0,0,1],[1,0,0],[0,1,0]]) ## np
bitmap[s_x, s_y] = pixal_val
plt.imshow(bitmap)
But how to handle the case where some (x,y) pairs lie outside the bitmap?
Efficiency is paramount.
If I could map offscreen coords to the first row/col of the bitmap (-42, 7) -> (0, 7), (15, -6) -> (15, 0), I could simply black out the first row&col with bitmap[:,0,:] = 0; bitmap[0,:,:] = 0.
Is this doable?
Is there a smarter way?
Are you expecting offscreen coords? if so don't worry otherwise I was just wondering if it was using a non-traditional coordinate system - where the zero may be in the center of the image for whatever reason
Anyway, after my revelation that you can use numpy arrays to store the coordinates, mapping outliers to the first row/col is pretty straightforward, simply using: s_x[s_x < 0] = 0, however, i believe the most efficient way to use logic to find the index of the pixels you want to use so only they are allocated - see below:
bitmap = np.zeros((15, 16, 3))
## generate data
s_x = np.array([a for a in range(-3,22)], dtype=int)
s_y = np.array([a for a in range(-4,21)], dtype=int)
np.random.shuffle(s_x)
np.random.shuffle(s_y)
print(s_x)
print(s_y)
pixel_val = np.random.rand(25,3)
## generate is done
use = np.logical_and(np.logical_and(s_x >= 0, s_x < bitmap.shape[1]), np.logical_and(s_y >= 0, s_y < bitmap.shape[0]))
bitmap[s_y[use], s_x[use]] = pixel_val[use]
plt.imshow(bitmap)
output:
coordinates:
[ 8 3 21 9 -2 -3 5 14 -1 18 13 16 0 11 7 1 2 12 15 6 19 10 4 17 20]
[ 8 14 1 9 2 4 7 15 3 -3 19 16 6 -1 0 17 5 13 -2 20 -4 11 10 12 18]
image:
I ran a test where it had to allocate 3145728 (four times the size of the bitmap you gave in your other question), around half of which were outside the image and on average it took around 140ms, whereas remapping the outliers and then setting the first row/col to zero took 200ms for the same task

How number of MFCC coefficients depends on the length of the file

I have a voice data with length 1.85 seconds, then I extract its feature using MFCC (with libraby from James Lyson). It returns 184 x 13 features. I am using 10 milisecond frame step, 25 miliseconds frame size, and 13 coefficients from DCT. How can it return 184? I still can not understand because the last frame's length is not 25 miliseconds. Is there any formula which explain how can it return 184? Thank you in advance.
There is a picture that can explain you things, basically the last window takes more space than previous ones.
If you have 184 windows, the region you cover is 183 * 10 + 25 or approximately 1855 ms.

does GrADS have a "astd" (similarly to aave) command I could use?

I would like to have the spatial standard deviation for a variable (let's say temperature). in other words, does GrADS have a "astd" (similarly to aave) command I could use?
There is no command like this in GRADS. But you can actually compute the standard deviation in two ways:
[1] Compute manually. For example:
*compute the mean
x1 = ave(ts1.1,t=1,t=120)
*compute stdev
s1 = sqrt(ave(pow(ts1.1-x1,2),t=1,t=120)*(n1/(n1-1)))
n here is the number of samples.
[2] You can use the built in function 'stat' in GRADS.
You can use 'set stat on' or 'set gxout stat'
These commands will give you statics such as the following:
Data Type = grid
Dimensions = 0 1
I Dimension = 1 to 73 Linear 0 5
J Dimension = 1 to 46 Linear -90 4
Sizes = 73 46 3358
Undef value = -2.56e+33
Undef count = 1763 Valid count = 1595
Min, Max = 243.008 302.818
Cmin, cmax, cint = 245 300 5
Stats[sum,sumsqr,root(sumsqr),n]: 452778 1.29046e+08 11359.8 1595
Stats[(sum,sumsqr,root(sumsqr))/n]: 283.874 80906.7 284.441
Stats[(sum,sumsqr,root(sumsqr))/(n-1)]: 284.052 80957.4 284.53
Stats[(sigma,var)(n)]: 17.9565 322.437
Stats[(sigma,var)(n-1)]: 17.9622 322.64
Contouring: 245 to 300 interval 5
Sigma here is the standard deviation and Var is variance.
Is this what you are looking for?

Longitudinal Hierarchical Bayesian regression with JAGS

I'm completely new to JAGS/OpenBUGS so I would really appreciate a push in the right direction when it comes to specifying my model. I'm using an unbalanced longitudinal data that is compiled by 103 countries over 15 years where 12 years is picked in this case. The DV is the Gini coefficient, which shouldn't be modeled log-Normal but maybe rather Beta, although right now the focus is on just understanding how to compile the model in JAGS. I'm using a fixed effect model for the time being.
The data and code I'm running:
> head(x)
Year II2 II3 II4 ..... II24
1 1 2.956233 40.90458 4.475183 16.443553
8 1 1.257794 85.47378 2.395186 19.333433
19 1 4.139706 141.07899 2.544640 25.555404
37 1 2.233664 98.51313 3.902835 42.533333
49 1 2.879734 61.39000 1.471334 18.884444
71 1 3.381762 60.23783 3.432614 16.334222
> head(y)
Year II1
1 1 0.3240000
8 1 0.2576667
19 1 0.3132500
37 1 0.2700000
49 1 0.2744286
71 1 0.3250000
dim(x)
1224 23
length(y)
1224
Time <- 12, N <- length(y$II1)#No. of Obs.
dat <- list(x=x, y=y, N=N, Time=Time, p=dim(x)[2]),
inits <- funtion(){list(tau.1=1, tau.2=1, eta=1, alpha=0, beta1=0, beta2=0, beta3=0)}
model6 <- "model{
for(i in 1:N){for(t in 1:Time){
y[i,t]~dlnorm(mu[i,t],tau.1)
mu[i,t] <- inprod(x[i,t],beta[])+alpha[i]}
alpha[i]~dnorm(eta, tau.2)}
for (j in 1:p) {
b[j]~dnorm(0,0.001)
}
eta~dnorm(0, 0.0001)
tau.2~dgamma(0.01,0.01)
tau.1~dgamma(0.01,0.01)
}"
reg.jags <- jags.model(textConnection(model), data=dat, inits=inits, n.chains=1, n.adapt=1000)
And I keep getting this runtime error:
Error in jags.model(textConnection(model), data = dat, inits = inits, :
RUNTIME ERROR:
Compilation error on line 3.
Index out of range taking subset of y
Any suggestions on what I should do differently would be hugely appreciated! I know there are 3 "tricks" you can apply to unbalanced data but I'm still a little bit confused about how all of this works, e.i. how JAGS read the data input.
Cheers
J
Your dataframe y only has 2 columns. But Time is 12. Where you have
y[i,t]~dlnorm(mu[i,t],tau.1)
inside a loop
for(t in 1:Time){
think about what happens when t goes up to 3 (on its way to Time=12).
You are asking JAGS to look at y[i,3], which doesn't exist. Hence "Index out of range".

How to get fitted values from clogit model

I am interested in getting the fitted values at set locations from a clogit model. This includes the population level response and the confidence intervals around it. For example, I have data that looks approximately like this:
set.seed(1)
data <- data.frame(Used = rep(c(1,0,0,0),1250),
Open = round(runif(5000,0,50),0),
Activity = rep(sample(runif(24,.5,1.75),1250, replace=T), each=4),
Strata = rep(1:1250,each=4))
Within the Clogit model, activity does not vary within a strata, thus there is no activity main effect.
mod <- clogit(Used ~ Open + I(Open*Activity) + strata(Strata),data=data)
What I want to do is build a newdata frame at which I can eventually plot marginal fitted values at specified locations of Open similar to a newdata design in a traditional glm model: e.g.,
newdata <- data.frame(Open = seq(0,50,1),
Activity = rep(max(data$Activity),51))
However, when I try to run a predict function on the clogit, I get the following error:
fit<-predict(mod,newdata=newdata,type = "expected")
Error in Surv(rep(1, 5000L), Used) : object 'Used' not found
I realize this is because clogit in r is being run throught Cox.ph, and thus, the predict function is trying to predict relative risks between pairs of subjects within the same strata (in this case= Used).
My question, however is if there is a way around this. This is easily done in Stata (using the Margins Command), and manually in Excel, however I would like to automate in R since everything else is programmed there. I have also built this manually in R (example code below), however I keep ending up with what appear to be incorrect CIs in my real data, as a result I would like to rely on the predict function if possible. My code for manual prediction is:
coef<-data.frame(coef = summary(mod)$coefficients[,1],
se= summary(mod)$coefficients[,3])
coef$se <-summary(mod)$coefficients[,4]
coef$UpCI <- coef[,1] + (coef[,2]*2) ### this could be *1.96 but using 2 for simplicity
coef$LowCI <-coef[,1] - (coef[,2]*2) ### this could be *1.96 but using 2 for simplicity
fitted<-data.frame(Open= seq(0,50,2),
Activity=rep(max(data$Activity),26))
fitted$Marginal <- exp(coef[1,1]*fitted$Open +
coef[2,1]*fitted$Open*fitted$Activity)/
(1+exp(coef[1,1]*fitted$Open +
coef[2,1]*fitted$Open*fitted$Activity))
fitted$UpCI <- exp(coef[1,3]*fitted$Open +
coef[2,3]*fitted$Open*fitted$Activity)/
(1+exp(coef[1,3]*fitted$Open +
coef[2,3]*fitted$Open*fitted$Activity))
fitted$LowCI <- exp(coef[1,4]*fitted$Open +
coef[2,4]*fitted$Open*fitted$Activity)/
(1+exp(coef[1,4]*fitted$Open +
coef[2,4]*fitted$Open*fitted$Activity))
My end product would ideally look something like this but a product of the predict function....
Example output of fitted values.
Evidently Terry Therneau is less a purist on the matter of predictions from clogit models: http://markmail.org/search/?q=list%3Aorg.r-project.r-help+predict+clogit#query:list%3Aorg.r-project.r-help%20predict%20clogit%20from%3A%22Therneau%2C%20Terry%20M.%2C%20Ph.D.%22+page:1+mid:tsbl3cbnxywkafv6+state:results
Here's a modification to your code that does generate the 51 predictions. Did need to put in a dummy Strata column.
newdata <- data.frame(Open = seq(0,50,1),
Activity = rep(max(data$Activity),51), Strata=1)
risk <- predict(mod,newdata=newdata,type = "risk")
> risk/(risk+1)
1 2 3 4 5 6 7
0.5194350 0.5190029 0.5185707 0.5181385 0.5177063 0.5172741 0.5168418
8 9 10 11 12 13 14
0.5164096 0.5159773 0.5155449 0.5151126 0.5146802 0.5142478 0.5138154
15 16 17 18 19 20 21
0.5133829 0.5129505 0.5125180 0.5120855 0.5116530 0.5112205 0.5107879
22 23 24 25 26 27 28
0.5103553 0.5099228 0.5094902 0.5090575 0.5086249 0.5081923 0.5077596
29 30 31 32 33 34 35
0.5073270 0.5068943 0.5064616 0.5060289 0.5055962 0.5051635 0.5047308
36 37 38 39 40 41 42
0.5042981 0.5038653 0.5034326 0.5029999 0.5025671 0.5021344 0.5017016
43 44 45 46 47 48 49
0.5012689 0.5008361 0.5004033 0.4999706 0.4995378 0.4991051 0.4986723
50 51
0.4982396 0.4978068
{Warning} : It's actually rather difficult for mere mortals to determine which of the R-gods to believe on this one. I've learned so much R and statistics form each of those experts. I suspect there are matters of statistical concern or interpretation that I don't really understand.