I have a Table that does fitting operations for 10000 datasets that looks as follows:
ParallelTable[
NonlinearModelFit[data[[i]], func[t,a,b,c,d], {a,b,c,d}, t],
{i,1,10000}];
Which I can change to a for loop if necessary for my problem. That's no problem.
I would like to be able to catch errors in this statement. So if NonlinearModelFit returns any kind of errors (saddle point, maximum iterations reached, non-convergence), I would like to have "i" printed or appended to some other array, in order to know which dataset is not compatible with the fit and debug it. How can I do that?
Just to paraphrase belisarius and make it an answer:
Use
Check[ mymaincommand, resultexpressioniferror, optionallistofspecificmessages]
Related
I am new to LabVIEW and I am trying to read a code written in LabVIEW. The block diagram is this:
This is the program to input x and y functions into the voltage input. It is meant to give an input voltage in different forms (sine, heartshape , etc.) into the fast-steering mirror or galvano mirror x and y axises.
x and y function controls are for inputting a formula for a function, and then we use "evaluation single value" function to input into a daq assistant.
I understand that { 2*(|-Mpi|)/N }*i + -Mpi*pi goes into the x value. However, I dont understand why we use this kind of formula. Why we need to assign a negative value and then do the absolute value of -M*pi. Also, I don`t understand why we need to divide to N and then multiply by i. And finally, why need to add -Mpi again? If you provide any hints about this I would really appreciate it.
This is just a complicated way to write the code/formula. Given what the code looks like (unnecessary wire bends, duplicate loop-input-tunnels, hidden wires, unnecessary coercion dots, failure to use appropriate built-in 'negate' function) not much care has been given in writing it. So while it probably yields the correct results you should not expect it to do so in the most readable way.
To answer you specific questions:
Why we need to assign a negative value and then do the absolute value
We don't. We can just move the negation immediately before the last addition or change that to a subtraction:
{ 2*(|Mpi|)/N }*i - Mpi*pi
And as #yair pointed out: We are not assigning a value here, we are basically flipping the sign of whatever value the user entered.
Why we need to divide to N and then multiply by i
This gives you a fraction between 0 and 1, no matter how many steps you do in your for-loop. Think of N as a sampling rate. I.e. your mirrors will always do the same movement, but a larger N just produces more steps in between.
Why need to add -Mpi again
I would strongly assume this is some kind of quick-and-dirty workaround for a bug that has not been fixed properly. Looking at the code it seems this +Mpi*pi has been added later on in the development process. And while I don't know what the expected values are I would believe that multiplying only one of the summands by Pi is probably wrong.
I write a vector in Dymola mos script in a simple manner like this:
x_axis = cell.spatialSummary.x_cell;
output: x_axis={1,2,3,4,5} // row vector
I want to do the same thing in a function.'x_cell' has 5 values which I want to store in a row vector. I use DymolaCommands.Trajectories.readTrajectory function to read x_cell values one by one in for loop (I use for loop because, readTrajectory throws an error when I try to read entire x_cell)
Real x_axis[:],axis_value[:,:];
Integer len=5;
for i in 1:len loop
axis_value:=readTrajectory(result,{"cell.spatialSummary.x_cell["+String(i)+"]"},1); //This intermediate variable returns [1,1] matrix
x_axis[i]:=scalar(axis_value);
end for;
I get an error:
Assignment failed x_axis[i] = scalar(axis_value);
what's wrong here? All I want to do is read all values of x_cell and write it into a vector. How can I do this in dymola function?
Thank you!
Solution: Initialize the vector with a certain value. In this case,
x_axis :=fill(0, len);
This solved the above problem for me.
Pre filling as in the other solution works, and is generally the best solution. However, in some cases you might have to append to the vector as follows:
x_axis=fill(0.0, 0);
for i in 1:len loop
axis_value:=readTrajectory(result,{"cell.spatialSummary.x_cell["+String(i)+"]"},1); //This intermediate variable returns [1,1] matrix
x_axis:=cat(1, x_axis, {scalar(axis_value)});
end for;
(This takes x_axis and concatenates a new element at the end. It is generally slower.)
I have a SCIP project to solve a binary problem with nonlinear objective function that works well but for some instances I got a message saying "the best solution is not feasible", and there is some violation in the constraints. (The violation is mostly very small)
To solve this issue, I added the SCIP_CALL_EXC( SCIPsetRealParam(scip, "numerics/feastol", 1e-5) to change the default value of feastol. But I get segmentation fault!
Following your helpful suggestion, the violation value is now much lower. My objective function is in the form of Min: AX+ LSqrt(BX). In the previous version, I had used an auxiliary variable, let say, Q such that Q^2 - L^2(BX) >=0 and the objective function was expressed as Min: AX+ Q . In the new version, I changed the inequality sign into equality and in combination with SCIPsetRealParam(scip, "numerics/feastol",1e-8), the violation in the constraints are much lower. What can I do more to decrease the violation value? Moreover, when I printed the value of feastol, I seenumerics/lpfeastol=1e-8 but lpfeastol is different from feastol!. So, why lpfeastol is modified instead?. I see the modified value of lpfeastol several times printed on screen when SCIP is solving the problem. I appreciate your help in advance
Maybe try changing the define of UPGSCALE in src/scip/cons_soc.c to some larger value.
The output for the lpfeastol is unfortunate, but normal. Reducing feastol automatically leads to adjusting lpfeastol, too. I cannot reproduce "I printed the value of feastol, I seenumerics/lpfeastol=1e-8 but lpfeastol is different from feastol".
I have a problem, big problem with threads in vb.net
First of all I want to tell that I didn't work with threads before (just on the school), I read lot of pages about it, but none of them could help me for my problem.
My main question here is understand the logic, and after if is possible, solve the problem that I have, both are related, then I going to explain the problem.
The code hasn't comments and/or documentation related, and is a program developed lot of years ago and the guy that did this is not working on the office, nobody knows how it works :S
I have a list called listOfProccess, and when is only 1 works fine.
in the callback function in QueueUserWorkItem fill the information about p, then execute the thread, I suppose
That list contain an array with information type
listOfProccess[].type = 'a/b/c/d/e/f/g/'
also the list include an ID.
Code:
If listOfProccess.Count > 0 Then
Threading.ThreadPool.SetMinThreads(1, 1)
Threading.ThreadPool.SetMaxThreads(4, 4)
For Each p In listOfProccess
Try
Threading.ThreadPool.QueueUserWorkItem(New Threading.WaitCallback(Object p.function))
Catch e As Exception
sendMail("mail#mail", "alerts#mail.ie", "", e.StackTrace)
End Try
Next
Problems:
I have two problems here:
Sometimes execute an item in the list i.e. 'a' in a infinite loop, and expends all resources of the machine, but if i close and restart again, works, I didn't know if is a problem with the threads or not, sincerely I think that is other thing, because this problem starts two or tree weeks ago, and the program still running during a year.
This one i think is related about threads, if I have two (or more) p on the list like this:
p[1].type = 'a/b/c/d/e/f/g/'
p[1].ID = 1
p[2].type = 'ww/xx/ff/yy/aa/rr/'
p[2].ID =2
when the system execute something like this, the way that it follows is 'random' ie. takes for the first one, a, b,c and after this it does ww, and come back to the first. the problem is bigger if I have more items in the list, like 4 or 5; this is not a very big problem, because the program works, it not works 100% fine, but works, is more to try the understand why it works on this way.
Any help is welcome.
The second problem is a race condition problem, as you can't guarantee the order of the execution for the threads, there is a non-zero probability that your threads will replace each other's values. There are a lot of wayss to solve this issue: algorythms (either lock-oriented and lock-free), synchronization techniques and so on, and there is no silver-bullet solution for this.
First problem is unclear for me, as I can't understand what exactly do you mean by a infinite loop - may be this can happen if you link items to deleted (from other thread) ones and there is no way to go out of your task, as the links in a list are broken. This still should be solved with sychronization.
I think your should start with MSDN articles or some book about multithreading in general, and after that you should split your program step by step for a small parts you understand.
Update:
p.function - about the infinite loop you should consider the code of this delegate. If there is a condition to restart work, you should check a recursion limit. For example, if there is an optimistic locking algorythm, your code can find out that the update it tried isn't valid, and restart it. And if the links are broken, it will never end it's task, and will stay forever in an infinite loop.
This is a saga which began with the problem of how to do survey weighting. Now that I appear to be doing that correctly, I have hit a bit of a wall (see previous post for details on the import process and where the strata variable came from):
> require(foreign)
> ipums <- read.dta('/path/to/data.dta')
> require(survey)
> ipums.design <- svydesign(id=~serial, strata=~strata, data=ipums, weights=perwt)
Error in if (nbins > .Machine$integer.max) stop("attempt to make a table with >= 2^31 elements") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In pd * (as.integer(cat) - 1L) : NAs produced by integer overflow
2: In pd * nl : NAs produced by integer overflow
> traceback()
9: tabulate(bin, pd)
8: as.vector(data)
7: array(tabulate(bin, pd), dims, dimnames = dn)
6: table(ids[, 1], strata[, 1])
5: inherits(x, "data.frame")
4: is.data.frame(x)
3: rowSums(table(ids[, 1], strata[, 1]) > 0)
2: svydesign.default(id = ~serial, weights = ~perwt, strata = ~strata,
data = ipums)
1: svydesign(id = ~serial, weights = ~perwt, strata = ~strata, data = ipums)
This error seems to come from the tabulate function, which I hoped would be straightforward enough to circumvent, first by changing .Machine$integer.max
> .Machine$integer.max <- 2^40
and when that didn't work the whole source code of tabulate:
> tabulate <- function(bin, nbins = max(1L, bin, na.rm=TRUE))
{
if(!is.numeric(bin) && !is.factor(bin))
stop("'bin' must be numeric or a factor")
#if (nbins > .Machine$integer.max)
if (nbins > 2^40) #replacement line
stop("attempt to make a table with >= 2^31 elements")
.C("R_tabulate",
as.integer(bin),
as.integer(length(bin)),
as.integer(nbins),
ans = integer(nbins),
NAOK = TRUE,
PACKAGE="base")$ans
}
Neither circumvented the problem. Apparently this is one reason why the ff package was created, but what worries me is the extent to which this is a problem I cannot avoid in R. This post seems to indicate that even if I were to use a package that would avoid this problem, I would only be able to access 2^31 elements at a time. My hope was to use sql (either sqlite or postgresql) to get around the memory problems, but I'm afraid I'll spend a while getting that to work, only to run into the same fundamental limit.
Attempting to switch back to Stata doesn't solve the problem either. Again see the previous post for how I use svyset, but the calculation I would like to run causes Stata to hang:
svy: mean age, over(strata)
Whether throwing more memory at it will solve the problem I don't know. I run R on my desktop which has 16 gigs, and I use Stata through a Windows server, currently setting memory allocation to 2000MB, but I could theoretically experiment with increasing that.
So in sum:
Is this a hard limit in R?
Would sql solve my R problems?
If I split it up into many separate files would that fix it (a lot of work...)?
Would throwing a lot of memory at Stata do it?
Am I seriously barking up the wrong tree somehow?
Yes, R uses 32-bit indexes for vectors so they can contain no more than 2^31-1 entries and you are trying to create something with 2^40. There is talk of introducing 64-bit indexes but that will be some way off before appearing in R. Vectors have the stated hard limit and that is it as far as base R is concerned.
I am unfamiliar with the details of what you are doing to offer any further advice on the other parts of your Q.
Why do you want to work with the full data set? Wouldn't a smaller sample that can fit in to the restrictions R places on you be just as useful? You could use SQL to store all the data and query it from R to return a random subset of more appropriate size.
Since this question was asked some time ago, I'd like to point that my answer here uses the version 3.3 of the survey package.
If you check the code of svydesign, you can see that the function that causes all the problem is within a check step that looks whether you should set the nest parameter to TRUE or not. This step can be disabled setting the option check.strata=FALSE.
Of course, you shouldn't disable a check step unless you know what you are doing. In this case, you should be able to decide yourself whether you need to set the nest option to TRUE or FALSE. nest should be set to TRUE when the same PSU (cluster) id is recycled in different strata.
Concretely for the IPUMS dataset, since you are using the serial variable for cluster identification and serial is unique for each household in a given sample, you may want to set nest to FALSE.
So, your survey design line would be:
ipums.design <- svydesign(id=~serial, strata=~strata, data=ipums, weights=perwt, check.strata=FALSE, nest=FALSE)
Extra advice: even after circumventing this problem you will find that the code is pretty slow unless you remap strata to a range from 1 to length(unique(ipums$strata)):
ipums$strata <- match(ipums$strata,unique(ipums$strata))
Both #Gavin and #Martin deserve credit for this answer, or at least leading me in the right direction. I'm mostly answering it separately to make it easier to read.
In the order I asked:
Yes 2^31 is a hard limit in R, though it seems to matter what type it is (which is a bit strange given it is the length of the vector, rather than the amount of memory (which I have plenty of) which is the stated problem. Do not convert strata or id variables to factors, that will just fix their length and nullify the effects of subsetting (which is the way to get around this problem).
sql could probably help, provided I learn how to use it correctly. I did the following test:
library(multicore) # make svy fast!
ri.ny <- subset(ipums, statefips_num %in% c(36, 44))
ri.ny.design <- svydesign(id=~serial, weights=~perwt, strata=~strata, data=ri.ny)
svyby(~incwage, ~strata, ri.ny.design, svymean, data=ri.ny, na.rm=TRUE, multicore=TRUE)
ri <- subset(ri.ny, statefips_num==44)
ri.design <- svydesign(id=~serial, weights=~perwt, strata=~strata, data=ri)
ri.mean <- svymean(~incwage, ri.design, data=ri, na.rm=TRUE)
ny <- subset(ri.ny, statefips_num==36)
ny.design <- svydesign(id=~serial, weights=~perwt, strata=~strata, data=ny)
ny.mean <- svymean(~incwage, ny.design, data=ny, na.rm=TRUE, multicore=TRUE)
And found the means to be the same, which seems like a reasonable test.
So: in theory, provided I can split up the calculation by either using plyr or sql, the results should still be fine.
See 2.
Throwing a lot of memory at Stata definitely helps, but now I'm running into annoying formatting issues. I seem to be able to perform most of the calculation I want (much quicker and with more stability as well) but I can't figure out how to get it into the form I want. Will probably ask a separate question on this. I think the short version here is that for big survey data, Stata is much better out of the box.
In many ways yes. Trying to do analysis with data this big is not something I should have taken on lightly, and I'm far from figuring it out even now. I was using the svydesign function correctly, but I didn't really know what's going on. I have a (very slightly) better grasp now, and it's heartening to know I was generally correct about how to solve the problem. #Gavin's general suggestion of trying out small data with external results to compare to is invaluable, something I should have started ages ago. Many thanks to both #Gavin and #Martin.