I was wondering if anyone has code for a BUGS/JAGS model for a repeated measures ANOVA? Basically, I have a response (y) that I want to model against Time of day, Day, and Treatment. I would also like to include two interaction terms, Treatment x Time of Day and Treatment x Day. There are about 20 individuals in the study, who were measured 4 times per day over about 1 week. I'm not entirely sure where to start, and I'm concerned that the Time of day covariate should also be nested within the Day covariate? If anyone has code for the likelihood portion of the BUGS/JAGS model, it would be greatly appreciated. I can take care of priors. Just can't seem to get off the ground with this one.
There are a few ambiguities in your question.
Do you want Time of Day and Day to enter as continuous covariates or as discrete factors?
Do you want individual identity to enter the model as a fixed or random effect?
If either Day or Time of Day is a factor, do you want to include it as a fixed or random effect?
You ask about whether Time of Day should be nested within Day. This is impossible to answer without knowing more about your data and your aims.
Here's an example of code that assumes that you want to treat individuals as a random effect.
Also assumed: Treatment, Time.of.day, and Day have constant slopes across all individuals. It would be straightforward to extend this model to a fixed- or random-slopes model where different individuals get separate modeled slopes. For example, for a random-slopes model, you'd just modify the beta parameters below to treat them in a manner similar to the alpha parameter.
Following the OP's request, this is the likelihood portion only, and does not include the priors.
for(i in 1:n.observations){
y[i] ~ dnorm(alpha[individual[[i]] + beta1*Day[i] + beta2*Time.of.day[i] + beta3*Treatment[i] + beta4*Treatment[i]*Day[i] + beta5*Treatment[i]*Time.of.day[i], tau.obs)
}
# individual[i] contains the numerical index representing the individual that corresponds to observation i.
for(j in 1:n.individuals){
alpha[j] ~ dnorm(mu, tau)
}
Related
I'm relatively new to R and have to perform a Linear Mixed Model-Analysis on some data for my university studies.
To describe my data ("data_complete_group"):
I tested flow (variable: "fss_M") for individual soccer players deriving from 4 different teams. Each team (and therefore each player in the related team) was allocated to one of two conditions of the variable "group".
Each person also completed three different surveys on three different days and therefore had a personalized "ID" (which is also a variable in the model).
The variable "team_num" represents the related team for each player.
I now want test whether the group-factor has a significant influence on the flow-score.
The model looks as follows:
model1 <- lmer(fss_M ~ group + (1 + group|team_num/ID), data = data_complete_group)
summary(model1)
If I understood it correctly, this means that "fss_M ~ group" is the fixed effect with "1 + group|team_num/ID" as random effect.
Unfortunately I get an error message when I want to run the code:
Eror: number of levels of each grouping factor must be < number of observations (problems: ID:team_num)
In contrast to that, the analysis works when I remove the term for the random effect.
How can I understand this? What's wrong with the code for the analysis with fixed + random effect?
I'm glad for every answer to this, thanks a lot!
I am wondering whether it is possible to create constraints over years and hours at the same time in Pyomo.
For example, my current time variable is:
model.T = pyo.RangeSet(len(hourly_data.index))
However, this does not allow me to distinguish between hours and years. I do have a timestamp variable, that contains the date and the time. So, I thought perhaps I could do:
model.T2 = pyo.Set(initialize=hourly_data.DateTime)
Now the problem comes on how to manipulate this TimeStamp object. Consider that the parameters are given and the variables are outputs from the solver. Let's first assume that our objective function is a maximisation function. We would like to create the following constraint:
Get the maximum water used, in normal circumstances, if we would like to get the maximum water usage of during all hours, we can do:
model.c_maxWater = pyo.ConstraintList()
for t in model.T:
model.c_maxWater.add(model.waterUsage[t] <=
model.maxWater)
With a penalty in the objective function associated with model.maxWater. The problem becomes what if we want to penalise every year differently, because we have different water costs? I can imagine that our constraint would be somewhat like:
model.c_maxWater = pyo.ConstraintList()
for t in model.T2:
model.c_maxWater.add(model.waterUsage[t] <=
model.maxWater[y])
My problem is: how can I associate the t variable with certain years y. One index is hourly (in this case the t and the other is annually (y)?
Note: a multi index set is possible, but how to deal with leap years etc.? Can a multi index set have different lengths in it's hourly dimensions for leap years?
You could double-index your variables and data with [hour, year], but that would be redundant information, right? As you should be able to calculate the years from the hours (with or without some initial offset for the yearly rollover, if that is important.)
I would go about this making subsets of your time index associated with years. Do this outside of pyomo using set/list comprehensions and a little math and/or some of the functionality in DateTime if (as you say) you are spanning many years and leap years, etc. [Aside: If you are making an hourly model that spans years--it will probably collapse under it's own weight, but that is a secondary issue. :) ] Then you can use these subsets to build constraints in your model without muddying things up with extra indices.
Comment back if stuck...
Let's assume a variation on Nurse Rostering example in which instead of assigning a nurse to a shift on a day, the nurse is assigned to a variable number of timeblocks on that day (which consists of 24 timeblocks). eg: Nurse1 is assigned to timeblocks [8,9,10,11,12,13,14]. Let's call these consecutive assignments a ShiftPeriod. There is a hard minimum and maximum on these shiftperiods. However, optaplanner has difficulties finding a feasible solution.
When having hard consecutive constraints, is it better to model the planning entity as a startTimeBlock with a duration instead of my current way with assignment to a timeblock and a day and then imposing min/max consecutive?
Take a look at the meeting scheduling example on github master for 6.4.0.Beta1 (but the example will work perfectly with 6.3.0.Final too). Video and docs coming soon. That example uses the design pattern TimeGrains, which is what you're looking for I think.
I'm looking for recommendations on a best practice here.
I have a requirement where on a given day I must have an arbitrary number of intervals (think buckets of time which are composed of transactions) where I can have at most N intervals per day. These intervals are like time but can be arbitrary lengths i.e. some are seconds, others are minutes.
How the intervals should be formed is based on my source data. On any given day, we always start with interval 1 and it is unknown the total number of intervals we will have by EOD, each interval is defined by a fixed number of transactions. For every interval I am going to need to know the end time as well.
What is the best approach here? Should I be bucketing my fact table and connecting to a standard hour/minute/second dimension or should I be using my transactional data to be making a dimension that accommodates it?
I appreciate your feedback.
If the buckets are on time, you probably have to do it on one of your dimensions. There is a property on the attributes called bucket that can do that for you
Ok, I'm just curious what the formula would be for calculating an expected income over the next X weeks/months/etc, if the only data I have in mySQL DB is all past transactions (dates of transactions, amounts, etc)
I am thinking taking some averages and whatnot, but I can't think of a specific formula (there must be something along those lines) to take say average rise of income over time (weekly/monthly) and then apply it to a select future period and display it weekly/monthly/etc?
Any suggestions?
use AVG() on the income in the past devide it to proper weekly/monthly amounts if neccessary.
see http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html#function_avg for more info on AVG()
Linear regression + simple integration is probably sufficient for your needs. I leave sorting out exact implementation for your DB up to you, but that follow that link to the "Estimation Methods" section, and probably use Ordinary Least Squares.
Alternatively, you can always slurp your data into something like R where the details are already implemented.
EDIT:
For more detail: you're trying to model INCOME = BASE + SCALING*T where we are assuming that a linear model is "good" (it's probably not great, but it's probably good enough on a short time scale). For two value linear regression, you're pretty much just taking averages; follow that link to "Fitting the Regression Line" and you'll see which things you need to average (y = INCOME and x = T). There are some tricks you can play to simplify the calculation for the computer if you can enforce some other conditions (e.g., having equally spaced time periods + no missing data), but you'll need to math a bit more yourself first if you want to do that (and you'll be less flexible in the face of changing db assumptions).