Dynamic Optmization - optimization

I'm trying to set a problem that maximizes the consumption(u) over the time (10 years) subject to the variation of wealth (xt+1 - xt= -ct), where x0=1000. I've tried to solve it, but can't evaluate function at initial parameters. Also, the final conditions are: x10 >= 0 and u10 >= 0. Here's the code I'm using:
library('optimx')
# Parameters
t <- seq(0,10) # time
desc <- 0.1 # tax
beta <- 1/(1+desc) # parameter of function
alpha <- 0.8 # function
xo = 1000 # wealth in t=0
# Variables
x <- rep(xo, length(t)) # wealth in time
# Function
utilidade_total <- function(u, npar=TRUE, print=TRUE){
for (i in lenght(t)){
obj[i]=(beta^i)*(u[i]^alpha)
x[i]=x[i-1]-u[i-1]
}
result <- sum(obj)
return(result)
}
maxuti <- optimx(par=u,fn= utilidade_total,
control = list(maximize=TRUE))
umax <- coef(maxuti)
for (i in 2:lenght(t)){
print(obj[i]=(beta^i)*(u^alpha))
print(stock[i]=x[i-1]-umax[i-1])
}
# Result
for (i in 2:lenght(t)){
obj[i]=(beta^i)*(u^alpha)
x[i]=x[i-1]-u[i-1]
}
plot(t, x, type='b')
plot(t, u, type='b')

Related

For loop to read in multiple tables from SQLite database

I would like to create a for loop that reads in multiple tables from a SQLite database. I would like it to either read the first 300 tables, but ideally I would like to get it to read 300 random tables from my database into R.
For each table read in, I would like it to go through the written code, save the graph at the end then start over with a new table. If possible I would like the all of the tables to be on the same graph. I have written the code for a single table, but I am unsure as to how I could proceed from here.
for (i in 1:300){
# Reads the selected table in database
ind1 <- dbReadTable(mydb, i)
# Formats the SQL data to appropriate R data structure
cols <- c("Mortality", "AnimalID", "Species", "Sex", "CurrentCohort",
"BirthYear", "CaptureUnit","CaptureSubunit",
"CaptureArea", "ProjectName")
ind[cols] <- lapply(ind[cols], factor) ## as.factor() could also be used
ind$DateAndTime <- as.POSIXct(ind$DateAndTime, tz = "UTC",
origin = '1970-01-01')
# Converts the Longitude and Latitude to UTMs
ind <- convert_utm(ind1)
ind_steps <- ind %>%
# It's always a good idea to *double check* that your data are sorted
# properly before using lag() or lead() to get the previous/next value.
arrange(AnimalID, DateAndTime) %>%
# If we group_by() AnimalID, lead() will insert NAs in the proper
# places when we get to the end of one individual's data and the beginning
# of the next
group_by(AnimalID) %>%
# Now rename our base columns to reflect that they are the step's start point
rename(x1 = utm_x,
y1 = utm_y,
t1 = DateAndTime) %>%
# Attach the step's end point
mutate(x2 = lead(x1),
y2 = lead(y1),
t2 = lead(t1)) %>%
# Calculate differences in space and time
mutate(dx = x2 - x1,
dy = y2 - y1,
DateAndTime = as.numeric(difftime(t2, t1, units = "hours"))) %>%
# Calculate step length
mutate(sl = sqrt(dx^2 + dy^2)) %>%
# Calculate absolute angle
mutate(abs_angle = (pi/2 - atan2(dy, dx)) %% (2*pi)) %>%
# Calculate relative angle
mutate(rel_diff = (abs_angle - lag(abs_angle)) %% (2*pi),
rel_angle = ifelse(rel_diff > pi, rel_diff - 2*pi, rel_diff)) %>%
# Drop this uneccesary column
select(-rel_diff) %>%
# Drop incomplete final step
filter(!is.na(x2))
ind_steps <- ind_steps %>%
mutate(NSD = (x2 - x1[1])^2 + (y2 - y1[1])^2)
# Plot NSD
ind_steps %>%
ggplot(aes(x = t2, y = NSD)) +
geom_line() +
theme_bw()
}
Any help would be greatly appreciated!
If there are 1000 tables you can use sample to get random 300 from them, create a list with length 300 to store the plots and if you want to plot them together you can use cowplot::plot_grid.
random_tables <- sample(1000, 300, replace = TRUE)
plot_list <- vector('list', 300)
for (i in seq_along(random_tables)){
# Reads the selected table in database
ind1 <- dbReadTable(mydb, random_tables[i])
#...Rest of the code
#....
#....
# Plot NSD
plot_list[[i]] <- ggplot(ind_steps, aes(x = t2, y = NSD)) +
geom_line() + theme_bw()
}
cowplot::plot_grid(plotlist = plot_list, nrow = 30, ncol = 10)

Hessian matrix with optim or numderiv package?

I do maximum likelihood estimation for a loglikelihood function of a poisson distribution. After the estimation i want to compute the standard errors of the coeffients. For that i need the hessian matrix. Now i dont know which function i should use to get the hessian matrix , optim() or hessian() from the numderiv package.
Both function give me different solution. And if i try to compute Standard errors from the hessian that i get from optim i get one NaN entry in the result.
Whats the difference between these two functions for the compution of the hessian matrix?
logLikePois <- function(parameter,y, z) {
betaKoef <- parameter
lambda <- exp(betaKoef %*% t(z))
logLikeliHood <- -(sum(-lambda+y*log(lambda)-log(factorial(y))))
return(logLikeliHood)
}
grad <- function (y,z,parameter){
betaKoef <- parameter
# Lamba der Poissonregression
lambda <- exp(betaKoef%*%t(z))
gradient <- -((y-lambda)%*%(z))
return(gradient)
}
data(discoveries)
disc <- data.frame(count=as.numeric(discoveries),
year=seq(0,(length(discoveries)-1),1))
yearSqr <- disc$year^2
formula <- count ~ year + yearSqr
form <- formula(formula)
model <- model.frame(formula, data = disc)
z <- model.matrix(formula, data = disc)
y <- model.response(model)
parFullModell <- rep(0,ncol(z))
optimierung <- optim(par = parFullModell,gr=grad, fn = logLikePois,
z = z, y = y, method = "BFGS" ,hessian = TRUE)
optimHessian <- optimierung$hessian
numderivHessian <- hessian(func = logLikePois, x = optimierung$par, y=y,z=z)
sqrt(diag(solve(optimHessian)))
sqrt(diag(solve(numderivHessian )))

Depth Profiling visualization

I'm trying to create a depth profile graph with the variables depth, distance and temperature. The data collected is from 9 different points with known distances between them (distance 5m apart, 9 stations, 9 different sets of data). The temperature readings are according to these 9 stations where a sonde was dropped directly down, taking readings of temperature every 2 seconds. Max depth at each of the 9 stations were taken from the boat also.
So the data I have is:
Depth at each of the 9 stations (y axis)
Temperature readings at each of the 9 stations, at around .2m intervals vertical until the bottom was reached (fill area)
distance between the stations, (x axis)
Is it possible to create a depth profile similar to this? (obviously without the greater resolution in this graph)
I've already tried messing around with ggplot2 and raster but I just can't seem to figure out how to do this.
One of the problems I've come across is how to make ggplot2 distinguish between say 5m depth temperature reading at station 1 and 5m temperature reading at station 5 since they have the same depth value.
Even if you can guide me towards another program that would allow me to create a graph like this, that would be great
[ REVISION ]
(Please comment me if you know more suitable interpolation methods, especially not needing to cut under bottoms data.)
ggplot() needs long data form.
library(ggplot2)
# example data
max.depths <- c(1.1, 4, 4.7, 7.7, 8.2, 7.8, 10.7, 12.1, 14.3)
depth.list <- sapply(max.depths, function(x) seq(0, x, 0.2))
temp.list <- list()
set.seed(1); for(i in 1:9) temp.list[[i]] <- sapply(depth.list[[i]], function(x) rnorm(1, 20 - x*0.5, 0.2))
set.seed(1); dist <- c(0, sapply(seq(5, 40, 5), function(x) rnorm(1, x, 1)))
dist.list <- sapply(1:9, function(x) rep(dist[x], length(depth.list[[x]])))
main.df <- data.frame(dist = unlist(dist.list), depth = unlist(depth.list) * -1, temp = unlist(temp.list))
# a raw graph
ggplot(main.df, aes(x = dist, y = depth, z = temp)) +
geom_point(aes(colour = temp), size = 1) +
scale_colour_gradientn(colours = topo.colors(10))
# a relatively raw graph (don't run with this example data)
ggplot(main.df, aes(x = dist, y = depth, z = temp)) +
geom_raster(aes(fill = temp)) + # geom_contour() +
scale_fill_gradientn(colours = topo.colors(10))
If you want a graph such like you showed, you have to do interpolation. Some packages give you spatial interpolation methods. In this example, I used akima package but you should think seriously that which interpolation methods to use.
I used nx = 300 and ny = 300 in below code but I think it would be better to decide those values carefully. Large nx and ny gives a high resolution graph, but don't foreget real nx and ny (in this example, real nx is only 9 and ny is 101).
library(akima); library(dplyr)
interp.data <- interp(main.df$dist, main.df$depth, main.df$temp, nx = 300, ny = 300)
interp.df <- interp.data %>% interp2xyz() %>% as.data.frame()
names(interp.df) <- c("dist", "depth", "temp")
# draw interp.df
ggplot(interp.df, aes(x = dist, y = depth, z = temp)) +
geom_raster(aes(fill = temp)) + # geom_contour() +
scale_fill_gradientn(colours = topo.colors(10))
# to think appropriateness of interpolation (raw and interpolation data)
ggplot(interp.df, aes(x = dist, y = depth, z = temp)) +
geom_raster(aes(fill = temp), alpha = 0.3) + # interpolation
scale_fill_gradientn(colours = topo.colors(10)) +
geom_point(data = main.df, aes(colour = temp), size = 1) + # raw
scale_colour_gradientn(colours = topo.colors(10))
Bottoms don't match !!I found ?interp says "interpolation only within convex hull!", oops... I'm worrid about the interpolation around the problem-area, is it OK ? If no problem, you need only cut the data under the bottoms. If not, ... I can't answer immediately (below is an example code to cut).
bottoms <- max.depths * -1
# calculate bottom values using linear interpolation
approx.bottoms <- approx(dist, bottoms, n = 300) # n must be the same value as interp()'s nx
# change temp values under bottom into NA
library(dplyr)
interp.cut.df <- interp.df %>% cbind(bottoms = approx.bottoms$y) %>%
mutate(temp = ifelse(depth >= bottoms, temp, NA)) %>% select(-bottoms)
ggplot(interp.cut.df, aes(x = dist, y = depth, z = temp)) +
geom_raster(aes(fill = temp)) +
scale_fill_gradientn(colours = topo.colors(10)) +
geom_point(data = main.df, size = 1)
If you want to use stat_contour
It is harder to use stat_contour than geom_raster because it needs a regular grid form. As far as I see your graph, your data (depth and distance) don't form a regular grid, it means it is much difficult to use stat_contour with your raw data. So I used interp.cut.df to draw a contour plot. And stat_contour have a endemic problem (see How to fill in the contour fully using stat_contour), so you need to expand your data.
library(dplyr)
# 1st: change NA into a temp's out range value (I used 0)
interp.contour.df <- interp.cut.df
interp.contour.df[is.na(interp.contour.df)] <- 0
# 2nd: expand the df (It's a little complex, so please use this function)
contour.support.func <- function(df) {
colname <- names(df)
names(df) <- c("x", "y", "z")
Range <- as.data.frame(sapply(df, range))
Dim <- as.data.frame(t(sapply(df, function(x) length(unique(x)))))
arb_z = Range$z[1] - diff(Range$z)/20
df2 <- rbind(df,
expand.grid(x = c(Range$x[1] - diff(Range$x)/20, Range$x[2] + diff(Range$x)/20),
y = seq(Range$y[1], Range$y[2], length = Dim$y), z = arb_z),
expand.grid(x = seq(Range$x[1], Range$x[2], length = Dim$x),
y = c(Range$y[1] - diff(Range$y)/20, Range$y[2] + diff(Range$y)/20), z = arb_z))
names(df2) <- colname
return(df2)
}
interp.contour.df2 <- contour.support.func(interp.contour.df)
# 3rd: check the temp range (these values are used to define contour's border (breaks))
range(interp.cut.df$temp, na.rm=T) # 12.51622 20.18904
# 4th: draw ... the bottom border is dirty !!
ggplot(interp.contour.df2, aes(x = dist, y = depth, z = temp)) +
stat_contour(geom="polygon", breaks = seq(12.51622, 20.18904, length = 11), aes(fill = ..level..)) +
coord_cartesian(xlim = range(dist), ylim = range(bottoms), expand = F) + # cut expanded area
scale_fill_gradientn(colours = topo.colors(10)) # breaks's length is 11, so 10 colors are needed
# [Note]
# You can define the contour's border values (breaks) and colors.
contour.breaks <- c(12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5, 19.5, 20.5)
# = seq(12.5, 20.5, 1) or seq(12.5, 20.5, length = 9)
contour.colors <- c("darkblue", "cyan3", "cyan1", "green3", "green", "yellow2","pink", "darkred")
# breaks's length is 9, so 8 colors are needed.
# 5th: vanish the bottom border by bottom line
approx.df <- data.frame(dist = approx.bottoms$x, depth = approx.bottoms$y, temp = 0) # 0 is dummy value
ggplot(interp.contour.df2, aes(x = dist, y = depth, z = temp)) +
stat_contour(geom="polygon", breaks = contour.breaks, aes(fill = ..level..)) +
coord_cartesian(xlim=range(dist), ylim=range(bottoms), expand = F) +
scale_fill_gradientn(colours = contour.colors) +
geom_line(data = approx.df, lwd=1.5, color="gray50")
bonus: legend technic
library(dplyr)
interp.contour.df3 <- interp.contour.df2 %>% mutate(temp2 = cut(temp, breaks = contour.breaks))
interp.contour.df3$temp2 <- factor(interp.contour.df3$temp2, levels = rev(levels(interp.contour.df3$temp2)))
ggplot(interp.contour.df3, aes(x = dist, y = depth, z = temp)) +
stat_contour(geom="polygon", breaks = contour.breaks, aes(fill = ..level..)) +
coord_cartesian(xlim=range(dist), ylim=range(bottoms), expand = F) +
scale_fill_gradientn(colours = contour.colors, guide = F) + # add guide = F
geom_line(data = approx.df, lwd=1.5, color="gray50") +
geom_point(aes(colour = temp2), pch = 15, alpha = 0) + # add
guides(colour = guide_legend(override.aes = list(colour = rev(contour.colors), alpha = 1, cex = 5))) + # add
labs(colour = "temp") # add
You want to treat this as a 3-D surface with temperature as the z dimension. The given plot is a contour plot and it looks like ggplot2 can do that with stat_contour.
I'm not sure how the contour lines are computed (often it's linear interpolation along a Delaunay triangulation). If you want more control over how to interpolate between your x/y grid points, you can calculate a surface model first and feed those z coordinates into ggplot2.

shiny sliderInput range minimum and maximum values

I need to adjust a histogram output in shiny with a range values:
(where m is an arbitrary matrix)
#ui.R
sliderInput(inputId="adjust", label="Choose adjacency threshold", value=c(0.001, 0.9), min=0.0001, max=1),
plotOutput("hist")
#server.R
df<-reactive({
idx = m > min(input$adjust) & m < max(input$adjust)
data.frame(
id = row(m)[idx],
value = m[idx])
})
output$hist<-renderPlot({hist(df()$values)})
However this doesn't seem to affect the histogram- it re-renders when I toggle the slider but its the same each time.... it takes a very long time and it just seems to take all of the values in into account?
Does anyone know how to make this work?
when I try to print the slider's min/max value- nothing comes to the page:
#ui.R
verbatimTextOutput("x")
#server
output$x<-renderPrint({min(input$adjust)})
Hence I might be approaching this the completely wrong way... does anyone know how to do this?
FULL EXAMPLE
library(shiny)
runApp(list(ui = fluidPage(sliderInput(inputId="adjust", label="Choose adjacency threshold", value=c(0.001, 0.9), min=0.0001, max=1),
plotOutput("hist")
server=function(input, output){
adjacentmat<-reactive({adjacency(dat)})
data<-reactive({
adj_mat<-adjacentmat()
adj_mat[adj_mat < input$adjust] <- 0
m<-adj_mat
idx = m > min(input$adjust) & m < max(input$adjust)
data.frame(
source = row(m)[idx],
target = col(m)[idx],
corr = m[idx])
})
output$hist<-renderPlot({hist(data()$corr)})
}
)
generate the dat variable with the following code:
library('dplyr')
set.seed(1)
# generate a couple clusters
nodes_per_cluster <- 30
n <- 10
nvals <- nodes_per_cluster * n
# cluster 1 (increasing)
cluster1 <- matrix(rep((1:n)/4, nodes_per_cluster) +
rnorm(nvals, sd=1),
nrow=nodes_per_cluster, byrow=TRUE)
# cluster 2 (decreasing)
cluster2 <- matrix(rep((n:1)/4, nodes_per_cluster) +
rnorm(nvals, sd=1),
nrow=nodes_per_cluster, byrow=TRUE)
# noise cluster
noise <- matrix(sample(1:2, nvals, replace=TRUE) +
rnorm(nvals, sd=1.5),
nrow=nodes_per_cluster, byrow=TRUE)
dat <- rbind(cluster1, cluster2, noise)
colnames(dat) <- paste0('n', 1:n)
rownames(dat) <- c(paste0('cluster1_', 1:nodes_per_cluster),
paste0('cluster2_', 1:nodes_per_cluster),
paste0('noise_', 1:nodes_per_cluster))
This works for me:
library(shiny)
runApp(list(ui = fluidPage(
mainPanel(sliderInput("test", "Select values", value= c(.001,.9), min= 0.0001, max= 1)),
verbatimTextOutput("test2")
),
server = function(input, output, session) {
output$test2 <- renderPrint(min(input$test))
}))
I'm guessing your problem is somewhere in the code you haven't shown us. Can you give the code for the entire running example of your problem?

Convert topicmodels output to JSON

I use the following function to convert the topicmodels output to JSON output to use in ldavis.
topicmodels_json_ldavis <- function(fitted, corpus, doc_term){
## Required packages
library(topicmodels)
library(dplyr)
library(stringi)
library(tm)
library(LDAvis)
## Find required quantities
phi <- posterior(fitted)$terms %>% as.matrix
theta <- posterior(fitted)$topics %>% as.matrix
vocab <- colnames(phi)
doc_length <- vector()
for (i in 1:length(corpus)) {
temp <- paste(corpus[[i]]$content, collapse = ' ')
doc_length <- c(doc_length, stri_count(temp, regex = '\\S+'))
}
temp_frequency <- inspect(doc_term)
freq_matrix <- data.frame(ST = colnames(temp_frequency),
Freq = colSums(temp_frequency))
rm(temp_frequency)
## Convert to json
json_lda <- LDAvis::createJSON(phi = phi, theta = theta,
vocab = vocab,
doc.length = doc_length,
term.frequency = freq_matrix$Freq)
return(json_lda)
}
but I receive the following error
Error in LDAvis::createJSON(phi = phi, theta = theta, vocab = vocab, doc.length = doc_length, : Length of doc.length not equal
to the number of rows in theta; both should be equal to the number of
documents in the data.
Here is my complete code:
data <- read.csv("textmining.csv")
corpus <- Corpus(DataframeSource(data.frame(data$reasonforleaving)))
# Remove punctuations and numbers because they are generally uninformative.
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
# Convert all words to lowercase.
corpus <- tm_map(corpus, content_transformer(tolower))
# Remove stopwords such as "a", "the", etc.
corpus <- tm_map(corpus, removeWords, stopwords("english"))
# Use the SnowballC package to do stemming.
library(SnowballC)
corpus <- tm_map(corpus, stemDocument)
# remove extra words
toSpace <- content_transformer(function (x , pattern ) gsub(pattern, " ", x))
corpus <- tm_map(corpus, toSpace, "still")
corpus <- tm_map(corpus, toSpace, "also")
# Remove excess white spaces between words.
corpus <- tm_map(corpus, stripWhitespace)
# Inspect the first document to see what it looks like.
corpus[[1]]$content
dtm <- DocumentTermMatrix(corpus)
# remove empty documents
library(slam)
dtm = dtm[row_sums(dtm)>0,]
# Use topicmodels package to conduct LDA analysis.
burnin <- 500
iter <- 1000
keep <- 30
k <- 5
result55 <- LDA(dtm, 5)
ldaoutput = topicmodels_json_ldavis(result55,corpus, dtm)
Do you know why I receive the error?
Thanks
I had the same issue with same code, and found this function here :
topicmodels2LDAvis <- function(x, ...){
post <- topicmodels::posterior(x)
if (ncol(post[["topics"]]) < 3) stop("The model must contain > 2 topics")
mat <- x#wordassignments
LDAvis::createJSON(
phi = post[["terms"]],
theta = post[["topics"]],
vocab = colnames(post[["terms"]]),
doc.length = slam::row_sums(mat, na.rm = TRUE),
term.frequency = slam::col_sums(mat, na.rm = TRUE)
)
}
Much much simpler to use, just put as argument your LDA result :
result55 <- LDA(dtm, 5)
serVis(topicmodels2LDAvis(result55))
Problem
Your problem is in for (i in 1:length(corpus)) in
doc_length <- vector()
for (i in 1:length(corpus)) {
temp <- paste(corpus[[i]]$content, collapse = ' ')
doc_length <- c(doc_length, stri_count(temp, regex = '\\S+'))
}
Remember, you have removed some "empty" documents from your DocumentTermMatrix in dtm = dtm[row_sums(dtm)>0,],
so your vector length here is going to be too big.
Suggestion
You may want to keep a vector of the empty docs around as it will help you not only to generate the JSON but also to go back and forth between your empty and full document sets.
doc.length = colSums( as.matrix(tdm) > 0 )[!empty.docs]
My suggestion assumes you have the full tdm with empty docs in place