Kevin,
I didn't have the data set (you might want to post a link to a
downloadable file instead), but on a quick look at the code your foo1
function looks as if it is not guaranteed to return an array of the same
length each time. It's testing for nonzero coefs in a fitted model and then
dropping exact zero coefs. That need not (and often will not) return the
same coefficients every time in a simulation. foo2 does the same general
kind of thing.
Could that be part of the problem?
S Ellison
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Kevin Egan
Sent: 05 January 2021 12:26
To: r-help <r-help at r-project.org>
Subject: [R] Error Running Bootstrap Function within Wrapper Function
===============
EXTERNAL EMAIL
===============
Hello,
I am currently trying to solve a problem with the boot package and
function within a function in R. I have developed several functions to
perform the lasso but continue to receive an error when bootstrapping
functions within a wrapper function. When I perform these methods using
tsboot outside the wrapper function, I do not get an error. However, when
placed within my function I continue to get the error "Error in t[r, ]
: number of items to replace is not a multiple of length."
I've attached an example of my functions as well as a data file of the
am using. I'm sorry the file is so large, but I do not get a problem
smaller number of observations.
library(boot)
library(glmnet)
library(np)
foo1 <- function(data,index){ #index is the bootstrap sample index
x <- data[index, -1] %>%
as.matrix() %>%
unname()
y <- data[index, 1] %>%
scale(center = TRUE, scale = FALSE) %>%
as.matrix() %>%
unname()
ols <- lm(y ~ x)
# The intercept estimate should be dropped.
ols.coef <- as.numeric(coef(ols))[-1]
ols.coef[is.na(ols.coef)] <- 0
## The intercept estimate should be dropped.
lasso <- cv.glmnet(x, y, alpha = 1,
penalty.factor = 1 / abs(ols.coef))
# Select nonzero coefficients from bic.out
coef <- as.vector(coef(lasso,
s = lasso$lambda.min))[-1]
return(coef)
}
foo2 <- function(data, index){ #index is the bootstrap sample index
x <- data[index, -1] %>%
as.matrix() %>%
unname()
y <- data[index, 1] %>%
scale(center = TRUE, scale = FALSE) %>%
as.matrix() %>%
unname()
# ic.glmnet provides coefficients with lowest BIC
ols <- lm(y ~ x)
# The intercept estimate should be dropped.
ols.coef <- as.numeric(coef(ols))[-1]
ols.coef[is.na(ols.coef)] <- 0
lasso <- cv.glmnet(x, y, alpha = 1,
penalty.factor = 1 / abs(ols.coef))
# Select nonzero coefficients from bic.out
coef <- as.vector(coef(lasso,
s = lasso$lambda.min))[-1]
coef_nonzero <- coef != 0
if(sum(coef_nonzero) > 0) {
ls_obj <- lm(y ~ x[, coef_nonzero, drop = FALSE])
ls_coef <- as.vector(coef(ls_obj))[-1]
coef[coef_nonzero] <- ls_coef
}
return(coef)
}
foo3 <- function(data, num_samples) {
bstar <- b.star(data[, 1], round = TRUE)
# Select Block Length of circular block result
blocklength <- bstar[, 2]
init_boot_ts <- tsboot(tseries = data,
statistic = foo1,
R = num_samples, l = blocklength,
sim = "fixed")
final_boot_ts <- tsboot(tseries = data,
statistic = foo2,
R = num_samples,
l = blocklength, sim = "fixed")
# point estimates
final_boot_t0 <- final_boot_ts$t0
return(list(point_estimates = final_boot_t0)) } num_samples <- 50
test.foo3 <- foo3(data, num_samples = num_samples)
Which *sometimes *works, however, sometimes I get the error: ?Error in
] <- res[[r]] : number of items to replace is not a multiple of length".
I've also gotten the error "Error in x[, coef_nonzero, drop = FALSE]
:(subscript) logical subscript too long" at times, when running foo2
foo3. Which seems to be unclear as there should never be an index which
greater than the number of columns in x.
As I stated, this error particularly occurs when I increase the number of
bootstrap samples I try to run, and only when I run foo3. When I run foo1
and foo2 separately, I don?t get an error at all. I?m wondering if there
something I need to add to my function ?foo3? to ensure that an error
this doesn?t occur since I am calling several other functions within this
function, ie., tsboot, foo1, and foo2.
Lastly, although I have not provided an example here, I get an error for
code when running with lapply for several dataframes similar to the
attached.
Again, I apologise for attaching such a large data frame but the function
seems to work with fewer observations. Perhaps it is a data issue?
Thanks