Skip to content

Using sapply instead of for loop

10 messages · Amit Thombre, Charles Determan Jr

#
I am trying to replace a for loop by using sapply, The code is for forecasting using arima. The code is as follows:-
-------------------------------------------------------
far<-function(p)
{

cat("does it come here value of p", p)
tryCatch({
air.model <-Arima(tsa,order=c(i-1,j-1,k-1), seasonal=list(order=c(l-1,m-1,n-1),period=p-1), lambda=lbda)  # the arima model

f<- forecast(air.model,h=testsize1) # for getting the error

ervalue[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)

}, error=function(e)
{

return(NA)
}
)
cat("Value of error", ervalue[i,j,k,l,m,n,p])
cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
print(ervalue)
return(ervalue)
}
---------------------------
maxval=2  # set the array size as well as the maximum parameter value here.
pmax=maxval  # set max p value of the ARIMA model
dmax=maxval  # set max d value of the ARIMA model
qmax=maxval  # set max q value of the ARIMA model
Pmax=maxval  # set max P value of the ARIMA model
Dmax=maxval  # set max D value of the ARIMA model
Qmax=maxval  # set max Q value of the ARIMA model
Permax=2     # maximum value of period.

st=2013   # start year value for getting the time series
month=4 d<-c(10, 13, 14, 4, 5, 6, 7, 10, 12, 13, 14, 20, 3, 4, 5, 19, 23, 21, 18, 19, 21, 14, 15, 16, 17, 12, 20, 19, 17)
tsa<-ts(d, frequency=freq, start=c(st,month))  # store the data in tsa as  the time

A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , also it stores the AIC valuearray size
ervalue<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , stores the error value.array size

for (i in 1:pmax)
{
for (j in 1:dmax)
{
for (k in 1:qmax)
{
for (l in 1:Pmax)
{
for (m in 1:Dmax)
{
for (n in 1:Qmax)
{
A<-sapply((1:Permax),function(p) far(p),simplify=FALSE)

}
}
}
}
}  #for looping through period value
}
------------------------------------------------------------------
The sapply replaces the for loop
for (p in 1:Permax)
{
cat("does it come here value of p", p)
tryCatch({
air.model <-Arima(tsa,order=c(i-1,j-1,k-1), seasonal=list(order=c(l-1,m-1,n-1),period=p), lambda=lbda)  # the arima model
A[i,j,k,l,m,n,p]<-AIC(air.model)
f<- forecast(air.model,h=testsize1) # for getting the error
er[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)
}, error=function(e)
{

return(NA)
}
)
 cat("Value of error", er[i,j,k,l,m,n,p])
 cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
}
--------------------------------------------------------------------------
Now the er[I,j,k,l,m,n,p] I.e the error get populated but on every call to the function far() the array loses the previous value and gets replaced with NA and gets the newly calculated error value. Finally the array A gets populated with only the latest value and does not hold the old values. Please help


============================================================================================================================
Disclaimer:  This message and the information contained herein is proprietary and confidential and subject to the Tech Mahindra policy statement, you may review the policy at http://www.techmahindra.com/Disclaimer.html externally http://tim.techmahindra.com/tim/disclaimer.html internally within TechMahindra.
============================================================================================================================
#
Amit,

Your question isn't necessarily complete.  You haven't provided a
reproducible example of your data or an error message.  At first glance you
aren't passing anything to your 'far' function except for 'p' and yet it
uses i,j,k,l,m,n,testsize1, and act1.  You should generally try to avoid
global variables as they can lead to broken code.  You should redefine your
function with all the needed parameters and try again.

Regards,

On Wed, Nov 19, 2014 at 3:47 AM, Amit Thombre <amitmt at techmahindra.com>
wrote:

  
    
#
Charles ,

I am not getting an error . The final array A does not have the values in it. Here is the reproducible code.  I have even tried using paasing ervalue as a parameter to the function far.

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

errf<-function(act, res, testsize, flag)
{
j=1
if(flag==1)
{
j<-nrow(d)-testsize
}

print(act)
print(res)
print(flag)
diff<-0
s<-0
# loop for iterating to each value of the actual value and finding the difference with thepredicted value
for (mn in 1:length(act))
{
cat("Value of mn in err", mn)
cat("Value of j in err", j)
cat("Value of res[j] in err", res[j])
diff<-(act[mn]-res[j])
print(act[mn])
print(res[j])
print(diff)
s<-s+(diff*diff)

j<-j+1
}

er1<-sqrt(s/length(act)) #forecasting error
print(er1)
return(er1)
}



far<-function(p)
{

cat("does it come here value of p", p)
tryCatch({
air.model <-Arima(tsa,order=c(i-1,j-1,k-1), seasonal=list(order=c(l-1,m-1,n-1),period=p-1), lambda=lbda)  # the arima model

f<- forecast(air.model,h=testsize1) # for getting the error

ervalue[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)

}, error=function(e)
{

return(NA)
}
)
cat("Value of error", ervalue[i,j,k,l,m,n,p])
cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
print(ervalue)
return(ervalue)
}
---------------------------
library('TTR')
library('forecast')
library('timeSeries')
library('xts')
library('RODBC')


maxval=2  # set the array size as well as the maximum parameter value here.
pmax=maxval  # set max p value of the ARIMA model
dmax=maxval  # set max d value of the ARIMA model
qmax=maxval  # set max q value of the ARIMA model
Pmax=maxval  # set max P value of the ARIMA model
Dmax=maxval  # set max D value of the ARIMA model
Qmax=maxval  # set max Q value of the ARIMA model
Permax=2     # maximum value of period.
freq=12
d<-c(10, 13, 14, 4, 5, 6, 7, 10, 12, 13, 14, 20, 3, 4, 5, 19, 23, 21, 18, 19, 21, 14, 15, 16, 17, 12, 20, 19, 17)
st=2013   # start year value for getting the time series
month=4
tsa<-ts(d, frequency=freq, start=c(st,month))  # store the data in tsa as  the time

A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , also it stores the AIC valuearray size
er<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval,2)) # depdending on the max value set the , stores the error value.array size
ervalue<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , stores the error value.array size
erval1<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , stores the error value.array size
for (i in 1:pmax)
{
for (j in 1:dmax)
{
for (k in 1:qmax)
{
for (l in 1:Pmax)
{
for (m in 1:Dmax)
{
for (n in 1:Qmax)
{
A<-sapply((1:Permax),function(p) far(p),simplify=FALSE)

}
}
}
}
}  #for looping through period value
}
#
If I pass all the variables to the function in the following way then I get the following error "Error in cat("Value of error", ervalue[i, j, k, l, m, n, p]) :
  argument "ervalue" is missing, with no default. Finally the A array should have all the root mean square value calculated for each run which is missing even using global variables and even using the variables paaed to t afunction( I get the error so the code does not fill A)



----------------------------------------------------------------

errf<-function(act, res, testsize, flag)
{

j=1
if(flag==1)
{
j<-nrow(d)-testsize
}

print(act)
print(res)
print(flag)

diff<-0
s<-0

# loop for iterating to each value of the actual value and finding the difference with thepredicted value
for (mn in 1:length(act))
{

cat("Value of mn in err", mn)
cat("Value of j in err", j)
cat("Value of res[j] in err", res[j])

diff<-(act[mn]-res[j])
print(act[mn])
print(res[j])
print(diff)
s<-s+(diff*diff)

j<-j+1

}

er1<-sqrt(s/length(act)) #forecasting error
print(er1)
return(er1)

}

far<-function(p, i, j, k, l, m, n, ervalue)
{

cat("does it come here value of p", p)

tryCatch({

air.model <-Arima(tsa,order=c(i-1,j-1,k-1), seasonal=list(order=c(l-1,m-1,n-1),period=p-1), lambda=lbda)  # the arima model

f<- forecast(air.model,h=testsize1) # for getting the error



ervalue[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)

}, error=function(e)

{

return(NA)

}

)

cat("Value of error", ervalue[i,j,k,l,m,n,p])
cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
print(ervalue)
return(ervalue)
}

---------------------------
library('TTR')
library('forecast')
library('timeSeries')
library('xts')
library('RODBC')





maxval=2  # set the array size as well as the maximum parameter value here.
pmax=maxval  # set max p value of the ARIMA model
dmax=maxval  # set max d value of the ARIMA model
qmax=maxval  # set max q value of the ARIMA model
Pmax=maxval  # set max P value of the ARIMA model
Dmax=maxval  # set max D value of the ARIMA model
Qmax=maxval  # set max Q value of the ARIMA model
Permax=2     # maximum value of period.

freq=12
d<-c(10, 13, 14, 4, 5, 6, 7, 10, 12, 13, 14, 20, 3, 4, 5, 19, 23, 21, 18, 19, 21, 14, 15, 16, 17, 12, 20, 19, 17)
st=2013   # start year value for getting the time series
month=4
tsa<-ts(d, frequency=freq, start=c(st,month))  # store the data in tsa as  the time



A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , also it stores the AIC valuearray size
er<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval,2)) # depdending on the max value set the , stores the error value.array size
ervalue<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , stores the error value.array size
erval1<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , stores the error value.array size
for (i in 1:pmax)
{
for (j in 1:dmax)
{
for (k in 1:qmax)
{
for (l in 1:Pmax)
{
for (m in 1:Dmax)
{
for (n in 1:Qmax)
{

A<-sapply((1:Permax),function(p, i, j, k, l, m, n, ervalue) far(p, i, j, k, l, m,n, ervalue),simplify=FALSE)



}

}

}

}

}  #for looping through period value

}
------------------------------------------------------------------
The sapply replaces the for loop
for (p in 1:Permax)
{

cat("does it come here value of p", p)

tryCatch({

air.model <-Arima(tsa,order=c(i-1,j-1,k-1), seasonal=list(order=c(l-1,m-1,n-1),period=p), lambda=lbda)  # the arima model

A[i,j,k,l,m,n,p]<-AIC(air.model)

f<- forecast(air.model,h=testsize1) # for getting the error

er[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)

}, error=function(e)

{

return(NA)

}

)
 cat("Value of error", er[i,j,k,l,m,n,p])
 cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)

}
#
Amit,

Even if you aren't getting an error with your original global variables it
is far better practice to avoid global variables to make you code much more
stable.  Of course you ultimately get to decide how your code is written.

That said, your error from the modified far function to include the
variables is because you added too much to the sapply statement.  Here is
what it should look like:

            A<-sapply((1:Permax),function(p) far(p, i, j, k, l, m,n,
ervalue),simplify=FALSE)

You can think apply statements as nothing more than a for loop that has
been made 'pretty'.  You wanted to iterate from 1:Permax and use the other
variables, therefore you only have the anonymous function (i.e.
function(p)) only include the iterator and supply the other values from
your nested for loops to the function.  When I run this with you code,
making sure the function accepts the extra parameters, the A array appears
to fill appropriately whereby most are 'NA' as specified by your 'far'
function.  Is this what you expect?


On Wed, Nov 19, 2014 at 8:16 AM, Amit Thombre <amitmt at techmahindra.com>
wrote:

  
    
#
Charles,

Some variables were missing in the code. I have put them in this code. Now what happens is the value of cat("Value of error", ervalue[i,j,k,l,m,n,p]) gives error value for various runs but they are not in the final Array A. You will have to go through the runs carefully. The array ervalue which is printed shows the value only for that run with previous values as NA. It is like with every new value of p the previous values of ervalue are lost. Just for confirmation the A and ervalue array has the last value as 3.212016. This si just for information so that you can confirm if you are getting this value.
----------------------------------------------------------------------------------------------

errf<-function(act, res, testsize, flag)
{
j=1
if(flag==1)
{
j<-nrow(d)-testsize
}

print(act)
print(res)
print(flag)
diff<-0
s<-0
# loop for iterating to each value of the actual value and finding the difference with thepredicted value
for (mn in 1:length(act))
{
cat("Value of mn in err", mn)
cat("Value of j in err", j)
cat("Value of res[j] in err", res[j])
diff<-(act[mn]-res[j])
print(act[mn])
print(res[j])
print(diff)
s<-s+(diff*diff)

j<-j+1
}

er1<-sqrt(s/length(act)) #forecasting error
print(er1)
return(er1)
}



far<-function(p)
{
flagarima=0
cat("does it come here value of p", p)
tryCatch({
air.model <-Arima(tsa,order=c(i-1,j-1,k-1), seasonal=list(order=c(l-1,m-1,n-1),period=p-1), lambda=-0.254)  # the arima model

f<- forecast(air.model,h=5) # for getting the error

ervalue[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)

}, error=function(e)
{

return(NA)
}
)
cat("Value of error", ervalue[i,j,k,l,m,n,p])
cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
print(ervalue)
return(ervalue)
}
---------------------------
library('TTR')
library('forecast')
library('timeSeries')
library('xts')
library('RODBC')


maxval=2  # set the array size as well as the maximum parameter value here.
pmax=maxval  # set max p value of the ARIMA model
dmax=maxval  # set max d value of the ARIMA model
qmax=maxval  # set max q value of the ARIMA model
Pmax=maxval  # set max P value of the ARIMA model
Dmax=maxval  # set max D value of the ARIMA model
Qmax=maxval  # set max Q value of the ARIMA model
Permax=2     # maximum value of period.
freq=12
d<-c(3, 2, 5,29, 6, 10, 8, 4, 4, 5, 4, 6, 6, 1, 2, 3,5, 6, 9, 10)
st=2013   # start year value for getting the time series
month=4
tsa<-ts(d, frequency=freq, start=c(st,month))  # store the data in tsa as  the time
testsize1=5
act1<-d[16:20] # the array of actual values, the forecasted values will be compared against these values



A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , also it stores the AIC valuearray size
er<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval,2)) # depdending on the max value set the , stores the error value.array size
ervalue<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , stores the error value.array size
erval1<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on the max value set the , stores the error value.array size
for (i in 1:pmax)
{
for (j in 1:dmax)
{
for (k in 1:qmax)
{
for (l in 1:Pmax)
{
for (m in 1:Dmax)
{
for (n in 1:Qmax)
{
A<-sapply((1:Permax),function(p) far(p),simplify=FALSE)

}
}
}
}
}  #for looping through period value
}
#
The following provides array A with 3.212016 as the last value.  The error
values are indeed in the array here.  There is also another with 6.281757
that I noticed at first glance.

errf<-function(act, res, testsize, flag)
{
  j=1
  if(flag==1)
  {
    j<-nrow(d)-testsize
  }

  print(act)
  print(res)
  print(flag)
  diff<-0
  s<-0
  # loop for iterating to each value of the actual value and finding the
difference with thepredicted value
  for (mn in 1:length(act))
  {
    cat("Value of mn in err", mn)
    cat("Value of j in err", j)
    cat("Value of res[j] in err", res[j])
    diff<-(act[mn]-res[j])
    print(act[mn])
    print(res[j])
    print(diff)
    s<-s+(diff*diff)

    j<-j+1
  }

  er1<-sqrt(s/length(act)) #forecasting error
  print(er1)
  return(er1)
}



far<-function(p, i, j, k, l, m, n, ervalue)
{
  flagarima=0
  testsize1 = 5
  cat("does it come here value of p", p)
  tryCatch({
    air.model <-Arima(tsa,order=c(i-1,j-1,k-1),
seasonal=list(order=c(l-1,m-1,n-1),period=p-1), lambda=-0.254)  # the arima
model  # the arima model

    f<- forecast(air.model,h=testsize1) # for getting the error

    ervalue[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)

  }, error=function(e)
  {

    return(NA)
  }
  )
  cat("Value of error", ervalue[i,j,k,l,m,n,p])
  cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
  print(ervalue)
  return(ervalue)
}
---------------------------

  library('TTR')
library('forecast')
library('timeSeries')
library('xts')
library('RODBC')


maxval=2  # set the array size as well as the maximum parameter value here.
pmax=maxval  # set max p value of the ARIMA model
dmax=maxval  # set max d value of the ARIMA model
qmax=maxval  # set max q value of the ARIMA model
Pmax=maxval  # set max P value of the ARIMA model
Dmax=maxval  # set max D value of the ARIMA model
Qmax=maxval  # set max Q value of the ARIMA model
Permax=2     # maximum value of period.
freq=12
d<-c(3, 2, 5,29, 6, 10, 8, 4, 4, 5, 4, 6, 6, 1, 2, 3,5, 6, 9, 10)
st=2013   # start year value for getting the time series
month=4
tsa<-ts(d, frequency=freq, start=c(st,month))  # store the data in tsa as
 the time
testsize1=5
act1<-d[16:20] # the array of actual values, the forecasted values will be
compared against these values

A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending on
the max value set the , also it stores the AIC valuearray size
er<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval,2)) # depdending on
the max value set the , stores the error value.array size
ervalue<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) #
depdending on the max value set the , stores the error value.array size
erval1<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) #
depdending on the max value set the , stores the error value.array size
for (i in 1:pmax)
{
  for (j in 1:dmax)
  {
    for (k in 1:qmax)
    {
      for (l in 1:Pmax)
      {
        for (m in 1:Dmax)
        {
          for (n in 1:Qmax)
          {
            A<-sapply((1:Permax),function(p) far(p, i, j, k, l, m,n,
ervalue),simplify=FALSE)

          }
        }
      }
    }
  }  #for looping through period value
}

A


On Wed, Nov 19, 2014 at 9:46 AM, Amit Thombre <amitmt at techmahindra.com>
wrote:

  
    
#
The following is printed for  i,j,k,l,m,n,p 2 2 2 2 2 1 2

"Value of error 6.281757Value of i,j,k,l,m,n,p 2 2 2 2 2 1 2, , 1, 1, 1, 1, 1"
Thus ervalue[2,2,2,2,2,1,2] should be 6.28175, But after all the runs if you try to get this array value it is NA. Also I think A is a list so not sure how to extract the same but the following is displayed for the same array as ervalue for A when I type A after all the runs .
, , 2, 2, 2, 1, 2
     [,1] [,2]
[1,]   NA   NA
[2,]   NA   NA
The ervalue itself loses the values , I think and hence A does not have it.
#
Ah, this is because you are overwriting your 'A' with each loop.  As a
simple way to demonstrate this I changed:

A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2))

to

A <- list()

and then I changed
A <- sapply((1:Permax),function(p) far(p, i, j, k, l, m,n,
ervalue),simplify=FALSE)

to

A<-append(A, sapply((1:Permax),function(p) far(p, i, j, k, l, m,n,
ervalue),simplify=FALSE))

Once the run is complete you can find the 6.28757 in A[126].  You can
easily create another index so you can find it easily in the list but the
ervalue is indeed , , 2,2,2,1,2 as you show above.


On Wed, Nov 19, 2014 at 11:46 AM, Amit Thombre <amitmt at techmahindra.com>
wrote:

  
    
#
Thanks Charles. I will have to extract the min value from the list A by selecting the proper index.