Skip to content

Is combining mclapply and gbm tasks possible using R-3.0.1 ?

4 messages · Stephen Weston, Patrick Connolly

#
Apologies for such a long question.  The question is fairly simple but
takes a lot of describing.
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_NZ.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_NZ.UTF-8        LC_COLLATE=en_NZ.UTF-8    
 [5] LC_MONETARY=en_NZ.UTF-8    LC_MESSAGES=en_NZ.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] datasets  parallel  splines   grDevices utils     stats     graphics 
[8] methods   base     

other attached packages:
[1] gbm_2.1          survival_2.37-4  cairoDevice_2.19 lattice_0.20-15 

loaded via a namespace (and not attached):
[1] grid_3.0.1      multicore_0.1-7 tools_3.0.1    


Using a system with the above characteristics, I made a function
modifying some of the code from the examples in the gbm() function
help.  The objective was to run some examples with different seeds.
And to do those in parallel using mclapply.  In the interests of
limiting the size of this email body and of avoiding email software
munging the function code, I've put the code into the attached file
testing.fn.sc which can be sourced into an R session.


That function runs fine when I use an unupdated installation of
R-2.13.1 and gbm 1.6-3.1 being quite capable of using four cores
simultaneously.  (It needs slight modification to use multicore
instead of parallel and the call to gbm has no n.cores parameter.)
2013-08-21 20:57:31  Begin using multicore method with phony data with 4 cores.
Core 1 uses 20442 

Core 2 uses 20443 

Core 3 uses 20445 

Core 4 uses 20447 

 2013-08-21 20:57:36 
....Completed testing multicore method with invented data.
$a
   CV Test OOB
1 126  131  79

[...]

$d
   CV Test OOB
1 123  140  81


However, when I try it with the current versions I get this:

system.time(bbb <- testing(4))
  2013-08-21 16:18:03  Begin using multicore method with phony data with 4 cores.

Core 1 uses 22812 

Core 2 uses 22814 

Core 3 uses 22816 

Core 4 uses 22819 
This session PID is 22821:
begun at 2013-08-21 16:18:04:
This session PID is 22829:
begun at 2013-08-21 16:18:04:
This session PID is 22838:
begun at 2013-08-21 16:18:04:
This session PID is 22847:
begun at 2013-08-21 16:18:04:
 2013-08-21 16:18:07 
....Completed testing multicore method with invented data.
   user  system elapsed 
  0.460   1.760   3.926 
Warning message:
In mclapply(subsets, FUN = test.gbm, mc.cores = nc, mc.cleanup = FALSE,  :
  3 function calls resulted in an error


bbb$b
[1] "Error in socketConnection(\"localhost\", port = port, server = TRUE, blocking = TRUE,  : \n  cannot open the connection\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in socketConnection("localhost", port = port, server = TRUE, blocking = TRUE,     open = "a+b", timeout = timeout): cannot open the connection>
(bbb$a works properly and the errors on bbb$c and bbb$d are identical
to the above.)

The two lines that look like this 

   This session PID is 22847:
   begun at 2013-08-21 16:18:04:

will look mysterious.  It's explained by the fact that my .Rprofile
cats the beginning time and the process id used at the beginning of
each R session (handy to know that sometimes).  Those outputs indicate
that extra R processes are started, and I assume end with nothing to
do.

That happens even when there is no problem with mclapply such as when
only a single core is used.
2013-08-21 16:09:46  Begin using multicore method with phony data with 1 cores.

Core 1 uses 18484 
This session PID is 18487:
begun at 2013-08-21 16:09:46:

Core 2 uses 18511 
This session PID is 18543:
begun at 2013-08-21 16:09:50:

Core 3 uses 18553 
This session PID is 18556:
begun at 2013-08-21 16:09:54:

Core 4 uses 18609 
This session PID is 18612:
begun at 2013-08-21 16:09:58:
 2013-08-21 16:10:01 
....Completed testing multicore method with invented data.
   user  system elapsed 
  1.540   2.070  15.488
Running that same code (minus the PID stuff) on a Windows 7
installation on identical hardware runs in about half that time.
Since I don't know how to get the equivalent to the PID information on
Windows, I can't tell if extra R processes are started on that
platform too.  However, running a more demanding task did seem to show
that more than one core was being used as though the OS is capable of
a degree of parallelling even when the R tasks are done serially.


My question is how can I use gbm and mclapply without reverting to an
ancient R version?

TIA
#
I think the problem has more to do with the version of gbm that you're
using, not the version of R.  Looking over gbm 2.1 briefly, it looks
like it always creates a cluster object and does the cross validation
with parLapply, even when you specify n.cores = 1.  That explains why
you see the messages from your .Rprofile within the mclapply tasks.
The error messages from socketConnection may be happening because
you're creating four cluster objects on the same machine at about the
same time, resulting in port collisions.  If you had any control over
how the cluster object was created by gbm, you might be able to
specify different ports for the different mclapply tasks, but it
doesn't look like gbm has made any provision for that kind of control.

If this is important to you, I think you should either back off to an
older version of gbm, or talk to the package maintainer.  I don't
think the package expects you to call gbm using mclapply.

- Steve

On Wed, Aug 21, 2013 at 5:45 AM, Patrick Connolly
<p_connolly at slingshot.co.nz> wrote:
#
On Wed, 21-Aug-2013 at 09:17AM -0400, Stephen Weston wrote:
|> I think the problem has more to do with the version of gbm that you're
|> using, not the version of R.  Looking over gbm 2.1 briefly, it looks
|> like it always creates a cluster object and does the cross validation
|> with parLapply, even when you specify n.cores = 1.  That explains why
|> you see the messages from your .Rprofile within the mclapply tasks.
|> The error messages from socketConnection may be happening because
|> you're creating four cluster objects on the same machine at about the
|> same time, resulting in port collisions.  If you had any control over
|> how the cluster object was created by gbm, you might be able to
|> specify different ports for the different mclapply tasks, but it
|> doesn't look like gbm has made any provision for that kind of control.

It should be traceable since it didn't work like that pre ver 2.x.

|> 
|> If this is important to you, I think you should either back off to an
|> older version of gbm, or talk to the package maintainer.  I don't
|> think the package expects you to call gbm using mclapply.

My gbm work is far more elaborate than in the example given.  There's
not enough hours in a month to do them all serially which isn't making
use of a machine with 8 cores and 24 Gb of memory.  That is to say, it
is imortant to me.  It surprises me that others would use gbm without
using parallel jobs.  Though it is superior in several ways to other
methods such as what can be done in Weka, it's orders of magnitude
slower.

I tried both your suggestions.  Harry Southworth, the package
maintainer, says he won't have time to look at it for a week or so.  I
removed the current gbm package and tried reinstalling gbm 1.6-3.1 but
it wouldn't compile using R-3.0.1 since it didn't have a NAMESPACE
file.  However, there was a slightly newer one, version 1.6-3.2, which
did compile.  And yes, it did work with mclapply without all those
extra R processes.

However, it's a bit disconcerting to use an orphaned package.  In my
experience over the years, old stuff soon stops working when changes
aren't made to make it fit the way software evolves.

Harry (the package maintainer) suggested reporting the issue to the
project's googlecode home page.  I'm not familar with the norms of
such pages.  Is what I have below too much?  It doesn't quite fit the
cookie cutter layout.


Thanx

|> 
|> - Steve
|> 
|> On Wed, Aug 21, 2013 at 5:45 AM, Patrick Connolly
|> <p_connolly at slingshot.co.nz> wrote:
|> > Apologies for such a long question.  The question is fairly simple but
|> > takes a lot of describing.
|> >
|> >> sessionInfo()
|> > R version 3.0.1 (2013-05-16)
|> > Platform: x86_64-unknown-linux-gnu (64-bit)
|> >
|> > locale:
|> >  [1] LC_CTYPE=en_NZ.UTF-8       LC_NUMERIC=C
|> >  [3] LC_TIME=en_NZ.UTF-8        LC_COLLATE=en_NZ.UTF-8
|> >  [5] LC_MONETARY=en_NZ.UTF-8    LC_MESSAGES=en_NZ.UTF-8
|> >  [7] LC_PAPER=C                 LC_NAME=C
|> >  [9] LC_ADDRESS=C               LC_TELEPHONE=C
|> > [11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C
|> >
|> > attached base packages:
|> > [1] datasets  parallel  splines   grDevices utils     stats     graphics
|> > [8] methods   base
|> >
|> > other attached packages:
|> > [1] gbm_2.1          survival_2.37-4  cairoDevice_2.19 lattice_0.20-15
|> >
|> > loaded via a namespace (and not attached):
|> > [1] grid_3.0.1      multicore_0.1-7 tools_3.0.1
|> >
|> >
|> > Using a system with the above characteristics, I made a function
|> > modifying some of the code from the examples in the gbm() function
|> > help.  The objective was to run some examples with different seeds.
|> > And to do those in parallel using mclapply.  In the interests of
|> > limiting the size of this email body and of avoiding email software
|> > munging the function code, I've put the code into the attached file
|> > testing.fn.sc which can be sourced into an R session.
|> >
|> >
|> > That function runs fine when I use an unupdated installation of
|> > R-2.13.1 and gbm 1.6-3.1 being quite capable of using four cores
|> > simultaneously.  (It needs slight modification to use multicore
|> > instead of parallel and the call to gbm has no n.cores parameter.)
|> >
|> >> testing(4)
|> >   2013-08-21 20:57:31  Begin using multicore method with phony data with 4 cores.
|> > Core 1 uses 20442
|> >
|> > Core 2 uses 20443
|> >
|> > Core 3 uses 20445
|> >
|> > Core 4 uses 20447
|> >
|> >  2013-08-21 20:57:36
|> > ....Completed testing multicore method with invented data.
|> > $a
|> >    CV Test OOB
|> > 1 126  131  79
|> >
|> > [...]
|> >
|> > $d
|> >    CV Test OOB
|> > 1 123  140  81
|> >
|> >
|> > However, when I try it with the current versions I get this:
|> >
|> > system.time(bbb <- testing(4))
|> >   2013-08-21 16:18:03  Begin using multicore method with phony data with 4 cores.
|> >
|> > Core 1 uses 22812
|> >
|> > Core 2 uses 22814
|> >
|> > Core 3 uses 22816
|> >
|> > Core 4 uses 22819
|> > This session PID is 22821:
|> > begun at 2013-08-21 16:18:04:
|> > This session PID is 22829:
|> > begun at 2013-08-21 16:18:04:
|> > This session PID is 22838:
|> > begun at 2013-08-21 16:18:04:
|> > This session PID is 22847:
|> > begun at 2013-08-21 16:18:04:
|> >  2013-08-21 16:18:07
|> > ....Completed testing multicore method with invented data.
|> >    user  system elapsed
|> >   0.460   1.760   3.926
|> > Warning message:
|> > In mclapply(subsets, FUN = test.gbm, mc.cores = nc, mc.cleanup = FALSE,  :
|> >   3 function calls resulted in an error
|> >
|> >
|> > bbb$b
|> > [1] "Error in socketConnection(\"localhost\", port = port, server = TRUE, blocking = TRUE,  : \n  cannot open the connection\n"
|> > attr(,"class")
|> > [1] "try-error"
|> > attr(,"condition")
|> > <simpleError in socketConnection("localhost", port = port, server = TRUE, blocking = TRUE,     open = "a+b", timeout = timeout): cannot open the connection>
|> >>
|> >
|> > (bbb$a works properly and the errors on bbb$c and bbb$d are identical
|> > to the above.)
|> >
|> > The two lines that look like this
|> >
|> >    This session PID is 22847:
|> >    begun at 2013-08-21 16:18:04:
|> >
|> > will look mysterious.  It's explained by the fact that my .Rprofile
|> > cats the beginning time and the process id used at the beginning of
|> > each R session (handy to know that sometimes).  Those outputs indicate
|> > that extra R processes are started, and I assume end with nothing to
|> > do.
|> >
|> > That happens even when there is no problem with mclapply such as when
|> > only a single core is used.
|> >
|> >> system.time(aaa <- testing())
|> >   2013-08-21 16:09:46  Begin using multicore method with phony data with 1 cores.
|> >
|> > Core 1 uses 18484
|> > This session PID is 18487:
|> > begun at 2013-08-21 16:09:46:
|> >
|> > Core 2 uses 18511
|> > This session PID is 18543:
|> > begun at 2013-08-21 16:09:50:
|> >
|> > Core 3 uses 18553
|> > This session PID is 18556:
|> > begun at 2013-08-21 16:09:54:
|> >
|> > Core 4 uses 18609
|> > This session PID is 18612:
|> > begun at 2013-08-21 16:09:58:
|> >  2013-08-21 16:10:01
|> > ....Completed testing multicore method with invented data.
|> >    user  system elapsed
|> >   1.540   2.070  15.488
|> >>
|> >
|> > Running that same code (minus the PID stuff) on a Windows 7
|> > installation on identical hardware runs in about half that time.
|> > Since I don't know how to get the equivalent to the PID information on
|> > Windows, I can't tell if extra R processes are started on that
|> > platform too.  However, running a more demanding task did seem to show
|> > that more than one core was being used as though the OS is capable of
|> > a degree of parallelling even when the R tasks are done serially.
|> >
|> >
|> > My question is how can I use gbm and mclapply without reverting to an
|> > ancient R version?
|> >
|> > TIA
|> >
|> > --
|> > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
|> >    ___    Patrick Connolly
|> >  {~._.~}                   Great minds discuss ideas
|> >  _( Y )_                 Average minds discuss events
|> > (:_~*~_:)                  Small minds discuss people
|> >  (_)-(_)                              ..... Eleanor Roosevelt
|> >
|> > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
|> >
|> > _______________________________________________
|> > R-sig-hpc mailing list
|> > R-sig-hpc at r-project.org
|> > https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
|> >
#
I think you should simply request that they use lapply rather than
parLapply when n.cores = 1.  I believe that will allow your code to
work as before.

- Steve

On Thu, Aug 22, 2013 at 5:36 AM, Patrick Connolly
<p_connolly at slingshot.co.nz> wrote: