pass by reference
Hi,
On Aug 14, 2012, at 10:07 AM, Bert Gunter wrote:
(Offlist, as my comments are not worth bothering the list about).
Almost off list!
I don't understand the purpose of this tirade (whose reasonableness I make no judgment of). R is what it is. If you don't like it for whatever reason, don't use it. As a point of order, there are several packages that "automate" pass by reference/pointers in R to some extent: packages ref, R.oo, and proto are 3 that I know of, but I think there are others. My understanding is that this tends to be computationally inefficient in R, but I have no direct knowledge.
I wonder why Reference Classes are not mentioned - I think they maybe be informally called R5 and RefClass, too. You can learn about them with ?setRefClass.
I have tried using this approach quite recently as I was working with very large data frames. I'm no software engineer, so I am not sure if using R5 style was a significant help to my problem, but with one exception it was pretty painless to give it a shake.* In fact, inside the methods (and therefore the object instance environment?) I probably was working in the pass-by-value paradigm.
I'm hoping that those in the know could shed some light on the pros and cons of using Reference Classes.
Cheers,
Ben
* There are two ways to add methods to a reference class: (1) in the call to setRefClass() which generates the object definition, and (2) using the MyRefClass$methods() function after the generator object is created. After some puzzle-filled afternoons I settled on the latter as being waaaaay better. I have pasted below an example.
##### START
# generator
MyRefClassR5 <- setRefClass("MyRefClassR5",
# here are the field properties
fields = list(
x = "numeric",
y = "numeric",
color = "character",
flavor = "character"),
# here we can add methods - but there is a handier way
# see the length() method added below
methods = list(
plot = function(color = .self$color, flavor = .self$flavor, pch = 15, ...){
graphics::plot(.self$x, .self$y, col = color, main = flavor, pch = pch, ...)
speak("plotting")
},
speak = function(message = paste(.self$flavor, "is the color", .self$color)){
cat("MyRefClass:", message, "\n")
})
)
MyRefClass <- function(x = seq(from = 0, to = 10), y = x^2,
color = "brown", flavor = "chocolate"){
X <- MyRefClassR5$new(x = x, y = y,
color = color, flavor = flavor)
return(X)
}
# create an instance, a
a <- MyRefClass()
a$speak()
a$plot()
# So, now I have an instance, a, of MyRefClassR5.
# But suppose I want to add a new method called length.
# this is a handier way to add methods as it adds them to existing
# instances of the class - if this new method is added above in the
# setRefClass() generator, then subsequent instances of the object would
# have a length() method, but object a would be orphaned without it.
MyRefClassR5$methods(
length = function(){
len <- c(x = base::length(.self$x), y = base::length(.self$y))
s <- paste("length x =", len["x"], " length y =", len["y"])
speak(s)
return(len)
})
# can I use this method on instance a? Yup!
a$length()
#### END
As you presumably already know, you can also implement this manually through the use of environments and S3 or S4 semantics. You might also be interested in Luke Tierney's comments on references in R: http://homepage.stat.uiowa.edu/~luke/R/references.html Cheers, Bert On Tue, Aug 14, 2012 at 2:07 AM, Jan T Kim <jttkim at googlemail.com> wrote:
On Mon, Aug 13, 2012 at 11:20:26PM -0300, Alexandre Aguiar wrote:
Sachinthaka Abeywardana <sachin.abeywardana at gmail.com> escreveu:
Think you are missing the point,
As lover of C-style pointers, I must admit that hiding complexities (and associated problems) of pointers is a great feature of all successful high level languages (HLLs). As much as they spare time and can be easily learned by non-programmers, they impose penalties in performance and memory consumption. Most drawbacks of HLLs have been effectively and efficiently addressed by a number of strategies in such a way that currently we have a wide variety of options. Languages have become tools to solve problems and, as such, we must pick the proper tool for each problem. That means the very first step is properly assessing the problem.
yes, I quite agree. In my experience, the root of the trouble leading to "how can I pass things by reference in R" requests is that many problems involve objects that retain their identity while their attributes change in a dynamic way. These are represented more adequately by having multiple functions changing state of the same (identical!) object, rather than approximating this by repeatedly replacing an object with an updated copy of itself, or by using other "hacks". The overheads / inefficiencies that come with such hacks are really just a consequence of an inadequate representation of the problem (i.e. "as though one had not assessed it properly", if you will). As a further indication that this is a design issue rather than one of optimisation at the implementation level, notice that from a database perspective, holding multiple copies that represent the same thing in memory amounts to a denormalised design. Personally, I understand functional programming evangelists who object to state and side effects because this "purism" is adequate (and also often very elegant) for enabling parallel and distributed computing. But noticing that this is not much of an issue for R (which e.g. doesn't support multithreading much), I do think on a regular basis that providing a mechanism enabling multiple references to one instance would be an improvement that would not do too much damage. As it is, I've resigned to using other languages where I need an object graph, and producing a first normal form type of table where I want to do something in R. But I do get the feeling that I'm doing something not quite right, and I frequently reiterate the problem analysis outlined above to myself in order to put that funny feeling behind me. Just my 2 pence, Jan
My 2 cents. -- Alexandre Aguiar, MD SCT SPS Consultoria -- Sent from my tablet. Please, excuse my brevity. Enviado do tablet. Por favor, perdoe a brevidade. Publi?? de le tablet. S'il vous pla??t pardonnez la bri??vet??. Ver??ffentlicht aus dem Tablet. Bitte verzeihen Sie die K??rze. Enviado desde mi tablet. Por favor, disculpen mi brevedad. Inviato dal mio tablet. Per favore, scusate la mia brevit??.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- +- Jan T. Kim -------------------------------------------------------+ | email: jttkim at gmail.com | | WWW: http://www.jtkim.dreamhosters.com/ | *-----=< hierarchical systems are for files, not for humans >=-----*
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Ben Tupper Bigelow Laboratory for Ocean Sciences 180 McKown Point Rd. P.O. Box 475 West Boothbay Harbor, Maine 04575-0475 http://www.bigelow.org