I don't think that using in-place modification as a general property would make sense. In-place modification brings in side-effects and that would mean that the order of evaluation can change the result. To get reliable results, the order of evaluation should not be the reason for different results, and thats the reason, why the functional approach is much better for reliable programs. So, in general I would say, this feature is a no-no. In general I would rather discourage in-place modification. For some certain cases it might help... but for such certain cases either such a boolean flag or programming a sparate module in C would make sense. There could also be a global in-place-flag that might be used (via options maybe) but if such a thing would be implemented, the default value should be FALSE. Ciao, Oliver
On Thu, Mar 08, 2012 at 04:21:42PM +0000, William Dunlap wrote:
So you propose an inplace=TRUE/FALSE entry for each argument to each function which may may want to avoid allocating memory? The major problem is that the function writer has no idea what the value of inplace should be, as it depends on how the function gets called. This makes writing reusable functions (hence packages) difficult. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
-----Original Message----- From: oliver [mailto:oliver at first.in-berlin.de] Sent: Thursday, March 08, 2012 7:40 AM To: William Dunlap Cc: R-devel Subject: Re: [Rd] Julia Ah, and you mean if it's an anonymous array it could be reused directly from the args. OK, now I see why you insist on the anonymous data thing. I didn't grasped it even in my last mail. But that somehow also relates to what I wrote about reusing an already existing, named vector. Just the moment of in-place-modification is different. From x <- runif(n) cx <- cos(x) instead of
cx <- cos(x=runif(n)) # no allocation needed, use the input space for the return value
to something like
cx <- runif(n)
cos( cx, inplace=TRUE)
or
cos( runif(n), inplace=TRUE)
This way it would be possible to specify the reusage of the input *explicitly*
(without implicit rules like anonymous vs. named values).
In Pseudo-Code something like that:
if (in_place == TRUE )
{
input_val[idx] = cos( input_val[idx] );
return input_val;
}
else
{
result_val = alloc_vec( LENGTH(input_val), ... );
result_val[idx] = cos( input_val[idx] );
return result_val;
}
Is this matching, what you were looking for?
Ciao,
Oliver
On Thu, Mar 08, 2012 at 02:56:24PM +0100, oliver wrote:
Hi,
ok, thank you for clarifiying what you meant.
You only referred to the reusage of the args, not of an already
existing vector.
So I overgenerealized your example.
But when looking at your example,
and how I would implement the cos()
I doubt I would use copying the args
before calculating the result.
Just allocate a result-vector, and then place the cos() of the
input-vector into the result vector.
I didn't looked at how it is done in R, but I would guess it's like
that.
In pseudo-Code something like that:
cos_val[idx] = cos( input_val[idx] );
But R also handles complex data with cos() so it will look a bit more
laborious.
What I have seen so far from implementing C-extensions for R is rather
C-ish, and so you have the control on many details. Copying the input
just to read it would not make sense here.
I doubt that R internally is doing that.
Or did you found that in the R-code?
The other problem, someone mentioned, was *changing* the contents of a
matrix... and that this is NO>T done in-place, when using a function
for it.
But the namespace-name / variable-name as "references" to the matrix
might solve that problem.
Ciao,
Oliver
On Wed, Mar 07, 2012 at 07:10:43PM +0000, William Dunlap wrote:
No my examples are what I meant. My point was that a function, say
cos(), can act like it does call-by-value but conserve memory when
it can if it can distinguish between the case
cx <- cos(x=runif(n)) # no allocation needed, use the input
space for the return value and and the case
x <- runif(n)
cx <- cos(x=x) # return value cannot reuse the argument's memory, so
allocate space for return value
sum(x) # Otherwise sum(x) would return sum(cx) The function needs to know if a memory block is referred to by a name in any environment in order to do that. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
-----Original Message----- From: oliver [mailto:oliver at first.in-berlin.de] Sent: Wednesday, March 07, 2012 10:22 AM To: Dominick Samperi Cc: William Dunlap; R-devel Subject: Re: [Rd] Julia On Tue, Mar 06, 2012 at 12:49:32PM -0500, Dominick Samperi wrote:
On Tue, Mar 6, 2012 at 11:44 AM, William Dunlap <wdunlap at tibco.com>
wrote:
S (and its derivatives and successors) promises that functions will not change their arguments, so in an expression like ? val <- func(arg) you know that arg will not be changed. ?You can do that by having func copy arg before doing anything, but that uses space and time that you want to conserve. If arg is not a named item in any environment then it should be fine to write over the original because there is no way the caller can detect that shortcut. ?E.g., in ? ?cx <- cos(runif(n)) the cos function does not need to allocate new space for its output, it can just write over its input because, without a name attached to it, the caller has no way of looking at what runif(n) returned. ?If you did ? ?x <- runif(n) ? ?cx <- cos(x)
You have two names here, x and cx, hence your example does not fit into what you want to explain. A better example would be: x <- runif(n) x <- cos(x)
then cos would have to allocate new space for its output because overwriting its input would affect a subsequent ? ?sum(x) I suppose that end-users and function-writers could learn to live with having to decide when to copy, but not having to make that decision makes S more pleasant (and safer) to use. I think that is a major reason that people are able to share S code so easily.
But don't forget the "Holy Grail" that Doug mentioned at the start of this thread: finding a flexible language that is also fast. Currently many R packages employ C/C++ components to compensate for the fact that the R interpreter can be slow, and the pass-by-value semantics of S provides no protection here.
[...] The distinction imperative vs. functional has nothing to do with the distinction interpreted vs. directly executed. Thinking again on the problem that was mentioned here, I think it might be circumvented. Looking again at R's properties, looking again into U.Ligges "Programmieren in R", I saw there was mentioned that in R anything (?!) is an object... so then it's OOP; but also it was mentioned, R is a functional language. But this does not mean it's purely functional or
has no imperative data structures.
As R relies heavily on vectors, here we have an imperative datastructure. So, it rather looks to me that "<-" does work in-place on the vectors, even
"<-"
itself is a function (which does not matter for the problem). If thats true (I assume here, it is; correct me, if it's wrong), then I think, assigning with "<<-" and assign() also would do an imperative (in-place) change of the contents. Then the copying-of-big-objects-when-passed-as-args problem can be circumvented by working on either a variable in the GlobalEnv (and using "<<-", or using a certain environment for the big data and passing it's name (and the variable) as value to the function which then uses assign() and get() to work on that data. Then in-place modification should be possible.
In 2008 Ross Ihaka and Duncan Temple Lang published the paper "Back to the Future: Lisp as a base for a statistical computing system" where they propose Common Lisp as a new foundation for R. They suggest that this could be done while maintaining the same
familiar R syntax.
A key requirement of any strategy is to maintain easy access to the huge universe of existing C/C++/Fortran numerical and graphics libraries, as these libraries are not likely to be rewritten. Thus there will always be a need for a foreign function interface, and the problem is to provide a flexible and type-safe language that does not force developers to use another unfamiliar, less flexible, and error-prone language to optimize the hot
spots.
If I here "type safe" I rather would think about OCaml or maybe
Ada, but not LISP.
Also, LISP has so many "("'s and ")"'s, that it's making people
going crazy ;-)
Ciao,
Oliver
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel