Skip to content

Error: unexpected '<' in "<" when modifying existing functions

7 messages · Peter Langfelder, Rui Esteves, Duncan Murdoch

#
Hi.
I am trying to modify kmeans function.
It seems that is failing something obvious with the workspace.
I am a newbie and here is my code:

 myk = function (x, centers, iter.max = 10, nstart = 1, algorithm =
c("Hartigan-Wong",
+     "Lloyd", "Forgy", "MacQueen"))
+ {
+     do_one <- function(nmeth) {
+         Z <- switch(nmeth, {
+             Z <- .Fortran(R_kmns, as.double(x), as.integer(m),
+                 as.integer(ncol(x)), centers = as.double(centers),
+                 as.integer(k), c1 = integer(m), integer(m), nc = integer(k),
+                 double(k), double(k), integer(k), double(m),
+                 integer(k), integer(k), as.integer(iter.max),
+                 wss = double(k), ifault = 0L)
+             switch(Z$ifault, stop("empty cluster: try a better set
of initial centers",
+                 call. = FALSE), warning(gettextf("did not converge
in %d iterations",
+                 iter.max), call. = FALSE, domain = NA), stop("number
of cluster centres must lie between 1 and nrow(x)",
+                 call. = FALSE))
+             Z
+         }, {
+             Z <- .C(R_kmeans_Lloyd, as.double(x), as.integer(m),
+                 as.integer(ncol(x)), centers = as.double(centers),
+                 as.integer(k), c1 = integer(m), iter = as.integer(iter.max),
+                 nc = integer(k), wss = double(k))
+             myIter=Z$iter
+             if (Z$iter > iter.max) warning("did not converge in ",
+                 iter.max, " iterations", call. = FALSE)
+             if (any(Z$nc == 0)) warning("empty cluster: try a better
set of initial centers",
+                 call. = FALSE)
+             Z
+
+         }, {
+             Z <- .C(R_kmeans_MacQueen, as.double(x), as.integer(m),
+                 as.integer(ncol(x)), centers = as.double(centers),
+                 as.integer(k), c1 = integer(m), iter = as.integer(iter.max),
+                 nc = integer(k), wss = double(k))
+             if (Z$iter > iter.max) warning("did not converge in ",
+                 iter.max, " iterations", call. = FALSE)
+             if (any(Z$nc == 0)) warning("empty cluster: try a better
set of initial centers",
+                 call. = FALSE)
+             Z
+         })
+         Z
+     }
+     x <- as.matrix(x)
+     m <- nrow(x)
+     if (missing(centers))
+         stop("'centers' must be a number or a matrix")
+     nmeth <- switch(match.arg(algorithm), `Hartigan-Wong` = 1,
+         Lloyd = 2, Forgy = 2, MacQueen = 3)
+     if (length(centers) == 1L) {
+         if (centers == 1)
+             nmeth <- 3
+         k <- centers
+         if (nstart == 1)
+             centers <- x[sample.int(m, k), , drop = FALSE]
+         if (nstart >= 2 || any(duplicated(centers))) {
+             cn <- unique(x)
+             mm <- nrow(cn)
+             if (mm < k)
+                 stop("more cluster centers than distinct data points.")
+             centers <- cn[sample.int(mm, k), , drop = FALSE]
+         }
+     }
+     else {
+         centers <- as.matrix(centers)
+         if (any(duplicated(centers)))
+             stop("initial centers are not distinct")
+         cn <- NULL
+         k <- nrow(centers)
+         if (m < k)
+             stop("more cluster centers than data points")
+     }
+     if (iter.max < 1)
+         stop("'iter.max' must be positive")
+     if (ncol(x) != ncol(centers))
+         stop("must have same number of columns in 'x' and 'centers'")
+     Z <- do_one(nmeth)
+     best <- sum(Z$wss)
+     if (nstart >= 2 && !is.null(cn))
+         for (i in 2:nstart) {
+             centers <- cn[sample.int(mm, k), , drop = FALSE]
+             ZZ <- do_one(nmeth)
+             if ((z <- sum(ZZ$wss)) < best) {
+                 Z <- ZZ
+                 best <- z
+             }
+         }
+     centers <- matrix(Z$centers, k)
+     dimnames(centers) <- list(1L:k, dimnames(x)[[2L]])
+     cluster <- Z$c1
+     if (!is.null(rn <- rownames(x)))
+         names(cluster) <- rn
+     totss <- sum(scale(x, scale = FALSE)^2)
+      print(Z$iter)
+     structure(list(cluster = cluster, centers = centers, totss = totss,
+         withinss = Z$wss, tot.withinss = best, betweenss = totss -
+             best, size = Z$nc, iter = Z$iter), class = "kmeans")
+ }
Error: unexpected '<' in "<"
#
On Fri, Jan 13, 2012 at 4:57 PM, Rui Esteves <ruimaximo at gmail.com> wrote:
Do not include the last line

<environment: namespace:stats>

it is not part of the function definition. Simply leave it out and
your function will be defined in the  user workspace (a.k.a. global
environment).

HTH

Peter
#
On 12-01-13 8:05 PM, Peter Langfelder wrote:
That's only partly right.  Leaving it off will define the function in 
the global environment, but the definition might not work, because 
that's where it will look up variables, and the original function would 
look them up in the stats namespace.  I don't know if that will matter, 
but it might lead to tricky bugs.

What you should do when modifying a function from a package is set the 
environment to the same environment a function in the package would 
normally get, i.e. to the stats namespace.  I think the as.environment() 
function can do this, but I always forget the syntax; an easier way is 
the following:

Create the new function:

kmeansnew <- function (...) ...

Set its environment the same as the old one:

environment(kmeansnew) <- environment(stats::kmeans)

BTW, if you use the fix() function to get a copy for editing, it will do 
this for you automatically.

Duncan Murdoch
#
Thank you both.

1) As Duncan said, if I leave <environment: namespace:stats> out, it
will not work since it is using .C and .Fortran functions that kmeans
calls.
I
2) don`t know how to use the as.environment() (I did not understood by
reading the help).

3) Setting environment(kmeansnew) <- environment(stats::kmeans) does
not work as well.

4) Using fix() works, but then I don`t know how to store just the
function in an external file. To use it in another computer, for
example.  If I use save(myfunc,"myFile.R", ASCII=TRUE) it doesn't work
when I try to load it again using myfunc=load("myFile.R")

Rui


On Sat, Jan 14, 2012 at 3:22 AM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
#
On 12-01-14 3:58 AM, Rui Esteves wrote:
I think you need to explain what "does not work" means.  What did you 
do, and how do you know it didn't work?
Don't use load() on a source file.  Use load() on a binary file produced 
by save().  You could save() your working function, but then you can't 
edit it outside of R.  To produce a .R file that you can use in another 
session, you're going to need to produce the function, then modify the 
environment, using 2 or 3 above.

Duncan Murdoch
#
All of these tries leave to the same result:
1) First I defined kmeansnew with the content of kmeans, but leaving
the  <environment: namespace:stats> out.
Then I run environment(kmeansnew)<- environment(stats::kmeans) at the
command line.
2) kmeansnew <- kmeans() {.... environment(kmeansnew)<-
environment(stats::kmeans) }
3) kmeansnew <- kmeans() {....}   environment(kmeansnew)<-
environment(stats::kmeans)

When I do kmeansnew(iris[-5],4) it returns:
 Error in do_one(nmeth) : object 'R_kmns' not found

'R_kmns' is a .FORTRAN that is called by the original kmeans().
it is the same error as if i would just leave <environment:
namespace:stats> out.



On Sat, Jan 14, 2012 at 11:50 AM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
#
On 12-01-14 6:08 AM, Rui Esteves wrote:
Number 1 is what you should do.  When you do that and print kmeansnew in 
the console, does it list the environment at the end?  What does
environment(kmeansnew) print?

Duncan Murdoch