Hi All I am new to R and I am not sure of how this should be done. I have a matrix of 985x100 values and the class is data.frame. A sample of my dataset looks like this (Since its a huge dataset and it would make the screen look more complex, I am pasting only the first few rows and columns. V2 V3 V4 V5 V6 2 0.009953966 -0.01586103 -0.016227028 0.016774711 -0.021342598 3 -0.230181145 0.203303786 -0.685321843 0.147050709 -0.122269004 4 -0.552905273 -0.034039644 -0.511356309 -0.330524909 -0.239088566 5 -0.089739322 -0.082768643 -0.411209134 -0.301011664 1.560185991 6 -1.986059137 -0.252217616 -0.369044526 -0.585619405 0.545903757 7 -1.635875161 2.741310455 -0.058411313 -1.458825827 0.078480977 8 0.525846706 -1.134643662 -0.067014844 -1.431990219 -0.557057121 9 -0.913511821 0.688374777 0.376412044 -0.861746434 2.065507172 10 -1.538179621 0.814330376 1.639939042 -1.41478931 1.802738289 11 0.817957993 -0.426560507 2.773380242 -0.123291817 1.316883748 When I try to use this command to convert it to numeric, as.numeric(leu_cluster1): I get an error Error: (list) object cannot be coerced to type 'double'. I tried several functions and looked into other forums too, but could not find a solution. i am trying to change it to numeric data.frame and not to a matrix. thanks in advance. :)
How to create a numeric data.frame
13 messages · Sarah Goslee, Barry Rowlingson, Joshua Wiley +4 more
What are you trying to do? It looks numeric, although a visual assessment isn't reliable. The output of str() would be helpful. But I'm not sure what your objective is. What do you think your data frame is now, and what do you think it should be? Sarah
On Mon, Jun 13, 2011 at 6:06 AM, Aparna <aparna.sampath26 at gmail.com> wrote:
Hi All I am new to R and ?I am not sure of how this should be done. I have a matrix of 985x100 values and the class is data.frame. A sample of my dataset looks like this (Since its a huge dataset and it would make the screen look more complex, I am pasting only the first few rows and columns. ?V2 ? ? ? ? ? V3 ? ? ? ? ? V4 ? ? ? ? ? V5 ? ? ? ? ? V6 2 ? 0.009953966 ?-0.01586103 -0.016227028 ?0.016774711 -0.021342598 3 ?-0.230181145 ?0.203303786 -0.685321843 ?0.147050709 -0.122269004 4 ?-0.552905273 -0.034039644 -0.511356309 -0.330524909 -0.239088566 5 ?-0.089739322 -0.082768643 -0.411209134 -0.301011664 ?1.560185991 6 ?-1.986059137 -0.252217616 -0.369044526 -0.585619405 ?0.545903757 7 ?-1.635875161 ?2.741310455 -0.058411313 -1.458825827 ?0.078480977 8 ? 0.525846706 -1.134643662 -0.067014844 -1.431990219 -0.557057121 9 ?-0.913511821 ?0.688374777 ?0.376412044 -0.861746434 ?2.065507172 10 -1.538179621 ?0.814330376 ?1.639939042 ?-1.41478931 ?1.802738289 11 ?0.817957993 -0.426560507 ?2.773380242 -0.123291817 ?1.316883748 When I try to use this command to convert it to numeric, as.numeric(leu_cluster1): I get an error Error: (list) object cannot be coerced to type 'double'. I tried several functions and looked into other forums too, but could not find a solution. i am trying to change it to numeric data.frame and not to a matrix. thanks in advance. :)
Sarah Goslee http://www.functionaldiversity.org
Hi, If your matrix is already numeric, then: as.data.frame(your_matrix_name) will do the trick. However, if you have a matrix that is not numeric (say it is character), then you could use: as.data.frame(as.numeric(your_matrix_name)) Matrices can only hold one class of data (for example, all numeric, or all character, or all factor), so if *any* of your data is character (say one column contains people's names), then the entire matrix will be character, and calling as.numeric() on it is probably not what you want (the character data will get screwed up). In which case, you might convert the matrix to a data frame first: as.data.frame(your_matrix_name) because data frames can contain different classes of data in their different columns. Once it is a data frame, you could convert the columns that should be numeric to numeric (say, columns 2 through 6 only) by: your_data_name[, 2:6] <- lapply(your_data_name[, 2:6], as.numeric) For relevant documentation, see ?as.numeric ?as.data.frame ## under the "Details" section, it shows the hierarchy of data types ## that is how I could know that if there is character data, the numeric ## data will be converted up to the character class ?matrix Hope this helps, Josh
On Mon, Jun 13, 2011 at 3:06 AM, Aparna <aparna.sampath26 at gmail.com> wrote:
Hi All I am new to R and ?I am not sure of how this should be done. I have a matrix of 985x100 values and the class is data.frame. A sample of my dataset looks like this (Since its a huge dataset and it would make the screen look more complex, I am pasting only the first few rows and columns. ?V2 ? ? ? ? ? V3 ? ? ? ? ? V4 ? ? ? ? ? V5 ? ? ? ? ? V6 2 ? 0.009953966 ?-0.01586103 -0.016227028 ?0.016774711 -0.021342598 3 ?-0.230181145 ?0.203303786 -0.685321843 ?0.147050709 -0.122269004 4 ?-0.552905273 -0.034039644 -0.511356309 -0.330524909 -0.239088566 5 ?-0.089739322 -0.082768643 -0.411209134 -0.301011664 ?1.560185991 6 ?-1.986059137 -0.252217616 -0.369044526 -0.585619405 ?0.545903757 7 ?-1.635875161 ?2.741310455 -0.058411313 -1.458825827 ?0.078480977 8 ? 0.525846706 -1.134643662 -0.067014844 -1.431990219 -0.557057121 9 ?-0.913511821 ?0.688374777 ?0.376412044 -0.861746434 ?2.065507172 10 -1.538179621 ?0.814330376 ?1.639939042 ?-1.41478931 ?1.802738289 11 ?0.817957993 -0.426560507 ?2.773380242 -0.123291817 ?1.316883748 When I try to use this command to convert it to numeric, as.numeric(leu_cluster1): I get an error Error: (list) object cannot be coerced to type 'double'. I tried several functions and looked into other forums too, but could not find a solution. i am trying to change it to numeric data.frame and not to a matrix. thanks in advance. :)
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/
On Mon, Jun 13, 2011 at 11:06 AM, Aparna <aparna.sampath26 at gmail.com> wrote:
Hi All I am new to R and ?I am not sure of how this should be done. I have a matrix of 985x100 values and the class is data.frame.
You don't have a 'matrix' in the R sense of the word. You seem to have a table of numbers which are stored in an object of class 'data.frame'.
?V2 ? ? ? ? ? V3 ? ? ? ? ? V4 ? ? ? ? ? V5 ? ? ? ? ? V6 2 ? 0.009953966 ?-0.01586103 -0.016227028 ?0.016774711 -0.021342598 3 ?-0.230181145 ?0.203303786 -0.685321843 ?0.147050709 -0.122269004 4 ?-0.552905273 -0.034039644 -0.511356309 -0.330524909 -0.239088566 5 ?-0.089739322 -0.082768643 -0.411209134 -0.301011664 ?1.560185991 6 ?-1.986059137 -0.252217616 -0.369044526 -0.585619405 ?0.545903757 7 ?-1.635875161 ?2.741310455 -0.058411313 -1.458825827 ?0.078480977 8 ? 0.525846706 -1.134643662 -0.067014844 -1.431990219 -0.557057121 9 ?-0.913511821 ?0.688374777 ?0.376412044 -0.861746434 ?2.065507172 10 -1.538179621 ?0.814330376 ?1.639939042 ?-1.41478931 ?1.802738289 11 ?0.817957993 -0.426560507 ?2.773380242 -0.123291817 ?1.316883748 When I try to use this command to convert it to numeric,
A data frame doesn't have an overall sense of itself being numeric or character. Only the columns have that, and they can be independent.
as.numeric(leu_cluster1): I get an error Error: (list) object cannot be coerced to type 'double'.
Because a data frame is implemented as a list where each element is
the same length. Each element is a vector of numbers or characters.
You are doing the equivalent of:
as.numeric(list(foo=c(1,2,3))
now you may think it reasonable to do an 'as.numeric' on that, but what about:
as.numeric(list(foo=list(bar=c(1,2,3),baz=c(34,5)),bar=c("Hello","World"))
how would you 'as.numeric' that?
I tried several functions and looked into other forums too, but could not find a solution. i am trying to change it to numeric data.frame and not to a matrix.
There is no numeric data frame. There is only numeric matrix, or a dataframe with all numeric columns. Do summary(mydataframe) to see what class your columns all are. Barry
On Mon, Jun 13, 2011 at 7:45 AM, Barry Rowlingson
<b.rowlingson at lancaster.ac.uk> wrote:
[snip]
?now you may think it reasonable to do an 'as.numeric' on that, but what about:
?as.numeric(list(foo=list(bar=c(1,2,3),baz=c(34,5)),bar=c("Hello","World"))
?how would you 'as.numeric' that?
Well, "Hello" is one of the first words spoken when meeting someone and in programming, and I think "World" represents not just the earth, but everything that exists. Everything seems best represented by a continuous circle, so logically I would 'as.numeric' that $foo $foo$bar [1] 1 2 3 $foo$baz [1] 34 5 $bar [1] 1 0 However, R does not agree with my logic ;) Josh
On Mon, Jun 13, 2011 at 4:45 PM, Barry Rowlingson
<b.rowlingson at lancaster.ac.uk> wrote:
On Mon, Jun 13, 2011 at 11:06 AM, Aparna <aparna.sampath26 at gmail.com> wrote:
Hi All I am new to R and ?I am not sure of how this should be done. I have a matrix of 985x100 values and the class is data.frame.
?You don't have a 'matrix' in the R sense of the word. You seem to have a table of numbers which are stored in an object of class 'data.frame'.
but you could have one: a <- data.frame(matrix(rnorm(100),10) # get some data class(a) # check for its class as.numeric(a) # whoops, won't work class(as.matrix(a)) # change class, and as.numeric(as.matrix(a)) # bingo, it works PF
+----------------------------------------------------------------------- | Patrizio Frederic, | http://www.economia.unimore.it/frederic_patrizio/ +-----------------------------------------------------------------------
Hi r-help-bounces at r-project.org napsal dne 13.06.2011 17:19:39:
Patrizio Frederic <frederic.patrizio at gmail.com> On Mon, Jun 13, 2011 at 4:45 PM, Barry Rowlingson <b.rowlingson at lancaster.ac.uk> wrote:
On Mon, Jun 13, 2011 at 11:06 AM, Aparna <aparna.sampath26 at gmail.com>
wrote:
Hi All I am new to R and I am not sure of how this should be done. I have a
matrix of
985x100 values and the class is data.frame.
You don't have a 'matrix' in the R sense of the word. You seem to have a table of numbers which are stored in an object of class 'data.frame'.
but you could have one: a <- data.frame(matrix(rnorm(100),10) # get some data class(a) # check for its class as.numeric(a) # whoops, won't work class(as.matrix(a)) # change class, and as.numeric(as.matrix(a)) # bingo, it works
Which results in vector of numbers str(as.numeric(as.matrix(a))) num [1:100] 0.82 -1.339 1.397 0.673 -0.461 ... data frame is convenient list structure which can contain vectors of various nature (numeric, character, factor, logical, ...) and looks quite similar to Excel table. matrix is a vector with (2) dimensions but as it is a vector it can not consist from objects of different nature (class). Therefore you can have numeric or character matrix but not numeric and character columns in your matrix. and vector is vector (numeric, character, logical, ...) but again you can not mix items of different class in one vector. Regards Petr
PF -- +----------------------------------------------------------------------- | Patrizio Frederic, | http://www.economia.unimore.it/frederic_patrizio/ +-----------------------------------------------------------------------
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hi Sarah Thanks for your advice. My dataset contains all the normalized values. I have to give this dataset as input to ClusterCons package in R. In order to run the package, it requires the data to be converted to numeric data.frame. When i check my data using class(mydataset), it is in the form of data.frame. But when I try to convert it to numeric using as.numeric(mydataset), it gives me an error saying Error: (list) object cannot be coerced to type 'double'. Could you tell me why this error occurs. Thanks Aparna
Hi Joshua While looking at the data, all the values seem to be in numeric. As i mentioned, the dataset is already in data.frame. As suggested, I used str(mydata) and got the following result: str(leu_cluster1) 'data.frame': 984 obs. of 100 variables: $ V2 : Factor w/ 986 levels "-0.00257361",..: 543 116 252 54 520 ... $ V3 : Factor w/ 986 levels "-0.000790437",..: 7 666 14 32 105 ... $ V4 : Factor w/ 986 levels "-0.0023231","-0.004207663",..: 6 353 267 208 187... $ V5 : Factor w/ 986 levels "-0.006466083",..: 585 627 146 131 263 ... $ V6 : Factor w/ 986 levels "-0.002119173",..: 11 56 111 898 780... The columns are not numeric, which I understood from this. There is a function called data_check as part of Clustercons package I am using for this project. This helps me check whether the input data is numeric or not. Using this I could tell that my data is not numeric and that is why I was trying to convert it to numeric data. This forum is of great help since I am able to learn more and thanks for making this forum so helpful to people like us who are new to R. Aparna
Which results in vector of numbers str(as.numeric(as.matrix(a))) ?num [1:100] 0.82 -1.339 1.397 0.673 -0.461 ... data frame is convenient list structure which can contain vectors of various nature (numeric, character, factor, logical, ...) and looks quite similar to Excel table. matrix is a vector with (2) dimensions but as it is a vector it can not consist from objects of different nature (class). Therefore you can have numeric or character matrix but not numeric and character columns in your matrix. and vector is vector (numeric, character, logical, ?...) but again you can not mix items of different class in one vector.
of course it is. I forgot to say that the way I proposed works only if
the data-frame contains numeric objects only.
R is a great tool because you can get to the very same results in many
different ways.
Depending on the problem you're dealing with, you have to choose the
most efficient one.
Often, in my research work, the most efficient is the one that use as
less as possible lines of code:
Suppose a is a data.frame which contains numeric objects only
a <- data.frame(matrix(rnorm(100),10)) # some data
## 1 not very nice
b <- 0
for (j in 1:length(a)) b<-c(b,as.numeric(a[i]))
b<-b[-1]
## 2 long time ago I was a fortran guy
b<-numeric(length(a))
for (j in 1:dim(a)[2]){
for (i in 1:dim(a)[1]){
b[10*(j-1)+i] <- as.numeric(a[i,j])
}
}
## 3 better: sapply function
as.numeric(sapply(a,function(x)as.numeric(x)))
## 4 shorter
as.numeric(as.matrix(a))
## which type of data a has
a <- data.frame(a,fact=sample(c('F1','F2'),dim(a)[1],replace=T))
class_a <- sapply(a,function(x)class(x))
class_a
a_numeric <- a[,class_a=='numeric']
as.numeric(as.matrix(a_numeric))
Regards,
PF
+----------------------------------------------------------------------- | Patrizio Frederic, | http://www.economia.unimore.it/frederic_patrizio/ +-----------------------------------------------------------------------
On Mon, Jun 13, 2011 at 6:47 PM, Aparna <aparna.sampath26 at gmail.com> wrote:
Hi Joshua While looking at the data, all the values seem to be in numeric. As i mentioned, the dataset is already in data.frame. As suggested, I used str(mydata) and got the following result: str(leu_cluster1) 'data.frame': 984 obs. of 100 variables: $ V2 : Factor w/ 986 levels "-0.00257361",..: 543 116 252 54 520 ...
your data columns are not numeric but factors indeed. you may try this one a <- as.character(rnorm(100)) # some numeric data adf <- data.frame(matrix(a,10)) # which are misinterpreted as factors adf adf[,1] class(adf[,1]) # check for the class of the first column sapply(adf,function(x)class(x)) # check classes for all columns b <- sapply(adf,function(x)as.numeric(as.character(x))) # as.character: use levels literally, as.numeric: transforms in numbers b # look at b class(b) # which is now a numeric matrix best regards PF
+----------------------------------------------------------------------- | Patrizio Frederic, | http://www.economia.unimore.it/frederic_patrizio/ +-----------------------------------------------------------------------
On Mon, Jun 13, 2011 at 2:11 PM, Patrizio Frederic
<frederic.patrizio at gmail.com> wrote:
On Mon, Jun 13, 2011 at 6:47 PM, Aparna <aparna.sampath26 at gmail.com> wrote:
Hi Joshua While looking at the data, all the values seem to be in numeric. As i mentioned, the dataset is already in data.frame. As suggested, I used str(mydata) and got the following result: str(leu_cluster1) 'data.frame': ? 984 obs. of ?100 variables: ?$ V2 ?: Factor w/ 986 levels "-0.00257361",..: 543 116 252 54 520 ...
your data columns are not numeric but factors indeed. you may try this one a <- as.character(rnorm(100)) ? ? ? ? ? # some numeric data adf <- data.frame(matrix(a,10)) ? ? ? ? # which are misinterpreted as factors adf adf[,1] class(adf[,1]) # check for the class of the first column sapply(adf,function(x)class(x)) # check classes for all columns b <- sapply(adf,function(x)as.numeric(as.character(x))) #
But coercing to a character class first is not the recommended method. Also, I am leery about using sapply() with data frames, because it converts them to matrices, which can cause havoc, if you have different classes of data. You mentioned that as a first step, you had removed the names column from the data frame before trying to convert it to numeric. I would simply leave the names in, and then (supposing they are in column 101) leu_cluster1[, 1:100] <- lapply(leu_cluster1[, 1:100], function(x) as.numeric(levels(x))[x]) apply the conversion to numeric on only the necessary columns. This simplifies life because you are not making interim data sets. Using lapply() allows you to work with (potentially) different classes of data (although I realize in this particular case you are only dealing with one class). So long as you are assigning the results back into a data frame (as above), the methods for lapply will automatically conver the list back to a data frame. If you are concerned about this, just wrap the call in as.data.frame() leu_cluster1[, 1:100] <- as.data.frame(lapply( leu_cluster1[, 1:100], function(x) as.numeric(levels(x))[x])) Cheers, Josh
as.character: use levels literally, as.numeric: transforms in numbers b # look at b class(b) # which is now a numeric matrix best regards PF -- +----------------------------------------------------------------------- | Patrizio Frederic, | http://www.economia.unimore.it/frederic_patrizio/ +-----------------------------------------------------------------------
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110613/5ca1bad9/attachment.pl>