replacing a factor value in a data frame
On Fri, 28 Oct 2005, Federico Calboli wrote:
Hi All,
I have the following problem, that's driving me mad.
I have a dataframe of factors, from a genetic scan of SNPs. I DO have
NAs in the dataframe, which would look like:
V4 V5 V6 V7 V8 V9 V10
1 TT GG TT AC AG AG TT
2 AT CC TT AA AA AA TT
3 AT CC TT AC AA <NA> TT
4 TT CC TT AA AA AA TT
5 AT CG TT CC AA AA TT
6 TT CC TT AA AA AA TT
7 AT CC TT CC <NA> <NA> TT
8 TT CC TT AC AG AG TT
9 AT CC TT CC AG <NA> TT
10 TT CC TT CC GG GG TT
In the dataframe I have 1 column where one factor has been erroneosly
given alternative readings: CG and GC.
I want to change the instances of GC to CG and I use the code:
data[data[,30]=="GC", 30] = "CG"
but get the error:
Error in "[<-.data.frame"(`*tmp*`, all[, 30] == "GC", 30
missing values are not allowed in subscripted as
Any hints?
1) Use %in% not == 2) (Better) As this is a factor, use levels<- to merge the levels. See ?levels.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595