Skip to content
Prev 305670 / 398506 Next

[newbie] aggregating table() results and simplifying code with loop

Comments in-line below.

Sorry to be so long getting back to you but sleep intervened.
Ah, so that's what WS stands for!  Do you have an abstract or short summary of what you are doing in English or French?  My Itallian is limited to about 3 words or what I can guess from French/Latin.

I think that you are misunderstanding what is happening in m51. It is just summarizing the total counts across those two conditions.  I think either I made a logic error or was not understanding what was needed. In any case the dotplot is not plotting the actual occurances but the number of times the crop_pattern was found.  The count is the actual number of occurances per WS per crop_pattern


Have a look at my new effort.  it is still ugly but I think it more accurately supplies some of what you want.

Two caveats : 1.  This is still very rough and an better programer may have a better approach.
2. I used the package ggplot2 to do the graphs. One more thing to learn but I am not used to lattice and I am used to ggplot2. Rather than spend a lot of time with lattice or 
2. The last faceted plots are not intended to be of real use, just my quick look to see if anything looked like I expected. A bit of thinking probably would give somethimg much better
Yes I suspect that there are lots of ways to 'clean' the process and automate it so that it would apply to all three data sets quite easily.  Some I think I can help with and some you may need other people's  help.

For example, once we have a working model to handle one or two conditions It should be relatively easy to use an apply() or a loop to handle all of them and so on.. 

Well, I'm off to work now so I probably won't be able to get back to much before late evening my time ( probably after you are asleep) I think I'm 6 hours ahead of you.

##===================Revised approach====================
# load the various packages (plyr, latticeExtra, ggplot2, reshape2)
library(plyr)
library(latticeExtra)
library(ggplot2)
library(reshape2)

# sample data
T80<- read.csv("/home/john/rdata/sample.csv",  header = TRUE, sep = ";")
# Davide's actual read statement
# T80<-read.table(file="C:/sample.txt", header=T, sep=";")

# Looking for Maize
pattern  <-  c("2Ma", "2Ma","2Ma", "2Ma","2Ma")

# one row examples to see that is happening
T80[1,3:7]
T80[1, 3:7] == pattern

T80[405, 3:7]
T80[405, 3:7] == pattern

T80[55, 3:7] == pattern

# now we apply the patterns to the entire data set.
pp1  <-  T80[, 3:7] == pattern

# paste the TRUEs and FALSEs together to form a single variable
concatdat  <-  paste(pp1[, 1], pp1[, 2], pp1[, 3], pp1[, 4],pp1[,5] ,  sep = "+")

# Assmble new data frame. 
maizedata  <-  data.frame(T80$WS, concatdat)
names(maizedata)  <-  c("WS", "crop_pattern")
str(maizedata)

maizedata$crop_pattern  <-  as.character(maizedata$crop_pattern)

pattern_count  <-  ddply(maizedata, .(crop_pattern), summarize, npattern = length(crop_pattern))
str(pattern_count)
head(pattern_count);  dim(pattern_count) # 	quick look at data.frame and  its size.
                                                                                            # FALSE+FALSE+FALSE+FALSE+FALSE accounts for 21,493 values.
                                                                                             
which(pattern_count$npattern == max(pattern_count$npattern))  # this does the same as looking at the data
                                                                                                                                                # not needed here but useful for larger datasets.

# If we graph pattern_count as it stands we lose any useful detail because of that outlier. 
p  <-  ggplot(pattern_count  , aes(crop_pattern, npattern  )) + geom_point() +
           coord_flip()
p

(pattern1  <-  pattern_count[-1,])  # Drop the offending FALSE+FALSE+FALSE+FALSE+FALSE 
dim(pattern1)  # Okay now we have the maize patterns, without the WS who had no maize at all.

p  <-  ggplot(pattern1  , aes(crop_pattern, npattern   )) + geom_point() +
           coord_flip()
p 


newmaize  <-  subset(maizedata, maizedata$crop_pattern != "FALSE+FALSE+FALSE+FALSE+FALSE")
dim(newmaize) ;  head(newmaize)
str(newmaize)

summaize_by_WS   <-  ddply(newmaize, .(WS), summarize, crop_pattern_ws = length(crop_pattern))

p  <-  ggplot( summaize_by_WS  , aes(summaize_by_WS  )) + geom_point + 
             coord_flip()
p 

summaize_by_WS_and_crop   <-  ddply(newmaize, .(WS, crop_pattern), summarize, crop_pattern_ws = length(crop_pattern))

# crappy graph but just try to see what we might get.  probablly need to subset or use better grid layout.
p  <-  ggplot( summaize_by_WS_and_crop  , aes(crop_pattern, crop_pattern_ws   )) + geom_point() + 
             coord_flip() + facet_grid(WS ~ . )
p 

# save the last graph to look at it in a graphics package== still terrible
ggsave( "/home/john/Rjunk/crop.png")

##
____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!