Skip to content
Prev 156158 / 398506 Next

Scripting in R -- pattern matching, logic, system calls, the works!

Don,
Excellent advice.  I've gone back and done a bit of coding and wanted to see
what you think and possibly "shore up" some of the technical stuff I am
still having a bit of difficulty with.

I'll past the code I have to date with any important annotations:

topdir="~"
library(gmodels)

setwd(topdir)

### Will probably want to do two for loops as opposed to recursive
files=list.files(path=topdir,pattern="Coverage")

for (i in files)
{
        dir=paste("~/hangers/",i,sep="")

        files2=list.files(path=dir,pattern="Length")

        ### Make an empty matrix that will have the independent variable as
the filenum and the dependent variable
        ### as the mean of the length or should I have two vectors for the
regression.  Basically the Length_(\d+) is the independent variable (which
is taken from the filename) which all the regressions will have and then
inside the Length_(\d+) is a 1d set of numbers which I take the mean of
which in turn becomes the dependent variable.  So in essence the points are:
f(length)=mean(length$V1)
f(45)=50
f(50)=60
etc ...


        for (j in files2)
        {
        ## I just rearranged the following line but I'm not sure what the
command is doing
        ## I am assuming 'as.numeric' means take the input as a number
instead of a string and the gsub has                #me stumped 
       
        filenum=as.numeric(gsub('Length_','',j))        
        
        ## Can I assign variables at the top instead of hardcoding? like
upper=50 , lower=30?
        ## And I don't need to put brackets for this if statement do I? 
Does it basically just
        ## say that if the filenum is outside those parameters, just go to
the next j in files2?
        if (filenum > 200 | filenum < -10) next

        dir2=paste("~/hangers",i,j,sep="/")

        tmp=read.table(dir2)

        mean(tmp($V1))

        Now should I put these in a matrix or a vector (all j values (length
vs mean(tmp$V1) for each i iteration) 
        }
}

I think lastly, Id like to get a print out of each of the regressions (each
iteration of i).  Is that when I use the summary command?  And, like in
unix, can I redirect the output to a file?

Best
Don MacQueen wrote:
out
plots
regression?
besides
R-project.org/posting-guide.html