Skip to content

More than on loop??

19 messages · che, David Winsemius, jim holtman

che
#
hello every one,

How to function more than one loop in R? I have the following problem to be
solved with the a method of three loops, can you help me please?

The data is attached with this message.

The data is composed of two parts, cleaved (denoted by ?cleaved?) and non
cleaved (denoted by ?noncleaved?). 
? to access to the ith peptide, you can use X$Peptide[i]
? to access to the ith label, you can use X$Label[i]
 
define a set of amino acids using string or other format if you want
amino.acid<-"ACDEFGHIKLMNPQRSTVWY"
define two matrices with initialised entries, one for cleaved  peptides and
one for none-cleaved peptides
? matrix(0,AA,mer),where AA is the number of amino acids, and mer is the
number
of residues detected from data using the nchar function
? both matrices have the same size, the number of rows being equal to the
number
of amino acids and the number of columns being equal to the number of
residues
in peptides


 use one three-loop structure to detect the frequency of amino acids in
cleaved peptides
and one three-loop structure to detect the frequency of amino acids in
non-cleaved
peptides. They should not be mixed in one three-loop structure. The best way
to
handle this is to use a function. The three-loop structure is exampled as
below
for(i in 1:num)#scanning data for all peptides, where num means the number
of peptides
{
for(j in 1:mer)#scanning all residues in a peptide
{
for(k in 1:AA)#scanning 20 amino acids
{
#actions
}
}
}
http://n4.nabble.com/file/n1015851/hiv.dat hiv.dat
#
On Jan 16, 2010, at 7:09 PM, che wrote:

            
for (i in 1:3) {
       for (j in 1:2) {
          for (k in letters[1:4]) { print(paste(i , j,  k) )  }
                }      }

I'm afraid I develop a strong suspicion that a problem is homework  
when I am told it should be solved with a particular control structure  
and I then see no code.
che
#
Thank you very much for your help,

you have been excused to have a suspicion, but  dont worry i am not
cheating, it is not a home work, rather it is a pre-project task that i have
to deal with in order to prepare to my project, and i cant understand this
programming things alone, i tried my best but still i cant deal with it
properly, i am studying master and PhD in bioinformatics, and i need to
develop a good understanding of  programming languages. still a beginner but
i start to have some fears ... what ever you send me, i study it and know
exactly how it works, and believe me that can help a lot to develop my
skills. Moreover i am dealing with it in a very honesty way that does not
break any academic regulations. 

thanks again i will try what you sent me ..

Yours
che wrote:

  
    
7 days later
che
#
Really thanks very much, with your help i was able to write a prober code to
count the aminoacids in all the cleaved and noncleaved and then to display
the results in a matrix with 8 column, i used only two loops instead of
three. The code is working but i still have warning telling me that: 
 ?In if (x$Label[i] == nc) { ... :
  the condition has length > 1 and only the first element will be used)?
So please can you help me with this warning what is the reason of it as i
can?t understand exactly what does it mean? 
Here is the code that i am using, and the data file is attached:
x<-read.table("C:/Uni/502/CA2/hiv.dat", header=TRUE)
num<-nrow(x)
AA<-c('A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y')
nc<-x$Label[61:308]
c<-x$Label[nc]
noncleaved<-function(x)
{
y<-matrix(0,20,8)
colnames(y)<-c("N4","N3","N2","N1","C1","C2","C3","C4")
for(i in 1:num){
if (x$Label[i]==nc)
{
for(j in 1:8){
res<-which(AA==substr(x$Peptide[i],j,j))
y[res, j]=y[res, j]+1
}
}
}
return (y/274*100)
}
cleaved<-function(x)
{
y<-matrix(0,20,8)
colnames(y)<-c("N4","N3","N2","N1","C1","C2","C3","C4")
for(i in 1:num){
if (x$Label[i]==c)
{
for(j in 1:8){
res<-which(AA==substr(x$Peptide[i],j,j))
y[res, j]=y[res, j]+1
}
}
}
return (y/113*100)
}
   
http://n4.nabble.com/file/n1288922/hiv.dat hiv.dat
#
Notice that 'nc' is multivalued (nc<-x$Label[61:308]).  If you want to
check if x$Label[i] is one of the values in 'nc', then use %in%:

if (x$Label[i] %in% nc) ....

The error happens because 'if' can only have a single conditional
expressions and your original 'if' statement would have resulted in
multiple logical values.  THe error message says that 'if' can only
use 1, so it picks the first and ignores the rest.  It is probably not
giving you the results you expect.
On Sun, Jan 24, 2010 at 6:14 PM, che <fadialnaji at live.com> wrote:

  
    
che
#
can i ask your help again, please excuse my questions:
It is working perfectly now, i still have the last part which i tried a lot
with but still i can?t translate it properly for the computer through R. I
need to draw rectangular based on the frequency of each residue, actually i
found the pattern, but i am not able to translate it into an automatic R
function.
First, i drew an empty plot where the rectangles should be placed, then with
rect function i drew the rectangles in the preferable  pattern but in a
manual way, i used this command for this purpose: 
rect(x*10,y,x+10,y+round(cleaved(x)[k,j]),col=colmap[k])
 now i want to translate this pattern to an R function to go through the all
data set, specially the y which i suffered with should be cumulative.
hear is what i wrote, at the end i put the code which i used, but it is not
working properly to translate the pattern that  i made manually :
hi<-function(x)
{
height<-rep(0,8)
for (j in 1:8){
height[j]<-sum(round(cleaved(x)[,j]))
max.height<-max(height)
}
plot(c(0,10*8),c(0,max.height+20),col="white")
}
recta<-function(x)
{
colmap<-c("#FFFFFF", "#FFFFCC", "#FFFF99", "#FFFF66", "#FFFF33",
"#FFFF00", "#FFCCFF", "#FFCCCC", "#FFCC99", "#FFCC66", "#FFCC33",
"#FFCC00", "#FF99FF", "#FF99CC", "#FF9999", "#FF9966", "#FF9933",
"#FF9900", "#FF33FF", "#FF33CC")

rect(1*10,20,10+10,20+round(cleaved(x)[1,1]),col=colmap[1])
rect(1*10,40,10+10,40+round(cleaved(x)[2,1]),col=colmap[2])
rect(1*10,53,10+10,53+round(cleaved(x)[3,1]),col=colmap[3])
rect(1*10,63,10+10,63+round(cleaved(x)[4,1]),col=colmap[4])
rect(1*10,69,10+10,69+round(cleaved(x)[5,1]),col=colmap[5])
rect(1*10,73,10+10,73+round(cleaved(x)[6,1]),col=colmap[6])
rect(1*10,85,10+10,85+round(cleaved(x)[7,1]),col=colmap[7])
rect(1*10,89,10+10,89+round(cleaved(x)[8,1]),col=colmap[8])
rect(1*10,96,10+10,96+round(cleaved(x)[9,1]),col=colmap[9])
rect(1*10,110,10+10,110+round(cleaved(x)[10,1]),col=colmap[10])
rect(1*10,118,10+10,118+round(cleaved(x)[11,1]),col=colmap[11])
rect(1*10,123,10+10,123+round(cleaved(x)[12,1]),col=colmap[12])
rect(1*10,144,10+10,144+round(cleaved(x)[13,1]),col=colmap[13])
rect(1*10,149,10+10,149+round(cleaved(x)[14,1]),col=colmap[14])
rect(1*10,158,10+10,158+round(cleaved(x)[15,1]),col=colmap[15])
rect(1*10,170,10+10,170+round(cleaved(x)[16,1]),col=colmap[16])
rect(1*10,198,10+10,198+round(cleaved(x)[17,1]),col=colmap[17])
rect(1*10,213,10+10,213+round(cleaved(x)[18,1]),col=colmap[18])
rect(1*10,225,10+10,225+round(cleaved(x)[19,1]),col=colmap[19])
rect(1*10,229,10+10,225+round(cleaved(x)[20,1]),col=colmap[20])

}
recta<-function(x)
{
colmap<-c("#FFFFFF", "#FFFFCC", "#FFFF99", "#FFFF66", "#FFFF33",
"#FFFF00", "#FFCCFF", "#FFCCCC", "#FFCC99", "#FFCC66", "#FFCC33",
"#FFCC00", "#FF99FF", "#FF99CC", "#FF9999", "#FF9966", "#FF9933",
"#FF9900", "#FF33FF", "#FF33CC")
for (j in 1:8){
	xx<-j*10
for(k in 1:20){
	yy<-cumsum(round(cleaved(x)[k,j]))
rect(xx,yy,xx+10,yy+round(cleaved(x)[k,j]),col=colmap[k])
}
}
}
che
#
here you are the whole code, and the data is attached:
+ {
+ y<-matrix(0,20,8)
+ colnames(y)<-c("N4","N3","N2","N1","C1","C2","C3","C4")
+ for(i in 1:num){
+ if (x$Label[i] %in% nc)
+ {
+ for(j in 1:8){
+ res<-which(AA==substr(x$Peptide[i],j,j))
+ y[res, j]=y[res, j]+1
+ }
+ }
+ }
+ return (y/274*100)
+ }
+ {
+ y<-matrix(0,20,8)
+ colnames(y)<-c("N4","N3","N2","N1","C1","C2","C3","C4")
+ for(i in 1:num){
+ if (x$Label[i] %in% nc)
+ {
+ for(j in 1:8){
+ res<-which(AA==substr(x$Peptide[i],j,j))
+ y[res, j]=y[res, j]+1
+ }
+ }
+ }
+ return (y/113*100)
+ }
+ {
+ height<-rep(0,8)
+ for (j in 1:8){
+ height[j]<-sum(round(cleaved(x)[,j]))
+ max.height<-max(height)
+ }
+ plot(c(0,10*8),c(0,max.height+20),col="white")
+ }
+ {
+ colmap<-c("#FFFFFF", "#FFFFCC", "#FFFF99", "#FFFF66", "#FFFF33",
+ "#FFFF00", "#FFCCFF", "#FFCCCC", "#FFCC99", "#FFCC66", "#FFCC33",
+ "#FFCC00", "#FF99FF", "#FF99CC", "#FF9999", "#FF9966", "#FF9933",
+ "#FF9900", "#FF33FF", "#FF33CC")
+ for (j in 1:8){
+     xx<-j*10
+ for(k in 1:20){
+         yy<-round(cleaved(x)[k,j])
+         rect(xx,yy,xx+10,yy+round(cleaved(x)[k,j]),col=colmap[k])
+  }
+  }
+  }
#
Does this do what you want?  You were not initializing the plot before
calling 'rect'.  also it appears that you only have to call cleaved(x)
once to get the matrix and then use it in the loop -- more efficient
that way.

x<-read.table("C:/hiv.txt", header=TRUE)
num<-nrow(x)
AA<-c('A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y')
nc<-x$Label[61:308]
c<-x$Label[nc]
noncleaved<-function(x)
{
    y<-matrix(0,20,8)
    colnames(y)<-c("N4","N3","N2","N1","C1","C2","C3","C4")
    for(i in 1:num){
        if (x$Label[i] %in% nc)
        {
            for(j in 1:8){
                res<-which(AA==substr(x$Peptide[i],j,j))
                y[res, j]=y[res, j]+1
            }
        }
    }
    return (y/274*100)
}

cleaved<-function(x)
{
    y<-matrix(0,20,8)
    colnames(y)<-c("N4","N3","N2","N1","C1","C2","C3","C4")
    for(i in 1:num){
        if (x$Label[i] %in% nc)
        {
            for(j in 1:8){
                res<-which(AA==substr(x$Peptide[i],j,j))
                y[res, j]=y[res, j]+1
            }
        }
    }
    return (y/113*100)
}


hi<-function(x)
{
    height<-rep(0,8)
    for (j in 1:8){
        height[j]<-sum(round(cleaved(x)[,j]))
        max.height<-max(height)
    }
    plot(c(0,10*8),c(0,max.height+20),col="white")
}
recta<-function(x)
{
    plot(0, type='n', xlim=c(0,100), ylim=c(0,60))
    colmap<-c("#FFFFFF", "#FFFFCC", "#FFFF99", "#FFFF66", "#FFFF33",
    "#FFFF00", "#FFCCFF", "#FFCCCC", "#FFCC99", "#FFCC66", "#FFCC33",
    "#FFCC00", "#FF99FF", "#FF99CC", "#FF9999", "#FF9966", "#FF9933",
    "#FF9900", "#FF33FF", "#FF33CC")
    x.c <- cleaved(x)
    for (j in 1:8){

        xx<-j*10
        for(k in 1:20){

                yy<-round(x.c[k,j])

                rect(xx,yy,xx+10,yy+round(x.c[k,j]),col=colmap[k])
         }
     }
 }

recta(x)
On Mon, Jan 25, 2010 at 9:51 PM, che <fadialnaji at live.com> wrote:

  
    
che
#
70% yes, the problem is i am trying to produce a graph similar to the one in
attachments in this message, which represents the frequency of each letter
"aminoacid" in the cleaved function and the noncleaved function. some thing
else i added to the attachments is the pattern which seemingly working
correctly, i am trying now to create a R code to loop and simulate this
pattern in order to draw all rectangles for the eight columns. But i don't
know exactly how to deal with this variable which i highlighted with yellow
in the image, it is cumulative in a challenging way.   
http://n4.nabble.com/file/n1290048/cleaved.jpg cleaved.jpg 
http://n4.nabble.com/file/n1290048/pattern.jpg pattern.jpg
#
It sounds like you want to use 'barplot' like below given that it
appears that the value in x.c would be the matrix you want to graph:

x.c <- cleaved(x)
barplot(x.c, col=c("#FFFFFF", "#FFFFCC", "#FFFF99", "#FFFF66", "#FFFF33",
   "#FFFF00", "#FFCCFF", "#FFCCCC", "#FFCC99", "#FFCC66", "#FFCC33",
   "#FFCC00", "#FF99FF", "#FF99CC", "#FF9999", "#FF9966", "#FF9933",
   "#FF9900", "#FF33FF", "#FF33CC"))

THis seems to produce something like you want.
On Mon, Jan 25, 2010 at 10:42 PM, che <fadialnaji at live.com> wrote:

  
    
1 day later
che
#
yes, but the outcome graphs are almost the same, that mean it does not
calculated in a cumulative way , if you apply the following code, then run
hi(x), and then recta(x), you will see how the shape are similar to the
frequency of Amino Acid  in the matrix. i am looking for a code that can do
this automatically starting from the first column ending with the last
column- data attached. many thanx
x<-read.table("C:/Uni/502/CA2/hiv.dat", header=TRUE)
num<-nrow(x)
AA<-c('A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y')
nc<-x$Label[61:308]
c<-x$Label[nc]
noncleaved<-function(x)
{
y<-matrix(0,20,8)
colnames(y)<-c("N4","N3","N2","N1","C1","C2","C3","C4")
for(i in 1:num){
if (x$Label[i] %in% nc)
{
for(j in 1:8){
res<-which(AA==substr(x$Peptide[i],j,j))
y[res, j]=y[res, j]+1
}
}
}
return (y/274*100)
}

cleaved<-function(x)
{
y<-matrix(0,20,8)
colnames(y)<-c("N4","N3","N2","N1","C1","C2","C3","C4")
for(i in 1:num){
if (x$Label[i] %in% nc)
{
for(j in 1:8){
res<-which(AA==substr(x$Peptide[i],j,j))
y[res, j]=y[res, j]+1
}
}
}
return (y/113*100)
}

hi<-function(x)
{
height<-rep(0,8)
for (j in 1:8){
height[j]<-sum(round(cleaved(x)[,j]))
max.height<-max(height)
}
plot(c(0,10*8),c(0,max.height+20),col="white")
}
suma<-function(i,j,A)
{
  if( j<= 0)
    {
    sum<-0
    }
  else
    {
     sum<-0
     for(k in 1:j)
       {
        sum<- sum + round(A[k,i])
         
        }
     }
return(sum)
}

grafica<- function(A)
{
for(i in 1:8)
  {
  for(j in 1:20)
    {
    rect((i-1)*10,suma(i,j-1,A),((i-1)*10)+10,suma(i,j,A), col=colmap[j])
    if ( A[j,i] != 0)
      {
       text( ((i-1)*10)+5, (suma(i,j-1,A) + round(A[j,i])/2), amino.acid[j],
cex=( (2*round(A[j,i])/round(max(A))
)))
       }
     }
  }  
}
recta<-function(x) 
{ 
colmap<-c("#FFFFFF", "#FFFFCC", "#FFFF99", "#FFFF66", "#FFFF33", 
"#FFFF00", "#FFCCFF", "#FFCCCC", "#FFCC99", "#FFCC66", "#FFCC33", 
"#FFCC00", "#FF99FF", "#FF99CC", "#FF9999", "#FF9966", "#FF9933", 
"#FF9900", "#FF33FF", "#FF33CC") 

rect(1*10,20,10+10,20+round(cleaved(x)[1,1]),col=colmap[1]) 
rect(1*10,40,10+10,40+round(cleaved(x)[2,1]),col=colmap[2]) 
rect(1*10,53,10+10,53+round(cleaved(x)[3,1]),col=colmap[3]) 
rect(1*10,63,10+10,63+round(cleaved(x)[4,1]),col=colmap[4]) 
rect(1*10,69,10+10,69+round(cleaved(x)[5,1]),col=colmap[5]) 
rect(1*10,73,10+10,73+round(cleaved(x)[6,1]),col=colmap[6]) 
rect(1*10,85,10+10,85+round(cleaved(x)[7,1]),col=colmap[7]) 
rect(1*10,89,10+10,89+round(cleaved(x)[8,1]),col=colmap[8]) 
rect(1*10,96,10+10,96+round(cleaved(x)[9,1]),col=colmap[9]) 
rect(1*10,110,10+10,110+round(cleaved(x)[10,1]),col=colmap[10]) 
rect(1*10,118,10+10,118+round(cleaved(x)[11,1]),col=colmap[11]) 
rect(1*10,123,10+10,123+round(cleaved(x)[12,1]),col=colmap[12]) 
rect(1*10,144,10+10,144+round(cleaved(x)[13,1]),col=colmap[13]) 
rect(1*10,149,10+10,149+round(cleaved(x)[14,1]),col=colmap[14]) 
rect(1*10,158,10+10,158+round(cleaved(x)[15,1]),col=colmap[15]) 
rect(1*10,170,10+10,170+round(cleaved(x)[16,1]),col=colmap[16]) 
rect(1*10,198,10+10,198+round(cleaved(x)[17,1]),col=colmap[17]) 
rect(1*10,213,10+10,213+round(cleaved(x)[18,1]),col=colmap[18]) 
rect(1*10,225,10+10,225+round(cleaved(x)[19,1]),col=colmap[19]) 
rect(1*10,229,10+10,225+round(cleaved(x)[20,1]),col=colmap[20]) 

}
http://n4.nabble.com/file/n1312372/hiv.dat hiv.dat 
http://n4.nabble.com/file/n1312372/hiv.txt hiv.txt
2 days later
#
It is not entirely clear what you are trying to do.  Can you explain
what the matrix that you are creating out of 'cleaved' represents?

"Tell me what you want to do; not how you want to do it".  It is hard
to follow code when you have not explained what it is doing.  THere
appear to be all kinds of "magic" numbers in the code.

Why do you 'return(y/113*100)'.  What does that mean?  There are
probably easier ways to do it if we understood exactly what you wanted
to do.  It sounds like once you can define the transformation to
create the matrix, barplot will probably work for you.
On Wed, Jan 27, 2010 at 9:04 PM, che <fadialnaji at live.com> wrote:

  
    
che
#
Here is the the written instruction as i managed to get it from my professor,
the graphs and data are attached:

The graph below shows an example of the expected outcome of this course
work. You may
procude a better one. The graph for analysing the motifs of a set of
peptides is designed
this way

? the graph is composed of columns of coloured rectangles

? a column corresponding to a residue from ?N4? to ?C4?. Note that eight
residues
are denoted by ?N4?, ?N3?, ?N2?, ?N1?, ?C1?, ?C2?, ?C3?, ?C4?. ?N4? means
the
4th flanking residue of a cleavage site on the N-terminal side and ?C3?
means the 3rd
flanking residue of a cleavage site on the C-terminal side. The cleavage
occurs between
?N1? and ?C1?.

? there are 20 rectangles in each column corresponding to 20 amino acids. A
rectangular
of an amino acid has a larger height if the corresponding amino acid has a
larger
frequency to occur at the residue, for instance, the rectangular of ?S? in
the first
column for the cleaved peptides.

? a letter of an amino acid is printed within a rectangular. Its font size
depends on the
frequency of the amino acid in a residue.

In your package, you need to have the following functions
1. set a colour map using the following or your own design
? colmap<-c("#FFFFFF", "#FFFFCC", "#FFFF99", "#FFFF66", "#FFFF33",
"#FFFF00", "#FFCCFF", "#FFCCCC", "#FFCC99", "#FFCC66", "#FFCC33",
"#FFCC00", "#FF99FF", "#FF99CC", "#FF9999", "#FF9966", "#FF9933",
"#FF9900", "#FF33FF", "#FF33CC")
2. define a set of amino acids using string or other format if you want
? amino.acid<-"ACDEFGHIKLMNPQRSTVWY"

3. read in the given peptide data (?hiv.dat?) using
read.table(??../data/hiv.dat??,header=TRUE)
? The data I sent to you should not be saved in the same directory where you
save
your R code!
? The data is composed of two parts, cleaved (denoted by ?cleaved?) and non
cleaved (denoted by ?noncleaved?). The first five lines of the data are
shown
below
Peptide Label
TQIMFETF cleaved
GQVNYEEF cleaved
KVFGRCEL noncleaved
VFGRCELA noncleaved
? to access to the ith peptide, you can use X$Peptide[i]
? to access to the ith label, you can use X$Label[i]

4. detect the number of cleaved peptides and the number of non-cleaved
peptides using
? nrow(X)

5. define two matrices with initialised entries, one for positive peptides
and one for neg-
ative peptides
? matrix(0,AA,mer),where AA is the number of amino acids, and mer is the
number
of residues detected from data using the nchar function
? both matrices have the same size, the number of rows being equal to the
number
of amino acids and the number of columns being equal to the number of
residues
in peptides
? name the columns of these two matrices using
? c("N4","N3","N2","N1","C1","C2","C3","C4"),

6. use one three-loop structure to detect the frequency of amino acids in
cleaved peptides
and one three-loop structure to detect the frequency of amino acids in
non-cleaved
peptides. They should not be mixed in one three-loop structure. The best way
to
handle this is to use a function. The three-loop structure is exampled as
below
for(i in 1:num)#scanning data for all peptides, where num means the number
of peptides
{
for(j in 1:mer)#scanning all residues in a peptide
{
for(k in 1:AA)#scanning 20 amino acids
{
#actions
}
}
}

7. make sure that each frequency matrix needs to be converted to a
percentage, i.e. each
entry in the matrix is divided by the number of cleaved or non-cleaved
peptides and
multiplied by 100. This converted frequency is named as the normalised
frequency.

8. detect the maximum height of the normalised frequency each residue in
cleaved or
non-cleaved peptides using
height<-rep(0,mer)
for(j in 1:mer)
height[j]<-sum(round(X.frequency[,j]))
max.height<-max(height)
? Note that the height of each column in a graph (see the graph on 3)
corresponds
to the summation of 20 frequencies of 20 amino acids for a residue.

9. draw a blank plot using the maximum height
? plot(c(0,10*mer),c(0,max.height),col="white", ? ? ?)
? in this blank plot, you can add graphics as discussed below

10. determine the x coordinate, but it is recommended to use i*10 as the
x-coordinate
where i indexes the residues. The x-coordinate represents columns in the
graph shown
in 3. If there are 8 residues in peptides, there are 8 columns.

11. determine the y coordinate, which is cumulative (see next item below).
The y-
coordinate represents rows in the graph shown in 3. There are always 20 rows
for
20 amino acids. Note that the rows cannot be aligned because the frequency
of an
amino acid in a residue varies.
12. draw a rectangular based on the frequency of each residue and each amino
acid
? rect(x,y,x+10,y+round(X.frequency[k,j]),col=colmap[k]), where k indi-
cates an amino acid and j indicates a residue
? after drawing this rectangular, the y-coordinate ?y? should be increased
by round(X.frequency[k,j])
? after one column is drawn for one residue, the x-coordinate ?x? should be
in-
creased by 10
13. plot a text at the corresponding position using
? text((x+5),(y+round(X.frequency[k,j])/2),substr(amino.acid,k,k))
14. place two drawings in one plot using the par function
http://n4.nabble.com/file/n1457645/cleaved.jpg cleaved.jpg 
http://n4.nabble.com/file/n1457645/noncleaved.jpg noncleaved.jpg 
http://n4.nabble.com/file/n1457645/hiv.dat hiv.dat
#
One quick comment about looking at the graphs you provided, why aren't
all 8 columns the same height given that each column should have the
same number of amino acids in them.  FOr the cleaved case is it 114
and even after normalizing, the column sums should be the same -- 100.
 Are the graphs really correct?
On Sat, Jan 30, 2010 at 3:38 PM, che <fadialnaji at live.com> wrote:

  
    
#
Also this looks like homework, so I can not really reply with a
solution.  BTW, once you have the normalized matrix, barplot will
create your output without the complications of steps 8-13.  You will
have to use the data to put the text, but that again is relatively
easy with the data.
On Sat, Jan 30, 2010 at 3:38 PM, che <fadialnaji at live.com> wrote:

  
    
che
#
hello, 
i appreciate your help, your help, comments, and suggestion really are so
helpful to develop not only my R skills, but also my programming language
sense. as i said, i am not breaking any academic rules, and this is a
softwar i have to develop to deal with my project after two months along
with some algorithms .. any way, for your question jholtman, if one
particular amino acids (let say letter A) is missed in the data, that wont
appear in the graph. any way i think i found the clue for this work, here
you are what i wrote, it is working, but i would love to have any comments,
or advices. the data attached. many thanks.
x<-read.table("C:/Uni/502/CA2/hiv.dat", header=TRUE)										
attach(x)																					
AA<-c('A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y')		
num<-nrow(x)																				
frequency<-function(Q)
{																				
y<-matrix(0,20,8)
colnames(y)<-c("N4","N3","N2","N1","C1","C2","C3","C4")

	for(i in 1:num){
		if (x$Label[i]== Q)
			{	
			for(j in 1:8){
			res<-which(AA==substr(x$Peptide[i],j,j))
			y[res, j]=y[res, j]+1

 						}
			}
}
return (y)
}
freqC<-frequency("cleaved")
freqNc<-frequency("noncleaved")

 
ClPeptide<-114
nClPeptide<-248


Norm<-function(F,N)

{
No<-matrix(0,20,8)
  colnames(No)<-c("N4","N3","N2","N1","C1","C2","C3","C4")
	for (j in 1:8){
		for (k in 1:20){
			No[k,j]=(F[k,j]/N)*100
		}
	}
	return(No)
}
normalisedC<-Norm(freqC,ClPeptide)
normalisedNc<-Norm(freqNc,nClPeptide)


hi<-function(H)
{	
	height<-rep(0,8)
		for (j in 1:8){
		height[j]<-sum(round(H)[,j])
		max.height<-max(height)
					}
return(max.height)
}

CleH<-hi(normalisedC)
nCleH<-hi(normalisedNc)

colmap<-c("#009900", "#00CC00", "#00FF00", "#009933", "#00CC33",
"#00FF33", "#009966", "#00CC66", "#00FF66", "#009999", "#00CC99",
"#00FF99", "#0099CC", "#00CCCC", "#00FFCC", "#0099FF", "#00CCFF",
"#00FFFF", "#66FFFF", "#CCFFFF")

CumulativeY<-function(k,b,F)
{
  if( b<=0)
    {
	    cum=0
	}
  else
    {
	    cum=0
	    for(d in 1:b)
    {
        cum=cum + (round(F[d,k]))
         
        }
     }
return(cum)
}
graph<- function(F)
{
for(i in 1:8)
  {
  for(j in 1:20)
    {
    rect((i-1)*10,CumulativeY(i,j-1,F),((i-1)*10)+10,CumulativeY(i,j,F),
col=colmap[j])
    if ( F[j,i] != 0)
      {
       text( ((i-1)*10)+5, (CumulativeY(i,j-1,F) + round(F[j,i])/2), AA[j],
cex=((2*round(F[j,i])/round(max(F)))),col="#990000")
       }
     }
  }  
}

par(mfrow=c(1,2))
plot(c(0,10*8),c(0,CleH),col="#303030")
graph(normalisedC)
plot(c(0,10*8),c(0,nCleH),col="#303030")
graph(normalisedNc)
http://n4.nabble.com/file/n1458002/hiv.dat hiv.dat