Howdy, I have done many searches and can't seem to find a way around this. I am reading in a .csv file where each row is a dataset and each column represents a position. The values are sparse (there are 2003 positions but usually only 100-200 values) and the idea is to be able to plot each dataset (row) and overlay them in different combinations. What I would like to do is have a plot where each data point is connected by a line, but since there are numerous NA values between each real data value, I have yet to find out to do this. Essentially I would like each data point to be connected to the next non-NA data point. I also thought about subsetting the relevant data points out, but am unsure how to do this while retaining the column numbers so they will not be affected spatially. Any help would be greatly appreciated. - Fincher
plotting lines when data has missing/NA values
2 messages · Justin Fincher, (Ted Harding)
On 08-Jul-10 19:52:36, Justin Fincher wrote:
Howdy, I have done many searches and can't seem to find a way around this. I am reading in a .csv file where each row is a dataset and each column represents a position. The values are sparse (there are 2003 positions but usually only 100-200 values) and the idea is to be able to plot each dataset (row) and overlay them in different combinations. What I would like to do is have a plot where each data point is connected by a line, but since there are numerous NA values between each real data value, I have yet to find out to do this. Essentially I would like each data point to be connected to the next non-NA data point. I also thought about subsetting the relevant data points out, but am unsure how to do this while retaining the column numbers so they will not be affected spatially. Any help would be greatly appreciated. - Fincher
The following small artificial example may do what you seem to want
(if the general idea is right, it should be straightforward to
modify the details to taste):
set.seed(54321)
D <- matrix(rep(NA,60),nrow=4)
D[sample(60,20)] <-runif(20)
print(D,3)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] 0.825 NA NA NA NA 0.905 NA NA
# [2,] NA NA NA 0.0763 0.54 NA NA NA
# [3,] NA NA 0.130 0.2061 NA 0.622 NA 0.330
# [4,] 0.879 NA NA NA NA 0.937 0.204 0.953
# [,9] [,10] [,11] [,12] [,13] [,14] [,15]
# [1,] 0.0378 0.170 NA NA NA NA NA
# [2,] NA 0.484 0.894 0.230 NA NA NA
# [3,] NA NA NA NA NA 0.131 NA
# [4,] NA NA 0.389 0.859 NA NA NA
posns<-(1:15)
cols <-c("red","green","blue","yellow")
M<-max(D,na.rm=TRUE)
ix.NA <- which(!is.na(D[1,]))
plot(posns[ix.NA],D[1,ix.NA],pch="+",xlim=c(0,16),ylim=c(0,M))
lines(posns[ix.NA],D[1,ix.NA])
for(i in (2:nrow(D))){
ix.NA <- which(!is.na(D[i,]))
points(posns[ix.NA],D[i,ix.NA],pch="+",ylim=c(0,M))
lines(posns[ix.NA],D[i,ix.NA],col=cols[(i-1)])
}
Hoping this helps!
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 08-Jul-10 Time: 21:34:18
------------------------------ XFMail ------------------------------