I read records using scan:
dat<-data.frame(scan(file="KDA.csv",what=list(t="%m/%d/%y
%H:%M",f=0,p=0,d=0,o=0,s=0,a=0,l=0,c=0),skip=2,sep=",",nmax=np,flush=TRUE,na.strings=c("I/OTimeout","ArcOff-line")))
which results in:
dat[1:5,]
t f p d o s a l c
1 1/21/09 5:01 16151 8.2 76 30 282 1060 53 7
2 1/21/09 5:02 16256 8.3 76 23 282 1059 54 7
3 1/21/09 5:03 16150 8.4 76 26 282 1059 55 7
4 1/21/09 5:04 16150 9.0 76 25 282 1051 57 6
5 1/21/09 5:05 15543 10.4 76 7 282 1024 58 6
I have been unable to find a way to convert this into a time series. I did
read the manuals and came across a way to coerce a data frame to a ts
object: as.ts()
Trouble is I do not know how to keep the timestamps in column t in the
data frame above. The t column is not strings. If I do:
plot.ts(dat)
I can see how the first graphics panel is indeed numbers not text. So I
think scan converted the text correctly per the format string I put in.
Much more difficult still. The datafiles I have contain invalid data,
missing values and other none relevant information. I filter this out
using subset which works brilliantly. However, how can I filter using
subset and convert to a time series afterwards. Since after subsetting
there will be 'holes' i.e. missing records. Can a ts object deal with
missing records? If so, how? Just point me to a document. I can and will
put in the work to figure it out myself.
Thank you!
Alex van der Spek
Try varying the arguments to this to accommodate
the precise format of your data. See the three zoo
vignettes, ?read.zoo and R News 4/1 for dates
and times.
f p d o s a l c
(01/21/09 05:01:00) 16151 8.2 76 30 282 1060 53 7
(01/21/09 05:02:00) 16256 8.3 76 23 282 1059 54 7
(01/21/09 05:03:00) 16150 8.4 76 26 282 1059 55 7
(01/21/09 05:04:00) 16150 9.0 76 25 282 1051 57 6
(01/21/09 05:05:00) 15543 10.4 76 7 282 1024 58 6
On Wed, Apr 8, 2009 at 4:43 AM, <amvds at xs4all.nl> wrote:
I read records using scan:
dat<-data.frame(scan(file="KDA.csv",what=list(t="%m/%d/%y
%H:%M",f=0,p=0,d=0,o=0,s=0,a=0,l=0,c=0),skip=2,sep=",",nmax=np,flush=TRUE,na.strings=c("I/OTimeout","ArcOff-line")))
which results in:
dat[1:5,]
? ? ? ? ? ? t ? ? f ? ?p ?d ?o ? s ? ?a ?l c
1 1/21/09 5:01 16151 ?8.2 76 30 282 1060 53 7
2 1/21/09 5:02 16256 ?8.3 76 23 282 1059 54 7
3 1/21/09 5:03 16150 ?8.4 76 26 282 1059 55 7
4 1/21/09 5:04 16150 ?9.0 76 25 282 1051 57 6
5 1/21/09 5:05 15543 10.4 76 ?7 282 1024 58 6
I have been unable to find a way to convert this into a time series. I did
read the manuals and came across a way to coerce a data frame to a ts
object: as.ts()
Trouble is I do not know how to keep the timestamps in column t in the
data frame above. The t column is not strings. If I do:
plot.ts(dat)
I can see how the first graphics panel is indeed numbers not text. So I
think scan converted the text correctly per the format string I put in.
Much more difficult still. The datafiles I have contain invalid data,
missing values and other none relevant information. I filter this out
using subset which works brilliantly. However, how can I filter using
subset and convert to a time series afterwards. Since after subsetting
there will be 'holes' i.e. missing records. Can a ts object deal with
missing records? If so, how? Just point me to a document. I can and will
put in the work to figure it out myself.
Thank you!
Alex van der Spek