Help needed for accessing factor data,

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20110916/5e0847fd/attachment.pl>
It does,

Thanks a lot,

Have a great weekend ahead,

Sincerely
Bharat

On Fri, Sep 16, 2011 at 4:24 PM, R. Michael Weylandt
The following (admittedly inelegant) code worked for me to get the data in:

D = readLines("test.txt")
D = sapply(D,strsplit,",")
D = t(simplify2array(D))
rownames(D) <- NULL

D = as.data.frame(D)

for (i in 3:9){
??? D[,i] <- as.double(as.character(D[,i]))
}
D[,10] <- (D$V10 == "true")

Columns 1 and 2 are still factors and probably require some additional
processing.

Hope this helps,

Michael

On Fri, Sep 16, 2011 at 6:57 PM, Bharat Kherwa <bharatram.mnit at gmail.com>
wrote:
Hi Micheal

Thanks for the quick reply, but I tried your solution, and still
doesn't work ?(it gives error for the dimension).
"apply(as.character(data$V2),1,strsplit,",")
Error in apply(as.character(data$V2), 1, strsplit, ",") :
?dim(X) must have a positive length
"
I tried the above with all the data too: still got similar error.

I am attaching a file here whose data I need to access,

when we read the data using read.table('test'), it's read in as "data
frame" and using str(data) gives this :

str(data)
'data.frame': ? 23399 obs. of ?2 variables:
?$ V1: Factor w/ 1 level "spy,20110815": 1 1 1 1 1 1 1 1 1 1 ...
?$ V2: Factor w/ 23399 levels
"01:00:00,119.92,119.92,119.92,119.92,0,0,0.0 , ...

I need to access the individual elements of each row of V2.

Please let me know,
It's a great help,

Thanks a lot
Bharat

On Fri, Sep 16, 2011 at 3:42 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:
I'd guess that the data frame could be easily converted to strings
without
loosing any information, so why don't you try something like

apply(as.character(V),1,strsplit,",")

and then coerce the columns to their appropriate types and recombine
them
into a big data frame?

If this doesn't work, I'd happily take a closer look at the problem, but
please use dput() so I can just copy it directly into my console and
make
sure I get just the right sort of thing.

Hope this helps,

Michael Weylandt

On Fri, Sep 16, 2011 at 6:18 PM, Bharat Kherwa
<bharatram.mnit at gmail.com>
wrote:
Hi Guys,

I have some daily tick data like this,
? ? ? ? ? ?V1
? ? ? ? ? ? ? ? ? ? V2
1 spy,20110815 09:30:00,119.18,119.19,119.18,119.19,0,0,0.0,false
2 spy,20110815 09:30:01,119.21,119.21,119.19,119.21,0,0,0.0,false
3 spy,20110815 09:30:02,119.22,119.27,119.21,119.27,0,0,0.0,false
4 spy,20110815 09:30:03,119.26,119.27,119.18,119.18,0,0,0.0,false
5 spy,20110815 ? ?09:30:04,119.2,119.2,119.18,119.2,0,0,0.0,false
6 spy,20110815 09:30:05,119.21,119.21,119.18,119.18,0,0,0.0,false

The structure of the data is as follows:

'data.frame': ? 23399 obs. of ?2 variables:
?$ V1: Factor w/ 1 level "spy,20110815": 1 1 1 1 1 1 1 1 1 1 ...
?$ V2: Factor w/ 23399 levels
"01:00:00,119.92,119.92,119.92,119.92,0,0,0.0,false",..: 10800 10801
10802 10803 10804 10805 10806 10807 10808 10809 ...

Now I want to access the individual elements of data[1,2] ?which is
09:30:00,119.18,119.19,119.18,119.19,0,0,0.0,false,
i.e. access 119.18 separately, and similarly for other entries in the
row...
How do I do that? I tried all combinations to access the individual
elements with no success,

Any help will be greatly appreciated.

Thanks a lot
Bharat

_______________________________________________
R-SIG-Finance at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions
should go.