Skip to content

Reading in and writing out one line at a time

5 messages · Uwe Ligges, Jan van der Laan, Johannes Hüsing +1 more

#
I hope that somebody can help me with this - I think very simple - issue...?

I am running a package that only accepts one line at a time, but I would
like to run this package on a dataframe of >500 lines. 

Dataframe "d" is a single column:

APPLES
PEARS
AUBERGINES
KUMQUATS

I would like to read one line of my dataframe "d", individually into a new
frame "f", then execute the program on "f" and it provides an output:
[[1]]
[1]  FREDBLOGGS

[2]  250

I would like to record this output to a two column dataframe, "r", such as:

FREDBLOGGS     250

and then repeat the process on the next line of dataframe "d", and so on to
the end of dataframe "d",writing each line into "r" so that the dataframe
"r" eventually reads:

FREDBLOGGS     250
JAMESJONES      175
TERRYTAITE       892
HARRYSMITH     320


I'm afraid that I'm new to this, but think that this first step in will be
very useful in general. Thank you kindly for any help.
1 day later
#
On 20.05.2010 00:47, sedm1000 wrote:
This will be slow, therefore I'd suggest to read the whole data frame at 
once and then feed it row by row into your function.

Uwe Ligges
#
Perhaps you mean something like sapply or apply?

When d is indeed a data.frame with one column: sapply(d[,1], mash)

Regards,

Jan van der Laan
On Thu, May 20, 2010 at 12:47 AM, sedm1000 <gdoran at mit.edu> wrote:
#
Jan van der Laan <djvanderlaan at gmail.com> [Fri, May 21, 2010 at 11:22:35AM CEST]:
Neither am I certain what he means exactly, but I would use apply and cousins
only if I don't care in which sequence the elements are processed. His problem
(reading line by line) sounds as if the order of processing matters.
#
Hi, Thanks for you insight. 

the problem that I have is that the program reports back an error: 

Error in FUN(X[[1L]], ...) :
 STRING_ELT() can only be applied to a 'character vector', not a 'integer

When I try to read in a multi-row frame, and so I can only feed in one row
at a time. The data is then output thus:

[[1]]
[1]  FREDBLOGGS

[2]  250 


The best that I can understand, I need to split my multi-row dataframe into
single rows, and then record the output into single rows in a new results
dataframe, and rbind to add the next output to the results dataframe. 

If anybody can understand the error message, then that would be useful
too..!

Thanks.