Date: Wed, 6 May 2009 06:42:45 -0700
From: jrkrideau at yahoo.ca
To: r-help at stat.math.ethz.ch; fjbuch at gmail.com
CC: ross.lazarus at gmail.com; gregory_warnes at urmc.rochester.edu;
greg at warnes.net
Subject: Re: [R] Do you use R for data manipulation?
--- On Wed, 5/6/09, Farrel Buchinsky <fjbuch at gmail.com> wrote:
Is R an appropriate tool for data
manipulation and data reshaping and data
organizing? I think so but someone who recently joined our
group thinks not.
I only do small scale projects and am by no means a programmer. Isn't Perl
something for earings?
That said, I find R to be extremely useful at data manipulation and have
used it exclusively in my last three projects. The different data
structures alone are worth their weight in gold, if for nothing else than
making it harder to make stupid mistakes in coding.
The new recruit believes that python or another language is
a far better tool for developing data manipulation scripts that can be
then used by> several members of our research group. Her assessment is
that R is useful> only when it comes to data analysis and working with
statistical models.
Any reason that she thinks this? How well does she know R? It is not
exactly a language that one picks up in a week, especially if one is
coming from using a stats package like SAS or SPSS. As an ex-SAS and
SYSTAT user it took me weeks to just get comfortable with the power of
subscripting and the ability to do all kinds of calculations "in-line".
So what do you think:
1)R is a phenomenally powerful and flexible tool and since you are going
> to do analyses in R you might as well use it to read data in and merge
it and reshape it to whatever you need.
Definately. I am not a computer scientist or a statistician. I usually am
working as a single contractor and normally with small datasets as part of
a larger project. R does what I want, usually very elegantly (albeit
perhaps after a lot of headbanging and calls for help to the R-list) and
it would be stupid for me to use more than one language when it is not
needed.
Another plus is that I can easily leave my data analysis work and a
working copy of R with the client. He/she may have a problem seeing what
I did but it is clearly readable & replicable by either the client or
another consultant.
OR
2) Are you crazy? Nobody in their right mind uses R to pipe
the data around their lab and assemble it for analysis.
Well I don't work in a lab but why complicate things? If everyone is using
the same tools then you have a good situation. Others who do work in labs
can address this point more cogently
From a personnel point of view do you expect everyone in the lab to be
proficient with R and, for example, Perl? What happens when/if you lose
your Perl expert(s)? I've had occasions where I waited a week for data
simply because the division's MS Access "expert" was on holiday and the
only other "Access" person there only knew how to enter data and run the
monthly reports. Anything more complicated required the "expert".