Do you use R for data manipulation?
Warren Young wrote:
Farrel Buchinsky wrote:
Is R an appropriate tool for data manipulation and data reshaping and data organizing? I think so but someone who recently joined our group thinks not. The new recruit believes that python or another language is a far better tool for developing data manipulation scripts that can be then used by several members of our research group. Her assessment is that R is useful only when it comes to data analysis and working with statistical models.
It's hard to shift people's individual preferences, but impressive
objective comparisons are easy to come by. Ask her how many lines it
would take to do this trivial R task in Python:
data <- read.csv('original-data.csv')
write.csv('scaled-data.csv', data * 10)
you might want to learn that this is a question of appropriate
libraries. in r, read.csv and write.csv reside in the package utils.
in python, you'd use numpy:
from numpy import loadtxt, savetxt
savetxt('scaled.csv', loadtxt('original.csv', delimiter=',')*10,
delimiter=',')
this makes 2 lines, together with importing the library.
R's ability to do something to an entire data structure -- or a slice of it, or some other subset -- in a single operation is very useful when cleaning up data for presentation and analysis.
but this is really *hardly* r-specific. you can do that in many, many languages, be assured. just peek out.
Also point out how easy it is to get data *out* of R, as above, not just into it, so you can then hack on it in Python, if that's the better language for further manipulation. If she gives you static about how a few more lines are no big deal, remind her that it's well established that bug count is always a simple function of line count. This fact has been known since the 70's.
that's a slogan, esp. when you think of how compact (but unreadable, and thus error-prone) can code written in perl be. often, more lines of code make it easier to maintain, and thus avoid bugs.
While making your points, remember that she has a good one, too: R is not the only good language out there. You should learn Python while she's learning R.
+1