Skip to content

Data.table vs dplr handling multiple variables

2 messages · Ek Esawi, Jeff Newmiller

#
Hi All?

I am often working with large datasets with multiple variables (integer,
decimal, string, complex, date, and time) that require processing,
cleaning, etc. I am relatively new to R and I would like to get some input
on the following issue: I am trying to figure out which R-package(s) is
most suitable for my work. I looked into data.table and dplyr. Both are
very good but I found out that data.table does not handle time data well
(one has to use fast time package) and not sure whether dplyr does the same
or not. I am not sure about their handling of other variables listed above.
I like data.table.


The questions: (1) which package should I invest on learning and how to
deal with issue like time data and possibly other variables such complex
numbers, date, etc.? (2) What is the ?best? practical solution for such
issue?



Thanks in advance,


EKE
#
All approaches have strong points and weak points. Your question has no clear answer.

I happen to like dplyr for many things (including lots of timestamp values), but base R is always there to solve problems if the analysis framework-du-jour has troubles. So learn base R ways of doing things if nothing else. 

For next time: please read the Posting Guide. Give us a minimal example in R of what you are trying to accomplish along with your description and what you think the right answer will look like (consider using the reprex R package), and turn off HTML in your email program at least for your mails sent to this list because HTML gets damaged to varying degrees by the mailing list and then we are left puzzled about what you were asking.