Skip to content

Sum Question

4 messages · Edgar Alminar, Marc Schwartz, Dennis Murphy

#
RID      SCRNO VISCODE RECNO CONTTIME
338   43 HBA0020036      bl     1        9
1187  95 HBA0020087      bl     1        3
3251 230 HBA0020209      bl     2        3
3258 230 HBA0020209      bl     1       28
3321 235 HBA0020213      bl     2        5
3351 235 HBA0020213      bl     1        6
3436 247 HBA0020222      bl     1        5
3456 247 HBA0020222      bl     2        4
4569 321 HBA0020292      bl    13        2
4572 321 HBA0020292      bl     5       13
4573 321 HBA0020292      bl     1       25
4576 321 HBA0020292      bl     7        5
4578 321 HBA0020292      bl     8        2
4581 321 HBA0020292      bl     4        4
4582 321 HBA0020292      bl     9        5
4586 321 HBA0020292      bl    12        2
4587 321 HBA0020292      bl     6        2
4590 321 HBA0020292      bl    10        3
4591 321 HBA0020292      bl    11        7
#
On Jun 30, 2011, at 11:20 AM, Edgar Alminar wrote:

            
That is not the entire dataset....HBA0020366 is missing, as an example.

I don't use the data.table package, but if you are getting an error indicating that CONTTIME is a factor, then something is wrong with either the data itself (there are non-numeric entries) or the way in which it was entered/imported into R.

Thus, I would first check your data for errors. Use str(YourDataSet) to review its structure and if CONTTIME is a factor, check into the data to see why.

Lastly, review this R FAQ:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-do-I-convert-factors-to-numeric_003f

Just as an alternative, with your data in 'DF':
RID      SCRNO VISCODE RECNO CONTTIME
338   43 HBA0020036      bl     1        9
1187  95 HBA0020087      bl     1        3
3251 230 HBA0020209      bl     2        3
3258 230 HBA0020209      bl     1       28
3321 235 HBA0020213      bl     2        5
3351 235 HBA0020213      bl     1        6
3436 247 HBA0020222      bl     1        5
3456 247 HBA0020222      bl     2        4
4569 321 HBA0020292      bl    13        2
4572 321 HBA0020292      bl     5       13
4573 321 HBA0020292      bl     1       25
4576 321 HBA0020292      bl     7        5
4578 321 HBA0020292      bl     8        2
4581 321 HBA0020292      bl     4        4
4582 321 HBA0020292      bl     9        5
4586 321 HBA0020292      bl    12        2
4587 321 HBA0020292      bl     6        2
4590 321 HBA0020292      bl    10        3
4591 321 HBA0020292      bl    11        7
DF$SCRNO CONTTIME
1 HBA0020036        9
2 HBA0020087        3
3 HBA0020209       31
4 HBA0020213       11
5 HBA0020222        9
6 HBA0020292       70


See ?aggregate

HTH,

Marc Schwartz
#
On Jun 30, 2011, at 12:30 PM, Marc Schwartz wrote:

            
Quick typo correction here. the 'DF$' in DF$SCRNO is superfluous. I did not clean that up before copying and pasting.
SCRNO CONTTIME
1 HBA0020036        9
2 HBA0020087        3
3 HBA0020209       31
4 HBA0020213       11
5 HBA0020222        9
6 HBA0020292       70


Marc
#
Hi:

Here's a data.table solution. After I read in your data as a data
frame named dd, I used str() to check its contents:
'data.frame':   19 obs. of  5 variables:
 $ RID     : int  43 95 230 230 235 235 247 247 321 321 ...
 $ SCRNO   : Factor w/ 6 levels "HBA0020036","HBA0020087",..: 1 2 3 3
4 4 5 5 6 6 ...
 $ VISCODE : Factor w/ 1 level "bl": 1 1 1 1 1 1 1 1 1 1 ...
 $ RECNO   : int  1 1 2 1 2 1 1 2 13 5 ...
 $ CONTTIME: int  9 3 3 28 5 6 5 4 2 13 ...

If you were getting CONTTIME as a factor, I'm guessing you put all of
this into a matrix (cbind?) and then read it into data.table. If so,
you need to spend a little time reading up on the differences between
matrices and data frames. A data table is meant to be a generalization
of a data frame.  It's important that you know the classes of your
objects and how to coerce them from one class to another if necessary.
That aside,
data.table 1.6
Quick start guide : vignette("datatable-intro")
Homepage : http://datatable.r-forge.r-project.org/
Help : help("data.table") or ?data.table (includes fast start examples)
SCRNO csum
[1,] HBA0020036    9
[2,] HBA0020087    3
[3,] HBA0020209   31
[4,] HBA0020213   11
[5,] HBA0020222    9
[6,] HBA0020292   70

Using the list() wrapper is useful, especially if you want to output
multiple variables or if you want to assign a name to the derived
summary variable.

HTH,
Dennis
On Thu, Jun 30, 2011 at 9:20 AM, Edgar Alminar <eaalminar at ucsd.edu> wrote: