Skip to content
Back to formatted view

Raw Message

Message-ID: <ACE465E1-1207-432A-B253-B394701473A6@comcast.net>
Date: 2009-11-13T11:41:51Z
From: David Winsemius
Subject: processing log file
In-Reply-To: <313692.68426.qm@web28505.mail.ukl.yahoo.com>

On Nov 13, 2009, at 6:03 AM, Jabez Wilson wrote:

> Dear all, I'm trying to process a log file which logs the date, the  
> username and the computer number accessed. The table looks like this:
>> table.users
>          Date UserName Machine
> 1  2008-11-25     John     641
> 2  2008-11-25    Clive     611
> 3  2008-11-25   Jeremy     641
> 4  2008-11-25     Walt     722
> 5  2008-11-25     Tony     645
> 6  2008-11-26     Tony     645
> 7  2008-11-26     Tony     641
> 8  2008-11-26     Tony     641
> 9  2008-11-26     Walt     641
> 10 2008-11-26     Walt     645
> 11 2008-11-30     John     641
> 12 2008-11-30    Clive     611
> 13 2008-11-30     Tony     641
> 14 2008-11-30     John     641
> 15 2008-11-30     John     641
> ..................etc
> What I want to do is to find out how many unique users logged on  
> each day, and how many individual machines where accessed per day.  
> In the above example, therefore on 2008-11-25 there were 5 separate  
> users accessing 4 machines, on 2008-11-26 there were 2 unique users  
> who used 2 machines (although both logged on more than once).
> I've got as far as apply(table.users, 2, FUN=table) which gives me  
> an output of date, or username or machine and how many times they  
> were accessed, but not really what I want.
> Any help appreciated

You were almost there. Just use lapply on the list object you produced:

 > lapply(apply(table.users, 2, FUN=table), length)
$Date
[1] 3

$UserName
[1] 5

$Machine
[1] 4

Or if you want the individual items that you requested:

 > lapply(apply(table.users, 2, FUN=table), length)$UserName
[1] 5
 > lapply(apply(table.users, 2, FUN=table), length)$Machine
[1] 4


-- 

David Winsemius, MD
Heritage Laboratories
West Hartford, CT