An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091113/b4704d57/attachment-0001.pl>
processing log file
3 messages · Jabez Wilson, David Winsemius, Karl Ove Hufthammer
On Nov 13, 2009, at 6:03 AM, Jabez Wilson wrote:
Dear all, I'm trying to process a log file which logs the date, the username and the computer number accessed. The table looks like this:
table.users
Date UserName Machine 1 2008-11-25 John 641 2 2008-11-25 Clive 611 3 2008-11-25 Jeremy 641 4 2008-11-25 Walt 722 5 2008-11-25 Tony 645 6 2008-11-26 Tony 645 7 2008-11-26 Tony 641 8 2008-11-26 Tony 641 9 2008-11-26 Walt 641 10 2008-11-26 Walt 645 11 2008-11-30 John 641 12 2008-11-30 Clive 611 13 2008-11-30 Tony 641 14 2008-11-30 John 641 15 2008-11-30 John 641 ..................etc What I want to do is to find out how many unique users logged on each day, and how many individual machines where accessed per day. In the above example, therefore on 2008-11-25 there were 5 separate users accessing 4 machines, on 2008-11-26 there were 2 unique users who used 2 machines (although both logged on more than once). I've got as far as apply(table.users, 2, FUN=table) which gives me an output of date, or username or machine and how many times they were accessed, but not really what I want. Any help appreciated
You were almost there. Just use lapply on the list object you produced: > lapply(apply(table.users, 2, FUN=table), length) $Date [1] 3 $UserName [1] 5 $Machine [1] 4 Or if you want the individual items that you requested: > lapply(apply(table.users, 2, FUN=table), length)$UserName [1] 5 > lapply(apply(table.users, 2, FUN=table), length)$Machine [1] 4
David Winsemius, MD Heritage Laboratories West Hartford, CT
On Fri, 13 Nov 2009 11:03:31 +0000 (GMT) Jabez Wilson
<jabezwuk at yahoo.co.uk> wrote:
What I want to do is to find out how many unique users logged on each day, and how many individual machines where accessed per day.
Use the 'plyr' package: library(plyr) ddply(table.users, .(Date), summarise, users=length(unique(Username)), machines=length(unique(Machine)))
Karl Ove Hufthammer