user in the first column.
Here is an example of my data:
Email Email_sent
john at doe.com "2013-09-26 15:59:55" "2013-09-27 09:48:29" "2013-09-27 10:00:02" "2013-09-27 10:12:54"
jane at shoe.com "2013-09-26 09:50:28" "2013-09-26 14:41:24" "2013-09-26 14:51:36" "2013-09-26 17:50:10" "2013-09-27 13:34:02" "2013-09-27 14:41:10"
"2013-09-27 15:37:36"
...
I cannot find any way to calculate the frequencies between each email sent for each user:
john at doe.com 0.02 email / hour
jane at shoe.com 0.15 email / hour
...
Can anyone help me on this problem?
You could do something like this:
## scan your data file
d <- scan(, what = "character")
## here I use the data from above
d <- scan(textConnection('john at doe.com "2013-09-26 15:59:55"
"2013-09-27 09:48:29" "2013-09-27 10:00:02" "2013-09-27 10:12:54"
jane at shoe.com "2013-09-26 09:50:28" "2013-09-26 14:41:24"
"2013-09-26 14:51:36" "2013-09-26 17:50:10" "2013-09-27 13:34:02"
"2013-09-27 14:41:10" "2013-09-27 15:37:36"'), what = "character")
## find position of e-mail addresses
n <- grep("@", dc, fixed = TRUE)
## extract list of dates
n <- c(n, length(d) + 1)
x <- lapply(1:(length(n) - 1),
function(i) as.POSIXct(d[(n[i] + 1):(n[i+1] - 1)]))
## add e-mail addresses as names
names(x) <- d[head(n, -1)]
## functions that could extract quantities of interest such as
## number of mails per hour or mean time difference etc.
meantime <- function(timevec)
mean(as.numeric(diff(timevec), units = "hours"))
numperhour <- function(timevec)
length(timevec) / as.numeric(diff(range(timevec)), units = "hours")
## apply to full list
sapply(x, numperhour)
sapply(x, meantime)
## apply to list by date
sapply(x, function(timevec) tapply(timevec, as.Date(timevec), numperhour))
sapply(x, function(timevec) tapply(timevec, as.Date(timevec), meantime))
hth,
Z
The ultimate goal (which seems amibitious at this time) is to calculate, for each user, the frequencies between each mail per day, between the first email sent
and the last email sent each day (to avoid taking nights into account), i.e.:
2013-09-26 2013-09-27
john at doe.com 1.32 emails / hour 0.56 emails / hour
jane at shoe.com 10.57 emails / hour 2.54 emails / hour
...
At this time it seems pretty impossible, but I guess I will eventually find a way :-)
Thanks a lot,
Sartene Bel
R learner