Skip to content

Suggest method

1 message · Boris Steipe

#
Please keep the conversation on the list ...

Here is a toy example to help you think this through:


set.seed(11235)

# artifical random data ...
# miles: normalized between 0 and 1
# flights: number of flights within last year
# since: how many days ago was the last flight booked
# score: just placeholder zeros for now
pList <- data.frame(miles=runif(100), 
                    flights=sample(0:12,100, replace=TRUE), 
                    since=sample(1:365, 100, replace=TRUE), 
                    score=0)
                    
# make up some weigthing scheme
fScore <- function(x) {
	m <- x[1]          # reward high number of miles
	f <- x[2]/3        # reward large number of flights
	s <- 10 / x[3]     # penalize if last flight was long ago
	return( m + f + s) # return score as sum of these factors
}

# calculate the scores and put the values into the data frame
pList$score <- apply(pList, MARGIN=1, FUN=fScore)

# Get the top three scoring passengers
pList[order(pList$score, decreasing=TRUE)[1:3], ]

        miles flights since    score
78 0.58376271       5     2 7.250429
94 0.01534421      12     7 5.443916
53 0.93216146      10    11 5.174586

# #78 flew very recently, #94 had lots of flights, #53 has lots of miles ...
# ... upgrade them to receive a free bag of peanuts each.


Note that the logic of selecting depends entirely on the way the score function is constructed. Clustering would not contribute anything useful.

That's as much as I'll write about this. This looks like a homework problem anyway and none of this is really an R problem.

B.
On Apr 23, 2015, at 12:57 AM, Lalitha Kristipati <Lalitha.Kristipati at TechMahindra.com> wrote: