An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20131201/d43d3dc8/attachment.pl>
How to get the proportions of data with respect to two variables in R?
2 messages · umair durrani, jim holtman
Here is an example using data.table to get the proportion for the Length/Width:
input <- read.table(text = "'ID' 'Class' 'Length' 'Width'
+ 2 2 13.5 4.5 + 2 2 13.5 4.5 + 2 2 13.5 4.5 + 2 2 13.5 4.5 + 3 2 13.5 4.0 + 3 2 13.5 4.0 + 3 2 13.5 4.0 + 3 2 13.5 4.0 + 4 2 10.0 4.5 + 4 2 10.0 4.5 + 4 2 10.0 4.5 + 4 2 10.0 4.5 + 5 3 23.0 4.5 + 5 3 23.0 4.5 + 5 3 23.0 4.5 + 5 3 23.0 4.5 + 6 3 76.5 4.5 + 6 3 76.5 4.5 + 6 3 76.5 4.5 + 6 3 76.5 4.5 + 6 3 76.5 4.5 + 7 1 10.0 3.0 + 7 1 10.0 3.0 + 7 1 10.0 3.0 + 7 1 10.0 3.0 + 8 2 13.5 5.5 + 8 2 13.5 5.5 + 8 2 13.5 5.5 + 8 2 13.5 5.5", header = TRUE)
# remove duplicates input <- subset(input, !duplicated(input)) require(data.table) input <- data.table(input) # create counts by Length/Width counts <- input[
+ , list(count = .N) + , keyby = 'Class,Length,Width' + ]
# add proportion counts$prop <- ave(counts$count
+ , counts$Class + , FUN = function(x) round(x / sum(x) * 100, 1) + )
counts
Class Length Width count prop 1: 1 10.0 3.0 1 100 2: 2 10.0 4.5 1 25 3: 2 13.5 4.0 1 25 4: 2 13.5 4.5 1 25 5: 2 13.5 5.5 1 25 6: 3 23.0 4.5 1 50 7: 3 76.5 4.5 1 50
Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.
On Sun, Dec 1, 2013 at 3:05 AM, umair durrani <umairdurrani at outlook.com> wrote:
Thanks for your answers Arun. Unfortunately the code didn't work and I am getting the error: arguments must have same length. Here are sample input and output: INPUT: Vehicle ID Vehicle Class Vehicle Length Vehicle Width 2 2 13.5 4.5 2 2 13.5 4.5 2 2 13.5 4.5 2 2 13.5 4.5 3 2 13.5 4.0 3 2 13.5 4.0 3 2 13.5 4.0 3 2 13.5 4.0 4 2 10.0 4.5 4 2 10.0 4.5 4 2 10.0 4.5 4 2 10.0 4.5 5 3 23.0 4.5 5 3 23.0 4.5 5 3 23.0 4.5 5 3 23.0 4.5 6 3 76.5 4.5 6 3 76.5 4.5 6 3 76.5 4.5 6 3 76.5 4.5 6 3 76.5 4.5 7 1 10.0 3.0 7 1 10.0 3.0 7 1 10.0 3.0 7 1 10.0 3.0 8 2 13.5 5.5 8 2 13.5 5.5 8 2 13.5 5.5 8 2 13.5 5.5Note that in this input: Total number of cars=4, trucks=2, motorcycles=1 Sample OutputGroup: cars VehicleLength VehicleWidth Proportion 13.5 4.5 0.25 13.5 4.0 0.25 13.5 5.5 0.25 23.0 4.5 0.25 Group:trucks VehicleLength VehicleWidth Proportion 23.0 4.5 0.5 76.0 4.5 0.5 Group: motorcycles VehicleLength VehicleWidth Proportion 10.0 3.0 1.0 Umair Durrani email: umairdurrani at outlook.com
Date: Sat, 30 Nov 2013 23:41:28 -0800
From: smartpink111 at yahoo.com
Subject: Re: [R] How to get the proportions of data with respect to two variables in R?
To: r-help at r-project.org
CC: umairdurrani at outlook.com
Hi,
It is better to provide a reproducible example.
May be this helps:
set.seed(252)
dat1 <- data.frame(`Vehicle ID`=sample(150,150,replace=FALSE),`Vehicle Class`=rep(1:4,c(20,40,30,60)), `Vehicle length`= sample(15:25,150,replace=TRUE), `Vehicle width`= sample(4:10,150,replace=TRUE),check.names=FALSE)
cars <- subset(dat1,`Vehicle Class`==2)
by(cars,INDICES=cars$`Vehicle length`,FUN=table(cars$`Vehicle width`))
#Error in FUN(X[[1L]], ...) : could not find function "FUN"
by(cars$`Vehicle width`,INDICES=cars$`Vehicle length`, table)
by(dat1$`Vehicle width`,list(dat1$`Vehicle Class`,dat1$`Vehicle length`), table)
#Also, you may check
ftable(dat1[2:4])
prop.table(ftable(dat1[2:4]),1)
A.K.
On Sunday, December 1, 2013 12:08 AM, umair durrani <umairdurrani at outlook.com> wrote:
I have 4 columns: Vehicle ID, Vehicle Class, Vehicle Length and Vehicle Width. Every vehicle has a unique vehicle ID (e.g. 2, 4, 5,...) and the data was collected every 0.1 seconds which means that vehicle IDs are repeated in Vehicle ID column for the number of times they were observed. There are three vehicle classes i.e. 1=motorcycles, 2=cars, 3=trucks in the Vehicle Class column and the lengths and widths are in their respective columns against every vehicle ID. I want to subset the data by vehicle class and then find the proportions of each vehicle model (unique length and width) within every class. For example, for the Vehicle Class = 2 i.e. car, I want to find different models of cars (unique length and width) and their proportions with respect to total number of cars. Here is what I have done so far:To subset data by Vehicle Classcars <- subset(b, b$'Vehicle class'==2)
trucks <- subset(b, b$'Vehicle class'==3)
motorcycles <- subset(b, b$'Vehicle class'==1)To find the number of carsnumofcars <- length(unique(cars$'Vehicle ID')) # 2830
numoftrucks <- length(unique(trucks$'Vehicle ID')) # 137
numofmotorcycles <- length(unique(motorcycles$'Vehicle ID'))# 45The above code worked but I could not find the proportions by using the code below:by (cars, INDICES=cars$'Vehicle Length', FUN=table(class$'Vehicle width'))R gives an error stating that it could not find 'FUN'. Please help me in finding the proportions of each model within all classes of vehicles.
Umair Durrani
email: umairdurrani at outlook.com
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.