Dear Shadee,
If you have a data.frame with the following columns:
n = 100; # population size
x = data.frame(
??????Sex = sample(c("M","F"), n, T),
??????Country = sample(c("AA", "BB", "US"), n, T),
??????Income = as.factor(sample(1:3, n, T))
)
# Dummy variable
ONE = rep(1, nrow(x))
r = aggregate(ONE ~ Sex + Income + Country, length, data = x)
r = r[, c("Country", "Income", "Sex")]
print(r)
It is possible to write more simple code, if you need only the particular combination of variables (which you specified in your mail). But this is the more general approach.
Note: you may want to use "sum" instead of "length", e.g. if you have a column specifying the number of individuals in that category.
Hope this helps,
Leonard
R code for overlapping variables -- count
4 messages · Leonard Mada, Rui Barradas
?s 18:34 de 02/06/2024, Leo Mada via R-help escreveu:
Dear Shadee,
If you have a data.frame with the following columns:
n = 100; # population size
x = data.frame(
??????Sex = sample(c("M","F"), n, T),
??????Country = sample(c("AA", "BB", "US"), n, T),
??????Income = as.factor(sample(1:3, n, T))
)
# Dummy variable
ONE = rep(1, nrow(x))
r = aggregate(ONE ~ Sex + Income + Country, length, data = x)
r = r[, c("Country", "Income", "Sex")]
print(r)
It is possible to write more simple code, if you need only the particular combination of variables (which you specified in your mail). But this is the more general approach.
Note: you may want to use "sum" instead of "length", e.g. if you have a column specifying the number of individuals in that category.
Hope this helps,
Leonard
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, The following is simpler. r2 <- xtabs(~ ., x) |> as.data.frame() r2[-4L] # or r2[names(r2) != "Freq"] Hope this helps, Rui Barradas
Este e-mail foi analisado pelo software antiv?rus AVG para verificar a presen?a de v?rus. www.avg.com
Correcting a small glitch - see new code.
From: Leo Mada <leo.mada at syonic.eu>
Sent: Sunday, June 2, 2024 8:34 PM
To: Shadee Ashtari <shadee.ashtari at gmail.com>
Cc: r-help at r-project.org <r-help at r-project.org>
Subject: [R] R code for overlapping variables -- count
Sent: Sunday, June 2, 2024 8:34 PM
To: Shadee Ashtari <shadee.ashtari at gmail.com>
Cc: r-help at r-project.org <r-help at r-project.org>
Subject: [R] R code for overlapping variables -- count
Dear Shadee,
If you have a data.frame with the following columns:
n = 100; # population size
x = data.frame(
??????Sex = sample(c("M","F"), n, T),
??????Country = sample(c("AA", "BB", "US"), n, T),
??????Income = as.factor(sample(1:3, n, T))
)
# Dummy variable
ONE = rep(1, nrow(x))
# corrected
r = aggregate(ONE ~ Sex + Income + Country, length, data = x)
r = r[, c("Country", "Income", "Sex", "ONE")]
names(r)[4] = "Count"
print(r)
It is possible to write more simple code, if you need only the particular combination of variables (which you specified in your mail). But this is the more general approach.
Note: you may want to use "sum" instead of "length", e.g. if you have a column specifying the number of individuals in that category.
Hope this helps,
Leonard
?s 18:40 de 02/06/2024, Rui Barradas escreveu:
?s 18:34 de 02/06/2024, Leo Mada via R-help escreveu:
Dear Shadee,
If you have a data.frame with the following columns:
n = 100; # population size
x = data.frame(
??????Sex = sample(c("M","F"), n, T),
??????Country = sample(c("AA", "BB", "US"), n, T),
??????Income? = as.factor(sample(1:3, n, T))
)
# Dummy variable
ONE = rep(1, nrow(x))
r = aggregate(ONE ~ Sex + Income + Country, length, data = x)
r = r[, c("Country", "Income", "Sex")]
print(r)
It is possible to write more simple code, if you need only the
particular combination of variables (which you specified in your
mail). But this is the more general approach.
Note: you may want to use "sum" instead of "length", e.g. if you have
a column specifying the number of individuals in that category.
Hope this helps,
Leonard
????[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, The following is simpler. r2 <- xtabs(~ ., x) |> as.data.frame() r2[-4L] # or r2[names(r2) != "Freq"] Hope this helps, Rui Barradas
Hello,
This is the same solution but the code to keep only the columns in the
original data set is better. And it's a MRE.
n <- 100; # population size
x <- data.frame(
Sex = sample(c("M","F"), n, T),
Country = sample(c("AA", "BB", "US"), n, T),
Income = as.factor(sample(1:3, n, T))
)
r2 <- xtabs(~ ., x) |> as.data.frame()
# no need for constants, find the columns
# to keep from the data
r2[names(r2) %in% names(x)]
Hope this helps,
Rui Barradas
Este e-mail foi analisado pelo software antiv?rus AVG para verificar a presen?a de v?rus. www.avg.com