Hi, I need to select 15 elements, always considering the highest values (descending order) but obeying the following configuration: 3A - 4B - 0C - 3D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D If I have, for example, 5 A elements as the highest values, I can only choose 3 (first and third choice) or 2 (second choice) elements. how to make this selection? library(dplyr) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) (data = data[order(data$Var.2, decreasing=TRUE), ]) Elements = data %>% arrange(desc(Var.2)) Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346
Selecting elements
11 messages · Jeff Newmiller, PIKAL Petr, Silvano +1 more
Hallo I am confused, maybe others know what do you want but could you be more specific? Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D? Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Silvano Cesar da Costa Sent: Thursday, August 19, 2021 10:40 PM To: r-help at r-project.org Subject: [R] Selecting elements Hi, I need to select 15 elements, always considering the highest values (descending order) but obeying the following configuration: 3A - 4B - 0C - 3D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D If I have, for example, 5 A elements as the highest values, I can only choose (first and third choice) or 2 (second choice) elements. how to make this selection? library(dplyr) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) (data = data[order(data$Var.2, decreasing=TRUE), ]) Elements = data %>% arrange(desc(Var.2)) Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Agreed. Need the rest of a complete example.
On August 19, 2021 11:27:59 PM PDT, PIKAL Petr <petr.pikal at precheza.cz> wrote:
Hallo I am confused, maybe others know what do you want but could you be more specific? Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D? Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Silvano Cesar da Costa Sent: Thursday, August 19, 2021 10:40 PM To: r-help at r-project.org Subject: [R] Selecting elements Hi, I need to select 15 elements, always considering the highest values (descending order) but obeying the following configuration: 3A - 4B - 0C - 3D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D If I have, for example, 5 A elements as the highest values, I can only choose (first and third choice) or 2 (second choice) elements. how to make this selection? library(dplyr) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) (data = data[order(data$Var.2, decreasing=TRUE), ]) Elements = data %>% arrange(desc(Var.2)) Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.
Hi, thanks you for the answer. Sorry English is not my native language. But you got it right.
As C is first and fourth biggest value, you follow third option and
select 3 highest A, 3B 2C and 2D? I must select the 10 (not 15) highest values, but which follow a certain order: 3A - 3B - 2C - 2D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D I'll put the example in Excel for a better understanding (with 20 elements only). I must select 10 elements (the highest values of variable Var.2), which fit one of the 3 options above. Number Position Var.1 Var.2 1 27 C 40 2 30 B 39 Selected: 3 5 A 38 Number Position Var.1 Var.2 4 16 D 37 1 27 C 40 5 23 C 36 2 30 B 39 3A - 3B - 2C - 2D 6 13 A 35 3 5 A 38 7 20 D 34 4 16 D 37 3A - 3B - 1C - 3D 8 12 D 33 5 23 C 36 9 9 A 32 6 13 A 35 2A - 5B - 0C - 3D 10 1 A 31 7 20 D 34 11 21 A 30 10 9 A 32 12 35 C 29 13 14 B 28 13 14 B 28 17 6 B 25 14 8 D 27 15 7 C 26 16 6 B 25 17 40 D 24 18 26 B 23 19 29 A 22 20 31 C 21 Second option (other data set): Number Position Var.1 Var.2 1 36 D 20 2 11 B 19 Selected: 3 39 A 18 Number Position Var.1 Var.2 4 24 D 17 1 36 D 20 5 34 B 16 2 11 B 19 3A - 3B - 2C - 2D 6 2 B 15 3 39 A 18 7 3 A 14 4 24 D 17 3A - 3B - 1C - 3D 8 32 D 13 5 34 B 16 9 28 D 12 6 2 B 15 2A - 5B - 0C - 3D 10 25 A 11 7 3 A 14 11 19 B 10 8 32 D 13 12 15 B 9 9 25 A 11 13 17 A 8 10 18 C 7 14 18 C 7 15 38 B 6 16 10 B 5 17 22 B 4 18 4 D 3 19 33 A 2 20 37 A 1 How to make the selection of these 10 elements that fit one of the 3 options using R? Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em sex., 20 de ago. de 2021 ?s 03:28, PIKAL Petr <petr.pikal at precheza.cz> escreveu:
Hallo I am confused, maybe others know what do you want but could you be more specific? Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D? Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Silvano Cesar
da
Costa Sent: Thursday, August 19, 2021 10:40 PM To: r-help at r-project.org Subject: [R] Selecting elements Hi, I need to select 15 elements, always considering the highest values (descending order) but obeying the following configuration: 3A - 4B - 0C - 3D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D If I have, for example, 5 A elements as the highest values, I can only
choose
(first and third choice) or 2 (second choice) elements.
how to make this selection?
library(dplyr)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(data = data[order(data$Var.2, decreasing=TRUE), ])
Elements = data %>%
arrange(desc(Var.2))
Thanks,
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Well, I don't think language is as much of a problem as your failure to compose your messages using plain text format. Your examples are all mushed together since the mailing list removes formatting. See what we see below for example.
On August 20, 2021 12:27:50 PM PDT, Silvano Cesar da Costa <silvano at uel.br> wrote:
Hi, thanks you for the answer. Sorry English is not my native language. But you got it right.
As C is first and fourth biggest value, you follow third option and
select 3 highest A, 3B 2C and 2D? I must select the 10 (not 15) highest values, but which follow a certain order: 3A - 3B - 2C - 2D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D I'll put the example in Excel for a better understanding (with 20 elements only). I must select 10 elements (the highest values of variable Var.2), which fit one of the 3 options above. Number Position Var.1 Var.2 1 27 C 40 2 30 B 39 Selected: 3 5 A 38 Number Position Var.1 Var.2 4 16 D 37 1 27 C 40 5 23 C 36 2 30 B 39 3A - 3B - 2C - 2D 6 13 A 35 3 5 A 38 7 20 D 34 4 16 D 37 3A - 3B - 1C - 3D 8 12 D 33 5 23 C 36 9 9 A 32 6 13 A 35 2A - 5B - 0C - 3D 10 1 A 31 7 20 D 34 11 21 A 30 10 9 A 32 12 35 C 29 13 14 B 28 13 14 B 28 17 6 B 25 14 8 D 27 15 7 C 26 16 6 B 25 17 40 D 24 18 26 B 23 19 29 A 22 20 31 C 21 Second option (other data set): Number Position Var.1 Var.2 1 36 D 20 2 11 B 19 Selected: 3 39 A 18 Number Position Var.1 Var.2 4 24 D 17 1 36 D 20 5 34 B 16 2 11 B 19 3A - 3B - 2C - 2D 6 2 B 15 3 39 A 18 7 3 A 14 4 24 D 17 3A - 3B - 1C - 3D 8 32 D 13 5 34 B 16 9 28 D 12 6 2 B 15 2A - 5B - 0C - 3D 10 25 A 11 7 3 A 14 11 19 B 10 8 32 D 13 12 15 B 9 9 25 A 11 13 17 A 8 10 18 C 7 14 18 C 7 15 38 B 6 16 10 B 5 17 22 B 4 18 4 D 3 19 33 A 2 20 37 A 1 How to make the selection of these 10 elements that fit one of the 3 options using R? Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em sex., 20 de ago. de 2021 ?s 03:28, PIKAL Petr <petr.pikal at precheza.cz> escreveu:
Hallo I am confused, maybe others know what do you want but could you be more specific? Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D? Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Silvano Cesar
da
Costa Sent: Thursday, August 19, 2021 10:40 PM To: r-help at r-project.org Subject: [R] Selecting elements Hi, I need to select 15 elements, always considering the highest values (descending order) but obeying the following configuration: 3A - 4B - 0C - 3D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D If I have, for example, 5 A elements as the highest values, I can only
choose
(first and third choice) or 2 (second choice) elements.
how to make this selection?
library(dplyr)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(data = data[order(data$Var.2, decreasing=TRUE), ])
Elements = data %>%
arrange(desc(Var.2))
Thanks,
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.
2 days later
Hi Only I got your HTML formated mail, rest of the world got complete mess. Do not use HTML formating. As I got it right I wonder why in your second example you did not follow 3A - 3B - 2C - 2D as D were positioned 1st and 4th. I hope that you could use something like sss <- split(data$Var.2, data$Var.1) lapply(sss, cumsum) $A [1] 38 73 105 136 166 188 199 207 209 210 $B [1] 39 67 92 115 131 146 153 159 164 168 $C [1] 40 76 105 131 152 171 189 203 213 222 $D [1] 37 71 104 131 155 175 192 205 217 220 Now you need to evaluate this result according to your sets. Here the highest value (76) is in C so the set with 2C is the one you should choose and select you value according to this set. With
set.seed(666) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) data <- data[order(data$Var.2, decreasing=TRUE), ] sss <- split(data$Var.2, data$Var.1) lapply(sss, cumsum)
$A [1] 36 70 102 133 163 182 200 207 212 213 $B [1] 35 57 78 95 108 120 131 140 148 150 $C [1] 40 73 102 130 156 180 196 211 221 225 $D [1] 39 77 114 141 166 189 209 223 229 232 Highest value is in D so either 3A - 3B - 2C - 2D or 3A - 3B - 2C - 2D should be appropriate. And here I am again lost as both sets are same. Maybe you need to reconsider your statements. Cheers Petr From: Silvano Cesar da Costa <silvano at uel.br> Sent: Friday, August 20, 2021 9:28 PM To: PIKAL Petr <petr.pikal at precheza.cz> Cc: r-help at r-project.org Subject: Re: [R] Selecting elements Hi, thanks you for the answer. Sorry English is not my native language. But you got it right.
As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D?
I must select the 10 (not 15) highest values, but which follow a certain order: 3A - 3B - 2C - 2D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D I'll put the example in Excel for a better understanding (with 20 elements only). I must select 10 elements (the highest values of variable Var.2), which fit one of the 3 options above. Number Position Var.1 Var.2 1 27 C 40 2 30 B 39 Selected: 3 5 A 38 Number Position Var.1 Var.2 4 16 D 37 1 27 C 40 5 23 C 36 2 30 B 39 3A - 3B - 2C - 2D 6 13 A 35 3 5 A 38 7 20 D 34 4 16 D 37 3A - 3B - 1C - 3D 8 12 D 33 5 23 C 36 9 9 A 32 6 13 A 35 2A - 5B - 0C - 3D 10 1 A 31 7 20 D 34 11 21 A 30 10 9 A 32 12 35 C 29 13 14 B 28 13 14 B 28 17 6 B 25 14 8 D 27 15 7 C 26 16 6 B 25 17 40 D 24 18 26 B 23 19 29 A 22 20 31 C 21 Second option (other data set): Number Position Var.1 Var.2 1 36 D 20 2 11 B 19 Selected: 3 39 A 18 Number Position Var.1 Var.2 4 24 D 17 1 36 D 20 5 34 B 16 2 11 B 19 3A - 3B - 2C - 2D 6 2 B 15 3 39 A 18 7 3 A 14 4 24 D 17 3A - 3B - 1C - 3D 8 32 D 13 5 34 B 16 9 28 D 12 6 2 B 15 2A - 5B - 0C - 3D 10 25 A 11 7 3 A 14 11 19 B 10 8 32 D 13 12 15 B 9 9 25 A 11 13 17 A 8 10 18 C 7 14 18 C 7 15 38 B 6 16 10 B 5 17 22 B 4 18 4 D 3 19 33 A 2 20 37 A 1 How to make the selection of these 10 elements that fit one of the 3 options using R? Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em sex., 20 de ago. de 2021 ?s 03:28, PIKAL Petr <mailto:petr.pikal at precheza.cz> escreveu: Hallo I am confused, maybe others know what do you want but could you be more specific? Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D? Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message-----
From: R-help <mailto:r-help-bounces at r-project.org> On Behalf Of Silvano Cesar da
Costa
Sent: Thursday, August 19, 2021 10:40 PM
To: mailto:r-help at r-project.org
Subject: [R] Selecting elements
Hi,
I need to select 15 elements, always considering the highest values
(descending order) but obeying the following configuration:
3A - 4B - 0C - 3D or
2A - 5B - 0C - 3D or
3A - 3B - 2C - 2D
If I have, for example, 5 A elements as the highest values, I can only choose
(first and third choice) or 2 (second choice) elements.
how to make this selection?
library(dplyr)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(data = data[order(data$Var.2, decreasing=TRUE), ])
Elements = data %>%
arrange(desc(Var.2))
Thanks,
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
[[alternative HTML version deleted]]
______________________________________________ mailto:R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Hi, I apologize for the confusion. I will try to be clearer in my explanation. I believe that with the R script it becomes clearer. I have 4 variables with 10 repetitions and each one receives a value, randomly. I order the dataset from largest to smallest value. I have to select 10 elements in descending order of values, according to one of three schemes: # 3A - 3B - 2C - 2D # 2A - 5B - 0C - 3D # 3A - 4B - 2C - 1D If the first 3 elements (out of the 10 to be selected) are of the letter D, automatically the adopted scheme will be the second. So, I have to (following) choose 2A, 5B and 0C. How to make the selection automatically? I created two selection examples, with different schemes: set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) (Order = data[order(data$Var.2, decreasing=TRUE), ]) # I must select the 10 highest values (), # but which follow a certain scheme: # # 3A - 3B - 2C - 2D or # 2A - 5B - 0C - 3D or # 3A - 4B - 2C - 1D # # In this case, I started with the highest value that refers to the letter C. # Next comes only 1 of the letters B, A and D. All are selected once. # The fifth observation is the letter C, completing 2 C values. In this case, # following the 3 adopted schemes, note that the second scheme has 0C, # so this scheme is out. # Therefore, it can be the first scheme (3A - 3B - 2C - 2D) or the # third scheme (3A - 4B - 2C - 1D). # The next letter to be completed is the D (fourth and seventh elements), # among the 10 elements being selected. Therefore, the scheme adopted is the # first one (3A - 3B - 2C - 2D). # Therefore, it is necessary to select 2 values with the letter B and 1 value # with the letter A. # # Manual Selection - # The end result is: (Selected.data = Order[c(1,2,3,4,5,6,7,9,13,16), ]) # Scheme: 3A - 3B - 2C - 2D sort(Selected.data$Var.1) #------------------ # Second example: - #------------------ set.seed(4) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) (Order = data[order(data$Var.2, decreasing=TRUE), ]) # The end result is: (Selected.data.2 = Order[c(1,2,3,4,5,6,7,8,9,11), ]) # Scheme: 3A - 4B - 2C - 1D sort(Selected.data.2$Var.1) How to make the selection of the 10 elements automatically? Thank you very much. Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em seg., 23 de ago. de 2021 ?s 05:05, PIKAL Petr <petr.pikal at precheza.cz> escreveu:
Hi Only I got your HTML formated mail, rest of the world got complete mess. Do not use HTML formating. As I got it right I wonder why in your second example you did not follow 3A - 3B - 2C - 2D as D were positioned 1st and 4th. I hope that you could use something like sss <- split(data$Var.2, data$Var.1) lapply(sss, cumsum) $A [1] 38 73 105 136 166 188 199 207 209 210 $B [1] 39 67 92 115 131 146 153 159 164 168 $C [1] 40 76 105 131 152 171 189 203 213 222 $D [1] 37 71 104 131 155 175 192 205 217 220 Now you need to evaluate this result according to your sets. Here the highest value (76) is in C so the set with 2C is the one you should choose and select you value according to this set. With
set.seed(666) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) data <- data[order(data$Var.2, decreasing=TRUE), ] sss <- split(data$Var.2, data$Var.1) lapply(sss, cumsum)
$A [1] 36 70 102 133 163 182 200 207 212 213 $B [1] 35 57 78 95 108 120 131 140 148 150 $C [1] 40 73 102 130 156 180 196 211 221 225 $D [1] 39 77 114 141 166 189 209 223 229 232 Highest value is in D so either 3A - 3B - 2C - 2D or 3A - 3B - 2C - 2D should be appropriate. And here I am again lost as both sets are same. Maybe you need to reconsider your statements. Cheers Petr From: Silvano Cesar da Costa <silvano at uel.br> Sent: Friday, August 20, 2021 9:28 PM To: PIKAL Petr <petr.pikal at precheza.cz> Cc: r-help at r-project.org Subject: Re: [R] Selecting elements Hi, thanks you for the answer. Sorry English is not my native language. But you got it right.
As C is first and fourth biggest value, you follow third option and
select 3 highest A, 3B 2C and 2D? I must select the 10 (not 15) highest values, but which follow a certain order: 3A - 3B - 2C - 2D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D I'll put the example in Excel for a better understanding (with 20 elements only). I must select 10 elements (the highest values of variable Var.2), which fit one of the 3 options above. Number Position Var.1 Var.2 1 27 C 40 2 30 B 39 Selected: 3 5 A 38 Number Position Var.1 Var.2 4 16 D 37 1 27 C 40 5 23 C 36 2 30 B 39 3A - 3B - 2C - 2D 6 13 A 35 3 5 A 38 7 20 D 34 4 16 D 37 3A - 3B - 1C - 3D 8 12 D 33 5 23 C 36 9 9 A 32 6 13 A 35 2A - 5B - 0C - 3D 10 1 A 31 7 20 D 34 11 21 A 30 10 9 A 32 12 35 C 29 13 14 B 28 13 14 B 28 17 6 B 25 14 8 D 27 15 7 C 26 16 6 B 25 17 40 D 24 18 26 B 23 19 29 A 22 20 31 C 21 Second option (other data set): Number Position Var.1 Var.2 1 36 D 20 2 11 B 19 Selected: 3 39 A 18 Number Position Var.1 Var.2 4 24 D 17 1 36 D 20 5 34 B 16 2 11 B 19 3A - 3B - 2C - 2D 6 2 B 15 3 39 A 18 7 3 A 14 4 24 D 17 3A - 3B - 1C - 3D 8 32 D 13 5 34 B 16 9 28 D 12 6 2 B 15 2A - 5B - 0C - 3D 10 25 A 11 7 3 A 14 11 19 B 10 8 32 D 13 12 15 B 9 9 25 A 11 13 17 A 8 10 18 C 7 14 18 C 7 15 38 B 6 16 10 B 5 17 22 B 4 18 4 D 3 19 33 A 2 20 37 A 1 How to make the selection of these 10 elements that fit one of the 3 options using R? Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em sex., 20 de ago. de 2021 ?s 03:28, PIKAL Petr <mailto: petr.pikal at precheza.cz> escreveu: Hallo I am confused, maybe others know what do you want but could you be more specific? Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D? Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message----- From: R-help <mailto:r-help-bounces at r-project.org> On Behalf Of Silvano
Cesar da
Costa Sent: Thursday, August 19, 2021 10:40 PM To: mailto:r-help at r-project.org Subject: [R] Selecting elements Hi, I need to select 15 elements, always considering the highest values (descending order) but obeying the following configuration: 3A - 4B - 0C - 3D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D If I have, for example, 5 A elements as the highest values, I can only
choose
(first and third choice) or 2 (second choice) elements.
how to make this selection?
library(dplyr)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(data = data[order(data$Var.2, decreasing=TRUE), ])
Elements = data %>%
arrange(desc(Var.2))
Thanks,
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
[[alternative HTML version deleted]]
______________________________________________ mailto:R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Hi. Now it is understandable. However the solution is not clear for me. table(Order$Var.1[1:10]) A B C D 4 1 2 3 should give you a hint which scheme could be acceptable, but how to do it programmatically I do not know. maybe to start with lower value in the table call and gradually increse it to check which scheme starts to be the chosen one
table(data.o$Var.1[1]) # scheme 2 is out
C 1 ...
table(data.o$Var.1[1:5]) #scheme 3
A B C D 1 1 2 1
table(data.o$Var.1[1:6]) #scheme 3
A B C D 2 1 2 1
table(data.o$Var.1[1:7]) # scheme1
A B C D 2 1 2 2
table(data.o$Var.1[1:8]) # no such scheme, so scheme 1 is chosen one
A B C D
2 1 2 3
#Now you need to select values based on scheme 1.
# 3A - 3B - 2C - 2D
sss <- split(Order, Order$Var.1)
selection <- c(3,3,2,2)
result <- vector("list", 4)
#I would use loop
for(i in 1:4) {
result[[i]] <- sss[[i]][1:selection[i],]
}
Maybe someone come with other ingenious solution.
Cheers
Petr
From: Silvano Cesar da Costa <silvano at uel.br>
Sent: Monday, August 23, 2021 7:54 PM
To: PIKAL Petr <petr.pikal at precheza.cz>
Cc: r-help at r-project.org
Subject: Re: [R] Selecting elements
Hi,
I apologize for the confusion. I will try to be clearer in my explanation. I believe that with the R script it becomes clearer.
I have 4 variables with 10 repetitions and each one receives a value, randomly.
I order the dataset from largest to smallest value. I have to select 10 elements in
descending order of values, according to one of three schemes:
# 3A - 3B - 2C - 2D
# 2A - 5B - 0C - 3D
# 3A - 4B - 2C - 1D
If the first 3 elements (out of the 10 to be selected) are of the letter D, automatically
the adopted scheme will be the second. So, I have to (following) choose 2A, 5B and 0C.
How to make the selection automatically?
I created two selection examples, with different schemes:
set.seed(123)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(Order = data[order(data$Var.2, decreasing=TRUE), ])
# I must select the 10 highest values (),
# but which follow a certain scheme:
#
# 3A - 3B - 2C - 2D or
# 2A - 5B - 0C - 3D or
# 3A - 4B - 2C - 1D
#
# In this case, I started with the highest value that refers to the letter C.
# Next comes only 1 of the letters B, A and D. All are selected once.
# The fifth observation is the letter C, completing 2 C values. In this case,
# following the 3 adopted schemes, note that the second scheme has 0C,
# so this scheme is out.
# Therefore, it can be the first scheme (3A - 3B - 2C - 2D) or the
# third scheme (3A - 4B - 2C - 1D).
# The next letter to be completed is the D (fourth and seventh elements),
# among the 10 elements being selected. Therefore, the scheme adopted is the
# first one (3A - 3B - 2C - 2D).
# Therefore, it is necessary to select 2 values with the letter B and 1 value
# with the letter A.
#
# Manual Selection -
# The end result is:
(Selected.data = Order[c(1,2,3,4,5,6,7,9,13,16), ])
# Scheme: 3A - 3B - 2C - 2D
sort(Selected.data$Var.1)
#------------------
# Second example: -
#------------------
set.seed(4)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(Order = data[order(data$Var.2, decreasing=TRUE), ])
# The end result is:
(Selected.data.2 = Order[c(1,2,3,4,5,6,7,8,9,11), ])
# Scheme: 3A - 4B - 2C - 1D
sort(Selected.data.2$Var.1)
How to make the selection of the 10 elements automatically?
Thank you very much.
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
Em seg., 23 de ago. de 2021 ?s 05:05, PIKAL Petr <mailto:petr.pikal at precheza.cz> escreveu:
Hi
Only I got your HTML formated mail, rest of the world got complete mess. Do not use HTML formating.
As I got it right I wonder why in your second example you did not follow
3A - 3B - 2C - 2D
as D were positioned 1st and 4th.
I hope that you could use something like
sss <- split(data$Var.2, data$Var.1)
lapply(sss, cumsum)
$A
[1] 38 73 105 136 166 188 199 207 209 210
$B
[1] 39 67 92 115 131 146 153 159 164 168
$C
[1] 40 76 105 131 152 171 189 203 213 222
$D
[1] 37 71 104 131 155 175 192 205 217 220
Now you need to evaluate this result according to your sets. Here the highest value (76) is in C so the set with 2C is the one you should choose and select you value according to this set.
With
set.seed(666) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) data <- data[order(data$Var.2, decreasing=TRUE), ] sss <- split(data$Var.2, data$Var.1) lapply(sss, cumsum)
$A [1] 36 70 102 133 163 182 200 207 212 213 $B [1] 35 57 78 95 108 120 131 140 148 150 $C [1] 40 73 102 130 156 180 196 211 221 225 $D [1] 39 77 114 141 166 189 209 223 229 232 Highest value is in D so either 3A - 3B - 2C - 2D or 3A - 3B - 2C - 2D should be appropriate. And here I am again lost as both sets are same. Maybe you need to reconsider your statements. Cheers Petr From: Silvano Cesar da Costa <mailto:silvano at uel.br> Sent: Friday, August 20, 2021 9:28 PM To: PIKAL Petr <mailto:petr.pikal at precheza.cz> Cc: mailto:r-help at r-project.org Subject: Re: [R] Selecting elements Hi, thanks you for the answer. Sorry English is not my native language. But you got it right.
As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D?
I must select the 10 (not 15) highest values, but which follow a certain order: 3A - 3B - 2C - 2D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D I'll put the example in Excel for a better understanding (with 20 elements only). I must select 10 elements (the highest values of variable Var.2), which fit one of the 3 options above. Number Position Var.1 Var.2 1 27 C 40 2 30 B 39 Selected: 3 5 A 38 Number Position Var.1 Var.2 4 16 D 37 1 27 C 40 5 23 C 36 2 30 B 39 3A - 3B - 2C - 2D 6 13 A 35 3 5 A 38 7 20 D 34 4 16 D 37 3A - 3B - 1C - 3D 8 12 D 33 5 23 C 36 9 9 A 32 6 13 A 35 2A - 5B - 0C - 3D 10 1 A 31 7 20 D 34 11 21 A 30 10 9 A 32 12 35 C 29 13 14 B 28 13 14 B 28 17 6 B 25 14 8 D 27 15 7 C 26 16 6 B 25 17 40 D 24 18 26 B 23 19 29 A 22 20 31 C 21 Second option (other data set): Number Position Var.1 Var.2 1 36 D 20 2 11 B 19 Selected: 3 39 A 18 Number Position Var.1 Var.2 4 24 D 17 1 36 D 20 5 34 B 16 2 11 B 19 3A - 3B - 2C - 2D 6 2 B 15 3 39 A 18 7 3 A 14 4 24 D 17 3A - 3B - 1C - 3D 8 32 D 13 5 34 B 16 9 28 D 12 6 2 B 15 2A - 5B - 0C - 3D 10 25 A 11 7 3 A 14 11 19 B 10 8 32 D 13 12 15 B 9 9 25 A 11 13 17 A 8 10 18 C 7 14 18 C 7 15 38 B 6 16 10 B 5 17 22 B 4 18 4 D 3 19 33 A 2 20 37 A 1 How to make the selection of these 10 elements that fit one of the 3 options using R? Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em sex., 20 de ago. de 2021 ?s 03:28, PIKAL Petr <mailto:mailto:petr.pikal at precheza.cz> escreveu: Hallo I am confused, maybe others know what do you want but could you be more specific? Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D? Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message-----
From: R-help <mailto:mailto:r-help-bounces at r-project.org> On Behalf Of Silvano Cesar da
Costa
Sent: Thursday, August 19, 2021 10:40 PM
To: mailto:mailto:r-help at r-project.org
Subject: [R] Selecting elements
Hi,
I need to select 15 elements, always considering the highest values
(descending order) but obeying the following configuration:
3A - 4B - 0C - 3D or
2A - 5B - 0C - 3D or
3A - 3B - 2C - 2D
If I have, for example, 5 A elements as the highest values, I can only choose
(first and third choice) or 2 (second choice) elements.
how to make this selection?
library(dplyr)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(data = data[order(data$Var.2, decreasing=TRUE), ])
Elements = data %>%
arrange(desc(Var.2))
Thanks,
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
[[alternative HTML version deleted]]
______________________________________________ mailto:mailto:R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Silvano,
I was completely stumped by your problem until I looked through Petr's
response and guessed that you wanted the largest sum of 'Var.1"
constrained by the specified numbers in your three schemes. I think
this is what you want, but I haven't checked it exhaustively.
set.seed(123)
Var.1 <- rep(LETTERS[1:4], 10)
Var.2 <- sample(1:40, replace=FALSE)
data <- data.frame(Var.1, Var.2)
(Order <- data[order(data$Var.2, decreasing=TRUE), ])
allowed<-matrix(c(3,3,2,2,2,5,0,3,3,4,2,1),nrow=3,byrow=TRUE)
colnames(allowed)<-LETTERS[1:4]
select_largest<-function(x,allowed,n=10) {
totals<-rep(0,nrow(allowed))
indices<-matrix(0,ncol=n,nrow=nrow(allowed))
for(i in 1:nrow(allowed)) {
ii<-1
for(j in 1:ncol(allowed)) {
if(allowed[i,j]) {
indx<-which(x[,1] == colnames(allowed)[j])
totals[i]<-totals[i]+sum(x[indx[1:allowed[i,j]],2])
indices[i,ii:(ii+allowed[i,j]-1)]<-indx[1:allowed[i,j]]
ii<-ii+allowed[i,j]
}
}
}
largest<-which.max(totals)
return(list(scheme=largest,total=totals[largest],
indices=sort(indices[largest,])))
}
select_largest(Order,allowed)
Jim
On Tue, Aug 24, 2021 at 7:11 PM PIKAL Petr <petr.pikal at precheza.cz> wrote:
Hi. Now it is understandable. However the solution is not clear for me. table(Order$Var.1[1:10]) A B C D 4 1 2 3 should give you a hint which scheme could be acceptable, but how to do it programmatically I do not know. maybe to start with lower value in the table call and gradually increse it to check which scheme starts to be the chosen one
table(data.o$Var.1[1]) # scheme 2 is out
C 1 ...
table(data.o$Var.1[1:5]) #scheme 3
A B C D 1 1 2 1
table(data.o$Var.1[1:6]) #scheme 3
A B C D 2 1 2 1
table(data.o$Var.1[1:7]) # scheme1
A B C D 2 1 2 2
table(data.o$Var.1[1:8]) # no such scheme, so scheme 1 is chosen one
A B C D
2 1 2 3
#Now you need to select values based on scheme 1.
# 3A - 3B - 2C - 2D
sss <- split(Order, Order$Var.1)
selection <- c(3,3,2,2)
result <- vector("list", 4)
#I would use loop
for(i in 1:4) {
result[[i]] <- sss[[i]][1:selection[i],]
}
Maybe someone come with other ingenious solution.
Cheers
Petr
From: Silvano Cesar da Costa <silvano at uel.br>
Sent: Monday, August 23, 2021 7:54 PM
To: PIKAL Petr <petr.pikal at precheza.cz>
Cc: r-help at r-project.org
Subject: Re: [R] Selecting elements
Hi,
I apologize for the confusion. I will try to be clearer in my explanation. I believe that with the R script it becomes clearer.
I have 4 variables with 10 repetitions and each one receives a value, randomly.
I order the dataset from largest to smallest value. I have to select 10 elements in
descending order of values, according to one of three schemes:
# 3A - 3B - 2C - 2D
# 2A - 5B - 0C - 3D
# 3A - 4B - 2C - 1D
If the first 3 elements (out of the 10 to be selected) are of the letter D, automatically
the adopted scheme will be the second. So, I have to (following) choose 2A, 5B and 0C.
How to make the selection automatically?
I created two selection examples, with different schemes:
set.seed(123)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(Order = data[order(data$Var.2, decreasing=TRUE), ])
# I must select the 10 highest values (),
# but which follow a certain scheme:
#
# 3A - 3B - 2C - 2D or
# 2A - 5B - 0C - 3D or
# 3A - 4B - 2C - 1D
#
# In this case, I started with the highest value that refers to the letter C.
# Next comes only 1 of the letters B, A and D. All are selected once.
# The fifth observation is the letter C, completing 2 C values. In this case,
# following the 3 adopted schemes, note that the second scheme has 0C,
# so this scheme is out.
# Therefore, it can be the first scheme (3A - 3B - 2C - 2D) or the
# third scheme (3A - 4B - 2C - 1D).
# The next letter to be completed is the D (fourth and seventh elements),
# among the 10 elements being selected. Therefore, the scheme adopted is the
# first one (3A - 3B - 2C - 2D).
# Therefore, it is necessary to select 2 values with the letter B and 1 value
# with the letter A.
#
# Manual Selection -
# The end result is:
(Selected.data = Order[c(1,2,3,4,5,6,7,9,13,16), ])
# Scheme: 3A - 3B - 2C - 2D
sort(Selected.data$Var.1)
#------------------
# Second example: -
#------------------
set.seed(4)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(Order = data[order(data$Var.2, decreasing=TRUE), ])
# The end result is:
(Selected.data.2 = Order[c(1,2,3,4,5,6,7,8,9,11), ])
# Scheme: 3A - 4B - 2C - 1D
sort(Selected.data.2$Var.1)
How to make the selection of the 10 elements automatically?
Thank you very much.
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
Em seg., 23 de ago. de 2021 ?s 05:05, PIKAL Petr <mailto:petr.pikal at precheza.cz> escreveu:
Hi
Only I got your HTML formated mail, rest of the world got complete mess. Do not use HTML formating.
As I got it right I wonder why in your second example you did not follow
3A - 3B - 2C - 2D
as D were positioned 1st and 4th.
I hope that you could use something like
sss <- split(data$Var.2, data$Var.1)
lapply(sss, cumsum)
$A
[1] 38 73 105 136 166 188 199 207 209 210
$B
[1] 39 67 92 115 131 146 153 159 164 168
$C
[1] 40 76 105 131 152 171 189 203 213 222
$D
[1] 37 71 104 131 155 175 192 205 217 220
Now you need to evaluate this result according to your sets. Here the highest value (76) is in C so the set with 2C is the one you should choose and select you value according to this set.
With
set.seed(666) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) data <- data[order(data$Var.2, decreasing=TRUE), ] sss <- split(data$Var.2, data$Var.1) lapply(sss, cumsum)
$A [1] 36 70 102 133 163 182 200 207 212 213 $B [1] 35 57 78 95 108 120 131 140 148 150 $C [1] 40 73 102 130 156 180 196 211 221 225 $D [1] 39 77 114 141 166 189 209 223 229 232 Highest value is in D so either 3A - 3B - 2C - 2D or 3A - 3B - 2C - 2D should be appropriate. And here I am again lost as both sets are same. Maybe you need to reconsider your statements. Cheers Petr From: Silvano Cesar da Costa <mailto:silvano at uel.br> Sent: Friday, August 20, 2021 9:28 PM To: PIKAL Petr <mailto:petr.pikal at precheza.cz> Cc: mailto:r-help at r-project.org Subject: Re: [R] Selecting elements Hi, thanks you for the answer. Sorry English is not my native language. But you got it right.
As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D?
I must select the 10 (not 15) highest values, but which follow a certain order: 3A - 3B - 2C - 2D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D I'll put the example in Excel for a better understanding (with 20 elements only). I must select 10 elements (the highest values of variable Var.2), which fit one of the 3 options above. Number Position Var.1 Var.2 1 27 C 40 2 30 B 39 Selected: 3 5 A 38 Number Position Var.1 Var.2 4 16 D 37 1 27 C 40 5 23 C 36 2 30 B 39 3A - 3B - 2C - 2D 6 13 A 35 3 5 A 38 7 20 D 34 4 16 D 37 3A - 3B - 1C - 3D 8 12 D 33 5 23 C 36 9 9 A 32 6 13 A 35 2A - 5B - 0C - 3D 10 1 A 31 7 20 D 34 11 21 A 30 10 9 A 32 12 35 C 29 13 14 B 28 13 14 B 28 17 6 B 25 14 8 D 27 15 7 C 26 16 6 B 25 17 40 D 24 18 26 B 23 19 29 A 22 20 31 C 21 Second option (other data set): Number Position Var.1 Var.2 1 36 D 20 2 11 B 19 Selected: 3 39 A 18 Number Position Var.1 Var.2 4 24 D 17 1 36 D 20 5 34 B 16 2 11 B 19 3A - 3B - 2C - 2D 6 2 B 15 3 39 A 18 7 3 A 14 4 24 D 17 3A - 3B - 1C - 3D 8 32 D 13 5 34 B 16 9 28 D 12 6 2 B 15 2A - 5B - 0C - 3D 10 25 A 11 7 3 A 14 11 19 B 10 8 32 D 13 12 15 B 9 9 25 A 11 13 17 A 8 10 18 C 7 14 18 C 7 15 38 B 6 16 10 B 5 17 22 B 4 18 4 D 3 19 33 A 2 20 37 A 1 How to make the selection of these 10 elements that fit one of the 3 options using R? Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em sex., 20 de ago. de 2021 ?s 03:28, PIKAL Petr <mailto:mailto:petr.pikal at precheza.cz> escreveu: Hallo I am confused, maybe others know what do you want but could you be more specific? Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D? Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message-----
From: R-help <mailto:mailto:r-help-bounces at r-project.org> On Behalf Of Silvano Cesar da
Costa
Sent: Thursday, August 19, 2021 10:40 PM
To: mailto:mailto:r-help at r-project.org
Subject: [R] Selecting elements
Hi,
I need to select 15 elements, always considering the highest values
(descending order) but obeying the following configuration:
3A - 4B - 0C - 3D or
2A - 5B - 0C - 3D or
3A - 3B - 2C - 2D
If I have, for example, 5 A elements as the highest values, I can only choose
(first and third choice) or 2 (second choice) elements.
how to make this selection?
library(dplyr)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(data = data[order(data$Var.2, decreasing=TRUE), ])
Elements = data %>%
arrange(desc(Var.2))
Thanks,
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
[[alternative HTML version deleted]]
______________________________________________ mailto:mailto:R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Wow, That's exactly what I want. But, if possible, that a list was created with the selected elements (variable and value). Is it possible to add in the output file? Thank you very much. Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em qua., 25 de ago. de 2021 ?s 03:12, Jim Lemon <drjimlemon at gmail.com> escreveu:
Hi Silvano,
I was completely stumped by your problem until I looked through Petr's
response and guessed that you wanted the largest sum of 'Var.1"
constrained by the specified numbers in your three schemes. I think
this is what you want, but I haven't checked it exhaustively.
set.seed(123)
Var.1 <- rep(LETTERS[1:4], 10)
Var.2 <- sample(1:40, replace=FALSE)
data <- data.frame(Var.1, Var.2)
(Order <- data[order(data$Var.2, decreasing=TRUE), ])
allowed<-matrix(c(3,3,2,2,2,5,0,3,3,4,2,1),nrow=3,byrow=TRUE)
colnames(allowed)<-LETTERS[1:4]
select_largest<-function(x,allowed,n=10) {
totals<-rep(0,nrow(allowed))
indices<-matrix(0,ncol=n,nrow=nrow(allowed))
for(i in 1:nrow(allowed)) {
ii<-1
for(j in 1:ncol(allowed)) {
if(allowed[i,j]) {
indx<-which(x[,1] == colnames(allowed)[j])
totals[i]<-totals[i]+sum(x[indx[1:allowed[i,j]],2])
indices[i,ii:(ii+allowed[i,j]-1)]<-indx[1:allowed[i,j]]
ii<-ii+allowed[i,j]
}
}
}
largest<-which.max(totals)
return(list(scheme=largest,total=totals[largest],
indices=sort(indices[largest,])))
}
select_largest(Order,allowed)
Jim
On Tue, Aug 24, 2021 at 7:11 PM PIKAL Petr <petr.pikal at precheza.cz> wrote:
Hi. Now it is understandable. However the solution is not clear for me. table(Order$Var.1[1:10]) A B C D 4 1 2 3 should give you a hint which scheme could be acceptable, but how to do
it programmatically I do not know.
maybe to start with lower value in the table call and gradually increse
it to check which scheme starts to be the chosen one
table(data.o$Var.1[1]) # scheme 2 is out
C 1 ...
table(data.o$Var.1[1:5]) #scheme 3
A B C D 1 1 2 1
table(data.o$Var.1[1:6]) #scheme 3
A B C D 2 1 2 1
table(data.o$Var.1[1:7]) # scheme1
A B C D 2 1 2 2
table(data.o$Var.1[1:8]) # no such scheme, so scheme 1 is chosen one
A B C D
2 1 2 3
#Now you need to select values based on scheme 1.
# 3A - 3B - 2C - 2D
sss <- split(Order, Order$Var.1)
selection <- c(3,3,2,2)
result <- vector("list", 4)
#I would use loop
for(i in 1:4) {
result[[i]] <- sss[[i]][1:selection[i],]
}
Maybe someone come with other ingenious solution.
Cheers
Petr
From: Silvano Cesar da Costa <silvano at uel.br>
Sent: Monday, August 23, 2021 7:54 PM
To: PIKAL Petr <petr.pikal at precheza.cz>
Cc: r-help at r-project.org
Subject: Re: [R] Selecting elements
Hi,
I apologize for the confusion. I will try to be clearer in my
explanation. I believe that with the R script it becomes clearer.
I have 4 variables with 10 repetitions and each one receives a value,
randomly.
I order the dataset from largest to smallest value. I have to select 10
elements in
descending order of values, according to one of three schemes: # 3A - 3B - 2C - 2D # 2A - 5B - 0C - 3D # 3A - 4B - 2C - 1D If the first 3 elements (out of the 10 to be selected) are of the letter
D, automatically
the adopted scheme will be the second. So, I have to (following) choose
2A, 5B and 0C.
How to make the selection automatically? I created two selection examples, with different schemes: set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) (Order = data[order(data$Var.2, decreasing=TRUE), ]) # I must select the 10 highest values (), # but which follow a certain scheme: # # 3A - 3B - 2C - 2D or # 2A - 5B - 0C - 3D or # 3A - 4B - 2C - 1D # # In this case, I started with the highest value that refers to the
letter C.
# Next comes only 1 of the letters B, A and D. All are selected once. # The fifth observation is the letter C, completing 2 C values. In this
case,
# following the 3 adopted schemes, note that the second scheme has 0C, # so this scheme is out. # Therefore, it can be the first scheme (3A - 3B - 2C - 2D) or the # third scheme (3A - 4B - 2C - 1D). # The next letter to be completed is the D (fourth and seventh elements), # among the 10 elements being selected. Therefore, the scheme adopted is
the
# first one (3A - 3B - 2C - 2D). # Therefore, it is necessary to select 2 values with the letter B and 1
value
# with the letter A. # # Manual Selection - # The end result is: (Selected.data = Order[c(1,2,3,4,5,6,7,9,13,16), ]) # Scheme: 3A - 3B - 2C - 2D sort(Selected.data$Var.1) #------------------ # Second example: - #------------------ set.seed(4) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) (Order = data[order(data$Var.2, decreasing=TRUE), ]) # The end result is: (Selected.data.2 = Order[c(1,2,3,4,5,6,7,8,9,11), ]) # Scheme: 3A - 4B - 2C - 1D sort(Selected.data.2$Var.1) How to make the selection of the 10 elements automatically? Thank you very much. Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em seg., 23 de ago. de 2021 ?s 05:05, PIKAL Petr <mailto:
petr.pikal at precheza.cz> escreveu:
Hi Only I got your HTML formated mail, rest of the world got complete mess.
Do not use HTML formating.
As I got it right I wonder why in your second example you did not follow 3A - 3B - 2C - 2D as D were positioned 1st and 4th. I hope that you could use something like sss <- split(data$Var.2, data$Var.1) lapply(sss, cumsum) $A [1] 38 73 105 136 166 188 199 207 209 210 $B [1] 39 67 92 115 131 146 153 159 164 168 $C [1] 40 76 105 131 152 171 189 203 213 222 $D [1] 37 71 104 131 155 175 192 205 217 220 Now you need to evaluate this result according to your sets. Here the
highest value (76) is in C so the set with 2C is the one you should choose and select you value according to this set.
With
set.seed(666) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) data <- data[order(data$Var.2, decreasing=TRUE), ] sss <- split(data$Var.2, data$Var.1) lapply(sss, cumsum)
$A [1] 36 70 102 133 163 182 200 207 212 213 $B [1] 35 57 78 95 108 120 131 140 148 150 $C [1] 40 73 102 130 156 180 196 211 221 225 $D [1] 39 77 114 141 166 189 209 223 229 232 Highest value is in D so either 3A - 3B - 2C - 2D or 3A - 3B - 2C - 2D
should be appropriate. And here I am again lost as both sets are same. Maybe you need to reconsider your statements.
Cheers Petr From: Silvano Cesar da Costa <mailto:silvano at uel.br> Sent: Friday, August 20, 2021 9:28 PM To: PIKAL Petr <mailto:petr.pikal at precheza.cz> Cc: mailto:r-help at r-project.org Subject: Re: [R] Selecting elements Hi, thanks you for the answer. Sorry English is not my native language. But you got it right.
As C is first and fourth biggest value, you follow third option and
select 3 highest A, 3B 2C and 2D?
I must select the 10 (not 15) highest values, but which follow a certain
order:
3A - 3B - 2C - 2D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D I'll put the example in Excel for a better understanding (with 20
elements only).
I must select 10 elements (the highest values of variable Var.2), which
fit one of the 3 options above.
Number Position Var.1 Var.2 1 27 C 40 2 30 B 39 Selected: 3 5 A 38 Number Position Var.1 Var.2 4 16 D 37 1 27 C 40 5 23 C 36 2 30 B 39 3A - 3B - 2C - 2D 6 13 A 35 3 5 A 38 7 20 D 34 4 16 D 37 3A - 3B - 1C - 3D 8 12 D 33 5 23 C 36 9 9 A 32 6 13 A 35 2A - 5B - 0C - 3D 10 1 A 31 7 20 D 34 11 21 A 30 10 9 A 32 12 35 C 29 13 14 B 28 13 14 B 28 17 6 B 25 14 8 D 27 15 7 C 26 16 6 B 25 17 40 D 24 18 26 B 23 19 29 A 22 20 31 C 21 Second option (other data set): Number Position Var.1 Var.2 1 36 D 20 2 11 B 19 Selected: 3 39 A 18 Number Position Var.1 Var.2 4 24 D 17 1 36 D 20 5 34 B 16 2 11 B 19 3A - 3B - 2C - 2D 6 2 B 15 3 39 A 18 7 3 A 14 4 24 D 17 3A - 3B - 1C - 3D 8 32 D 13 5 34 B 16 9 28 D 12 6 2 B 15 2A - 5B - 0C - 3D 10 25 A 11 7 3 A 14 11 19 B 10 8 32 D 13 12 15 B 9 9 25 A 11 13 17 A 8 10 18 C 7 14 18 C 7 15 38 B 6 16 10 B 5 17 22 B 4 18 4 D 3 19 33 A 2 20 37 A 1 How to make the selection of these 10 elements that fit one of the 3
options using R?
Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em sex., 20 de ago. de 2021 ?s 03:28, PIKAL Petr <mailto:mailto:
petr.pikal at precheza.cz> escreveu:
Hallo I am confused, maybe others know what do you want but could you be more
specific?
Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth
biggest value, you follow third option and select 3 highest A, 3B 2C and 2D?
Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message----- From: R-help <mailto:mailto:r-help-bounces at r-project.org> On Behalf
Of Silvano Cesar da
Costa Sent: Thursday, August 19, 2021 10:40 PM To: mailto:mailto:r-help at r-project.org Subject: [R] Selecting elements Hi, I need to select 15 elements, always considering the highest values (descending order) but obeying the following configuration: 3A - 4B - 0C - 3D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D If I have, for example, 5 A elements as the highest values, I can only
choose
(first and third choice) or 2 (second choice) elements.
how to make this selection?
library(dplyr)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(data = data[order(data$Var.2, decreasing=TRUE), ])
Elements = data %>%
arrange(desc(Var.2))
Thanks,
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
[[alternative HTML version deleted]]
______________________________________________ mailto:mailto:R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Silvano,
Just add the selected elements to the return value:
set.seed(123)
Var.1 <- rep(LETTERS[1:4], 10)
Var.2 <- sample(1:40, replace=FALSE)
data <- data.frame(Var.1, Var.2)
(Order <- data[order(data$Var.2, decreasing=TRUE), ])
allowed<-matrix(c(3,3,2,2,2,5,0,3,3,4,2,1),nrow=3,byrow=TRUE)
colnames(allowed)<-LETTERS[1:4]
select_largest<-function(x,allowed,n=10) {
totals<-rep(0,nrow(allowed))
indices<-matrix(0,ncol=n,nrow=nrow(allowed))
for(i in 1:nrow(allowed)) {
ii<-1
for(j in 1:ncol(allowed)) {
if(allowed[i,j]) {
indx<-which(x[,1] == colnames(allowed)[j])
totals[i]<-totals[i]+sum(x[indx[1:allowed[i,j]],2])
indices[i,ii:(ii+allowed[i,j]-1)]<-indx[1:allowed[i,j]]
ii<-ii+allowed[i,j]
}
}
}
largest<-which.max(totals)
# sort the indices here
indices<-sort(indices[largest,])
return(list(scheme=largest,total=totals[largest],
indices=indices,elements=x[indices,]))
}
select_largest(Order,allowed)
Jim
On Thu, Aug 26, 2021 at 12:46 AM Silvano Cesar da Costa <silvano at uel.br> wrote:
Wow, That's exactly what I want. But, if possible, that a list was created with the selected elements (variable and value). Is it possible to add in the output file? Thank you very much. Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em qua., 25 de ago. de 2021 ?s 03:12, Jim Lemon <drjimlemon at gmail.com> escreveu:
Hi Silvano,
I was completely stumped by your problem until I looked through Petr's
response and guessed that you wanted the largest sum of 'Var.1"
constrained by the specified numbers in your three schemes. I think
this is what you want, but I haven't checked it exhaustively.
set.seed(123)
Var.1 <- rep(LETTERS[1:4], 10)
Var.2 <- sample(1:40, replace=FALSE)
data <- data.frame(Var.1, Var.2)
(Order <- data[order(data$Var.2, decreasing=TRUE), ])
allowed<-matrix(c(3,3,2,2,2,5,0,3,3,4,2,1),nrow=3,byrow=TRUE)
colnames(allowed)<-LETTERS[1:4]
select_largest<-function(x,allowed,n=10) {
totals<-rep(0,nrow(allowed))
indices<-matrix(0,ncol=n,nrow=nrow(allowed))
for(i in 1:nrow(allowed)) {
ii<-1
for(j in 1:ncol(allowed)) {
if(allowed[i,j]) {
indx<-which(x[,1] == colnames(allowed)[j])
totals[i]<-totals[i]+sum(x[indx[1:allowed[i,j]],2])
indices[i,ii:(ii+allowed[i,j]-1)]<-indx[1:allowed[i,j]]
ii<-ii+allowed[i,j]
}
}
}
largest<-which.max(totals)
return(list(scheme=largest,total=totals[largest],
indices=sort(indices[largest,])))
}
select_largest(Order,allowed)
Jim
On Tue, Aug 24, 2021 at 7:11 PM PIKAL Petr <petr.pikal at precheza.cz> wrote:
Hi. Now it is understandable. However the solution is not clear for me. table(Order$Var.1[1:10]) A B C D 4 1 2 3 should give you a hint which scheme could be acceptable, but how to do it programmatically I do not know. maybe to start with lower value in the table call and gradually increse it to check which scheme starts to be the chosen one
table(data.o$Var.1[1]) # scheme 2 is out
C 1 ...
table(data.o$Var.1[1:5]) #scheme 3
A B C D 1 1 2 1
table(data.o$Var.1[1:6]) #scheme 3
A B C D 2 1 2 1
table(data.o$Var.1[1:7]) # scheme1
A B C D 2 1 2 2
table(data.o$Var.1[1:8]) # no such scheme, so scheme 1 is chosen one
A B C D
2 1 2 3
#Now you need to select values based on scheme 1.
# 3A - 3B - 2C - 2D
sss <- split(Order, Order$Var.1)
selection <- c(3,3,2,2)
result <- vector("list", 4)
#I would use loop
for(i in 1:4) {
result[[i]] <- sss[[i]][1:selection[i],]
}
Maybe someone come with other ingenious solution.
Cheers
Petr
From: Silvano Cesar da Costa <silvano at uel.br>
Sent: Monday, August 23, 2021 7:54 PM
To: PIKAL Petr <petr.pikal at precheza.cz>
Cc: r-help at r-project.org
Subject: Re: [R] Selecting elements
Hi,
I apologize for the confusion. I will try to be clearer in my explanation. I believe that with the R script it becomes clearer.
I have 4 variables with 10 repetitions and each one receives a value, randomly.
I order the dataset from largest to smallest value. I have to select 10 elements in
descending order of values, according to one of three schemes:
# 3A - 3B - 2C - 2D
# 2A - 5B - 0C - 3D
# 3A - 4B - 2C - 1D
If the first 3 elements (out of the 10 to be selected) are of the letter D, automatically
the adopted scheme will be the second. So, I have to (following) choose 2A, 5B and 0C.
How to make the selection automatically?
I created two selection examples, with different schemes:
set.seed(123)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(Order = data[order(data$Var.2, decreasing=TRUE), ])
# I must select the 10 highest values (),
# but which follow a certain scheme:
#
# 3A - 3B - 2C - 2D or
# 2A - 5B - 0C - 3D or
# 3A - 4B - 2C - 1D
#
# In this case, I started with the highest value that refers to the letter C.
# Next comes only 1 of the letters B, A and D. All are selected once.
# The fifth observation is the letter C, completing 2 C values. In this case,
# following the 3 adopted schemes, note that the second scheme has 0C,
# so this scheme is out.
# Therefore, it can be the first scheme (3A - 3B - 2C - 2D) or the
# third scheme (3A - 4B - 2C - 1D).
# The next letter to be completed is the D (fourth and seventh elements),
# among the 10 elements being selected. Therefore, the scheme adopted is the
# first one (3A - 3B - 2C - 2D).
# Therefore, it is necessary to select 2 values with the letter B and 1 value
# with the letter A.
#
# Manual Selection -
# The end result is:
(Selected.data = Order[c(1,2,3,4,5,6,7,9,13,16), ])
# Scheme: 3A - 3B - 2C - 2D
sort(Selected.data$Var.1)
#------------------
# Second example: -
#------------------
set.seed(4)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(Order = data[order(data$Var.2, decreasing=TRUE), ])
# The end result is:
(Selected.data.2 = Order[c(1,2,3,4,5,6,7,8,9,11), ])
# Scheme: 3A - 4B - 2C - 1D
sort(Selected.data.2$Var.1)
How to make the selection of the 10 elements automatically?
Thank you very much.
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
Em seg., 23 de ago. de 2021 ?s 05:05, PIKAL Petr <mailto:petr.pikal at precheza.cz> escreveu:
Hi
Only I got your HTML formated mail, rest of the world got complete mess. Do not use HTML formating.
As I got it right I wonder why in your second example you did not follow
3A - 3B - 2C - 2D
as D were positioned 1st and 4th.
I hope that you could use something like
sss <- split(data$Var.2, data$Var.1)
lapply(sss, cumsum)
$A
[1] 38 73 105 136 166 188 199 207 209 210
$B
[1] 39 67 92 115 131 146 153 159 164 168
$C
[1] 40 76 105 131 152 171 189 203 213 222
$D
[1] 37 71 104 131 155 175 192 205 217 220
Now you need to evaluate this result according to your sets. Here the highest value (76) is in C so the set with 2C is the one you should choose and select you value according to this set.
With
set.seed(666) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) data <- data[order(data$Var.2, decreasing=TRUE), ] sss <- split(data$Var.2, data$Var.1) lapply(sss, cumsum)
$A [1] 36 70 102 133 163 182 200 207 212 213 $B [1] 35 57 78 95 108 120 131 140 148 150 $C [1] 40 73 102 130 156 180 196 211 221 225 $D [1] 39 77 114 141 166 189 209 223 229 232 Highest value is in D so either 3A - 3B - 2C - 2D or 3A - 3B - 2C - 2D should be appropriate. And here I am again lost as both sets are same. Maybe you need to reconsider your statements. Cheers Petr From: Silvano Cesar da Costa <mailto:silvano at uel.br> Sent: Friday, August 20, 2021 9:28 PM To: PIKAL Petr <mailto:petr.pikal at precheza.cz> Cc: mailto:r-help at r-project.org Subject: Re: [R] Selecting elements Hi, thanks you for the answer. Sorry English is not my native language. But you got it right.
As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D?
I must select the 10 (not 15) highest values, but which follow a certain order: 3A - 3B - 2C - 2D or 2A - 5B - 0C - 3D or 3A - 3B - 2C - 2D I'll put the example in Excel for a better understanding (with 20 elements only). I must select 10 elements (the highest values of variable Var.2), which fit one of the 3 options above. Number Position Var.1 Var.2 1 27 C 40 2 30 B 39 Selected: 3 5 A 38 Number Position Var.1 Var.2 4 16 D 37 1 27 C 40 5 23 C 36 2 30 B 39 3A - 3B - 2C - 2D 6 13 A 35 3 5 A 38 7 20 D 34 4 16 D 37 3A - 3B - 1C - 3D 8 12 D 33 5 23 C 36 9 9 A 32 6 13 A 35 2A - 5B - 0C - 3D 10 1 A 31 7 20 D 34 11 21 A 30 10 9 A 32 12 35 C 29 13 14 B 28 13 14 B 28 17 6 B 25 14 8 D 27 15 7 C 26 16 6 B 25 17 40 D 24 18 26 B 23 19 29 A 22 20 31 C 21 Second option (other data set): Number Position Var.1 Var.2 1 36 D 20 2 11 B 19 Selected: 3 39 A 18 Number Position Var.1 Var.2 4 24 D 17 1 36 D 20 5 34 B 16 2 11 B 19 3A - 3B - 2C - 2D 6 2 B 15 3 39 A 18 7 3 A 14 4 24 D 17 3A - 3B - 1C - 3D 8 32 D 13 5 34 B 16 9 28 D 12 6 2 B 15 2A - 5B - 0C - 3D 10 25 A 11 7 3 A 14 11 19 B 10 8 32 D 13 12 15 B 9 9 25 A 11 13 17 A 8 10 18 C 7 14 18 C 7 15 38 B 6 16 10 B 5 17 22 B 4 18 4 D 3 19 33 A 2 20 37 A 1 How to make the selection of these 10 elements that fit one of the 3 options using R? Thanks, Prof. Dr. Silvano Cesar da Costa Universidade Estadual de Londrina Centro de Ci?ncias Exatas Departamento de Estat?stica Fone: (43) 3371-4346 Em sex., 20 de ago. de 2021 ?s 03:28, PIKAL Petr <mailto:mailto:petr.pikal at precheza.cz> escreveu: Hallo I am confused, maybe others know what do you want but could you be more specific? Let say you have such data set.seed(123) Var.1 = rep(LETTERS[1:4], 10) Var.2 = sample(1:40, replace=FALSE) data = data.frame(Var.1, Var.2) What should be the desired outcome? You can sort data <- data[order(data$Var.2, decreasing=TRUE), ] and split the data
split(data$Var.2, data$Var.1)
$A [1] 38 35 32 31 30 22 11 8 2 1 $B [1] 39 28 25 23 16 15 7 6 5 4 $C [1] 40 36 29 26 21 19 18 14 10 9 $D [1] 37 34 33 27 24 20 17 13 12 3 T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D? Or I do not understand at all what you really want to achieve. Cheers Petr
-----Original Message-----
From: R-help <mailto:mailto:r-help-bounces at r-project.org> On Behalf Of Silvano Cesar da
Costa
Sent: Thursday, August 19, 2021 10:40 PM
To: mailto:mailto:r-help at r-project.org
Subject: [R] Selecting elements
Hi,
I need to select 15 elements, always considering the highest values
(descending order) but obeying the following configuration:
3A - 4B - 0C - 3D or
2A - 5B - 0C - 3D or
3A - 3B - 2C - 2D
If I have, for example, 5 A elements as the highest values, I can only choose
(first and third choice) or 2 (second choice) elements.
how to make this selection?
library(dplyr)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
(data = data[order(data$Var.2, decreasing=TRUE), ])
Elements = data %>%
arrange(desc(Var.2))
Thanks,
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ci?ncias Exatas
Departamento de Estat?stica
Fone: (43) 3371-4346
[[alternative HTML version deleted]]
______________________________________________ mailto:mailto:R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.