Greetings, I am working on a project where we are applying the
Kruskal-Wallace test to some factor data to evaluate their correlation with
existing grade data. I know that the grade data is nonnormal therefore we
cannot rely on ANOVA or a similar parametric test. What I would like to
find is a mechanism for making power calculations for the KW test given the
nonparametric assumptions. My perusal of the literature has suggested that
a simulation would be the best method.
Can anyone point me to good examples of such simulations for KW in R? And
does anyone have a favourite package for generating simulated data or
conducting such tests?
Thank you,
Collin.
Kruskal-Wallace power calculations.
6 messages · Jeff Newmiller, Jim Lemon, Collin Lynch +1 more
Please stop... you are acting like a broken record, and are also posting in HTML format. Please read the Posting Guide and demonstrate that you have used a search engine on this topic before posting again.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On April 2, 2015 7:25:20 AM PDT, Collin Lynch <cflynch at ncsu.edu> wrote:
Greetings, I am working on a project where we are applying the Kruskal-Wallace test to some factor data to evaluate their correlation with existing grade data. I know that the grade data is nonnormal therefore we cannot rely on ANOVA or a similar parametric test. What I would like to find is a mechanism for making power calculations for the KW test given the nonparametric assumptions. My perusal of the literature has suggested that a simulation would be the best method. Can anyone point me to good examples of such simulations for KW in R? And does anyone have a favourite package for generating simulated data or conducting such tests? Thank you, Collin. [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Collin, Have a look at this: http://stats.stackexchange.com/questions/70643/power-analysis-for-kruskal-wallis-or-mann-whitney-u-test-using-r Although, thinking about it, this might have constituted your "perusal of the literature". Plus it always looks better when you spell the names properly Jim On Fri, Apr 3, 2015 at 2:23 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
Please stop... you are acting like a broken record, and are also posting
in HTML format. Please read the Posting Guide and demonstrate that you have
used a search engine on this topic before posting again.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On April 2, 2015 7:25:20 AM PDT, Collin Lynch <cflynch at ncsu.edu> wrote:
Greetings, I am working on a project where we are applying the
Kruskal-Wallace test to some factor data to evaluate their correlation
with
existing grade data. I know that the grade data is nonnormal therefore
we
cannot rely on ANOVA or a similar parametric test. What I would like
to
find is a mechanism for making power calculations for the KW test given
the
nonparametric assumptions. My perusal of the literature has suggested
that
a simulation would be the best method.
Can anyone point me to good examples of such simulations for KW in R?
And
does anyone have a favourite package for generating simulated data or
conducting such tests?
Thank you,
Collin.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thank you Jim, I did see those (though not my typo :) and am still pondering the warning about post-hoc analyses. The situation that I am in is that I have a set of individuals who have been assigned a course grade. We have then clustered these individuals into about 50 communities using standard community detection algorithms with the goal of determining whether community membership affects one of their grades. We are using the KW test as the grade data is strongly non-normal and my coauthors preferred KW as an alternative. The two issues that I am struggling with are: 1) whether the post-hoc power analysis would be useful; and 2) how to code the simulation studies that are described in: http://onlinelibrary.wiley.com/doi/10.1002/bimj.4710380510/abstract Problem #1 is of course beyond the scope of this e-mail list though I would welcome anyone's suggestions on that point. I am not sure that I buy the arguments against it offered here: http://graphpad.com/support/faq/why-it-is-not-helpful-to-compute-the-power-of-an-experiment-to-detect-the-difference-actually-observed-why-is-post-hoc-power-analysis-futile/ It seems that the rationale boils down to "you didn't find it so you couldn't find it" but that does not tell me how far off I was from the goal. I am still perusing the articles the author cites however. With respect to question #2 I am trying to lay my hands on the article and did find this old r-help discussion: http://r.789695.n4.nabble.com/Power-of-Kruskal-Wallis-Test-td4671188.html however I am not sure how to adapt the simulation studies that it links to to my current problem. The links it leads to focus on mixed-effects models. This may be more of a pure stats question and not suited for this list but I thought I'd ask in the hopes that anyone had any more specific KW code or knew of a good tutorial for the right kinds of simulation studies. Thank you, Collin.
On Thu, Apr 2, 2015 at 6:35 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Collin, Have a look at this: http://stats.stackexchange.com/questions/70643/power-analysis-for-kruskal-wallis-or-mann-whitney-u-test-using-r Although, thinking about it, this might have constituted your "perusal of the literature". Plus it always looks better when you spell the names properly Jim On Fri, Apr 3, 2015 at 2:23 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
Please stop... you are acting like a broken record, and are also posting
in HTML format. Please read the Posting Guide and demonstrate that you have
used a search engine on this topic before posting again.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go
Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#.
rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On April 2, 2015 7:25:20 AM PDT, Collin Lynch <cflynch at ncsu.edu> wrote:
Greetings, I am working on a project where we are applying the
Kruskal-Wallace test to some factor data to evaluate their correlation
with
existing grade data. I know that the grade data is nonnormal therefore
we
cannot rely on ANOVA or a similar parametric test. What I would like
to
find is a mechanism for making power calculations for the KW test given
the
nonparametric assumptions. My perusal of the literature has suggested
that
a simulation would be the best method.
Can anyone point me to good examples of such simulations for KW in R?
And
does anyone have a favourite package for generating simulated data or
conducting such tests?
Thank you,
Collin.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Here is some sample code:
## Simulation function to create data, analyze it using
## kruskal.test, and return the p-value
## change rexp to change the simulation distribution
simfun <- function(means, k=length(means), n=rep(50,k)) {
mydata <- lapply( seq_len(k), function(i) {
rexp(n[i], 1) - 1 + means[i]
})
kruskal.test(mydata)$p.value
}
# simulate under the null to check proper sizing
B <- 10000
out1 <- replicate(B, simfun(rep(3,4)))
hist(out1)
mean( out1 <= 0.05 )
binom.test( sum(out1 <= 0.05), B, p=0.05)
### Now simulate for power
B <- 10000
out2 <- replicate(B, simfun( c(3,3,3.2,3.3)))
hist(out2)
mean( out2 <= 0.05 )
binom.test( sum(out2 <= 0.05), B, p=0.05 )
This simulates from a continuous exponential (skewed) and shifts to
get the means (shifted location is a common assumption, though not
required for the actual test).
On Thu, Apr 2, 2015 at 8:19 PM, Collin Lynch <cflynch at ncsu.edu> wrote:
Thank you Jim, I did see those (though not my typo :) and am still pondering the warning about post-hoc analyses. The situation that I am in is that I have a set of individuals who have been assigned a course grade. We have then clustered these individuals into about 50 communities using standard community detection algorithms with the goal of determining whether community membership affects one of their grades. We are using the KW test as the grade data is strongly non-normal and my coauthors preferred KW as an alternative. The two issues that I am struggling with are: 1) whether the post-hoc power analysis would be useful; and 2) how to code the simulation studies that are described in: http://onlinelibrary.wiley.com/doi/10.1002/bimj.4710380510/abstract Problem #1 is of course beyond the scope of this e-mail list though I would welcome anyone's suggestions on that point. I am not sure that I buy the arguments against it offered here: http://graphpad.com/support/faq/why-it-is-not-helpful-to-compute-the-power-of-an-experiment-to-detect-the-difference-actually-observed-why-is-post-hoc-power-analysis-futile/ It seems that the rationale boils down to "you didn't find it so you couldn't find it" but that does not tell me how far off I was from the goal. I am still perusing the articles the author cites however. With respect to question #2 I am trying to lay my hands on the article and did find this old r-help discussion: http://r.789695.n4.nabble.com/Power-of-Kruskal-Wallis-Test-td4671188.html however I am not sure how to adapt the simulation studies that it links to to my current problem. The links it leads to focus on mixed-effects models. This may be more of a pure stats question and not suited for this list but I thought I'd ask in the hopes that anyone had any more specific KW code or knew of a good tutorial for the right kinds of simulation studies. Thank you, Collin. On Thu, Apr 2, 2015 at 6:35 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Collin, Have a look at this: http://stats.stackexchange.com/questions/70643/power-analysis-for-kruskal-wallis-or-mann-whitney-u-test-using-r Although, thinking about it, this might have constituted your "perusal of the literature". Plus it always looks better when you spell the names properly Jim On Fri, Apr 3, 2015 at 2:23 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
Please stop... you are acting like a broken record, and are also posting
in HTML format. Please read the Posting Guide and demonstrate that you have
used a search engine on this topic before posting again.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go
Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#.
rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On April 2, 2015 7:25:20 AM PDT, Collin Lynch <cflynch at ncsu.edu> wrote:
Greetings, I am working on a project where we are applying the
Kruskal-Wallace test to some factor data to evaluate their correlation
with
existing grade data. I know that the grade data is nonnormal therefore
we
cannot rely on ANOVA or a similar parametric test. What I would like
to
find is a mechanism for making power calculations for the KW test given
the
nonparametric assumptions. My perusal of the literature has suggested
that
a simulation would be the best method.
Can anyone point me to good examples of such simulations for KW in R?
And
does anyone have a favourite package for generating simulated data or
conducting such tests?
Thank you,
Collin.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Gregory (Greg) L. Snow Ph.D. 538280 at gmail.com
Thank you very much Greg, I will give that a try.
Best,
Collin.
On Fri, Apr 3, 2015 at 1:43 PM, Greg Snow <538280 at gmail.com> wrote:
Here is some sample code:
## Simulation function to create data, analyze it using
## kruskal.test, and return the p-value
## change rexp to change the simulation distribution
simfun <- function(means, k=length(means), n=rep(50,k)) {
mydata <- lapply( seq_len(k), function(i) {
rexp(n[i], 1) - 1 + means[i]
})
kruskal.test(mydata)$p.value
}
# simulate under the null to check proper sizing
B <- 10000
out1 <- replicate(B, simfun(rep(3,4)))
hist(out1)
mean( out1 <= 0.05 )
binom.test( sum(out1 <= 0.05), B, p=0.05)
### Now simulate for power
B <- 10000
out2 <- replicate(B, simfun( c(3,3,3.2,3.3)))
hist(out2)
mean( out2 <= 0.05 )
binom.test( sum(out2 <= 0.05), B, p=0.05 )
This simulates from a continuous exponential (skewed) and shifts to
get the means (shifted location is a common assumption, though not
required for the actual test).
On Thu, Apr 2, 2015 at 8:19 PM, Collin Lynch <cflynch at ncsu.edu> wrote:
Thank you Jim, I did see those (though not my typo :) and am still pondering the warning about post-hoc analyses. The situation that I am in is that I have a set of individuals who have been assigned a course grade. We have then clustered these individuals into about 50 communities using standard community detection algorithms with the goal of determining whether community membership affects one of their grades. We are using the KW test as the grade data is strongly non-normal and my coauthors preferred KW as an alternative. The two issues that I am struggling with are: 1) whether the post-hoc power analysis would be useful; and 2) how to code the simulation studies that are described in: http://onlinelibrary.wiley.com/doi/10.1002/bimj.4710380510/abstract Problem #1 is of course beyond the scope of this e-mail list though I would welcome anyone's suggestions on that point. I am not sure that I buy the arguments against it offered here: http://graphpad.com/support/faq/why-it-is-not-helpful-to-compute-the-power-of-an-experiment-to-detect-the-difference-actually-observed-why-is-post-hoc-power-analysis-futile/ It seems that the rationale boils down to "you didn't find it so you couldn't find it" but that does not tell me how far off I was from the goal. I am still perusing the articles the author cites however. With respect to question #2 I am trying to lay my hands on the article and did find this old r-help discussion: http://r.789695.n4.nabble.com/Power-of-Kruskal-Wallis-Test-td4671188.html however I am not sure how to adapt the simulation studies that it links to to my current problem. The links it leads to focus on mixed-effects models. This may be more of a pure stats question and not suited for this list but I thought I'd ask in the hopes that anyone had any more specific KW code or knew of a good tutorial for the right kinds of simulation studies. Thank you, Collin. On Thu, Apr 2, 2015 at 6:35 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Collin, Have a look at this: http://stats.stackexchange.com/questions/70643/power-analysis-for-kruskal-wallis-or-mann-whitney-u-test-using-r Although, thinking about it, this might have constituted your "perusal of the literature". Plus it always looks better when you spell the names properly Jim On Fri, Apr 3, 2015 at 2:23 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
Please stop... you are acting like a broken record, and are also posting
in HTML format. Please read the Posting Guide and demonstrate that you have
used a search engine on this topic before posting again.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go
Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#.
rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On April 2, 2015 7:25:20 AM PDT, Collin Lynch <cflynch at ncsu.edu> wrote:
Greetings, I am working on a project where we are applying the
Kruskal-Wallace test to some factor data to evaluate their correlation
with
existing grade data. I know that the grade data is nonnormal therefore
we
cannot rely on ANOVA or a similar parametric test. What I would like
to
find is a mechanism for making power calculations for the KW test given
the
nonparametric assumptions. My perusal of the literature has suggested
that
a simulation would be the best method.
Can anyone point me to good examples of such simulations for KW in R?
And
does anyone have a favourite package for generating simulated data or
conducting such tests?
Thank you,
Collin.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Gregory (Greg) L. Snow Ph.D. 538280 at gmail.com