Dear all, Edzer Pebesma and I are combining forces into a new GitHub organisation called "r-quantities", to which we have moved the CRAN packages 'units', 'errors' and 'constants'. The idea is to write a new package called 'quantities' to integrate 'units' and 'errors' into a comprehensive solution for dealing with quantity values + uncertainty calculus. Given that a significant fraction of R users, both practitioners and researchers, use R to analyse measurements, we believe that the R community would benefit from such a project. Moreover, to the best of our knowledge, there exists no such an integrated and automated framework outside the R language. We would like to share a proposal [1] to be submitted to the R Consortium before October 15. Until then, we kindly invite the R package developers to review it. Any feedback or contribution would be very helpful. [1] https://github.com/r-quantities/proposal Regards, I?aki
[R-pkg-devel] r-quantities seeking feedback
9 messages · David Hugh-Jones, Bill Denney, Duncan Murdoch +1 more
One question that comes to mind: what's the synergy? I e why are units and errors best handled together? I use standard errors a lot, but never units... I would like a standard way to represent uncertainty but don't think I need the other stuff. Cheers, D
On Fri, 6 Oct 2017 at 17:25, I?aki ?car <i.ucar86 at gmail.com> wrote:
Dear all, Edzer Pebesma and I are combining forces into a new GitHub organisation called "r-quantities", to which we have moved the CRAN packages 'units', 'errors' and 'constants'. The idea is to write a new package called 'quantities' to integrate 'units' and 'errors' into a comprehensive solution for dealing with quantity values + uncertainty calculus. Given that a significant fraction of R users, both practitioners and researchers, use R to analyse measurements, we believe that the R community would benefit from such a project. Moreover, to the best of our knowledge, there exists no such an integrated and automated framework outside the R language. We would like to share a proposal [1] to be submitted to the R Consortium before October 15. Until then, we kindly invite the R package developers to review it. Any feedback or contribution would be very helpful. [1] https://github.com/r-quantities/proposal Regards, I?aki
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Sent from Gmail Mobile [[alternative HTML version deleted]]
El 6 oct. 2017 19:13, "David Hugh-Jones" <davidhughjones at gmail.com> escribi?: One question that comes to mind: what's the synergy? I e why are units and errors best handled together? I use standard errors a lot, but never units... I would like a standard way to represent uncertainty but don't think I need the other stuff. You will always be able to use errors (or units) alone if you wish, but every measurement has a unit and some uncertainty, so we think it's interesting to have the possibility of handling them in a unified way (I/O, propagation, automatic axes and error bars in plots...). I?aki Cheers, D
On Fri, 6 Oct 2017 at 17:25, I?aki ?car <i.ucar86 at gmail.com> wrote:
Dear all, Edzer Pebesma and I are combining forces into a new GitHub organisation called "r-quantities", to which we have moved the CRAN packages 'units', 'errors' and 'constants'. The idea is to write a new package called 'quantities' to integrate 'units' and 'errors' into a comprehensive solution for dealing with quantity values + uncertainty calculus. Given that a significant fraction of R users, both practitioners and researchers, use R to analyse measurements, we believe that the R community would benefit from such a project. Moreover, to the best of our knowledge, there exists no such an integrated and automated framework outside the R language. We would like to share a proposal [1] to be submitted to the R Consortium before October 15. Until then, we kindly invite the R package developers to review it. Any feedback or contribution would be very helpful. [1] https://github.com/r-quantities/proposal Regards, I?aki
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Sent from Gmail Mobile [[alternative HTML version deleted]]
Many measurements have no unit, but some uncertainty - e.g. the b and se from an arbitrary regression. Can you give specific examples of the advantages from binding these packages tightly together?
On Fri, 6 Oct 2017 at 21:23, I?aki ?car <i.ucar86 at gmail.com> wrote:
El 6 oct. 2017 19:13, "David Hugh-Jones" <davidhughjones at gmail.com> escribi?: One question that comes to mind: what's the synergy? I e why are units and errors best handled together? I use standard errors a lot, but never units... I would like a standard way to represent uncertainty but don't think I need the other stuff. You will always be able to use errors (or units) alone if you wish, but every measurement has a unit and some uncertainty, so we think it's interesting to have the possibility of handling them in a unified way (I/O, propagation, automatic axes and error bars in plots...). I?aki Cheers, D On Fri, 6 Oct 2017 at 17:25, I?aki ?car <i.ucar86 at gmail.com> wrote:
Dear all, Edzer Pebesma and I are combining forces into a new GitHub organisation called "r-quantities", to which we have moved the CRAN packages 'units', 'errors' and 'constants'. The idea is to write a new package called 'quantities' to integrate 'units' and 'errors' into a comprehensive solution for dealing with quantity values + uncertainty calculus. Given that a significant fraction of R users, both practitioners and researchers, use R to analyse measurements, we believe that the R community would benefit from such a project. Moreover, to the best of our knowledge, there exists no such an integrated and automated framework outside the R language. We would like to share a proposal [1] to be submitted to the R Consortium before October 15. Until then, we kindly invite the R package developers to review it. Any feedback or contribution would be very helpful. [1] https://github.com/r-quantities/proposal Regards, I?aki
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
-- Sent from Gmail Mobile --
Sent from Gmail Mobile
Hi I?aki and David, I fully see the need in a standardized unit package, and I understand the need for propagation of errors (though I'm in the opposite camp to David where I usually need unit tracking and conversion and rarely need error propagation-- though that's because my error propagation is often nonlinear and sometimes not normally distributed, so I have to do it myself). I agree with David in that: error propagation and unit tracking and conversion are different with partially-overlapping audiences. But, I agree with I?aki that there is a need for a consistent framework that can handle both. The reason for the need of a consistent framework is that if we have two separate packages that handle both they generally will be unaware of each other and may not play nicely together (ref the recent discussion on tibbles not always playing nicely with code expecting data.frames). I think that three packages should generally be the goal: 1) One that handles units 2) One that handles error propagation 3) One that uses the other two to handle both units and error propagation The components that I didn't see in your discussion of your proposal is extension of both libraries. For units, it should be possible to connect any set of units to any other set of units with a new conversion (e.g. mass and molar units could be connected with a molecular weight). And, it should be possible to have multiple unit systems that can manage separate sets of rules (often an extension of a basic set of rules), and these should be possible to connect together. The example for me again is with molecular weights, I may have molecule 1 that has a molecular weight of 100 g/mole and molecule 2 with a molecular weight of 200 g/mole; I would need to be able to store those at the same time without the system confusing the two. And, I would slow need to store the rule that 2 count of molecule 1 make 1 count of molecule 2. (FYI, parts of this are in https://github.com/pacificclimate/Rudunits2/pull/9 ) For both units and error propagation, these will need to work with general functions in packages that do not explicitly support the new packages. As an example, the lm, glm, gls, etc. (along with thousands of other) functions are unlikely to be modified for support of the packages). There should be some mechanism to make a simple wrapper function that looks at the input and understands how to map the output. Such as: lm_quantities <- function(...) { # look at the LHS of the formula argument, and apply any maths required to determine the units of the LHS. # call lm normally # assign units and/or error propagation to the result of the lm call } That would have to be repeated for any other function of interest. Straight-forward examples that are part of the recommended libraries would hopefully be covered, and other library authors should have a simple way of assessing what the right units and error measures are to add it to their own libraries (optionally). Thanks, Bill
On Oct 6, 2017, at 13:13, David Hugh-Jones <davidhughjones at gmail.com> wrote: One question that comes to mind: what's the synergy? I e why are units and errors best handled together? I use standard errors a lot, but never units... I would like a standard way to represent uncertainty but don't think I need the other stuff. Cheers, D
On Fri, 6 Oct 2017 at 17:25, I?aki ?car <i.ucar86 at gmail.com> wrote: Dear all, Edzer Pebesma and I are combining forces into a new GitHub organisation called "r-quantities", to which we have moved the CRAN packages 'units', 'errors' and 'constants'. The idea is to write a new package called 'quantities' to integrate 'units' and 'errors' into a comprehensive solution for dealing with quantity values + uncertainty calculus. Given that a significant fraction of R users, both practitioners and researchers, use R to analyse measurements, we believe that the R community would benefit from such a project. Moreover, to the best of our knowledge, there exists no such an integrated and automated framework outside the R language. We would like to share a proposal [1] to be submitted to the R Consortium before October 15. Until then, we kindly invite the R package developers to review it. Any feedback or contribution would be very helpful. [1] https://github.com/r-quantities/proposal Regards, I?aki
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
-- Sent from Gmail Mobile [[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
On 06/10/2017 4:28 PM, David Hugh-Jones wrote:
Many measurements have no unit, but some uncertainty - e.g. the b and se from an arbitrary regression. Can you give specific examples of the advantages from binding these packages tightly together?
Just to nitpick: in the regression y = a + b x, b needs to have the units of y divided by the units of x. Its se has the same units. If you change the units for either x or y, you'll change the appropriate value of b and se. Duncan Murdoch
On Fri, 6 Oct 2017 at 21:23, I?aki ?car <i.ucar86 at gmail.com> wrote:
El 6 oct. 2017 19:13, "David Hugh-Jones" <davidhughjones at gmail.com> escribi?: One question that comes to mind: what's the synergy? I e why are units and errors best handled together? I use standard errors a lot, but never units... I would like a standard way to represent uncertainty but don't think I need the other stuff. You will always be able to use errors (or units) alone if you wish, but every measurement has a unit and some uncertainty, so we think it's interesting to have the possibility of handling them in a unified way (I/O, propagation, automatic axes and error bars in plots...). I?aki Cheers, D On Fri, 6 Oct 2017 at 17:25, I?aki ?car <i.ucar86 at gmail.com> wrote:
Dear all, Edzer Pebesma and I are combining forces into a new GitHub organisation called "r-quantities", to which we have moved the CRAN packages 'units', 'errors' and 'constants'. The idea is to write a new package called 'quantities' to integrate 'units' and 'errors' into a comprehensive solution for dealing with quantity values + uncertainty calculus. Given that a significant fraction of R users, both practitioners and researchers, use R to analyse measurements, we believe that the R community would benefit from such a project. Moreover, to the best of our knowledge, there exists no such an integrated and automated framework outside the R language. We would like to share a proposal [1] to be submitted to the R Consortium before October 15. Until then, we kindly invite the R package developers to review it. Any feedback or contribution would be very helpful. [1] https://github.com/r-quantities/proposal Regards, I?aki
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
-- Sent from Gmail Mobile --
Sent from Gmail Mobile [[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
2017-10-06 22:28 GMT+02:00 David Hugh-Jones <davidhughjones at gmail.com>:
Many measurements have no unit, but some uncertainty - e.g. the b and se from an arbitrary regression. Can you give specific examples of the advantages from binding these packages tightly together?
As Duncan already pointed out, the units of b and se from an arbitrary regression depend on the units of your variables. The advantages from integrating both packages are the combination of advantages from each one with the same workflow as if you were working with bare numbers. It seems that you are already aware of the advantages of automatic error propagation. Regarding the units package, it is very useful for painless conversion of units. A conversion from kg to g is elementary, but some others require more care, for example J to eV, or N.m-1 to dyn.cm-1. In electromagnetism, it is very common to work with the CGS units system, and an automatic conversion from/to the SI comes in handy. If you are not persuaded already, we can also talk about the Mars Climate Orbiter, a robotic space probe launched by NASA on 1998 which disintegrated in Mars' upper atmosphere due to a computation with wrong units. I?aki
2017-10-06 22:38 GMT+02:00 Bill Denney <bill at denney.ws>:
Hi I?aki and David, I fully see the need in a standardized unit package, and I understand the need for propagation of errors (though I'm in the opposite camp to David where I usually need unit tracking and conversion and rarely need error propagation-- though that's because my error propagation is often nonlinear and sometimes not normally distributed, so I have to do it myself).
I plan to extend 'errors' to support also arbitrary distributions and MC propagation methods. There are already excellent packages doing this, but unlike with 'errors', you need a separate workflow to propagate the uncertainty. I believe they could be integrated as backends for 'errors'.
I agree with David in that: error propagation and unit tracking and conversion are different with partially-overlapping audiences. But, I agree with I?aki that there is a need for a consistent framework that can handle both. The reason for the need of a consistent framework is that if we have two separate packages that handle both they generally will be unaware of each other and may not play nicely together (ref the recent discussion on tibbles not always playing nicely with code expecting data.frames). I think that three packages should generally be the goal: 1) One that handles units 2) One that handles error propagation 3) One that uses the other two to handle both units and error propagation
Yeap, that's exactly our intent.
The components that I didn't see in your discussion of your proposal is extension of both libraries. For units, it should be possible to connect any set of units to any other set of units with a new conversion (e.g. mass and molar units could be connected with a molecular weight). And, it should be possible to have multiple unit systems that can manage separate sets of rules (often an extension of a basic set of rules), and these should be possible to connect together. The example for me again is with molecular weights, I may have molecule 1 that has a molecular weight of 100 g/mole and molecule 2 with a molecular weight of 200 g/mole; I would need to be able to store those at the same time without the system confusing the two. And, I would slow need to store the rule that 2 count of molecule 1 make 1 count of molecule 2. (FYI, parts of this are in https://github.com/pacificclimate/Rudunits2/pull/9 )
I'm not sure how much discussion should be dedicated in the proposal to the feature extension of both libraries, because many issues and needs have yet to be identified. We are in conversations with David Flater, author of reference [3] in the proposal, and he raised very interesting points too regarding units. For example, operations with counting units: if you have 2 pixels * 2 pixels, you want 4 pixels as output, and not 4 pixels^2.
For both units and error propagation, these will need to work with general functions in packages that do not explicitly support the new packages. As an example, the lm, glm, gls, etc. (along with thousands of other) functions are unlikely to be modified for support of the packages). There should be some mechanism to make a simple wrapper function that looks at the input and understands how to map the output. Such as:
lm_quantities <- function(...) {
# look at the LHS of the formula argument, and apply any maths required to determine the units of the LHS.
# call lm normally
# assign units and/or error propagation to the result of the lm call
}
That would have to be repeated for any other function of interest. Straight-forward examples that are part of the recommended libraries would hopefully be covered, and other library authors should have a simple way of assessing what the right units and error measures are to add it to their own libraries (optionally).
This, on the other hand, is not about new features, but about general compatibility, and I agree it should be further discussed in the proposal. I'll add some discussion along this lines. Thank you very much, Bill. This feedback is very useful. I?aki
Hi I?aki, OK, it sounds like we have no practical disagreement: you're planning to keep separate packages and then have a third one for integration. That will be fine for people like me who don't necessarily want to specify units for our regressions. I look forward to seeing this! Cheers, David
On 7 October 2017 at 13:00, I?aki ?car <i.ucar86 at gmail.com> wrote:
2017-10-06 22:28 GMT+02:00 David Hugh-Jones <davidhughjones at gmail.com>:
Many measurements have no unit, but some uncertainty - e.g. the b and se from an arbitrary regression. Can you give specific examples of the advantages from binding these packages tightly together?
As Duncan already pointed out, the units of b and se from an arbitrary regression depend on the units of your variables. The advantages from integrating both packages are the combination of advantages from each one with the same workflow as if you were working with bare numbers. It seems that you are already aware of the advantages of automatic error propagation. Regarding the units package, it is very useful for painless conversion of units. A conversion from kg to g is elementary, but some others require more care, for example J to eV, or N.m-1 to dyn.cm-1. In electromagnetism, it is very common to work with the CGS units system, and an automatic conversion from/to the SI comes in handy. If you are not persuaded already, we can also talk about the Mars Climate Orbiter, a robotic space probe launched by NASA on 1998 which disintegrated in Mars' upper atmosphere due to a computation with wrong units. I?aki