[R-pkg-devel] [External] Formula modeling
On Fri, 8 Oct 2021, pikappa.devel at gmail.com wrote:
Hi, The different environments can potentially be an issue in the future. I was not aware of the vector construction notation, and I think this is what I was mainly looking for. I could provide two initialization methods. One will use the ugly vector notation that one could use to bind the whole model with a particular environment. The second can be more user-friendly and use the comma-separated list of formulas. Essentially, the second will prepare the vector formula and call the first initialization method. The (|) operator comment makes sense, and I would also want to avoid this to the extent that it is feasible. So, I am currently thinking something along the line: c(d, s, p | subject | time) ~ c(p + x + y, p + w + y, z + y)
From a perspective of a person that does not use formulas outside of
xyplot() and glm(), this is a bit hard to parse visually. One could
imagine making a mistake that s corresponds to x, rather than p+w+y.
I wonder if there is a way to write something along the lines of
~c( d~p+x+y,
s~p+w+y,
p~z+y |subject | time
)
A quick experiment with R shows that this is treated like a formula, so ~c
becomes a way to group formulas.
best
Vladimir Dergachev
This is very similar to how the function ?lme4::lmer uses the bar to separate expressions for design matrices from grouping factors. Actually, the subject and time variables are needed for subsetting prices for various operations required for the model matrix. Thanks for the suggestions; they are very helpful! Best, Pantelis -----Original Message----- From: Duncan Murdoch <murdoch.duncan at gmail.com> Sent: Friday, October 8, 2021 2:04 AM To: Richard M. Heiberger <rmh at temple.edu>; pikappa.devel at gmail.com Cc: r-package-devel at r-project.org Subject: Re: [R-pkg-devel] [External] Formula modeling On 07/10/2021 5:58 p.m., Duncan Murdoch wrote:
I don't work with models like this, but I would find it more natural
to express the multiple formulas in a list:
list(d ~ p + x + y, s ~ p + w + y, p ~ z + y)
I'd really have no idea how either of the proposals below should be parsed.
There's a disadvantage to this proposal. I'd assume that "p" means the same in all 3 formulas, but with the notation I give, it could refer to 3 unrelated variables, because each of the formulas would have its own environment, and they could all be different. I guess you could make it a requirement that they all use the same environment, but that's likely going to be confusing to users, who won't know what it means. Another possibility that wouldn't have this problem (but in my opinion is kind of ugly) is to use R vector construction notation: c(d, s, p) ~ c(p + x + y, p + w + y, z + y) Duncan Murdoch
Of course, if people working with models like this are used to working with notation like yours, that would be a strong argument to use your notation. Duncan Murdoch On 07/10/2021 5:51 p.m., Richard M. Heiberger wrote:
I am responding to a subset of what you asked. There are packages
which use multiple formulas in their argument sequence.
What you have as a single formula with | as a separator q | p |
subject | time | rho ~ p + x + y | p + w + y | z + y I think would be
better as a comma-separated list of formulas
q , p , subject , time , rho ~ p + x + y , p + w + y , z + y
because in R notation | is usually an operator, not a separator.
lattice uses formulas and the | is used as a conditioning operator.
nlme and lme4 can have multiple formulas in the same calling sequence.
lme4 is newer. from its ?lme4-package ?lme4? covers approximately
the same ground as the earlier ?nlme?
package.
lme4 should probably be the modelyou are looking for for the package design.
On Oct 07, 2021, at 17:20, pikappa.devel at gmail.com wrote: Dear R-package-devel subscribers, My question concerns a package design issue relating to the usage of formulas. I am interested in describing via formulas systems of the form: d = p + x + y s = p + w + y p = z + y q = min(d,s). The context in which I am working is that of market models with, primarily, panel data. In the above system, one may think of the first equation as demand, the second as supply, and the third as an equation (co-)determining prices. The fourth equation is implicitly used by the estimation method, and it does not need to be specified when programming the R formula. If you need more information bout the system, you may check the package diseq. Currently, I am using constructors to build market model objects. In a constructor call, I pass [i] the right-hand sides of the first three equations as strings, [ii] an argument indicating whether the equations of the system have correlated shocks, [iii] the identifiers of the used dataset (one for the subjects of the panel and one for time), and [iv] the quantity (q) and price (p) variables. These four arguments contain all the necessary information for constructing a model. I would now like to re-implement model construction using formulas, which would be a more regular practice for most R users. I am currently considering passing all the above information with a single formula of the form: q | p | subject | time | rho ~ p + x + y | p + w + y | z + y where subject and time are the identifiers, and rho indicates whether correlated or independent shocks should be used. I am unaware of other packages that use formulas in this way (for instance, passing the identifiers in the formula), and I wonder if this would go against any good practices. Would it be better to exclude some of the necessary elements for constructing the model? This might make the resuting formulas more similar to those of models with multiple responses or multiple parts. I am not sure, though, how one would use such model formulas without all the relevant information. Is there any suggested design alternative that I could check? I would appreciate any suggestions and discussion! Kind regards, Pantelis [[alternative HTML version deleted]]
______________________________________________ R-package-devel at r-project.org mailing list https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fst at.ethz.ch%2Fmailman%2Flistinfo%2Fr-package-devel&data=04%7C01%7 Crmh%40temple.edu%7C21a51d63bc6242e5e24908d989d84fce%7C716e81efb5224 4738e3110bd02ccf6e5%7C0%7C0%7C637692385020500219%7CUnknown%7CTWFpbGZ sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0 %3D%7C3000&sdata=UKazmoIzXSn8DDQY3diUTPmVIg1cfTI3e1roXyo2DMQ%3D& amp;reserved=0
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
______________________________________________ R-package-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel