Request: Suggestions for "good teaching" packages, esp. with C code
On 16/02/2011 7:04 a.m., Paul Johnson wrote:
Hello, I am looking for CRAN packages that don't teach bad habits. Can I have suggestions? I don't mean the recommended packages that come with R, I mean the contributed ones. I've been sampling a lot of examples and am surprised that many ignore seemingly agreed-upon principles of R coding. In r-devel, almost everyone seems to support the "functional programming" theme in Chambers's book on Software For Data Analysis, but when I go look at randomly selected packages, programmers don't follow that advice. In particular: 1. Functions must avoid "mystery variables from nowhere." Consider a function's code, it should not be necessary to say "what's variable X?" and go hunting in the commands that lead up to the function call. If X is used in the function, it should be in a named argument, or extracted from one of the named arguments. People who rely on variables floating around in the user's environment are creating hard-to-find bugs. 2. We don't want functions with indirect effects (no<<- ), almost always. 3. Code should be vectorized where possible, C style for loops over vector members should be avoided. 4. We don't want gratuitous use of "return" at the end of functions. Why do people still do that?
Well I for one (and Jeff as well it seems) think it is good programming practice. It makes explicit what is being returned eliminating the possibility of mistakes and provides clarity for anyone reading the code. David Scott
5. Neatness counts. Code should look nice! Check out how beautiful the functions in MASS look! I want code with spaces and "<- " rather than everything jammed together with "=". I don't mean to criticize any particular person's code in raising this point. For teaching exemples, where to focus? Here's one candidate I've found: MNP. as far as I can tell, it meets the first 4 requirements. And it has some very clear C code with it as well. I'm only hesitant there because I'm not entirely sure that a package's C code should introduce its own functions for handling vectors and matrices, when some general purpose library might be more desirable. But that's a small point, and clarity and completeness counts a great deal in my opinion.
_________________________________________________________________ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.scott at auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics