R html help system [Was: How to document man/*.Rd pages with images?]
On 5/13/11 8:20 PM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:
On May 13, 2011, at 7:08 PM, Sean Robert McGuffee wrote:
On 5/12/11 9:13 AM, "Simon Urbanek" <simon.urbanek at r-project.org> wrote:
I just want to clarify the mechanics of the help system when using html. R has a built-in HTTP server (aka Rhttpd) which transforms HTTP requests to function calls. It is not your usual web server, because it doesn't map URL paths to files, it just allows R functions to do anything with it -- something like CGI except that we are talking about functions and not files. Therefore you won't find any files and there is no file structure involved. For the help system, the function handling the requests is tools:::httpd() - you can look at what it does.
Awesome, will do!
Basically, it generates pages according to the various paths it supports. As part of its path handling it allows certain paths to reference files, e.g. /library/myPackage/doc/randomStuffInPackagesDocDirectory.html will read the file doc/randomStuffInPackagesDocDirectory.html in myPackage. Note that the whole point of the dynamic help *is* to generate content on the fly, because the content depends on the state of your current workspace -- packages loaded, classes defined, etc. so you cannot pre-generate pages as they won't have correct links - that's why we shifted from static html pages to the dynamic ones.
This is interesting. It actually speaks to a constructive criticism I have of R. As a user of R, I don't want to have conditional dynamic help. I want to always get the same answer to a question so to say. If there is a way to do something that works and is reproducible, then I don't want to maybe or maybe not get that answer. Thus, I guess what I'm thinking is that there should maybe be selection within help that has organization based on packages loaded and classes defined, but I would hope that the state of my system doesn't change the help that is displayed. At least I as one user would prefer to see all of my options will all packages and not have any of it emphasized or excluded by how my system is currently set up.
I don't think I follow you. Your options will be different, by definition, depending on which packages you have installed and loaded. As one obvious example, you can't refer to documentation of packages that you did not install. As another example (more future-directed I suspect), the generics and methods depend on your currently defined classes, methods etc. That is true in R itself, so I don't see why you would prefer documentation that doesn't match your R.
I guess what I mean to say is that although a user's options depend on context, when I use help, I want to know what my options are outside of the current context. Especially at a stage where I need help, I definitely might not have the right context set up, so it's very important to me to have non-context-specific help. I recognize that this is my opinion and by no means agreed upon, but this was a huge barrier for me when I first began using R. For example, sometimes I would load a library while following an example in the documentation before I even knew about libraries or realized what I was doing. Then, within that context, a help command within that library would work. On another day, when I hadn't set up that context by accident, using my notes of "help(whatever)" wouldn't work. That was very confusing and frustrating for me, and I'm much more computer literate than many of the test-tube wet lab scientists I would hope to have using my packages. I realize that "??whatever" gets around this, but the "??" symbol is hard to lookup and learn about, especially when there is a "help(whatever)" syntax that worked on a previous day. It took me months to understand that "??" even existed as something separate from "help(whatever)." In general, as a beginner, I found it very confusing that there were more than one help commands, and I'm not sure if that's a good thing for people who need help. Those are my two cents anyway. This might be best for most people--I tend to be a very literal and concrete thinker and may not be very representative. On the other hand, I don't see what the benefit is to having less options for help if a package isn't loaded. I suppose it provides more focus for someone who knows very well what they are doing and wants to know something specific to their current context. However, I'm not sure if helping that person focus is as critical to providing help as assisting a new user who is likely to be oblivious to contexts as a concept. Am I making any sense? Let me make up an extreme example, just to clarify: Suppose there are packages A and B, both with command C. Regardless of what packages I have loaded, I would want help( C ) to bring up a list with two files to choose from: one corresponding to A::C and one corresponding to B::C, both containing the information that to use C one must first type library(A) or library(B) to create the context that makes the help information about how to use C relevant. My understanding is that as it stands, typing help( C ) would do one of three things, depending on whether library(A) or library(B) has been loaded. If neither has been loaded, help( C ) would produce an error. If library(A) had been loaded, then help would be generated regarding A::C and the user would be otherwise oblivious to B::C, even if that were the useful info to the user. Likewise for the vis-a-versa case with library(B) loaded. So that's to help me explain my perspective of what I mean by non-context-specific help and why I think it would be advantageous. Does that make sense? What I don't understand is a case where it is advantageous to be oblivious to out of context options. Could someone show me an example of that? I'm sure I am missing something because I can't think of a case like that on my own and have been considering it for days.
In the satus quo, I can see how the choice of which pages to look at is dynamic if more than one comes up on a search, but it seems inefficient to me to have the page itself be dynamic. I think it would be a good idea if package authors could at least have an option to have their help pages produced as files either way.
That decision is left to the user - you can use --html to generate html pages.
This is appealing to me, but I can't seem to find any info about it. Maybe having the "--" part of the "--help" is throwing off my searches. I couldn't find "--help" when I searched R's help. If my package is named MyPackage, how would a user generate the html pages from it with "--html"?
I mean, when my package will be loaded, I certainly won't want options and do want to be able to point my users to an unconditional file location to point their browser to.
You can do that with dynamically generated pages (you can't do that with static pages in fact) - the paths are well defined (unlike in your file system). Even better is fact, because the dynamic help is smart enough to find packages in different libraries, for example.
I think this is a very good thing. Having the dynamic help is probably the very best way to go from that perspective. I personally find file-system paths to be annoyingly less well defined, so I'm completely sold by what you are saying. It seems from what you said below that the performance issue I have is from a bug. Am I right about that? I could be mistaking the context of what you said below because I was a bit ambiguous about more than one issue below. Since I can control what my package generates for help my context-specific issue is not a problem for me, so a this is a distinct issue I have with dynamic help and what appeared to me to be a performance problem. My machine tends to lag for quite a while on many packages when using the dynamic help system. Almost enough to discourage use. I found my case with a large image to be a particularly bad anecdotal example of a specific help problem, and maybe it is relevant to that more general issue if there is a general bug. If this is caused by a bug, I would see no problem with the concept of dynamic help at all--especially if it can react instantaneously. However, I'm not completely convinced of that yet. It seem very efficient to me to have a dynamic system point to pre-generated files. However, generating those files dynamically, especially if many of them are relevant, seems potentially inefficient. I don't claim to understand how it is working yet, although looking into it has been fascinating so far.
I'm curious about "If it?s a large picture this process nearly crashes my machine when trying to access the file via help" - do you have an example package that would illustrate the problem?
I?ve tried to recreate the problem with a small fake package, and although it passes the check it doesn?t seem to work quite right on my system. I might have some compiler issues or configuration issues though, so it might work as is on your system. If not, I think you could quickly find the relevant parts though and add them to a package of your own to see the bug if this doesn?t work as is on your machine. I?m not really sure why this doesn?t work on my machine. I did almost exactly the same thing as in the huge package that I can?t fit on my file transfer site. However, it is set up to only install in 64-bit and I couldn't remember how I set that up. So it might be the 32-bit part that is messing things up on my system. I think there should be a simple way to declare an architecture in a package DESCRIPTION or something. I can't remember. Anyway, that's beside the point. Here is an example of a syntax and image file that makes my help go extremely slow and not show images: http://ftsext.mskcc.org/FileExchange/FileList.aspx?id=9afb4fe1-ce1c-406d-b1a 1-c9360493137c Please let me know if you see why this isn?t working for me--both as to if this works as is on your system and as to if this causes the bug. At the moment this tells me " Error in gzfile(file, "rb") : cannot open the connection" even though it passes all the build check install tests on my machine.
There seem to be two bugs AFAICS: a) The path generated from the URL is either wrong or something is not in syncs - it says /Library/Frameworks/R.framework/Versions/2.13/Resources/library/Meta/Rd.rdsBug /help but the meta file is really in /Library/Frameworks/R.framework/Versions/2.13/Resources/library/helpBug/Meta/R d.rds It seems like some strange permutation issue - but I didn't look at httpd() yet (I'm a bit puzzled as of why it doesn't affect other packages - maybe it's some regexp thing ...). b) there seems to be an issue with WebKit and Rhttpd interaction in that Rhttpd gets blocked by WebKit not fetching the data. If you look at the page from an external browser, all is well. This will be a bit tricky to address and will need modifications to R...
Should I look into making these modifications to R? Or would this type of thing be addressed by more official R personnel?
Cheers, Simon
Thanks, Simon On May 11, 2011, at 7:14 PM, Sean Robert McGuffee wrote:
Thanks everyone for your help,
To summarize a resolution to my issue, it turns out that an image can be
include in a documentation file via html by putting an image file in the
inst/doc directory, for example inst/doc/myPic.png, and then pointing to it
in the man/myHelpPage.Rd file, for example as follows:
\if{html}{
\out{<img src="../doc/myPic.png" alt="image ../doc/myPic.png should be
here"/>}
}\ifelse{latex}{}{}
Note, this doesn?t mean that R?s help browser will view those images inside
the properly generated html help files.
Also, note that without the \out{} part, the text of the <img .../> line
would show up instead of the html commands.
I have some concerns incase anyone on the list is interested. If it?s a
large
picture this process nearly crashes my machine when trying to access the
file
via help?and I?m sure there must be some bug in that. I should note that
the
picture won?t actually display within R?s help console (at least on my
machine--I?m on a mac with a binary version of R). To see that the html
files
are created properly, I have to copy a link to the help file and then point
an actual browser such as firefox to the help file to see the page with the
image. I?m not sure how R is running httpd or how that interacts with help.
I?m not even sure about the basics of help. Is there a way to configure R
to
use an actual web browser by default instead of it?s slow one that doesn?t
show images? It would also be nice if there were an address bar on R?s help
browser. I mean, until I put a link to my help file inside another help
file,
there was no way for me to even get it?s address to copy and paste into
firefox. It would also be nice if it didn?t almost crash and let me more
easily get the link, but ideally it would be best not to have a
semi-functional help browser. Furthermore, this brings up the point that I
can?t find the files I?m browsing with the link. In this case, I get a link
such as:
http://127.0.0.1:23269/library/MyPackage/html/MyPackage.html
But I can?t find the MyPackage.html file anywhere on my computer. It?s
there
in the web browser, but seems to be only in existence via R?s httpd without
actually existing on my file system.
Is it there and I can?t find it or is it encoded in R somehow? If it is
there, where would it be? If I close R, I no longer have access to the page
that R?s httpd is serving. It seems to me that it?s being created every
time
I use help?and I think that is extremely inefficient. I think firefox can
handle file-type urls, so I if there is a way to get R to both generate
these
files and use firefox to browse them for help, I would very much like to
know
more about it. It would be much faster and useful than the status quo on my
machine if this file were generated once at installation and remained as a
file--and and using help simply pointed a web-browser to the file.
Anyway, I suppose this is a tangent. The main point is that there is a way
to
provide help documentation with images?but even though it tries to view
them
correctly via help?R?s help browser displays broken images so I have the
awkward need to copy and paste links into other web browsers.
Regarding some feedback I?ve gotten about some user?s interests in help
formatted as text, I think there are two things in this process that keep a
text help user on track: (1) the conditional html part and (2) even if
using
a textual html browser, <img ... alt=?alternate text?/> take care of
displaying images as text. I think though that the other way around, the
users who require images in their help files are having less functionality
via help in R. At least in this case, the best I could do was get R to
generate the proper help pages in html, but R?s default html help browser
(at
least on my machine) doesn?t display the images (although they are there
and
can be displayed by the same link in firefox).
Sometimes it?s true what they say about a picture being worth a thousand
words?I think in general this is true for complex things that need computer
power to deal with, so I hope R can eventually support images in help files
due to the usefulness of doing so in some cases.
Thanks again,
Sean