Skip to content

Better 'undefined columns' error for data.frame

8 messages · Martin Maechler, Duncan Murdoch, GILLIBERT, Andre +1 more

#
Dear R developers,


One of the error messages that make me loose the most time is the "undefined columns selected" of `[.data.frame`.

It ought to specify the list of bad column names, but currently does not.

Fortunately, this can easily be fixed by a small patch I can write.


Are you interested in that patch?

Is there a standard way to transfer patches for feature requests?


--

Sincerely

Andr? GILLIBERT
#
> Dear R developers, One of the error messages that make me
    > loose the most time is the "undefined columns selected" of
    > `[.data.frame`.

    > It ought to specify the list of bad column names, but
    > currently does not.

    > Fortunately, this can easily be fixed by a small patch I
    > can write.


    > Are you interested in that patch?

    > Is there a standard way to transfer patches for feature
    > requests?

Yes, the standard way is as for bug reports and patches
R's bugzilla:  https://bugs.r-project.org/

One needs an account there, see
https://www.r-project.org/bugs.html  for an explanation (and
more),
but you already got such an account, since about one year,
so do go ahead.

Please keep the patch *minimal*,
i.e., no  white-space only changes etc

Thank you in advance for trying to make R better !

With regards,
Martin


--
Martin Maechler
ETH Zurich  and  R Core team

    > --
    > Sincerely

    > Andr? GILLIBERT

    > 	[[alternative HTML version deleted]]
#
On 24/09/2022 9:56 a.m., GILLIBERT, Andre wrote:
I doubt if you'll get an answer to this without showing it to us.
The standard instructions are here:  https://www.r-project.org/bugs.html .
You're offering an enhancement rather than a bug fix, but the process is 
the same.  But since your change is probably quite small, you might not 
need to go through the whole process:  proposing it on this mailing list 
might be sufficient.

Duncan Murdoch
1 day later
#
Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
Please, find the patch attached, based on the latest R SVN trunk code.
In order to avoid any accidental modification of the behavior of `[.data.frame`, I only inserted code at positions where the "undefined columns selected" error was generated.
In order to test all code paths, I wrote a small test script (code attached), but did not insert clean & standardized tests to the R build.

I added a new base::.undefined_columns_msg closure. I am not sure that adding such new internal symbols to the base package is smart because it pollutes the base environment, but looking at the base-internal documentation page, it seems to be standard practice.

For message translations, I tried to use tools::xgettext() to re-generate the R-base.pot file, but this file seems to be out of sync for a few messages. So, I manually added the new messages to R-base.pot.
As I am a native french speaker, I added appropriate translations to R-fr.po.
The make command did not regenerate .mo files, and I did not find any appropriate make target to rebuild them, so, for testing purposes, I manually generated them.  The R installation & administration manual did not help (https://cran.r-project.org/doc/manuals/r-release/R-admin.html#Localization-of-messages).

When compiling the source code on a x86_64 Ubuntu 16.04 system, I got a problem with position independent code in the internal R LAPACK module. The -fPIC option was not properly passed to the gfortran command line. I fixed that by replacing $(ALL_FCFLAGS) by $(ALL_FFLAGS) on line 16 of src/modules/lapack/Makefile.in
That looks like a bug to me (misspelling of FFLAGS), but I did not fix that in my patch, since it is a very different issue.

--
Sincerely
Andr? GILLIBERT
#
Andre,
On 25 September 2022 at 18:09, GILLIBERT, Andre wrote:
| Please, find the patch attached, based on the latest R SVN trunk code.

Well the mailing list software tends to drop attachments.  There is a reason
all these emails suggest to use bugs.r-project.org.

Dirk
#
Hi.  That change is much more ambitious than I would have guessed.  R 
Core might prefer it to arrive via bugzilla.

A couple of small comments:

1.  Your last test, "dfa[,cols]" produces a really long message:

 > dfa[,cols]
Error in `[.data.frame`(dfa, , cols) :
   undefined columns selected:   4,   5,   6,   7,   8,   9,  10,  11, 
12,  13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25, 
26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,  39, 
40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,  52,  53, 
54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,  65,  66,  67, 
68,  69,  70,  71,  72,  73,  74,  75,  76,  77,  78,  79,  80,  81, 
82,  83,  84,  85,  86,  87,  88,  89,  90,  91,  92,  93,  94,  95, 
96,  97,  98,  99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 
110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 
124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 
138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 
152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 
166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 
180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 
194, 195, 196

but it is still incomplete, because cols is 1:1e6.  It might be 
preferable to use ellipsis when the number of columns is too large, e.g.

 > dfa[,cols]
Error in `[.data.frame`(dfa, , cols) :
   undefined columns selected:   4,   5,   6,   7,   8,   9,  10,  ...

(I think it's not worth the trouble to give the count of missing 
columns, but you might choose to do that.)

2.  Your patch file should be produced by "svn diff", which would have 
included the revision number.

I think your code is an improvement over the current code; let's hope it 
gets accepted.

Duncan Murdoch
On 25/09/2022 2:09 p.m., GILLIBERT, Andre wrote:
#
On 25/09/2022 2:48 p.m., Dirk Eddelbuettel wrote:
I was named in the posting, so I saw the patch file, my copy didn't come 
via the list.  It's a good patch, I hope Andre does post there.

Duncan Murdoch
#
Duncan Murdoch <murdoch.duncan at gmail.com> wrote:
Thank you.
I reported an "enhancement" #18409 at https://bugs.r-project.org/show_bug.cgi?id=18409
I took in account your suggestions to improve error messages.

I choosed to provide different messages for character and logical/numeric indices, but this increases the code size, the number of code paths and the number of translations to perform.
If you have suggestions to improve the patch, I am open to comments and ideas.

--
Sincerely
Andre GILLIBERT