Hi bioc-devel,
I noticed today that BioMart queries to phytozome return columns out of
order. I found mention of this issue here:
https://stat.ethz.ch/pipermail/bioconductor/2011-October/041384.html Currently,
biomaRt's getBM() makes the assumption that the results are returned in the
same order that the attributes are specified when setting column names,
which is incorrect.
According to the BioMart documentation (version 0.8), the query XML format
has an option to set header="1" (see page 63 here:
http://www.biomart.org/other/rc6_documentation.pdf) which appears to solve
this issue. I've changed this locally in R/biomaRt.R's getBM(), removed the
colnames(results) = attributes in the second to last line of the function,
and it appears to work with phytozome and ensembl. I can't find on the
BioMart website whether this is a recent addition to the query format. If
it only works with some recent BioMart versions, could we add header="1" to
the query for the appropriate versions and then call warning() to warn
users their column names may be incorrect for other versions? I would be
happy to supply a patch if needed.
best,
Vince