Skip to content

[Bioc-devel] Re: [BioC] can't instal MetaData package "mouse4302"

3 messages · Gordon K Smyth, Robert Gentleman, Seth Falcon

#
I've moved this to bioc-devel.
At 12:33 PM 13/03/2005, Robert Gentleman wrote:
Thanks, that's great.

I'd really like to have a better understanding of how the annotation 
packages are put together, the critical thing being the mapping of Affy 
probe set IDs to Entrez Gene (Locus Link) IDs. There's some information 
about this at

   http://www.bioconductor.org/data/annotation.html

but it doesn't fully describe the process. I notice that you're not just 
accepting the Affymetrix supplied Locus Link IDs. In your paper with John 
Zhang, http://bioinformatics.oupjournals.org/cgi/reprint/19/1/155, you 
describe a process which accepts the Affymetrix supplied GenBank IDs, and 
then uses a voting of other data sources to map GenBank to Locus Link. Is 
that still the strategy? Is that better than just accepting the Affymetrix 
supplied LL IDs?

Would it be possible to include with each annotation package the code used 
to build it, say in a directory called /build or whatever? The code might 
just be a few calls ot AnnBuilder functions, but it would nice to document it.

Cheers
Gordon
...
#
On Mar 13, 2005, at 2:16 AM, Gordon Smyth wrote:

            
As far as I know - and things are changing rapidly - Affymetrix  
version of things is much better now than it was when we started (at  
least for standard arrays, I am not sure about custom ones) and we are  
going to start to make more use of their data. When we first started  
the answer to your last
question was yes, we were doing much better.

  In any event, Affymetrix does not give mappings to everything we want,  
and so we will still have to do some manipulations and in some cases  
there does seem to be reasonable disagreement (and it is nice to know  
about such cases) - however, as I said, for standard arrays things are  
pretty stable and we are generally only conflicting on fairly obscure  
things.
Well, that is the point of AnnBuilder itself and we do document the  
actual resources and version numbers that have been used, for example  
go:
   ?hgu95av2
and that is pretty explicit. We can look at adding more, but I doubt  
that we can easily give something that is going to be comprehensive.  
The data resources themselves can undergo substantial changes (which  
seems to be what is happening with the mouse data right now) and we are  
struggling to find enough resources to keep up with those changes. Our  
main priority is ensuring that what we produce is accurate - so we do  
appreciate your comments and observations and will try to get things  
running more smoothly.

   I hope that from now on we will be able to provide some basic  
summaries on what has changed when we produce new sets of annotation  
(but again, getting that to work in anything that looks like an  
automatic fashion is non-trivial).

   Robert
+----------------------------------------------------------------------- 
----------------+
| Robert Gentleman              phone: (206) 667-7700                    
          |
| Head, Program in Computational Biology   fax:  (206) 667-1319   |
| Division of Public Health Sciences       office: M2-B865               
       |
| Fred Hutchinson Cancer Research Center                                 
          |
| email: rgentlem@fhcrc.org                                              
                          |
+----------------------------------------------------------------------- 
----------------+
#
A couple of clarifications:
At 12:33 PM 13/03/2005, Robert Gentleman wrote:
The mouse4302_1.6.9 annotation package that is currently on
www.bioconductor.org was created, without any change to the data, from
version 1.6.8.  The 1.6.9 version resolves an issue with data file
packaging that prevented use of the package on Windows.

Version 1.6.8 was built in early January 2005 by Jianhua (according to
the DESCRIPTION file). My understanding is that this package would
have been built against LocusLink and not EntrezGene.

+ seth