[Bioc-devel] [BioC] homology package question
James W. MacDonald wrote:
Hi Nianhua, Nianhua Li wrote:
Hi, James,
The source file of mmuhomology is
ftp://ftp.ncbi.nih.gov/pub/HomoloGene/current/hmlg.ftp (download on 02/28/2007)
and the description is
ftp://ftp.ncbi.nih.gov/pub/HomoloGene/README-old
According to the description, the 4th and 7th column of hmlg.ftp are Entrez Gene
ID, the 5th and 8th column are internal HomoloGene ID. If you look at the
hmlg.ftp file, even the current one, you can find that the internal HomoloGene
ID is the same as Entrez Gene ID for most of the case. That's why
mmuhomologyHGID2LL and mmuhomologyLL2HGID look identical.
Odd. I wonder if they no longer even check to see if the data are correct. I checked several of the IDs, and AFAIK they really are Entrez Gene IDs, and they really are not HomoloGene IDs. Anyway, it's really easy to get the mappings from biomaRt so that might be the direction to point people until we start using an updated source of these data.
Moved over to bioc-devel.... Just a thought and it is not my decision to make, but should the homology packages be part of the release at all if they can't be updated before? Sean