Message-ID: <462689CF.7020108@mail.nih.gov>
Date: 2007-04-18T21:12:47Z
From: Sean Davis
Subject: [Bioc-devel] [BioC] homology package question
In-Reply-To: <462685B0.9050405@med.umich.edu>
James W. MacDonald wrote:
> Hi Nianhua,
>
> Nianhua Li wrote:
>
>> Hi, James,
>>
>> The source file of mmuhomology is
>> ftp://ftp.ncbi.nih.gov/pub/HomoloGene/current/hmlg.ftp (download on 02/28/2007)
>> and the description is
>> ftp://ftp.ncbi.nih.gov/pub/HomoloGene/README-old
>>
>> According to the description, the 4th and 7th column of hmlg.ftp are Entrez Gene
>> ID, the 5th and 8th column are internal HomoloGene ID. If you look at the
>> hmlg.ftp file, even the current one, you can find that the internal HomoloGene
>> ID is the same as Entrez Gene ID for most of the case. That's why
>> mmuhomologyHGID2LL and mmuhomologyLL2HGID look identical.
>>
>
> Odd. I wonder if they no longer even check to see if the data are
> correct. I checked several of the IDs, and AFAIK they really are Entrez
> Gene IDs, and they really are not HomoloGene IDs.
>
> Anyway, it's really easy to get the mappings from biomaRt so that might
> be the direction to point people until we start using an updated source
> of these data.
>
Moved over to bioc-devel....
Just a thought and it is not my decision to make, but should the
homology packages be part of the release at all if they can't be updated
before?
Sean