Hi,
I need to create dataset BiomartGeneRegionTrack via Gviz package to run examples in my packages. But when I run
"R CMD check coMET", i have warning message for the checking :
checking data for non-ASCII characters ... WARNING
Warning: found non-ASCII strings
'[alpha cell,acidophil cell,acinar cell,adipoblast,adipocyte,amacrine cell,beta cell,capsular cell,cementocyte,chief cell,chondroblast,chondrocyte,chromaffin cell,chromophobic cell,corticotroph,delta cell,dendritic cell,enterochromaffin cell,ependymocyte,epithelium,erythroblast,erythrocyte,fibroblast,fibrocyte,follicular cell,germ cell,germinal epithelium,giant cell,glial cell,glioblast,goblet cell,gonadotroph,granulosa cell,haemocytoblast,hair cell,hepatoblast,hepatocyte,hyalocyte,interstitial cell,juxtaglomerular cell,keratinocyte,keratocyte,lemmal cell,leukocyte,luteal cell,lymphocytic stem cell,lymphoid cell,lymphoid stem cell,macroglial cell,mammotroph,mast cell,medulloblast,megakaryoblast,megakaryocyte,melanoblast,melanocyte,mesangial cell,mesothelium,metamyelocyte,monoblast,monocyte,mucous neck cell,muscle cell,myelocyte,myeloid cell,myeloid stem cell,myoblast,myoepithelial cell,myofibrobast,neuroblast,neuroepithelium,neuron,odontoblast,osteoblast,osteoclast,osteocyte,oxyntic cell,parafollicular cell,paraluteal cell,peptic cell,pericyte,phaeochromocyte,phalangeal cell,pinealocyte,pituicyte,plasma cell,platelet,podocyte,proerythroblast,promonocyte,promyeloblast,promyelocyte,pronormoblast,reticulocyte,retinal pigment epithelium,retinoblast,somatotroph,stem cell,sustentacular cell,teloglial cell,zymogenic cell,small cell,Th1,Cell Type,M<c3><bc>ller cell,primary oocyte,Claudius' cell,Th2,follicular dendritic cell,astrocyte,white,T-lymphoblast,basal cell,T-lymphocyte,helper induced T-lymphocyte:Th2,B-lymphocyte,neutrophil,oocyte,unclassifiable (Cell Type),natural killer cell,helper induced T-lymphocyte,brown,CD4+,Hensen cell,lymphocyte,cardiac muscle cell,lymphoblast,Paneth cell,alveolar macrophage,macrophage,squamous cell,oligodendrocyte,smooth muscle cell,gamete,spermatid,Schwann cell,CD34+,spermatocyte,helper induced T-lymphocyte:Th1,astroblast,eosinophil,oligodendroblast,basophil,peripheral blood mononuclear cell,histiocyte,Sertoli cell,endothelium,granulocyte,spermatozoon,Merkel cell,skeletal muscle cell,thymocyte,foam cell,ovum,secondary spermatocyte,Langerhans cell,primary spermatocyte,transitional,Purkinje cell,Kupffer cell,secondary oocyte,B-lymphoblast]' in object 'biomTrack'
chrom <- "chr2"
start <- 38290160
end <- 38303219
gen <- "hg19"
biomTrack <- BiomartGeneRegionTrack(genome = gen,
chromosome = chr, start = start,
end = end, name = "ENSEMBL",
fontcolor="black", groupAnnotation = "group",
just.group = "above",showId=showId )
Do you have an idea to correct this error? I think that we need to discuss with EMBL to correct that, do we ?
Tiphaine
----------------------------
Tiphaine Martin
PhD Research Student | King's College
The Department of Twin Research & Genetic Epidemiology | Genetics & Molecular Medicine Division
St Thomas' Hospital
4th Floor, Block D, South Wing
SE1 7EH, London
United Kingdom
email : tiphaine.martin at kcl.ac.uk
Fax: +44 (0) 207 188 6761
[Bioc-devel] Non-ASCII in datase from Biomart EMBL via Gviz package
4 messages · Martin, Tiphaine, Vincent Carey, Hahne, Florian
I don't know exactly how you are triggering this warning. If you have the
ability to prefilter your content before serializing, that may be best.
The following
is from the gwascat package. You have very little chance, I believe, of
getting an
institutional guarantee that only ascii will go into their emissions.
fixNonASCII = function(df) {
hasNonASCII = function(x) {
asc = iconv(x, "latin1", "ASCII")
any(asc != x | is.na(asc))
}
havebad = sapply(df, function(x) hasNonASCII(x))
if (!(any(havebad))) return(df)
message("NOTE: input data had non-ASCII characters replaced by '*'.")
badinds = which(havebad)
for (i in 1:length(badinds))
df[,badinds[i]] = iconv(df[,badinds[i]], to="ASCII", sub="*")
df
}
On Sun, Oct 12, 2014 at 2:14 PM, Martin, Tiphaine <tiphaine.martin at kcl.ac.uk
wrote:
Hi,
I need to create dataset BiomartGeneRegionTrack via Gviz package to run
examples in my packages. But when I run
"R CMD check coMET", i have warning message for the checking :
checking data for non-ASCII characters ... WARNING
Warning: found non-ASCII strings
'[alpha cell,acidophil cell,acinar cell,adipoblast,adipocyte,amacrine
cell,beta cell,capsular cell,cementocyte,chief
cell,chondroblast,chondrocyte,chromaffin cell,chromophobic
cell,corticotroph,delta cell,dendritic cell,enterochromaffin
cell,ependymocyte,epithelium,erythroblast,erythrocyte,fibroblast,fibrocyte,follicular
cell,germ cell,germinal epithelium,giant cell,glial cell,glioblast,goblet
cell,gonadotroph,granulosa cell,haemocytoblast,hair
cell,hepatoblast,hepatocyte,hyalocyte,interstitial cell,juxtaglomerular
cell,keratinocyte,keratocyte,lemmal cell,leukocyte,luteal cell,lymphocytic
stem cell,lymphoid cell,lymphoid stem cell,macroglial cell,mammotroph,mast
cell,medulloblast,megakaryoblast,megakaryocyte,melanoblast,melanocyte,mesangial
cell,mesothelium,metamyelocyte,monoblast,monocyte,mucous neck cell,muscle
cell,myelocyte,myeloid cell,myeloid stem cell,myoblast,myoepithelial
cell,myofibrobast,neuroblast,neuroepithelium,neuron,odontoblast,osteoblast,osteoclast,osteocy!
te,oxyntic cell,parafollicular cell,paraluteal cell,peptic
cell,pericyte,phaeochromocyte,phalangeal cell,pinealocyte,pituicyte,plasma
cell,platelet,podocyte,proerythroblast,promonocyte,promyeloblast,promyelocyte,pronormoblast,reticulocyte,retinal
pigment epithelium,retinoblast,somatotroph,stem cell,sustentacular
cell,teloglial cell,zymogenic cell,small cell,Th1,Cell Type,M<c3><bc>ller
cell,primary oocyte,Claudius' cell,Th2,follicular dendritic
cell,astrocyte,white,T-lymphoblast,basal cell,T-lymphocyte,helper induced
T-lymphocyte:Th2,B-lymphocyte,neutrophil,oocyte,unclassifiable (Cell
Type),natural killer cell,helper induced T-lymphocyte,brown,CD4+,Hensen
cell,lymphocyte,cardiac muscle cell,lymphoblast,Paneth cell,alveolar
macrophage,macrophage,squamous cell,oligodendrocyte,smooth muscle
cell,gamete,spermatid,Schwann cell,CD34+,spermatocyte,helper induced
T-lymphocyte:Th1,astroblast,eosinophil,oligodendroblast,basophil,peripheral
blood mononuclear cell,histiocyte,Sertoli cel!
l,endothelium,granulocyte,spermatozoon,Merkel cell,skeletal muscle cel
l,thymocyte,foam cell,ovum,secondary spermatocyte,Langerhans cell,primary
spermatocyte,transitional,Purkinje cell,Kupffer cell,secondary
oocyte,B-lymphoblast]' in object 'biomTrack'
chrom <- "chr2"
start <- 38290160
end <- 38303219
gen <- "hg19"
biomTrack <- BiomartGeneRegionTrack(genome = gen,
chromosome = chr, start = start,
end = end, name = "ENSEMBL",
fontcolor="black", groupAnnotation =
"group",
just.group = "above",showId=showId )
Do you have an idea to correct this error? I think that we need to discuss
with EMBL to correct that, do we ?
Tiphaine
----------------------------
Tiphaine Martin
PhD Research Student | King's College
The Department of Twin Research & Genetic Epidemiology | Genetics &
Molecular Medicine Division
St Thomas' Hospital
4th Floor, Block D, South Wing
SE1 7EH, London
United Kingdom
email : tiphaine.martin at kcl.ac.uk
Fax: +44 (0) 207 188 6761
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Hi Tiphaine, You can follow Vince?s advice and transform all the data into proper ASCII character. Or you can just get rid of the culprit (being the @biomart slot of the object) before serialising. The easiest way to do that is: foo at biomart <- NULL The slot is only present to cache the BiomaRt connection, which is lost anyways when serialising. The object is smart enough to realise that and just reconnects the next time it is plotted. That is how I handled things for the serialised BiomartGeneRegionTracks in Gviz. Florian
On 12/10/14 20:35, "Vincent Carey" <stvjc at channing.harvard.edu> wrote:
I don't know exactly how you are triggering this warning. If you have the
ability to prefilter your content before serializing, that may be best.
The following
is from the gwascat package. You have very little chance, I believe, of
getting an
institutional guarantee that only ascii will go into their emissions.
fixNonASCII = function(df) {
hasNonASCII = function(x) {
asc = iconv(x, "latin1", "ASCII")
any(asc != x | is.na(asc))
}
havebad = sapply(df, function(x) hasNonASCII(x))
if (!(any(havebad))) return(df)
message("NOTE: input data had non-ASCII characters replaced by '*'.")
badinds = which(havebad)
for (i in 1:length(badinds))
df[,badinds[i]] = iconv(df[,badinds[i]], to="ASCII", sub="*")
df
}
On Sun, Oct 12, 2014 at 2:14 PM, Martin, Tiphaine
<tiphaine.martin at kcl.ac.uk
wrote:
Hi,
I need to create dataset BiomartGeneRegionTrack via Gviz package to run
examples in my packages. But when I run
"R CMD check coMET", i have warning message for the checking :
checking data for non-ASCII characters ... WARNING
Warning: found non-ASCII strings
'[alpha cell,acidophil cell,acinar cell,adipoblast,adipocyte,amacrine
cell,beta cell,capsular cell,cementocyte,chief
cell,chondroblast,chondrocyte,chromaffin cell,chromophobic
cell,corticotroph,delta cell,dendritic cell,enterochromaffin
cell,ependymocyte,epithelium,erythroblast,erythrocyte,fibroblast,fibrocyt
e,follicular
cell,germ cell,germinal epithelium,giant cell,glial
cell,glioblast,goblet
cell,gonadotroph,granulosa cell,haemocytoblast,hair
cell,hepatoblast,hepatocyte,hyalocyte,interstitial cell,juxtaglomerular
cell,keratinocyte,keratocyte,lemmal cell,leukocyte,luteal
cell,lymphocytic
stem cell,lymphoid cell,lymphoid stem cell,macroglial
cell,mammotroph,mast
cell,medulloblast,megakaryoblast,megakaryocyte,melanoblast,melanocyte,mes
angial
cell,mesothelium,metamyelocyte,monoblast,monocyte,mucous neck
cell,muscle
cell,myelocyte,myeloid cell,myeloid stem cell,myoblast,myoepithelial
cell,myofibrobast,neuroblast,neuroepithelium,neuron,odontoblast,osteoblas
t,osteoclast,osteocy!
te,oxyntic cell,parafollicular cell,paraluteal cell,peptic
cell,pericyte,phaeochromocyte,phalangeal
cell,pinealocyte,pituicyte,plasma
cell,platelet,podocyte,proerythroblast,promonocyte,promyeloblast,promyelo
cyte,pronormoblast,reticulocyte,retinal
pigment epithelium,retinoblast,somatotroph,stem cell,sustentacular
cell,teloglial cell,zymogenic cell,small cell,Th1,Cell
Type,M<c3><bc>ller
cell,primary oocyte,Claudius' cell,Th2,follicular dendritic
cell,astrocyte,white,T-lymphoblast,basal cell,T-lymphocyte,helper
induced
T-lymphocyte:Th2,B-lymphocyte,neutrophil,oocyte,unclassifiable (Cell
Type),natural killer cell,helper induced T-lymphocyte,brown,CD4+,Hensen
cell,lymphocyte,cardiac muscle cell,lymphoblast,Paneth cell,alveolar
macrophage,macrophage,squamous cell,oligodendrocyte,smooth muscle
cell,gamete,spermatid,Schwann cell,CD34+,spermatocyte,helper induced
T-lymphocyte:Th1,astroblast,eosinophil,oligodendroblast,basophil,peripher
al
blood mononuclear cell,histiocyte,Sertoli cel!
l,endothelium,granulocyte,spermatozoon,Merkel cell,skeletal muscle cel
l,thymocyte,foam cell,ovum,secondary spermatocyte,Langerhans
cell,primary
spermatocyte,transitional,Purkinje cell,Kupffer cell,secondary
oocyte,B-lymphoblast]' in object 'biomTrack'
chrom <- "chr2"
start <- 38290160
end <- 38303219
gen <- "hg19"
biomTrack <- BiomartGeneRegionTrack(genome = gen,
chromosome = chr, start = start,
end = end, name = "ENSEMBL",
fontcolor="black",
groupAnnotation =
"group",
just.group =
"above",showId=showId )
Do you have an idea to correct this error? I think that we need to
discuss
with EMBL to correct that, do we ?
Tiphaine
----------------------------
Tiphaine Martin
PhD Research Student | King's College
The Department of Twin Research & Genetic Epidemiology | Genetics &
Molecular Medicine Division
St Thomas' Hospital
4th Floor, Block D, South Wing
SE1 7EH, London
United Kingdom
email : tiphaine.martin at kcl.ac.uk
Fax: +44 (0) 207 188 6761
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
both methods work well. Thanks, Tiphaine
From: Hahne, Florian <florian.hahne at novartis.com>
Sent: 13 October 2014 08:46
To: Vincent Carey; Martin, Tiphaine
Cc: bioc-devel at r-project.org
Subject: Re: [Bioc-devel] Non-ASCII in datase from Biomart EMBL via Gviz package
Sent: 13 October 2014 08:46
To: Vincent Carey; Martin, Tiphaine
Cc: bioc-devel at r-project.org
Subject: Re: [Bioc-devel] Non-ASCII in datase from Biomart EMBL via Gviz package
Hi Tiphaine,
You can follow Vince?s advice and transform all the data into proper ASCII
character. Or you can just get rid of the culprit (being the @biomart slot
of the object) before serialising. The easiest way to do that is:
foo at biomart <- NULL
The slot is only present to cache the BiomaRt connection, which is lost
anyways when serialising. The object is smart enough to realise that and
just reconnects the next time it is plotted. That is how I handled things
for the serialised BiomartGeneRegionTracks in Gviz.
Florian
On 12/10/14 20:35, "Vincent Carey" <stvjc at channing.harvard.edu> wrote:
>I don't know exactly how you are triggering this warning. If you have the
>ability to prefilter your content before serializing, that may be best.
>The following
>is from the gwascat package. You have very little chance, I believe, of
>getting an
>institutional guarantee that only ascii will go into their emissions.
>
>fixNonASCII = function(df) {
> hasNonASCII = function(x) {
> asc = iconv(x, "latin1", "ASCII")
> any(asc != x | is.na(asc))
> }
> havebad = sapply(df, function(x) hasNonASCII(x))
> if (!(any(havebad))) return(df)
> message("NOTE: input data had non-ASCII characters replaced by '*'.")
> badinds = which(havebad)
> for (i in 1:length(badinds))
> df[,badinds[i]] = iconv(df[,badinds[i]], to="ASCII", sub="*")
> df
>}
>
>
>
>On Sun, Oct 12, 2014 at 2:14 PM, Martin, Tiphaine
><tiphaine.martin at kcl.ac.uk
>> wrote:
>
>> Hi,
>>
>>
>> I need to create dataset BiomartGeneRegionTrack via Gviz package to run
>> examples in my packages. But when I run
>>
>> "R CMD check coMET", i have warning message for the checking :
>>
>>
>> checking data for non-ASCII characters ... WARNING
>> Warning: found non-ASCII strings
>> '[alpha cell,acidophil cell,acinar cell,adipoblast,adipocyte,amacrine
>> cell,beta cell,capsular cell,cementocyte,chief
>> cell,chondroblast,chondrocyte,chromaffin cell,chromophobic
>> cell,corticotroph,delta cell,dendritic cell,enterochromaffin
>>
>>cell,ependymocyte,epithelium,erythroblast,erythrocyte,fibroblast,fibrocyt
>>e,follicular
>> cell,germ cell,germinal epithelium,giant cell,glial
>>cell,glioblast,goblet
>> cell,gonadotroph,granulosa cell,haemocytoblast,hair
>> cell,hepatoblast,hepatocyte,hyalocyte,interstitial cell,juxtaglomerular
>> cell,keratinocyte,keratocyte,lemmal cell,leukocyte,luteal
>>cell,lymphocytic
>> stem cell,lymphoid cell,lymphoid stem cell,macroglial
>>cell,mammotroph,mast
>>
>>cell,medulloblast,megakaryoblast,megakaryocyte,melanoblast,melanocyte,mes
>>angial
>> cell,mesothelium,metamyelocyte,monoblast,monocyte,mucous neck
>>cell,muscle
>> cell,myelocyte,myeloid cell,myeloid stem cell,myoblast,myoepithelial
>>
>>cell,myofibrobast,neuroblast,neuroepithelium,neuron,odontoblast,osteoblas
>>t,osteoclast,osteocy!
>> te,oxyntic cell,parafollicular cell,paraluteal cell,peptic
>> cell,pericyte,phaeochromocyte,phalangeal
>>cell,pinealocyte,pituicyte,plasma
>>
>>cell,platelet,podocyte,proerythroblast,promonocyte,promyeloblast,promyelo
>>cyte,pronormoblast,reticulocyte,retinal
>> pigment epithelium,retinoblast,somatotroph,stem cell,sustentacular
>> cell,teloglial cell,zymogenic cell,small cell,Th1,Cell
>>Type,M<c3><bc>ller
>> cell,primary oocyte,Claudius' cell,Th2,follicular dendritic
>> cell,astrocyte,white,T-lymphoblast,basal cell,T-lymphocyte,helper
>>induced
>> T-lymphocyte:Th2,B-lymphocyte,neutrophil,oocyte,unclassifiable (Cell
>> Type),natural killer cell,helper induced T-lymphocyte,brown,CD4+,Hensen
>> cell,lymphocyte,cardiac muscle cell,lymphoblast,Paneth cell,alveolar
>> macrophage,macrophage,squamous cell,oligodendrocyte,smooth muscle
>> cell,gamete,spermatid,Schwann cell,CD34+,spermatocyte,helper induced
>>
>>T-lymphocyte:Th1,astroblast,eosinophil,oligodendroblast,basophil,peripher
>>al
>> blood mononuclear cell,histiocyte,Sertoli cel!
>> l,endothelium,granulocyte,spermatozoon,Merkel cell,skeletal muscle cel
>> l,thymocyte,foam cell,ovum,secondary spermatocyte,Langerhans
>>cell,primary
>> spermatocyte,transitional,Purkinje cell,Kupffer cell,secondary
>> oocyte,B-lymphoblast]' in object 'biomTrack'
>>
>>
>> chrom <- "chr2"
>> start <- 38290160
>> end <- 38303219
>> gen <- "hg19"
>>
>> biomTrack <- BiomartGeneRegionTrack(genome = gen,
>> chromosome = chr, start = start,
>> end = end, name = "ENSEMBL",
>> fontcolor="black",
>>groupAnnotation =
>> "group",
>> just.group =
>>"above",showId=showId )
>>
>>
>> Do you have an idea to correct this error? I think that we need to
>>discuss
>> with EMBL to correct that, do we ?
>>
>>
>> Tiphaine
>>
>>
>> ----------------------------
>> Tiphaine Martin
>> PhD Research Student | King's College
>> The Department of Twin Research & Genetic Epidemiology | Genetics &
>> Molecular Medicine Division
>> St Thomas' Hospital
>> 4th Floor, Block D, South Wing
>> SE1 7EH, London
>> United Kingdom
>>
>> email : tiphaine.martin at kcl.ac.uk
>> Fax: +44 (0) 207 188 6761
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
> [[alternative HTML version deleted]]
>
>_______________________________________________
>Bioc-devel at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/bioc-devel