Skip to content
Prev 132146 / 398506 Next

Analyzing Publications from Pubmed via XML

"Farrel Buchinsky" <fjbuch at gmail.com> wrote in
news:bd93cdad0712141216s23071d27n17d87a487ad06950 at mail.gmail.com:
Gabor's example already did that task.
I could not find it. The pubmed function appears to assume that you will 
already have a list of PMIDs. When I set up a function to take an 
arbitrary  PubMed search string (quoted by the user) and return the 
PMIDs, I had success by following Gabor's example:
srch.stem <-"http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term="
   query <-as.character(scan(file="",what="character"))
   doc <-xmlTreeParse(paste(srch.stem,query,sep=""),isURL = TRUE, 
         useInternalNodes = TRUE)
   sapply(c("//Id"), xpathApply, doc = doc, fun = xmlValue)
     }
1: "laryngeal neoplasms[mh]"
2: 
Read 1 item
      //Id      
 [1,] "18042931"
 [2,] "18038886"
 [3,] "17978930"
 [4,] "17974987"
 [5,] "17972507"
 [6,] "17970149"
 [7,] "17967299"
 [8,] "17962724"
 [9,] "17954109"
[10,] "17942038"
[11,] "17940076"
[12,] "17848290"
[13,] "17848288"
[14,] "17848287"
[15,] "17848278"
[16,] "17938330"
[17,] "17938329"
[18,] "17918311"
[19,] "17910347"
[20,] "17908862"

Emboldened by that minor success, I pushed on. Pubmed said your example 
was malformed and I took their suggested modification:
("Laryngeal Neoplasms"[MeSH] AND "Papilloma"[MeSH]) OR (("recurrence"[TIAB] NOT Medline[SB]) OR "recurrence"[MeSH Terms] OR recurrent[Text Word]) AND respiratory[All Fields] AND (("papilloma"[TIAB] NOT Medline[SB]) OR "papilloma"[MeSH Terms] OR papillomatosis[Text Word]) 

That returned 400+ citations, and I put it into a text file.

After quite a bit of hacking (in the sense of ineffective chopping with 
a dull ax), I finally came up with:

pm.srch<- function (){
  srch.stem<-"http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term="
  query<-readLines(con=file.choose())
  query<-gsub("\\\"","",x=query)
  doc<-xmlTreeParse(paste(srch.stem,query,sep=""),isURL = TRUE, 
                     useInternalNodes = TRUE)
  return(sapply(c("//Id"), xpathApply, doc = doc, fun = xmlValue) )
     }

pm.srch()  #choosing the search-file
      //Id      
 [1,] "18046565"
 [2,] "17978930"
 [3,] "17975511"
 [4,] "17935912"
 [5,] "17851940"
 [6,] "17765779"
 [7,] "17688640"
 [8,] "17638782"
 [9,] "17627059"
[10,] "17599582"
[11,] "17589729"
[12,] "17585283"
[13,] "17568846"
[14,] "17560665"
[15,] "17547971"
[16,] "17428551"
[17,] "17419899"
[18,] "17419519"
[19,] "17385606"
[20,] "17366752"

Thread (26 messages)

Farrel Buchinsky Analyzing Publications from Pubmed via XML Dec 13 Rajarshi Guha Analyzing Publications from Pubmed via XML Dec 13 Farrel Buchinsky Analyzing Publications from Pubmed via XML Dec 13 Gabor Grothendieck Analyzing Publications from Pubmed via XML Dec 13 Rajarshi Guha Analyzing Publications from Pubmed via XML Dec 13 Robert Gentleman Analyzing Publications from Pubmed via XML Dec 13 Farrel Buchinsky Analyzing Publications from Pubmed via XML Dec 14 Farrel Buchinsky Analyzing Publications from Pubmed via XML Dec 14 Gabor Grothendieck Analyzing Publications from Pubmed via XML Dec 14 Duncan Temple Lang Analyzing Publications from Pubmed via XML Dec 14 David Winsemius Analyzing Publications from Pubmed via XML Dec 15 David Winsemius Analyzing Publications from Pubmed via XML Dec 15 Gabor Grothendieck Analyzing Publications from Pubmed via XML Dec 15 David Winsemius Analyzing Publications from Pubmed via XML Dec 16 Gabor Grothendieck Analyzing Publications from Pubmed via XML Dec 16 David Winsemius Analyzing Publications from Pubmed via XML Dec 16 David Winsemius Analyzing Publications from Pubmed via XML Dec 16 Duncan Temple Lang Analyzing Publications from Pubmed via XML Dec 16 Armin Goralczyk Analyzing Publications from Pubmed via XML Dec 17 Martin Morgan Analyzing Publications from Pubmed via XML Dec 17 Armin Goralczyk Analyzing Publications from Pubmed via XML Dec 17 David Winsemius Analyzing Publications from Pubmed via XML Dec 17 David Winsemius Analyzing Publications from Pubmed via XML Dec 17 Armin Goralczyk Analyzing Publications from Pubmed via XML Dec 18 David Winsemius Analyzing Publications from Pubmed via XML Dec 18 Armin Goralczyk Analyzing Publications from Pubmed via XML Dec 19