[R-pkg-devel] ORCID ID finder via tools::CRAN_package_db() ?

Tue, Aug 20, 2024 6:47 AM

The variant attaches drops the URL and does unique.

Hmm, the ones in

  head(with(a, sort_by(a, ~ family + given)), 100)

without a family look suspicious ...

Best
-k


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: orcid.R
URL: <https://stat.ethz.ch/pipermail/r-package-devel/attachments/20240820/76546959/attachment.ksh>

-------------- next part --------------

Dirk Eddelbuettel writes:

On 20 August 2024 at 07:57, Dirk Eddelbuettel wrote:
| 
| Hi Kurt,
| 
| On 20 August 2024 at 14:29, Kurt Hornik wrote:
| | I think for now you could use something like what I attach below.
| | 
| | Not ideal: I had not too long ago starting adding orcidtools.R to tools,
| | which e.g. has .persons_from_metadata(), but that works on the unpacked
| | sources and not the CRAN package db.  Need to think about that ...
| 
| We need something like that too as I fat-fingered the string 'ORCID'. See
| fortune::fortunes("Dirk can type").
| 
| Will the function below later. Many thanks for sending it along.

Very nice. Resisted my common impulse to make it a data.table for easy
sorting via keys etc.  After running your code the line

head(with(a, sort_by(a, ~ family + given)), 100)

shows that we need a bit more QA as person entries are not properly split
between 'family' and 'given', use the URL and that we have repeats.
Excluding those is next.

Right.  One should canonicalize the ORCID (having the URLs is from being
nice) and then do unique() ...

Best
-k

Dirk

| Dirk
| 
| | 
| | Best
| | -k
| | 
| | ********************************************************************
| | x <- tools::CRAN_package_db()
| | a <- lapply(x[["Authors at R"]],
| |             function(a) {
| |                 if(!is.na(a)) {
| |                     a <- tryCatch(utils:::.read_authors_at_R_field(a), 
| |                                   error = identity)
| |                     if (inherits(a, "person")) 
| |                         return(a)
| |                 }
| |                 NULL
| |             })
| | a <- do.call(c, a)
| | a <- lapply(a,
| |             function(e) {
| |                 if(is.null(o <- e$comment["ORCID"]) || is.na(o))
| |                     return(NULL)
| |                 cbind(given = paste(e$given, collapse = " "),
| |                       family = paste(e$family, collapse = " "),
| |                       oid = unname(o))
| |             })
| | a <- as.data.frame(do.call(rbind, a))
| | ********************************************************************
| | 
| | > Salut Thierry,
| | 
| | > On 20 August 2024 at 13:43, Thierry Onkelinx wrote:
| | > | Happy to help. I'm working on a new version of the checklist package. I could
| | > | export the function if that makes it easier for you.
| | 
| | > Would be happy to help / iterate. Can you take a stab at making the
| | > per-column split more robust so that we can bulk-process all non-NA entries
| | > of the returned db?
| | 
| | > Best, Dirk
| | 
| | > -- 
| | > dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
| 
| -- 
| dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org

-- 
dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org

[R-pkg-devel] ORCID ID finder via tools::CRAN_package_db() ?

Thread (20 messages)