Code Monkey home page Code Monkey logo

rcrossref's Introduction


cran checks Project Status: Active - The project has reached a stable, usable state and is being actively developed. Build Status Build status rstudio mirror downloads cran version

R interface to various CrossRef APIs

CrossRef documentation


Stable version from CRAN


Or development version from GitHub


Load rcrossref


Citation search

Use CrossRef's DOI Content Negotiation service, where you can citations back in various formats, including apa

cr_cn(dois = "10.1126/science.169.3946.635", format = "text", style = "apa")
#> [1] "Frank, H. S. (1970). The Structure of Ordinary Water: New data and interpretations are yielding new insights into this fascinating substance. Science, 169(3946), 635–641. doi:10.1126/science.169.3946.635"


cat(cr_cn(dois = "10.1126/science.169.3946.635", format = "bibtex"))
#> @article{Frank_1970,
#> 	doi = {10.1126/science.169.3946.635},
#> 	url = {},
#> 	year = 1970,
#> 	month = {aug},
#> 	publisher = {American Association for the Advancement of Science ({AAAS})},
#> 	volume = {169},
#> 	number = {3946},
#> 	pages = {635--641},
#> 	author = {H. S. Frank},
#> 	title = {The Structure of Ordinary Water: New data and interpretations are yielding new insights into this fascinating substance},
#> 	journal = {Science}
#> }


cr_cn(dois = "10.6084/m9.figshare.97218", format = "bibentry")

Citation count

Citation count, using OpenURL

cr_citation_count(doi = "10.1371/journal.pone.0042793")
#> [1] 21

Search Crossref metadata API

The following functions all use the CrossRef API.

Look up funder information

cr_funders(query = "NSF")
#> $meta
#>   total_results search_terms start_index items_per_page
#> 1            10          NSF           0             20
#> $data
#> # A tibble: 10 x 6
#>    id     location  name          alt.names        uri       tokens       
#>    <chr>  <chr>     <chr>         <chr>            <chr>     <chr>        
#>  1 50110… Norway    Norsk Sykepl… NSF, Norwegian … http://d… norsk, sykep…
#>  2 10000… United S… Center for H… CHM, NSF, Unive… http://d… center, for,…
#>  3 10000… United S… National Sle… NSF              http://d… national, sl…
#>  4 50110… Sri Lanka National Sci… National Scienc… http://d… national, sc…
#>  5 10000… Denmark   Statens Natu… Danish National… http://d… statens, nat…
#>  6 10000… United S… Office of th… NSF Office of t… http://d… office, of, …
#>  7 50110… Australia National Str… NSF              http://d… national, st…
#>  8 10000… United S… National Sci… NSF              http://d… national, sc…
#>  9 50110… China     National Nat… NSFC-Yunnan Joi… http://d… national, na…
#> 10 50110… China     National Nat… Natural Science… http://d… national, na…
#> $facets

Check the DOI minting agency

cr_agency(dois = '10.13039/100000001')
#> $DOI
#> [1] "10.13039/100000001"
#> $agency
#> $agency$id
#> [1] "crossref"
#> $agency$label
#> [1] "Crossref"

Search works (i.e., articles)

cr_works(filter = c(has_orcid = TRUE, from_pub_date = '2004-04-04'), limit = 1)
#> $meta
#>   total_results search_terms start_index items_per_page
#> 1       1614119           NA           0              1
#> $data
#> # A tibble: 1 x 26
#> container.title created deposited doi   indexed issn 
#>   <chr>          <chr>           <chr>   <chr>     <chr> <chr>   <chr>
#> 1 S200529011400… Journal of Acu… 2014-0… 2017-06-… 10.1… 2018-0… 2005…
#> # ... with 19 more variables: issue <chr>, issued <chr>, member <chr>,
#> #   page <chr>, prefix <chr>, publisher <chr>, reference.count <chr>,
#> #   score <chr>, source <chr>, title <chr>, type <chr>,
#> #   update.policy <chr>, url <chr>, volume <chr>, assertion <list>,
#> #   author <list>, funder <list>, link <list>, license <list>
#> $facets

Search journals

cr_journals(issn = c('1803-2427','2326-4225'))
#> $data
#> # A tibble: 2 x 53
#>   title publisher issn  last_status_che… deposits_abstra… deposits_orcids…
#>   <chr> <chr>     <chr> <date>           <lgl>            <lgl>           
#> 1 Jour… "De Gruy… 1805… 2018-08-06       TRUE             FALSE           
#> 2 Jour… American… 2326… 2018-08-06       FALSE            FALSE           
#> # ... with 47 more variables: deposits <lgl>,
#> #   deposits_affiliations_backfile <lgl>,
#> #   deposits_update_policies_backfile <lgl>,
#> #   deposits_similarity_checking_backfile <lgl>,
#> #   deposits_award_numbers_current <lgl>,
#> #   deposits_resource_links_current <lgl>, deposits_articles <lgl>,
#> #   deposits_affiliations_current <lgl>, deposits_funders_current <lgl>,
#> #   deposits_references_backfile <lgl>, deposits_abstracts_backfile <lgl>,
#> #   deposits_licenses_backfile <lgl>,
#> #   deposits_award_numbers_backfile <lgl>,
#> #   deposits_open_references_backfile <lgl>,
#> #   deposits_open_references_current <lgl>,
#> #   deposits_references_current <lgl>,
#> #   deposits_resource_links_backfile <lgl>,
#> #   deposits_orcids_backfile <lgl>, deposits_funders_backfile <lgl>,
#> #   deposits_update_policies_current <lgl>,
#> #   deposits_similarity_checking_current <lgl>,
#> #   deposits_licenses_current <lgl>, affiliations_current <dbl>,
#> #   similarity_checking_current <dbl>, funders_backfile <dbl>,
#> #   licenses_backfile <dbl>, funders_current <dbl>,
#> #   affiliations_backfile <dbl>, resource_links_backfile <dbl>,
#> #   orcids_backfile <dbl>, update_policies_current <dbl>,
#> #   open_references_backfile <dbl>, orcids_current <dbl>,
#> #   similarity_checking_backfile <dbl>, references_backfile <dbl>,
#> #   award_numbers_backfile <dbl>, update_policies_backfile <dbl>,
#> #   licenses_current <dbl>, award_numbers_current <dbl>,
#> #   abstracts_backfile <dbl>, resource_links_current <dbl>,
#> #   abstracts_current <dbl>, open_references_current <dbl>,
#> #   references_current <dbl>, total_dois <int>, current_dois <int>,
#> #   backfile_dois <int>
#> $facets

Search license information

cr_licenses(query = 'elsevier')
#> $meta
#>   total_results search_terms start_index items_per_page
#> 1            25     elsevier           0             20
#> $data
#> # A tibble: 25 x 2
#>    URL                                                      work.count
#>    <chr>                                                         <int>
#>  1          1
#>  2                13
#>  3                 8
#>  4                    2
#>  5                       1
#>  6                        2
#>  7                       2
#>  8                      157
#>  9                   2175
#> 10           10
#> # ... with 15 more rows

Search based on DOI prefixes

cr_prefixes(prefixes = c('10.1016','10.1371','10.1023','10.4176','10.1093'))
#> $meta
#> $data
#>                               member                             name
#> 1                      Elsevier BV
#> 2 Public Library of Science (PLoS)
#> 3     Springer Nature America, Inc
#> 4             Co-Action Publishing
#> 5    Oxford University Press (OUP)
#>                                  prefix
#> 1
#> 2
#> 3
#> 4
#> 5
#> $facets
#> list()

Search CrossRef members

cr_members(query = 'ecology', limit = 5)
#> $meta
#>   total_results search_terms start_index items_per_page
#> 1            18      ecology           0              5
#> $data
#> # A tibble: 5 x 56
#>      id primary_name location last_status_che… total.dois current.dois
#>   <int> <chr>        <chr>    <date>           <chr>      <chr>       
#> 1   336 Japanese So… 5-3 Yon… 2018-08-06       1167       157         
#> 2  1950 Journal of … Suite 8… 2018-08-06       27         0           
#> 3  2080 The Japan S… 5-3 Yon… 2018-08-06       685        35          
#> 4  2151 Ecology and… 5-3 Yon… 2018-08-06       385        53          
#> 5  2169 Italian Soc… Diparti… 2018-08-06       1217       367         
#> # ... with 50 more variables: backfile.dois <chr>, prefixes <chr>,
#> #   coverge.affiliations.current <chr>,
#> #   coverge.similarity.checking.current <chr>,
#> #   coverge.funders.backfile <chr>, coverge.licenses.backfile <chr>,
#> #   coverge.funders.current <chr>, coverge.affiliations.backfile <chr>,
#> #   coverge.resource.links.backfile <chr>, coverge.orcids.backfile <chr>,
#> #   coverge.update.policies.current <chr>,
#> # <chr>, coverge.orcids.current <chr>,
#> #   coverge.similarity.checking.backfile <chr>,
#> #   coverge.references.backfile <chr>,
#> #   coverge.award.numbers.backfile <chr>,
#> #   coverge.update.policies.backfile <chr>,
#> #   coverge.licenses.current <chr>, coverge.award.numbers.current <chr>,
#> #   coverge.abstracts.backfile <chr>,
#> #   coverge.resource.links.current <chr>, coverge.abstracts.current <chr>,
#> # <chr>,
#> #   coverge.references.current <chr>,
#> #   flags.deposits.abstracts.current <chr>,
#> #   flags.deposits.orcids.current <chr>, flags.deposits <chr>,
#> #   flags.deposits.affiliations.backfile <chr>,
#> #   flags.deposits.update.policies.backfile <chr>,
#> #   flags.deposits.similarity.checking.backfile <chr>,
#> #   flags.deposits.award.numbers.current <chr>,
#> #   flags.deposits.resource.links.current <chr>,
#> #   flags.deposits.articles <chr>,
#> #   flags.deposits.affiliations.current <chr>,
#> #   flags.deposits.funders.current <chr>,
#> #   flags.deposits.references.backfile <chr>,
#> #   flags.deposits.abstracts.backfile <chr>,
#> #   flags.deposits.licenses.backfile <chr>,
#> #   flags.deposits.award.numbers.backfile <chr>,
#> # <chr>,
#> # <chr>,
#> #   flags.deposits.references.current <chr>,
#> #   flags.deposits.resource.links.backfile <chr>,
#> #   flags.deposits.orcids.backfile <chr>,
#> #   flags.deposits.funders.backfile <chr>,
#> #   flags.deposits.update.policies.current <chr>,
#> #   flags.deposits.similarity.checking.current <chr>,
#> #   flags.deposits.licenses.current <chr>, names <chr>, tokens <chr>
#> $facets

Get N random DOIs

cr_r() uses the function cr_works() internally.

#>  [1] "10.18472/cvt.18n1.2018.1488"                 
#>  [2] "10.1016/0161-5890(76)90161-9"                
#>  [3] "10.3109/09546634.2010.521811"                
#>  [4] "10.1007/978-3-642-86520-6_3"                 
#>  [5] "10.1093/benz/9780199773787.article.b00100239"
#>  [6] "10.1007/s10100-006-0016-5"                   
#>  [7] "10.1016/j.egypro.2011.10.348"                
#>  [8] "10.1097/01241398-199709000-00001"            
#>  [9] "10.1111/j.1445-5994.2009.02018.x"            
#> [10] "10.7215/ds_ip_20101029"

You can pass in the number of DOIs you want back (default is 10)

#> [1] "10.7868/s0869565217150221" "10.2307/442387"

Get full text

Publishers can optionally provide links in the metadata they provide to Crossref for full text of the work, but that data is often missing. Find out more about it at

Get some DOIs for articles that provide full text, and that have CC-BY 3.0 licenses (i.e., more likely to actually be open)

out <-
  cr_works(filter = list(has_full_text = TRUE,
    license_url = ""))
(dois <- out$data$doi)
#>  [1] "10.5194/acpd-14-24183-2014"     "10.5194/bgd-11-13343-2014"     
#>  [3] "10.5194/bgd-11-13455-2014"      "10.1155/2014/128505"           
#>  [5] "10.1155/2014/124592"            "10.1155/2014/154204"           
#>  [7] "10.1155/2014/718415"            "10.1155/2014/727135"           
#>  [9] "10.1155/2014/264217"            "10.1155/2014/484656"           
#> [11] "10.1155/2014/490386"            "10.1155/2014/528696"           
#> [13] "10.1155/2014/934510"            "10.1155/2014/913510"           
#> [15] "10.1155/2014/907584"            "10.1155/2014/936748"           
#> [17] "10.5194/amtd-7-9453-2014"       "10.1088/1742-6596/536/1/012003"
#> [19] "10.1088/1742-6596/536/1/012001" "10.1088/1742-6596/536/1/012016"

From the output of cr_works we can get full text links if we know where to look:"rbind", out$data$link)
#> # A tibble: 58 x 4
#>    URL                      content.type content.version intended.applica…
#>    <chr>                    <chr>        <chr>           <chr>            
#>  1 http://www.atmos-chem-p… unspecified  vor             similarity-check…
#>  2 http://www.biogeoscienc… unspecified  vor             similarity-check…
#>  3 http://www.biogeoscienc… unspecified  vor             similarity-check…
#>  4 http://downloads.hindaw… application… vor             text-mining      
#>  5 http://downloads.hindaw… application… vor             text-mining      
#>  6 http://downloads.hindaw… unspecified  vor             similarity-check…
#>  7 http://downloads.hindaw… application… vor             text-mining      
#>  8 http://downloads.hindaw… application… vor             text-mining      
#>  9 http://downloads.hindaw… unspecified  vor             similarity-check…
#> 10 http://downloads.hindaw… application… vor             text-mining      
#> # ... with 48 more rows

From there, you can grab your full text, but because most links require authentication, enter another package: crminer.

You'll need package crminer for the rest of the work.

Onc we have DOIs, get URLs to full text content

if (!requireNamespace("crminer")) {
(links <- crm_links("10.1155/2014/128505"))
#> $pdf
#> <url>
#> $xml
#> <url>
#> $unspecified
#> <url>

Then use those URLs to get full text

#> <document>/Users/sckott/Library/Caches/R/crminer/128505.pdf
#>   Pages: 1
#>   No. characters: 1565
#>   Created: 2014-09-15

See also fulltext for getting scholarly text for text mining.


  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for rcrossref in R doing citation(package = 'rcrossref')
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

This package is part of a richer suite called fulltext, along with several other packages, that provides the ability to search for and retrieve full text of open access scholarly articles.


rcrossref's People


sckott avatar njahn82 avatar cboettig avatar haozhu233 avatar karthik avatar noamross avatar poldham avatar



Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.