Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

MIT Libraries logo MIT Libraries

Massaschusetts Institute of Technology logo Search Account

Resources and Tools for Computational Research: Descriptions

arXiv Preprint Server

What it does: Gives programmatic access to all of the arXiv articles, metadata, and search interface via bulk metadata access and bulk full text access.

How it’s accessed: OAI-PMH, API, and RSS for metadata access and various cloud options for the full-text access. 

How to register: Free to use, no registration or API key required

Limitations: No limitation, but see the Play nice and Consider the impact sections here

Contact for technical questions: arXiv Google Group

For more information: arXiv Bulk Data Access

SAO/NASA Astrophysics Data System (ADS) API

What it does: Provides access to ADS database of bibliographic data on astronomy and physics publications

How it’s accessed: HTTP GET requests, or via an unofficial Python client

Result format: varies

How to register: Free to register, API key required

Limitations: Rate limits apply

Contact for technical questions: adshelp@cfa.harvard.edu

For more informationhttp://adsabs.github.io/help/api/, terms of use available here: http://adsabs.github.io/help/terms/

BioMed Central Journal articles

What it does: Provides access to open access content published by BMC 

How it’s accessed: SpringerNature Open Access API and text and data mining is through SpringerNature 

Result format: variety of different output formats, including XML and JSON

How to register: Registration to the developer portal required.

Limitations: No stated limitations for BMC content 

Contact for technical questions: supportapi@springernature.com

For more information: BMC's site on Indexing, archiving and access to data 

Caselaw Access Project API

What it does: Provides queryable access to all published US court decisions

How it’s accessed: In-browser API viewer or RESTful interface, also available as bulk download

Result format: structured XML, presentation HTML, or plain text

How to register: Most queries do not require registration, some jurisdictions with access restrictions require a free API key

Limitations: Full text of cases limited to 500 cases per person per day, unless otherwise authorized.  More on access limits here

Contact for technical questions: https://case.law/api/#problems

For more information: https://case.law/

Congress.gov

What it does: Congress.gov shares its application programming interface (API) to provide computational access to accurate and structured congressional data including bills, amendments, summaries, members, the Congressional Record, committee reports, nominations, treaties, and House Communications. Over time we will be adding other collections such as hearing transcripts and Senate Communications.

How it’s accessed: To use the API you must first get an API key.

Result format: Congress.gov API is a REST API and presents data in a hierarchical browse format with responses provided in XML or JSON. The XML format is the default for the API.

How to register: Free to register, API key required

Limitations: See Github page for limits.

Contact for technical questions: Library of Congress Ask a Librarian service.

For more information:  see congress.gov's GitHub page for documentation, user guides, a change log that details changes to the API, and opportunities for feedback

Constellate

What it does: Constellate is a text and data analytics service from JSTOR and Portico that has the ability to build datasets and analyze texts from across multiple content sources, and visualize and analyze their datasets. MIT users can download up to 50,000 documents.

How it’s accessed: All MIT users can access Constellate via the access request form

Result format: Dataset files may be downloaded in CSV or JSON.  For all documents in Constellate, you may download bibliographic metadata, unigrams, bigrams, and full-text. For any content which is not rights restricted (e.g. Chronicling America, Reveal Digital, or early JSTOR content) your dataset files will contain the full-text.

You may read more about 

Contact for technical questions: If you have access to  Constellate and need technical help, or need additional (typically more than 50,000) document downloads, please contact constellate@ithaka.org.

For more information: Contact lib-comptool@mit.edu with questions about the use of this tool. 

CORE

What it does: gives programmatic access to metadata and full-text of millions of OA research papers

How it’s accessed: Through API or bulk data download. See an overview of CORE services for more information

How to register: Free to use, API key required, register for API key at https://core.ac.uk/api-keys/register

Limitations: Quota applied for query volume, details at https://core.ac.uk/services#api

Contact for technical questions: theteam@core.ac.uk  

For more information: https://core.ac.uk/services

CrossRef DOI Registry Agency

What it does: Allows access to metadata records for over 75 million scholarly works that have CrossRef DOIs, covering around 5000 publishers.  Can be used for text and data mining, checking against funder mandates, and to obtain metadata in a variety of representations.

How it’s accessed: General search interfaces and various APIs

Result format: JSON, Text, and XML

How to register: No registration required

Limitations: No data use stated limitations; may be limited by publisher participation

Contact for technical questions: support@crossref.org

For more informationhttps://www.crossref.org/documentation/retrieve-metadata/

DVN (Dataverse Network) APIs for Data Sharing

What they do: Multiple APIs available to allow programmatic access to data and metadata in the Dataverse Network, which includes the Harvard Dataverse NetworkMIT Libraries-purchased data, and data deposited in other Dataverse Network repositories

How they’re accessed: the Data Access API 

Result format: DDI XML and JSON for partial records 

How to register: Access to restricted data sets requires approval by data owners.  To access MIT Libraries-purchased data, login to Dataverse by selecting Massachusetts Institute of Technology and using your certificates or touchstone.  More information available at: http://guides.dataverse.org/en/4.6/api/dataaccess.html#authentication-and-authorization

Limitations: No limitations on public data set downloads after agreeing to terms of use.  No limitations on restricted data set downloads after access is granted by data owners

Contact for technical questions: dvn_support@help.hmdc.harvard.edu; Questions can also be posted in https://groups.google.com/forum/#!forum/dataverse-community

For more informationhttp://guides.dataverse.org/en/4.6/api/

Digital Public Library of America (DPLA) API

What it does: Allows programmatic access to metadata in DPLA collections, including partner data from Harvard, New York Public Library, ARTstor, and others

How it’s accessed: RESTful interface

Result format: Structured JSON-LD objects

How to register: Free to use; API key must be requested with information here: https://dp.la/info/developers/codex/policies/#get-a-key

Limitations: No stated limitations

Contact for technical questions: codex@dp.la; Users can also submit issues to DPLA’s Issue Tracker

For more informationhttp://dp.la/info/developers/codex/

Europeana APIs

What they do: Four APIs available to allow access to metadata, annotation, and download of Europeana data

How they’re accessed: API details here

Result format: Varies by API

How to register: https://pro.europeana.eu/pages/get-api

Limitations: Free to register, no stated limitations

Contact for technical questions: api@europeana.eu

For more informationhttps://pro.europeana.eu/page/apis

HathiTrust Data API

What it does: Can be used to retrieve content (page images, OCR, and in some cases whole volume packages), and metadata for HathiTrust Digital Library volumes

How it’s accessed: RESTful interface

Result format: XML, JSON or binary depending on the resource queried

How to register: Two methods of access: via a Web client, requiring authentication (users who are not members of a HathiTrust partner institution must sign up for a University of Michigan “Friend” Account), or programmatically using an access key that can be obtained at http://babel.hathitrust.org/cgi/kgs/request

Limitations: No stated limitations but is not meant for large-scale retrieval of data

Contact for technical questions: feedback@issues.hathitrust.orghttps://www.hathitrust.org/feedback

For more informationhttps://www.hathitrust.org/data_api

IEEE Xplore API

What it does: Provides flexible query and retrieval of metadata records for more then 4 million documents comprising IEEE journals, conference proceedings, and technical standards

How it’s accessed: HTTP requests using structured URL queries

Result format: JSON, XML

How to register: Follow the steps at https://developer.ieee.org/getting_started

Limitations: Maximum of 200 results may be retrieved in a single query.  A query term can only contain a maximum of 10 words

Contact for technical questions: onlinesupport@ieee.org

For more informationhttps://developer.ieee.org/

JSTOR Data for Research

What it does: Not a true API, but allows computational analysis and selection of JSTOR’s scholarly journal and primary resource collections.  Includes tools for faceted searching and filtering, text analysis, topic modeling, data extraction, and visualization

How it’s accessed: Web interface

Result format: CSV, varies depending on tool used

How to register: Free to access, registration is required to obtain results; no institutional affiliation required

Limitations: Datasets are capped by default at 1,000 articles; users seeking larger results are asked to contact JSTOR Data for Research

Contact for technical questions: support@jstor.orghttp://about.jstor.org/contact

For more informationhttp://about.jstor.org/service/data-for-research

The Lens API

What it does: The Lens API provides programmatic access to Patent and Scholarly Works meta records. The Lens meta records of patents and scholarly works are metadata aggregated from various sources with persistent identifiers from the original data sources, and normalized with provenance maintained. All MIT affiliates have access via the Institutional User API Plans, which include the follo​​wing rates and volumes for each user. 

Scholarly Institutional User API Plan

Patent Institutional User API Plan

●   5,000 requests per month

●   500 records per request

●  10 requests per minute

●  5,000 requests per month

●  100 records per request

●  10 requests per minute

In addition, MIT Libraries has a small number of seats for high-volume API access via the Institutional Toolkit (ITK) Plan.  

How it’s accessed: All MIT affiliates can access the API via the request access form.

Result format: provides programmatic access via REST API to the full corpus of Lens scholarly works and patent, with JSON output format. The Lens API documentation includes details of request structure, searchable fields, code examples, and response fields etc.  You may also use their Swagger UI for query development. 

Contact for technical questions: If you have access to the API and need technical help, please create an issue on The Lens GitHub repo or use the feedback form on their support page at https://docs.api.lens.org/support.html. You may also contact support@lens.org

For more information: Contact lib-comptool@mit.edu with questions about our subscription and the use of this tool.

Library of Congress APIs

What they do: Multiple APIs available to download bibliographic data and search Library of Congress digital collections, including images, public radio and television, and historic newspapers

How they’re accessed: Varies by API used, more information available here

Result format: Varies by API used

How to register: Free to use, most APIs do not require an API key

Limitations: Not specified, varies by API used

Contact for technical questions: https://labs.loc.gov/lc-for-robots/

For more informationhttps://labs.loc.gov/lc-for-robots/

Nature Blogs API

What it does: Blog tracking and indexing service; tracks Nature blogs and other third-party science blogs

How it’s accessed: RESTful interface, queries are made as HTTP GET requests

Result format: Default is JSON, some queries return Atom/RSS

How to register: Free to register, API key no longer required as of 2013

Limitations: 2 calls per second; 5,000 calls per day; RSS results are limited to 100 items maximum

Contact for technical questions:  developers@nature.com

For more information: http://www.nature.com/developers/documentation/api-references/blogs-api/

Nature OpenSearch API

What it does: Bibliographic search service for Nature content

How it’s accessed: RSS, JSON, ATOM, SRU XML, TURTLE, depending on interface used

Result format: REST API with two interfaces: 1) OpenSearch standard interface using keyword searches; 2) SRU  search interface using CQL structed queries

How to register: Free to register, API key no longer required as of 2013

Limitations: Results served in pages of 25 records. Additional records can be retrieved by paging through the result set. The page size can be varied and is capped at 100 records

Contact for technical questions:  developers@nature.com

For more information: http://www.nature.com/developers/documentation/api-references/opensearch-api/

NLM APIs

What it does: multiple APIs and other data tools for accessing various NLM databases.

For more information: https://wwwcf.nlm.nih.gov/nlm_eresources/eresources/search_database.cfm

OECD Data APIs

What they do: Allows programmatic access to a selection of OECD datasets

How they’re accessed: two RESTful APIs available for queries in SDMX-JSON or SDMX-ML formats

Result format: JSON, XML

How to register: No registration required

Limitations: 1 million data points; not all OECD datasets are covered

Contact for technical questions: OECDdotStat@oecd.org

For more information: https://data.oecd.org/api/

ORCID API

What it does: Queries and searches the ORCID researcher identifier system and obtain researcher profile data

How it’s accessed: RESTful interface

Result format: HTML, XML, or JSON

How to register: Two options: 1) Users can access the Public API, which only returns data marked as “public”; 2) Become an ORCID member to receive API credentials: see here

Limitations: Data retrieved through Public API is limited

Contact for technical questions: https://orcid.org/help/contact-us

For more information: https://orcid.org/organizations/integrators/API

PLoS Article-Level Metrics API

What it does: Retrieves article-level metrics (including usage statistics, citation counts, and social networking activity) for articles published in PLOS journals and articles added to PLOS Hubs: Biodiversity

How it’s accessed: queries made as HTTP GET requests through a RESTful interface, or via web interface

Result format: XML, JSON, CSV

How to register: Free to register; API key needed; Go to http://api.plos.org/registration/

Limitations: Results limited to batches of 50 at a time

Contact for technical questions: alm@plos.org; Questions can also be posted in PLoS API Google Group

For more information: http://alm.plos.org/http://almreports.plos.org/http://alm.plos.org/docs/api

PLoS Search API

What it does: Allows PLoS content to be queried for integration into web, desktop, or mobile applications

How it’s accessed: RESTful interface, queries are made as HTTP GET requests

Result format: XML

How to register: Free to register; API key needed; go to http://api.plos.org/registration/.

Limitations: Max is 7200 requests a day, 300 per hour, 10 per minute; users should wait 5 seconds for each query to return results; requests should not return more than 100 rows; high-volume users should contact api@plos.org; API users are limited to no more than five concurrent connections from a single IP address

Contact for technical questions: api@plos.org; Questions can also be posted in PLoS API Google Group

For more information: http://api.plos.org/solr/faq/

ScienceDirect APIs

What they do: Multiple APIs available for different use cases, including text mining of full-text content, search widgets, displaying journal or book level data, federated searching, and indexing

How they’re accessed: varies, depending on use case

Result format: varies, depending on use case

How to register: Free to register.  MIT users should follow the steps on https://dev.elsevier.com/

Limitations: varies, depending on use case

Contact for technical questions: integrationsupport@elsevier.com

For more information: https://dev.elsevier.com/https://dev.elsevier.com/sd_apis.html

Springer APIs

What they do: Multiple APIs providing access to Springer metadata and open access content

How they’re accessed: RESTful interface, using structured URL requests

Result format: XML, JSON, PRISM, A++ depending on query specifications

How to register: Free to register, API key required

Limitations: maximum results for a single query is 100 results for metadata queries, or 20 results for open access queries

Contact for technical questions: support.api@springer.com

For more information: https://dev.springer.com/https://dev.springer.com/docs;  https://dev.springer.com/restfuloperations

STAT!Ref OpenSearch API

What it does: Bibliographic search service for displaying STAT!Ref results on a website.

How it’s accessed: OpenSearch specifications

Result format: RSS, ATOM, HTML

How to register: Free to register for users at a subscribing institution

Limitations: Limits exist but are not specified; high-volume users should contact STAT!Ref

Contact for technical questions: support@statref.com

For more information: http://online.statref.com/Resources/StatRefOpenSearch.aspx

TDM Studio

What it does: ProQuest’s TDM Studio is a text and data mining solution for research, teaching and learning and allows MIT users to analyze the content from eligible MIT Libraries ProQuest subscriptions. Please check the eligible content file for details and the date last updated. If you need a more recent version, please contact lib-comptool@mit.edu 

How it’s accessed: All MIT-affiliated users can access TDM Studio via the request access form. 

Result format: Dataset files of metadata and derived data can be downloaded in CSV format. Full-text content export is not allowed. 

Contact for technical questions: If you have access to the API and need technical help, please contact TDMStudio@clarivate.com.

For more information: Contact lib-comptool@mit.edu with questions about the use of this tool. 

UN Comtrade APIs

What they do: Allow access to data on International Merchandise Trade Statistics (IMTS) and the work of the International Merchandise Trade Statistics Section (IMTSS) of the United Nations Statistics Division

How they’re accessed: Some services in REST, some in SOAP

Result format: XML or CSV, depending on service

How to register: Comtrade Web Services requires IP authentication, users must have site license account, however, access to metadata and data availability is not restricted

Limitations: Depending on access rights, the following data can be obtained: Comtrade Data, Tariff Line Data, Total Trade, Annual Totals, Processed Data or Original Data. The latest three are restricted for data exchange between UN and OECD.

Contact for technical questions: comtrade@un.org

For more information: https://comtrade.un.org/ws/

Web of Science Lite

What it does: Allows text- and data-mining access to content in Web of Science Lite

How it’s accessed: Accessible via Clarivate’s Developers Portal

Result format: JSON or XML

How to register: Must be part of a subscribing institution to have full text access. MIT users must set up an individual account at Clarivate’s Developers Portal https://developer.clarivate.com and fill out a form with their name, email address, and an optional description of the project.

Limitations:  Maximum number of tokens per user:1. Maximum number of requests/second:2.

Users may use the API to access the Data Fields in accordance with the applicable License Level, in each case as permitted by your subscription.

If a user is using Web of Science data in an article or presentation they must appropriately cite and credit Clarivate Analytics as the source.

Contact for technical questions: Contact support link here

For more information: https://developer.clarivate.com/

Web of Science API Expanded

What it does: The Web of Science™ API Expanded API supports rich searching across the Web of Science to retrieve full item-level metadata from an expanded list of fields, including times cited counts, contributor addresses/affiliations and funding data. Additional operations support the ability to discover related records as well as cited references and citing items. The API offers full customization and flexibility for researchers who want to build more sophisticated queries, but requires some technical skill and coding ability to get up and running. MIT Libraries’ subscription includes 1 million record downloads per year for the MIT community. 

How it’s accessed: Access is provided through an API key and the WoS Developer Portal (registration required). Due to the limited amount of downloads per year for the MIT community, requests for access need to be reviewed. Please use this form to request access.

Result format: provides REST-based programmatic access to the Web of Science™ documents, with JSON and XML output format 

Contact for technical questions: If you have access to the API and need technical help, please contact clarivate.customersupport@clarivate.com or use the contact form.

For more information: Contact lib-comptool@mit.edu with questions about the use of this tool.

Wiley Text and Data Mining

What it does: Allows text- and data-mining access to content in the Wiley Online Library

How it’s accessed: Accessible via CrossRef’s TDM service; RESTful interface

Result format: JSON

How to register: Must be part of a subscribing institution to have full text access. Users will encounter a click-through agreement and will receive a Client API Token, which is needed when requesting full text of articles

Limitations: rate-limits implemented through CrossRef rate-limiting headers, exact limitations not specified

Contact for technical questions: TDM@wiley.comlabs@crossref.org for support using the CrossRef TDM service

For more information: http://olabout.wiley.com/WileyCDA/Section/id-826542.html

World Bank APIs

What they do: Provide access to World Bank statistical databases, indicators, projects, and loans, credits, financial statements and other data related to financial operations

How they’re accessed: Three RESTful APIs available to provide access to different datasets: Indicators (time series data), Projects (data on the World Bank’s operations), Finances (World Bank financial data)

Result format: XML, JSON, RDF, and Atom, depending on specific API used

How to register: Free to use, no registration or API key required

Limitations: Request volume limits are unspecified, but should be “reasonable”

Contact for technical questions: data@worldbank.org or “Contact support” link here

For more information: https://datahelpdesk.worldbank.org/knowledgebase/topics/125589

Alpha Vantage Stock API

What it does: The Alpha Vantage Stock API Service offers pre-processed and normalized finance and economic data for stocks, ETFs, mutual funds, foreign exchange rates, financial reports from SEC filings, and over 50 derived technical indicators

How it’s accessed: API calls are made using any web-enabled client (e.g. a web browser) to make an HTTP GET request to an appropriate URL. API users can use the programming language of their choice

Result format: JSON, CSV

How to register: A free API key can be obtained here

Limitations: Each free API key allows up to 500 API calls per day by default. Please reach out to support@alphavantage.co if a higher rate limit is needed

Contact for technical questions:  support@alphavantage.co

For more information: Please refer to the official documentation and the supplementary stock API review article for technical integration guide and financial modeling best practices