mirandrom / pys2 Goto Github PK

View Code? Open in Web Editor NEW

16.0 2.0 2.0 415 KB

A python library for the Semantic Scholar (S2) API with typed pydantic objects and various nifty functionalities.

Home Page: https://pys2.readthedocs.io/

License: MIT License

Python 100.00%

semantic-scholar s2 scientific-publications scientific-papers science-research citation-graph python

pys2's People

Contributors

Stargazers

Watchers

Forkers

henry-sun1974 joe-nano

pys2's Issues

[docs] clarify behaviour when using ``api_key`` and ``session`` in ``get_paper``

Is your feature request related to a problem? Please describe.
It's not clear from the docs how api_key and session interact together in get_paper

Describe the solution you'd like
Clarify interaction in docstrings.

Describe alternatives you've considered
NA

Additional context
NA

Don't scrape all of a paper's neighbours by default when adding it to the S2Graph

PyS2/s2/graph/builder.py

Line 187 in 9bf44f0

if self.hopper.hop(gpath, self.graph):

Currently, adding a paper to the S2Graph implies scraping and adding all of its neighbours as well. This choice was made to allow dynamic graph exploration (with e.g. a reinforcement learning based GraphHopper) that requires information about paper (and thus scraping it) before hopping to it.

However, for simple rule-based GraphHoppers, this can add a lot of undesirable overhead when papers have large amounts of citations/references. I think the best solution would be to let S2DataStore objects lazily query the API when a paper/author is not locally cached. That way if the GraphHopper doesn't need scraped paper information (e.g. if the decision is based only on the edge type), then API calls are avoided.

add function for construction citation graph of a paper

Is your feature request related to a problem? Please describe.
Constructing a paper's citation graph is a common use case for the S2 API.

Describe the solution you'd like
A function for creating the citation graph of a given paperId, with a specified traversal depth in the direction of references and of citations. Include intermediate data structures to allow recovery in case of interruption (i.e. deque of unscraped paperId's; set of scraped paperId's; set of papers with errors; set of papers not found). Decide on final data structure of citation graph, how its stored/saved, and possibly an html visualization.

Describe alternatives you've considered
NA

Additional context
NA

add logging

Is your feature request related to a problem? Please describe.
Currently, if a response contains an error status code, information about the paperId/authorId is printed. This doesn't work well when trying to perform large amounts of requests over an extended period of time.

Describe the solution you'd like
Use logging

Describe alternatives you've considered
NA

Additional context
NA

mirandrom / pys2 Goto Github PK

pys2's People

Contributors

Stargazers

Watchers

Forkers

pys2's Issues

[docs] clarify behaviour when using ``api_key`` and ``session`` in ``get_paper``

Don't scrape all of a paper's neighbours by default when adding it to the S2Graph

add function for construction citation graph of a paper

add logging

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent