Code Monkey home page Code Monkey logo

reconciler's Introduction

reconciler

license pytest status documentation status DOI

reconciler is a python package to reconcile tabular data with various reconciliation services, such as Wikidata, working similarly to what OpenRefine does, but entirely within Python, using Pandas.

Quickstart

You can install the latest version of reconciler from PyPI with:

pip install reconciler

Then to use it:

from reconciler import reconcile
import pandas as pd

# A DataFrame with a column you want to reconcile.
test_df = pd.DataFrame(
    {
        "City": ["Rio de Janeiro", "São Paulo", "São Paulo", "Natal"],
        "Country": ["Q155", "Q155", "Q155", "Q155"]
    }
)

# Reconcile against type city (Q515), getting the best match for each item.
reconciled = reconcile(test_df["City"], type_id="Q515")

The resulting dataframe would look like this:

id match name score type type_id input_value
Q8678 True Rio de Janeiro 100 city Q515 Rio de Janeiro
Q174 True São Paulo 100 city Q515 São Paulo
Q131620 True Natal 100 municipality of Brazil Q3184121 Natal

In case you want to ensure the results are cities from Brazil, you can specify the property_mapping argument with a specific property-value pair:

# Reconcile against type city (Q515) and items have the country (P17) property equals to Brazil (Q155)
reconciled = reconcile(test_df["City"], type_id="Q515", property_mapping={"P17": test_df["Country"]})

Options

The reconcile() function accepts several options.

  • type_id - The type of items to reconcile against per the API specification.
  • top_res - Either the number of results to return per entry or the string 'all' to return all results.
  • property_mapping - A list of properties to filter results on per the API specification.
  • reconciliation_endpoint - The reconciliation service to connect to. Defaults to https://wikidata.reconci.link/en/api.

Other very useful packages

Although my opinion may be biased, I think reconciler is a pretty nice package. But the thing is, it probably won't fulfill all your Wikidata-related needs. Here are other packages that could help with that:

  • WikidataIntegrator has a lot of very nice, low-level, functions for dealing with various wikidata-related activities, such as item acquisition and programmatic editing.

  • wikidata2df is a very simple utility package for quickly and easily turning wikidata SPARQL queries into Pandas DataFrames.

reconciler's People

Contributors

jvfe avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.