Code Monkey home page Code Monkey logo

pymediawiki's Introduction

pymediawiki

Build Status ![Gitter](https://badges.gitter.im/Join Chat.svg)

A package to extract the list of categories for a wikipedia page. Uses MediaWiki API. Aspires to cover all the features listed in the MediaWiki API:Properties.

We have a specially curated list of resources needed for this project in case you are a first time contributor. Check it out!

NB: This repository is more focused towards first time contributors who are new to Github and Open Source.

pymediawiki's People

Contributors

abhigyank avatar abinashmeher999 avatar ayuhsya avatar djr-jsr avatar ghostwriternr avatar theskcd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pymediawiki's Issues

Refactor the code to make it easier to extend and add features later

There are many things that can be improved

  • Create a proper module
  • Do proper exception handling and create new exceptions if needed
  • Currently there is a class WikiCatQuery whose object needs to be instantiated before making any queries. This makes it closed to any extensions. By this logic a new class will have to be added for every new feature, which is wrong. Make an interface with WikiPage class where you only add a method for a new feature. (Resolved in #19)

New name for the project

Now that the project involves more than just categories. I would like to hear suggestions on what should it be named.

Develop caching mechanisms

This would aim to minimize calls to the API for repetitive requests. The idea is to use the information which has already been obtained from the API for a second query of the same page. There are many more concerns like how long before the cache expires etc which can be further discussed here if anyone is interested.

Set up tests

Use a testing framework and setup simple tests that can be run on Travis CI.

Add method for each API property

Checklist of all API properties for reference:

  • images
  • linkshere
  • categories
  • categoryinfo
  • contributors
  • deletedrevisions
  • duplicatefiles
  • extlinks
  • fileusage
  • imageinfo
  • info
  • iwlinks
  • langlinks
  • links
  • pageprops
  • redirects
  • revisions
  • stashimageinfo
  • templates
  • transcludedin

Lazily fetch the list of linkshere

The response of linkshere is huge and it blocks the execution till it returns. So instead of setting the limit as max it would be good to set it to something reasonable and wrap it in a generator.

Set up a virtual environment

An isolated environment to which developers can switch and then start contributing. I would prefer Python 3 environment. This might also require minor refactoring of the code too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.