Comments (7)
Great catch, I hadn't looked at the title portion in awhile. Would you like to open a new issue so that we can track this? Thanks!
from mediawiki.
@lebedov Thank you for reporting this issue. I hope that pymediawiki has been a useful library.
The issue, upon a cursory look, seems to stem from the parsing of disambiguation details. I have a fix that I am working on and hope to push to pypi later today or over the holidays.
from mediawiki.
@lebedov : this should be resolved in version 0.3.16. You can get the latest using pip:
pip install pymediawiki --upgrade
Please let me know how the pymediawiki is working for you and if you run into any other issues!
from mediawiki.
Looks good. Thanks for the quick fix!
from mediawiki.
Hi, thanks for all your work on this project! It has been very handy.
I think there might still be a small bug associated with this issue/associated PR.
243e7a0#diff-c017063330dbc8ad5b88a8c5049c389bR568
Following the link above to the code raising a DisambiguationError, I think the new line if item and hasattr(item, 'title'):
will always evaluate to False
because item
is what's returned by a beautifulsoup find_all()
call, which is a list. Should it be if item and hasattr(item[0], 'title'):
instead?
This led to some confusion on my end, because I've been always seeing the disambiguation['title']
equal the full text of the li
rather than just the title of the link...
from mediawiki.
@awbirdsall can you provide me an example where this is not working correctly? For the pages that I am testing I am seeing what I expected to see. Thanks!
from mediawiki.
@barrust : here's an example for "Lincoln Museum" (https://en.wikipedia.org/wiki/Lincoln_Museum):
from mediawiki import MediaWiki, DisambiguationError
wikipedia = MediaWiki()
try:
p = wikipedia.page("Lincoln Museum")
except DisambiguationError as d:
print(d.details[5]['title'])
print(d.details[5]['description'])
I'd expect the title
here to be only "Ford's Theatre" since that's the title of the link, but instead the title
is the same as the full description
(both are Ford's Theatre, Washington, DC, USA — where Abraham Lincoln was assassinated; known as Lincoln Museum from 1936 to 1965 and legally "Ford's Theater (Lincoln Museum)" since 1965
).
In [8]: print(mediawiki.__version__)
0.6.0
from mediawiki.
Related Issues (20)
- Error when launching QuickStart tests on MacOS HOT 6
- Add support for using proxies HOT 1
- Support for files (URL, uploading user, etc) HOT 1
- Installing Extensions HOT 2
- Ability to set language for a specific lookup HOT 2
- Is an async version planned? HOT 4
- No wikitext in current PyPI package HOT 1
- Listing page revisions and loading a specific revision HOT 1
- feature: add an available_languages property HOT 1
- No rate limiting with rate_limit_wait less than 1 second
- Can not use this api HOT 1
- Disambiguation links ordered alphabetically HOT 1
- Getting data from table HOT 10
- How to get the page I requested and not the first one coming up from the search? HOT 3
- Issue with templates on "forgottenrealms.fandom.com"
- ImportError: cannot import name 'MediaWiki' from 'mediawiki' HOT 2
- Got DisambiguationError for a specific title HOT 2
- Support for wikibase (powering WIkidata) or explicit documenting it as out of scope would be great HOT 4
- categorymembers doesn't return files HOT 7
- Support pagination (continue) HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mediawiki.