Comments (9)
I ended up switching to a different library for my project but, for others use, it might help to understand which aspect is outdated in the Colab.
The package is installed using the setup instructions from the readme.
Is this the part that's outdated?
The method used is the first method mentioned under basic usage in the README.
Is that method no longer used?
from geograpy3.
Confirmed that it is not an environment issue with a minimal reproduction on Colab: https://colab.research.google.com/drive/1iZ8Qk8p044jF1hWNDxQjFPZ_MruOefNW?usp=sharing
from geograpy3.
Same thing also happens with
places = geograpy.get_geoPlace_context(text="England")
print(places)
Output:
countries=['United States of America']
regions=[]
cities=['England']
other=[]
from geograpy3.
One more case of odd mapping to the United States:
places = geograpy.get_geoPlace_context(text="Republic of Korea")
print(places)
Output:
countries=['United States of America']
regions=[]
cities=['Republic', 'Korea']
other=[]
from geograpy3.
You are using an outdated API.
from geograpy3.
no - that method is outdated and unreliable. A more reliable method is to combine nominatim access + geograpy with details about countries and regions in combination with a "voting" mechanism. See .e.g the code below
using
locations=self.locationContext.locateLocation(locationText)
'''
Created on 2021-08-11
@author: wf
'''
#from lodstorage.entity import EntityManager
from geograpy.locator import LocationContext
from geograpy.nominatim import NominatimWrapper
import sys
class LocationLookup:
'''
lookup locations
'''
predefinedLocations={}
@classmethod
def initPredefinedLocations(cls):
locMap={
"Not Known": None,
"Online": None,
"Virtual": None,
"Virtual, USA": None,
"Virtual Event, USA": None,
"Amsterdam": "Q727",
"Amsterdam, Amsterdam": "Q727",
"Amsterdam Netherlands": "Q727",
"Amsterdam, Netherlands": "Q727",
"Amsterdam, The Netherlands":"Q727",
"Amsterdam The Netherlands": "Q727",
"Będlewo, Poland": "Q5005546",
"Bergen, Norway":"Q26793",
"Bremen, Germany": "Q24879",
"Brussels, Belgium": "Q239",
"Brussels Belgium": "Q239",
"Cancun, Mexico":"Q8969",
"Cancún, Mexico": "Q8969",
"Gdansk, Poland":"Q1792",
"Heraklion, Crete, Greece":"Q160544",
"Красноярск": "Q919",
"Luxembourg, Luxembourg":"Q1842",
"Macau, Macau, China":"Q14773",
"Marina del Rey, CA": "Q988140",
"Monterrey, Mexico":"Q81033",
"Москва": "Q649",
"New Delhi": "Q987",
"New Delhi, India": "Q987",
"Новосибирск":"Q883",
"Salamanca, Spain": "Q15695",
"Skovde, Sweden": "Q21166",
"St. Petersburg": "Q656",
"Санкт-Петербург": "Q656",
"Saint-Petersburg, Russia":"Q656",
"Thessaloniki": "Q17151",
"Thessaloniki, Greece": "Q17151",
"Trondheim, Norway":"Q25804",
"Valencia": "Q8818",
"Valencia, Spain": "Q8818",
"Valencia, Valencia, Spain": "Q8818",
"York, UK":"Q42462"
}
for city,region,regionCode,countryCode,country,wikiDataId in [
("Albuquerque","New Mexico","NM","USA","United States","Q34804"),
("Alexandria","Virginia","VA","USA","United States","Q88"),
("Cambridge",None,None,"UK","United Kingdom","Q21713103"),
("Cambridge","Massachusetts","MA","USA","United States","Q49111"),
("Charleston","South Carolina","SC","USA","United States","Q47716"),
("Lake Louise","Alberta","AB","CA","Canada","Q12826048"),
("Los Angeles","California","CA","USA","United States","Q65"),
("Miami Beach","Florida", "FL", "USA","United States","Q201516"),
("Montreal","Quebec","QC","CA","Canada","Q340"),
("Montréal","Quebec","QC","CA","Canada","Q340"),
("New Brunswick","New Jersey","NJ","USA","United States","Q138338"),
("New Port Beach","California","CA","USA","United States","Q268873"),
("Newport Beach","California","CA","USA","United States","Q268873"),
("New Orleans","Louisiana","LA","USA","United States","Q34404"),
("New York","New York","NY","USA","United States", "Q60"),
("Palo Alto","California","CA","USA","United States","Q47265"),
("Pasadena","California","CA","USA","United States","Q485176"),
("Phoenix","Arizona","AZ","USA","United States", "Q16556"),
("San Diego", "California","CA","USA","United States", "Q16552"),
("San Francisco","California","CA","USA","United States", "Q62"),
("San Jose","California","CA","USA","United States", "Q16553"),
("Santa Barbara","California","CA","USA","United States", "Q159288"),
("Santa Fe","New Mexico","NM","USA","United States","Q38555"),
("Snowbird","Utah","UT","USA","United States","Q3487194"),
("St. Louis","Missouri","MO","USA","United States", "Q38022"),
("Toronto","Ontario","ON","CA","Canada", "Q172"),
("Waikiki, Honolulu","Hawaii","HI","USA","United States","Q254861")
]:
terms=[
f"{city}, {country}",
f"{city}, {countryCode}"
]
if region is not None:
terms.extend([
f"{city}, {region}",
f"{city} {region}",
f"{city}, {regionCode}",
f"{city} {regionCode}",
f"{city}, {region}, {country}",
f"{city} {region} {country}"
f"{city}, {region}, {countryCode}",
f"{city} {region} {countryCode}",
f"{city}, {regionCode}, {country}",
f"{city} {regionCode} {country}",
f"{city}, {regionCode}, {countryCode}",
f"{city} {regionCode} {countryCode}"
])
for term in terms:
locMap[term]=wikiDataId
cls.preDefinedLocations=locMap
cls.other={
"Washington, DC, USA": "Q61",
"Bangalore": "Q1355",
"Bangalore, India": "Q1355",
"Xi'an": "Q5826",
"Xi'an, China": "Q5826",
"Virtual Event USA": "Q30",
"Virtual USA": "Q30",
"London United Kingdom": "Q84",
"Brno":"Q14960",
"Cancun":"Q8969",
"Gothenburg Sweden": "Q25287",
"Zurich, Switzerland": "Q72",
"Barcelona Spain": "Q1492",
"Vienna Austria": "Q1741",
"Seoul Republic of Korea": "Q8684",
"Seattle WA USA": "Q5083",
"Singapore Singapore":"Q334",
"Tokyo Japan": "Q1490",
"Vancouver BC Canada": "Q24639",
"Vancouver British Columbia Canada": "Q24639",
"Paris France": "Q90",
"Nagoya": "Q11751",
"Marrakech":"Q101625",
"Austin Texas":"Q16559",
"Chicago IL USA":"Q1297",
"Bangkok Thailand":"Q1861",
"Firenze, Italy":"Q2044",
"Florence Italy":"Q2044",
"Timisoara":"Q83404",
"Langkawi":"Q273303",
"Beijing China":"Q956",
"Berlin Germany": "Q64",
"Prague Czech Republic":"Q1085",
"Portland Oregon USA":"Q6106",
"Portland OR USA":"Q6106",
"Pittsburgh PA USA":"Q1342",
"Новосибирск":"Q883",
"Los Angeles CA USA":"Q65",
"Kyoto Japan": "Q34600"
}
cls.predefinedLocations=locMap
def __init__(self):
'''
Constructor
'''
LocationLookup.initPredefinedLocations()
self.locationContext=LocationContext.fromCache()
cacheRootDir=LocationContext.getDefaultConfig().cacheRootDir
cacheDir=f"{cacheRootDir}/.nominatim"
self.nominatimWrapper=NominatimWrapper(cacheDir=cacheDir)
def getCityByWikiDataId(self,wikidataID:str):
'''
get the city for the given wikidataID
'''
citiesGen=self.locationContext.cityManager.getLocationsByWikidataId(wikidataID)
if citiesGen is not None:
cities=list(citiesGen)
if len(cities)>0:
return cities[0]
else:
return None
def lookupNominatim(self,locationText:str):
'''
lookup the location for the given locationText (if any)
Args:
locationText(str): the location text to search for
Return:
the location by first finding the wikidata id for the location text and then looking up the location
'''
location=None
wikidataId=self.nominatimWrapper.lookupWikiDataId(locationText)
if wikidataId is not None:
location=self.getCityByWikiDataId(wikidataId)
return location
def lookup(self,locationText:str,logFile=sys.stdout):
'''
lookup a location based on the given locationText
Args:
locationText(str): the location to lookup
logFile: the logFile to use - default is sys.stdout
'''
if locationText in LocationLookup.preDefinedLocations:
locationId=LocationLookup.preDefinedLocations[locationText]
if locationId is None:
return None
else:
location=self.getCityByWikiDataId(locationId)
if location is None:
print(f"❌❌-predefinedLocation {locationText}→{locationId} wikidataId not resolved",file=logFile)
return location
lg=self.lookupGeograpy(locationText)
ln=self.lookupNominatim(locationText)
if ln is not None and lg is not None and not ln.wikidataid==lg.wikidataid:
print(f"❌❌{locationText}→{lg}!={ln}",file=logFile)
return None
return lg
def lookupGeograpy(self,locationText:str):
'''
lookup the given location by the given locationText
'''
locations=self.locationContext.locateLocation(locationText)
if len(locations)>0:
return locations[0]
else:
return None
from geograpy3.
Hi, I'm not sure I follow the example provided. For example, what are those locMap
, and others list provided?
from geograpy3.
The predefinedLocations are completly optional and try to fix cases that were not properly solved by the voting process at the time of creation of the code. The key is to ask to different services and find out whether they agree. You might want to have a more blackbox view on the code although i provided all the details.
Location Lookup Overview
The LocationLookup
class is designed to handle the complexities of geographical location determination. It leverages predefined mappings and external services to ensure accurate and reliable location data.
Class Diagram
Diagram Description
- LocationLookup: Main class that provides methods to initialize location settings, perform location lookups using predefined data, and integrate results from different geolocation services.
- LocationContext: Provides context-specific configuration and caching mechanisms.
- NominatimWrapper: Interfaces with the Nominatim API to fetch geographical data based on location text.
Core Functionalities
- initPredefinedLocations(): Sets up predefined location data to handle known edge cases.
- lookup(): Main method to perform location lookups. It cross-references multiple data sources to confirm location accuracy.
- lookupNominatim() and lookupGeograpy(): These methods are used to fetch location data from specific APIs and compare their results for consistency and accuracy.
The diagram illustrates how LocationLookup
interacts with LocationContext
and NominatimWrapper
to provide comprehensive location resolution services. It is designed to handle ambiguities and discrepancies in location data retrieval, ensuring high reliability.
from geograpy3.
Maybe not a black box view of the code, but this class is not in the documentation and the README shows an example using:
import geograpy
url = 'https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay'
places = geograpy.get_geoPlace_context(url=url)
as previously noted. This is quite different from having to create a class, call two distinct functions, and then compare their respective outputs. If there was a complete example, to extract locations from a website url or a string that would be very helpful.
from geograpy3.
Related Issues (20)
- Geograpy3 for other languages HOT 10
- United Kingdom not recognised as a country. HOT 1
- [BUG] downloads are done on every call instead of just once
- [BUG] warning on loading the JSON Files HOT 2
- API for lookup
- Improve performance by avoiding ORM loading of all data
- [BUG]wikidataid is not unique and labels are not handled as lists HOT 1
- Signapore Michigan in CityLookup but Singapore, Singapore is not ... HOT 4
- Similarity matching is too error prone HOT 2
- [BUG] location.db access fails within read-only docker container HOT 3
- [BUG]OperationalError: no such table: countries HOT 28
- [BUG] Cities are linked to Netherlands, not to Kingdom of the Netherlands HOT 2
- [BUG] Multiple false positives on quite simple text HOT 4
- :curacao: Please support "Curacao" :pray: HOT 1
- :bug: "United Kingdom" is identified as "United States of America" HOT 1
- add nominatim wrapper
- [BUG] UnicodeDecodeError: 'charmap' HOT 3
- [BUG] command not found: geograpy-nltk HOT 2
- convert project to pyproject.toml style
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geograpy3.