komoot / photon Goto Github PK

an open source geocoder for openstreetmap data

License: Apache License 2.0

Shell 0.24% Python 2.21% Java 94.37% Makefile 0.01% CSS 1.42% JavaScript 0.56% HTML 1.18%

elasticsearch java openstreetmap photon geocoding geocoder search reverse-geocoding

photon's Introduction

photon

photon is an open source geocoder built for OpenStreetMap data. It is based on elasticsearch - an efficient, powerful and highly scalable search platform.

photon was started by komoot and provides search-as-you-type and multilingual support. Find our public API and demo on photon.komoot.io. Until October 2020 the API was available under photon.komoot.de. Please update your apps accordingly.

Contribution

All code contributions and bug reports are welcome!

For questions please send an email to our mailing list.

Feel free to test and participate!

Licence

photon software is open source and licensed under Apache License, Version 2.0

Features

high performance
highly scalability
search-as-you-type
multilingual search
location bias
typo tolerance
filter by osm tag and value
filter by bounding box
reverse geocode a coordinate to an address
OSM data import (built upon Nominatim) inclusive continuous updates

Installation

photon requires Java, at least version 11.

Download the search index (72G GB compressed, 159GB uncompressed as of 2023-10-26, worldwide coverage, languages: English, German, French and local name). The search index is updated weekly and thankfully provided by GraphHopper with the support of lonvia. Now get the latest version of photon from the releases.

Make sure you have bzip2 or pbzip2 installed and execute one of these two commands in your shell. This will download, uncompress and extract the huge database in one step:

wget -O - https://download1.graphhopper.com/public/photon-db-latest.tar.bz2 | bzip2 -cd | tar x
# you can significantly speed up extracting using pbzip2 (recommended):
wget -O - https://download1.graphhopper.com/public/photon-db-latest.tar.bz2 | pbzip2 -cd | tar x

Building

photon uses gradle for building. To build the package from source make sure you have a JDK installed. Then run:

./gradlew app:es_embedded:build

This will build and test photon. The final jar can be found in target.

Experimental OpenSearch version

The repository also contains a version that runs against the latest version of OpenSearch. This version is still experimental. To build the OpenSearch version run:

./gradlew app:opensearch:build

The final jar can be found in target/photon-opensearch-<VERSION>.jar.

Indexes produced by this version are not compatible with the ElasticSearch version. There are no prebuilt indexes available. You need to create your own export from a Nominatim database. See 'Customized Search Data' below.

Usage

Start photon with the following command:

java -jar photon-*.jar

Use the -data-dir option to point to the parent directory of photon_data if that directory is not in the default location ./photon_data. Before you can send requests to photon, ElasticSearch needs to load some data into memory so be patient for a few seconds.

Check the URL http://localhost:2322/api?q=berlin to see if photon is running without problems. You may want to use our leaflet plugin to see the results on a map.

To enable CORS (cross-site requests), use -cors-any to allow any origin or -cors-origin with a specific origin as the argument. By default, CORS support is disabled.

Discover more of photon's features with its usage java -jar photon-*.jar -h. The available options are as follows:

-h                    Show help / usage

-cluster              Name of elasticsearch cluster to put the server into (default is 'photon')

-transport-addresses  The comma separated addresses of external elasticsearch nodes where the
                      client can connect to (default is an empty string which forces an internal node to start)

-nominatim-import     Import nominatim database into photon (this will delete previous index)

-nominatim-update     Fetch updates from nominatim database into photon and exit (this updates the index only
                      without offering an API)

-languages            Languages nominatim importer should import and use at run-time, comma separated (default is 'en,fr,de,it')

-default-language     Language to return results in when no explicit language is choosen by the user

-country-codes        Country codes filter that nominatim importer should import, comma separated. If empty full planet is done

-extra-tags           Comma-separated list of additional tags to save for each place

-synonym-file         File with synonym and classification terms

-json                 Import nominatim database and dump it to a json like files in (useful for developing)

-host                 Postgres host (default 127.0.0.1)

-port                 Postgres port (default 5432)

-database             Postgres host (default nominatim)

-user                 Postgres user (default nominatim)

-password             Postgres password (default '')

-data-dir             Data directory (default '.')

-listen-port          Listen to port (default 2322)

-listen-ip            Listen to address (default '0.0.0.0')

-cors-any             Enable cross-site resource sharing for any origin (default CORS not supported)

-cors-origin          Enable cross-site resource sharing for the specified origins, comma separated (default CORS not supported)

-enable-update-api    Enable the additional endpoint /nominatim-update, which allows to trigger updates
                      from a nominatim database

Customized Search Data

If you need search data in other languages or restricted to a country you will need to create your search data by your own. Once you have your Nominatim database ready, you can import the data to photon.

If you haven't already set a password for your Nominatim database user, do it now (change user name and password as you like, below):

su postgres
psql
ALTER USER nominatim WITH ENCRYPTED PASSWORD 'mysecretpassword';

Import the data to photon:

java -jar photon-*.jar -nominatim-import -host localhost -port 5432 -database nominatim -user nominatim -password mysecretpassword -languages es,fr

The import of worldwide data set will take some hours/days, SSD/NVME disks are recommended to accelerate Nominatim queries.

Updating from OSM via Nominatim

To update an existing Photon database from Nominatim, first prepare the Nominatim database with the appropriate triggers:

java -jar photon-*.jar -database nominatim -user nominatim -password ... -nominatim-update-init-for update_user

This script must be run with a user that has the right to create tables, functions and triggers.

'update-user' is the PostgreSQL user that will be used when updating the Photon database. The user needs read rights on the database. The necessary update rights will be granted during initialisation.

Now you can run updates on Nominatim using the usual methods as described in the documentation. To bring the Photon database up-to-date, stop the Nominatim updates and then run the Photon update process:

java -jar photon-*.jar -database nominatim -user nominatim -password ... -nominatim-update

You can also run the photon process with the update API enabled:

java -jar photon-*.jar -enable-update-api -database nominatim -user nominatim -password ...

Then you can trigger updates like this:

curl http://localhost:2322/nominatim-update

This will only start the updates. To check if the updates have finished, use the status API:

curl http://localhost:2322/nominatim-update/status

It returns a single JSON string "BUSY" when updates are in progress or "OK" when another update round can be started.

For your convenience, this repository contains a script to continuously update both Nominatim and Photon using Photon's update API. Make sure you have Photon started with -enable-update-api and then run:

export NOMINATIM_DIR=/srv/nominatim/...
./continuously_update_from_nominatim.sh

where NOMINATIM_DIR is the project directory of your Nominatim installation.

Search API

Search

http://localhost:2322/api?q=berlin

Search with Location Bias

http://localhost:2322/api?q=berlin&lon=10&lat=52

There are two optional parameters to influence the location bias. 'zoom' describes the radius around the center to focus on. This is a number that should correspond roughly to the map zoom parameter of a corresponding map. The default is zoom=16.

The location_bias_scale describes how much the prominence of a result should still be taken into account. Sensible values go from 0.0 (ignore prominence almost completely) to 1.0 (prominence has approximately the same influence). The default is 0.2.

http://localhost:2322/api?q=berlin&lon=10&lat=52&zoom=12&location_bias_scale=0.1

Reverse geocode a coordinate

http://localhost:2322/reverse?lon=10&lat=52

An optional radius parameter can be used to specify a value in kilometers to reverse geocode within. The value has to be greater than 0 and lower than 5000.

http://localhost:2322/reverse?lon=10&lat=52&radius=10

Adapt Number of Results

http://localhost:2322/api?q=berlin&limit=2

Adjust Language

http://localhost:2322/api?q=berlin&lang=it

If omitted the 'accept-language' HTTP header will be used (browsers set this by default). If neither is set the local name of the place is returned. In OpenStreetMap data that's usually the value of the name tag, for example the local name for Tokyo is 東京都.

Filter results by bounding box

Expected format is minLon,minLat,maxLon,maxLat.

http://localhost:2322/api?q=berlin&bbox=9.5,51.5,11.5,53.5

Filter results by tags and values

Note: the filter only works on principal OSM tags and not all OSM tag/value combinations can be searched. The actual list depends on the import style used for the Nominatim database (e.g. settings/import-full.style). All tag/value combinations with a property 'main' are included in the photon database. If one or many query parameters named osm_tag are present, photon will attempt to filter results by those tags. In general, here is the expected format (syntax) for the value of osm_tag request parameters.

Include places with tag: osm_tag=key:value
Exclude places with tag: osm_tag=!key:value
Include places with tag key: osm_tag=key
Include places with tag value: osm_tag=:value
Exclude places with tag key: osm_tag=!key
Exclude places with tag value: osm_tag=:!value

For example, to search for all places named berlin with tag of tourism=museum, one should construct url as follows:

http://localhost:2322/api?q=berlin&osm_tag=tourism:museum

Or, just by they key

http://localhost:2322/api?q=berlin&osm_tag=tourism

You can also use this feature for reverse geocoding. Want to see the 5 pharmacies closest to a location ?

http://localhost:2322/reverse?lon=10&lat=52&osm_tag=amenity:pharmacy&limit=5

Filter results by layer

List of available layers:

house
street
locality
district
city
county
state
country
other (e.g. natural features)

http://localhost:2322/api?q=berlin&layer=city&layer=locality

Example above will return both cities and localities.

Results as GeoJSON

{
  "features": [
    {
      "properties": {
        "name": "Berlin",
        "state": "Berlin",
        "country": "Germany",
        "countrycode": "DE",
        "osm_key": "place",
        "osm_value": "city",
        "osm_type": "N",
        "osm_id": 240109189
      },
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [13.3888599, 52.5170365]
      }
    },
    {
      "properties": {
        "name": "Berlin Olympic Stadium",
        "street": "Olympischer Platz",
        "housenumber": "3",
        "postcode": "14053",
        "state": "Berlin",
        "country": "Germany",
        "countrycode": "DE",
        "osm_key": "leisure",
        "osm_value": "stadium",
        "osm_type": "W",
        "osm_id": 38862723,
        "extent": [13.23727, 52.5157151, 13.241757, 52.5135972]
      },
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [13.239514674078611, 52.51467945]
      }
    }
  ]
}

Structured queries

The OpenSeach based version of photon has opt-in support for structured queries. See docs/structured.md for details. Please note that structured queries are disabled for photon.komoot.io.

Related Projects

photon's search configuration was developed with a specific test framework. It is written in Python and hosted separately.
R package to access photon's public API with R
PHP Geocoder provider to access photon's public API with PHP using the GeoCoder Package

photon's People

Contributors

Stargazers

Watchers

Forkers

jonaskmt smartcommunitylab diegoguidotti mirkowerner yohanboniface cinemascop89 aredridel ruslan-atr g4vroche richterb clement-tourriere christophlingg gijs lonvia mwethington josefcs kpanic detlevn daryltucker muzuro umitkose falkplan huangsong rajanski mantesat sdole lilithwittmann karussell maximorlovsky nathan-muir lyxung routexl amnesia7 chromacode friedrichmueller candiani d-n-ust cylon-v zhaochunliang maraev a4x rafaeljribeiro sevenred1995 sguignot hipsterpercaso aaam masda70 nagyistoce pannous amandasaurus twain47 sellingerd dbuentello christophmayrhofer longuyen1 shanni lambder coolbole hendrikroth iyobo 425296516 aalidnow boonya-java mogic-le mannieg biddyweb nagyistge cyecp andrelohmann wangchunyu907 gtergeomatica mikecee ayudhien dgram schiegg reuschling sujancse rcplay jy03189211 mfdz spaceforcellc ddmng va2ron1 ramy-ahmed respe xydinesh from1900 bazantech rawder manzhikov amineja frolv by-implication shashank734 flitsmeister otbutz cryptoguys asebert soualid arpitabatra

photon's Issues

Improve performance of Nominatim importer

Possible changes to the importer to speed up the process and reduce IO load:

parallelize nominatim export and elastic search import
query place_addressline directly instead of using get_addressdata
cache country names for context and use calculated_country_code for lookups
reuse address for places with rank_search > 27 and same parent

create a load test

check the performance and find bottlenecks of photon to evaluate usgage on osm.org

maybe sarah can provide some logs of search term from nominatim. on komoot we also logs of this kind

Spark App vs. python

Do you intend do move the query python code into Java e.g. via spark?

Use alias 'photon' instead of index

Instead of creating an index called 'photon' it should be thought about creating an index called e.g. 'photon_1' or 'photon_' and add an alias then called 'photon'. This could make it easy in production to feed a new index without touching the old and fast switching. Not sure if this should be part of this project but this is an easy change and increases flexibility.

[Question] Location bias

How does location bias work?

On http://photon.komoot.de/api results seem to be only loosely influenced by different lat|lon parameters, and some results that are much farther from the given point, come before others that are much nearer.

Is it possible to order results based on distance from the point given as lat|lon?

Would it be possible to provide a bounding box to search, so that search wouldn't return results outside that box?

[Question] Getting started

How does one import data without installing Nominatim? The readme says

curl http://localhost:4567/dump/import # not working yet!

so what should one do...?

osm2geojson

You mention in the readme that it can take up to 12 days to create the nominatim database dump from scratch. I spend a bit of time recently coming up with a way to convert the osm planet xml dump to a joined json data set that you can build in around 15 hours. All it does is join nodes, ways, and relations but from there it should be quite easy to extract whatever information you need without having to setup nominatim.

I've actually been considering doing a little geocoder on top of elastic search based on this so I'm quite interested in your experiences with photon.

If you are interested, the project for my conversion tool is here: https://github.com/jillesvangurp/osm2geojson

Some entries are 'null', should be none existing or contain a value

Is it expected that some entries don't have a 'default' or 'en' entry? Even worse they contain 'null' (not printed here as gson is too clever)

{
  "_index": "photon",
  "_type": "place",
  "_id": "1406467",
  "_score": 10.824916,
  "_source": {
    "osm_key": "place",
    "city": {
      "default": "Frankfort"
    },
    "osm_value": "village",
    "osm_id": "306444989",
    "ranking": 11,
    "context": {
      "default": "Saint Mary, Middlesex County"
    },
    "id": "1406467",
    "country": {
      "de": "Jamaika",
      "it": "Giamaica",
      "fr": "Jamaïque"
    },
    "coordinate": "18.418850,-77.053417",
    "name": {
      "default": "Frankfort"
    }
  }
}

Returning weak partial matches

Sometimes one needs to do a geocoding lookup on an address that contains irrelevant data, such as an unknown condominium name. For example when searching for "Grand Parkview Asoke Unit 233/11 Sukhumvit 21" then none of the information is in OSM's database other than "Sukhumvit 21".

I would consider this an example of a weak partial match. For my use case, such irrelevant data needs to be ignored and therefore partial matches need to be returned too. E.g. http://photon.komoot.de/api/?q=grand%20parkview%20asoke%20sukhumvit%2021 should return the same as http://photon.komoot.de/api/?q=20sukhumvit%2021

Is it possible to perhaps add a score index to each returned feature, such that one can programatically detemrine if the result's score is high enough to be displayed? This varies by application.

assemble a representative set of test cases

So we can track the search performance of photon. Then we can improve photon's search logic and know if it's getting better.

see https://hackpad.com/Photon-sprint-2-r8OrAVWa2oI#:h=Testcases-and-user-feedback

Wordending

Using wordending is kind of a workaround for nedgegram searches like

berlin erlange

which would match berlinerstraße erlangen but better should only match stuff like 'berlin erlange*'.

When this workaround is used - why not avoid edge ngram at all and tokenize the query, plus do a prefix query for the last term? This would save space and memory with same quality. The only problem could be performance but my simple tests for small data don't tell me problems there.

Build failure / curly single quote problem

While trying to build photon I get the following build failure

/photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[131,17] error: unmappable character for encoding ASCII

The same error appears at other places in the same file; it would seem the character used in the source for the single quote in error messages ("Can´t...") is not mappable in ASCII: probably a curly quote instead of a straight single quote?

Would it be possible to search/replace that character in the source?

I think "Can't" would work, but "Can´t" can't.

Full stack trace below:

[DEBUG] Source roots:
[DEBUG]  /photon/src/main/java
[INFO] Compiling 16 source files to /photon/target/classes
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.332s
[INFO] Finished at: Wed May 28 09:05:31 UTC 2014
[INFO] Final Memory: 15M/481M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project photon: Compilation failure: Compilation failure:
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[131,17] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[131,18] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[205,17] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[205,18] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[230,17] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[230,18] error: unmappable character for encoding ASCII
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project photon: Compilation failure
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
        at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
        at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
        at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
        at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
        at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
        at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
        at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)
Caused by: org.apache.maven.plugin.CompilationFailureException: Compilation failure
        at org.apache.maven.plugin.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:516)
        at org.apache.maven.plugin.CompilerMojo.execute(CompilerMojo.java:114)
        at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
        ... 19 more
[ERROR]
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

US postcode are imported even for non US instances

This is only reproduceable on local installations:

Some queries have after their results several fields with completely wrong results. E.g. using an extracted part of austria, the query

http://localhost:2322/api?lang=de&q=Kaisersch%C3%BCtzenstra%C3%9Fe%2017

returns normal results and afterwards there are several fields:

{
            "properties": {
                "osm_id": 31344,
                "postcode": "17503",
                "osm_value": "postcode",
                "country": "Vereinigte Staaten von Amerika"
            },
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [
                    -76.08483701582313,
                    39.93753065153189
                ]
            }
        },

Which shouldn't even be in the extract. Is there an error in the extraction, or does photon "invent" these results?

Integration external test cases

there are a bunch of existing test cases (from nominatim, komoot, ...) that can be integrated into our own test framework to help us finding bugs an improvments

cannot find small housenumbers in type-ahead mode

when searching for street + housenumber, e.g. bödmerstraße 7 the right address can be found because both tokens can be found in collector's raw field.

searching for bödmerstr 7 won't be successful. bödmerstr is found in the edgengram field but 7 got stripped because edgengram starts with 2 characters:

"photonngram": {
    "type": "edgeNGram",
    "min_gram": 2,
    "max_gram": 15
}

consider international names of streets

Support arbitrary languages

Photon should be able to do searches over all available language variants mapped in OSM (i.e. all name:* tags). Bonus points if it can improve search results by guessing the language of the query right and re-weight the results accordingly.

Unable to find "Via Ca' de Volpi" even with exact string

Searching
Via Ca' de Volpi
yields no result even if a street with the exact name exists, see this way

Thanks.

Cristian

Precisions in readme : Import Hardware

Hi,

You mention in readme "It takes up to 10 days and sufficient RAM to import the entire world,".

Could you define "sufficient RAM" ?
I may be about to rent a new server for this purpose (unless "sufficient" is 300GB...)

Wrong info in README

In README it is indicated that demo UI is in src/main/python/demo, it seems to be in website/ or it's not the demo UI in that dir ?

Missing fields in schema.xml

When I am trying to upload iceland data I got some errors of missing fields. I had to add thes fields to the schema:

Probably I made some errors during the installation.

remove street duplicates

it is likely that one street in osm is composed by multiple independent osm ways, e.g.:

at the moment when searching for this street Bödmerstraße, photon returns multiple results of the same street. ideally, connected ways with the same name are merged on index time.

Feature request: bounding box in results

To show the results for areas at the correct scale it would be helpful if the bounding box could be provided directly as part of the result. With the current osm webservices one would either request the full osm data via overpass/xapi or try to get the matching result from nominatim (which unfortunately does not support query by osm_id)

Support multiple names

Sometimes OSM items have more than one name which are stored in tags like int_name, alt_name, loc_name, official_name ...

photon would benefit to consider these names too

Provide filters

It always depends on how the search mechanism should work, but from my point of view filters could be generally very usable (and would be for our case).

Filters:
Provide a way to filter results, e.g.

q=Hauptstraße&city=Salzburg

and the result will only show results from the city of Salzburg.

and as an idea:

q=Hauptstraße&city=Salzburg&city:mode=fuzzy

this result will also include fuzzy matches, like
Salzberg and Salzbrug (typo).

Filters could be:

city
country
postcode (e.g. 5020 with high fuzzyness would match 5* and low fuzzyness 502?)
name
osm_key (or generalized version)
osm_value (or generalized version)

Why:
Often, when searching for addresses, the user is confronted with impartial and incorrect addresses. The only thing he knows is a part of the name and in which city or part of the city the address is in. This would help greatly for people who have to research addresses.

For frontend-development (i.e. the komoot main site), the query could be totally adjusted to the needs or even user preferences just by the passed parameters.

Additionally as crazy idea (don't know if it is possible with elasticsearch):
Provide a "near" parameter, so you have:

q=Haubdstr&near=(Salzburg  OR lat/lon)

and the search parameter can have lesser matching the nearer it gets to the point provided by "near".

Data import log

I wanted to try photon with Italy data I started yesterday the import and it is not over yet: how do I see where they are? Is there available an import log?
Thank you
Alessandra

core.properties

I am trying to install photon on Ubuntu. I have manually installed SOLR and tried to load the photon config. SOLR is not able to load collection1 without the cor.properties file.

documentation for leaflet plugin

yohan created a leaflet plugin (https://github.com/komoot/leaflet.photon) to easily use photon's public API in a leaflet map.

it is yet a secret, so let people know about it on the project page!

Address parser

Another solution for the problem described in #56 could be to use an address parser. If the search backend is in Java one could use this address parser here

unknown -l option

Hi,

I tried to use NominatimImporter with the options described in documentation sample, but it seems that -l option doesn't exist (reading the source code I don't find it).

greetings
Mirko

how does search as you type work?

Could you provide an elaboration on the search as you type feature? Does it create a new query for every word that is inserted?

Moreover, is there an online demo available that I could check out?

Any feedback is highly appreciated.

Tom

Phonetic analyzer

One could try this analyzer. Not sure if this is necessary if fuzzy is enabled. E.g. fuzzy is already able to find strasse vs. straße

spatial extent of search item

when searching for extended items like germany you will see a marker a wood in a very high zoom level.

currently one the centroid of the geometries is stored. if we would store the extent of items too, we could adjust the zoom level to see the entire item on the map.

build a generic test framework to compare geocoders

see https://hackpad.com/Photon-sprint-2-r8OrAVWa2oI#:h=Testcases-and-user-feedback

consider cutoff_frequency

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-match-query.html

It's a sort of "dynamic" stop words.

adapt nominatim bridge from geo.io

500 Internal Error

Hi there,

I have imported a sub-region from geofabrik.de, imported it into nominatim database and finally started photon. Everything works quite charming, unfortunately the query localhost:2322/api?q=Gutenberg results in a "500 Internal Error" the console output is:
"Error spark.webserver.MatchFilter - java.lang.NullPointerException"

I have no idea whats wrong, do you have any suggestions?

Cheers
Ivan

allow typos

currently small typos like jamiaca instead of jamaica return no result. typos are very common in the world of search engines, for sure there are already best practices to solve it with elastic search.

Decompound words

One more normalization has to be done to improve search. E.g. Erlangerstraße will be split into "Erlanger straße". This has to be done while indexing and searching.

There is a plugin but it is GPL due to one used library, it could be less restrict but phyton code has to be ported to Java: jprante/elasticsearch-analysis-decompound#5

For POIs there is also often the Bahnhof vs. Hauptbahnhof problem. But probably we should get a main railway station that is not named like this also important in a different country. But probably this should be handled via a different fix: #318

Allow searching on osm_key and osm_value

There should be another filter parameter in the API to search in a specific area (bounding box) for a specific tag combination or even a set of combinations ('I want food').

For that osm_key and osm_value have to be indexed in the mapping. Or maybe we make this configurable if this increases size too much.

http://photon.komoot.de/data/world.zip is not available any more...

Following initial step from the quickstart (at https://github.com/komoot/photon) fails ...

# important: we do not yet provide this dump, creation will be finished soon
java -jar target/photon-0.1-SNAPSHOT.jar -import-snapshot http://photon.komoot.de/data/world.zip

... because http://photon.komoot.de/data/world.zip is not available any more!

Where can I download world.zip ?

support synonyms

mount everest can be found on photon, but many users are used to type mt everest and don't get any results so far.

synonyms from the old solr implementation
https://github.com/komoot/photon/tree/master/solrconfig/testing/conf

pelias created a list too https://github.com/mapzen/pelias/blob/master/lib/pelias/tasks/synonym.rake
https://github.com/mapzen/pelias/blob/master/config/feature_synonyms.yml

related to #169 and #97

test fails

QueryParsingException[[photon] script_score the script could not be loaded

Where is this script?

mittelberg

searching for the village mittelberg reveals a lot of places in mittelberg but the actual village cannot be found.

Many address features are missing?

Address geocoding on a worldwide level is complicated for many reasons, of which one is that every single country lists addresses in different ways. For example, in some countries provinces and states are important (Netherlands), whereas in others districts and sub-districts are more important (Thailand).

When I query your API for a road in Bangkok, Thailand, like so:

http://photon.komoot.de/api/?q=sukhumvit

I get the following:

    {
      "features": [
        {
          "properties": {
            "name": "Sukhumvit",
            "osm_value": "neighbourhood",
            "country": "Thailand",
            "osm_id": 2203974487,
            "osm_key": "place",
            "postcode": "10110"
          },
          "geometry": {
            "coordinates": [
              100.565073,
              13.73384
            ],
            "type": "Point"
          },
          "type": "Feature"
        },
        {
          "properties": {
            "country": "Thailand",
            "name": "Sukhumvit Road",
            "osm_value": "tertiary",
            "street": "Sukhumvit Road",
            "osm_id": 232865089,
            "osm_key": "highway"
          },
          "geometry": {
            "coordinates": [
              102.45796,
              12.18993
            ],
            "type": "Point"
          },
          "type": "Feature"
        },
        ... ETC ...
      ],
      "type": "FeatureCollection"
    }

The first entry is the one I need, but the problem is that there is a lot more data in the OSM database about this entry than your API shows. For example, it should return the following entries when available, just like Nominatim does:

- (Object) administrative
- (Object) attraction
- (Object) city
- (Object) city_district
- (Object) clothes
- (Object) commercial
- (Object) country
- (Object) country_code
- (Object) county
- (Object) house_number
- (Object) pedestrian
- (Object) place
- (Object) postcode
- (Object) road
- (Object) state
- (Object) state_district
- (Object) suburb
- (Object) town
- (Object) village

In the case of my query we already know this data: postcode = 10110 (given), country = thailand (given), district = Watthana District (not given, but should be), sub-district = Sukhumvit (not given, but should be), city = Bangkok, province = Bangkok Metropolitan Area (not given, but should be), etc.

Can I make this change myself to let photon return all known data that is relevant to the query? If I would have queried a US street I want to know all data that is available too, so in that case it should return the state, among other fields.

I hope I expressed my request clearly, if not, please tell me. ;)

Importing other sources

I would like at some time to import other sources than OSM into Photon, like the datasets from openaddresses.io or the BANO project, as soon as the licence is compliant.
This will need some changes in the mapping, like having source and source_id keys (instead of just osm_id)

And in the importer, we will need to find a way not to import the data from OSM where it can duplicate with those other sources (that we think are "better" than OSM if we chose to import them), like skipping all place=house if country=France, etc.

I will work on importing some BANO data, to have a first use case that we can discuss.

Proposal: Don't start server after import/dump actions

Right now, the server keeps running after import and dump actions.

This might not be the desired behavior all the time. How about one of these solutions:

1.) Introduce a "-shutdown" parameter, that shuts down photon after executing imports/export/indexing.

2.) Shut down automatically after one of the i/o actions.

I would refactor the code in App.java and send it as a pull request if so desired. Also I can add more description to the docs about the behavior.

provide data dumps

installing nominatim and importing osm files is time consuming. people who don't care about continious updates will be very happy if we provide data dumps (world / country extracts in most common languages). setup photon will then be dead easy and fast.

Importer > Import "importance" Nominatim field

The Nominatim importance field is not imported from Nominatim to Solr.

I think it should be available to be used in boost functions on some cases :

I plan to use photon with a map and sort search results according top map bounds/center.
But...
I would like queries like "Eiffel tower" or "Statue of Liberty" to push the original places first or at least fairly up in the search results (I can accept that if a replica is in or near map bounds, it shows up first).

I believe this could be accomplished by introducing importance field from nominatim and a custom query.

Am I wrong ?
Would the drawback on index size be worth it for you ?
The importance field is mapped in NominatimEntry model but does not make it to Solr index.
Any historical reason ?

I think I can deal with the solr query, but the import part is tougher to me, since I am not exactly an experimented Java developer...
I'll give it a try (unless you think I am totally wrong), but if anyone feels like they can acheive this "easily" they're welcome.

enable reverse geocoding

... and let other people know your feature on the project website