Code Monkey home page Code Monkey logo

photon's Introduction

photon

Continuous Integration

photon is an open source geocoder built for OpenStreetMap data. It is based on elasticsearch - an efficient, powerful and highly scalable search platform.

photon was started by komoot and provides search-as-you-type and multilingual support. Find our public API and demo on photon.komoot.io. Until October 2020 the API was available under photon.komoot.de. Please update your apps accordingly.

Contribution

All code contributions and bug reports are welcome!

For questions please send an email to our mailing list.

Feel free to test and participate!

Licence

photon software is open source and licensed under Apache License, Version 2.0

Features

  • high performance
  • highly scalability
  • search-as-you-type
  • multilingual search
  • location bias
  • typo tolerance
  • filter by osm tag and value
  • filter by bounding box
  • reverse geocode a coordinate to an address
  • OSM data import (built upon Nominatim) inclusive continuous updates

Installation

photon requires Java, at least version 11.

Download the search index (72G GB compressed, 159GB uncompressed as of 2023-10-26, worldwide coverage, languages: English, German, French and local name). The search index is updated weekly and thankfully provided by GraphHopper with the support of lonvia. Now get the latest version of photon from the releases.

Make sure you have bzip2 or pbzip2 installed and execute one of these two commands in your shell. This will download, uncompress and extract the huge database in one step:

wget -O - https://download1.graphhopper.com/public/photon-db-latest.tar.bz2 | bzip2 -cd | tar x
# you can significantly speed up extracting using pbzip2 (recommended):
wget -O - https://download1.graphhopper.com/public/photon-db-latest.tar.bz2 | pbzip2 -cd | tar x

Building

photon uses gradle for building. To build the package from source make sure you have a JDK installed. Then run:

./gradlew app:es_embedded:build

This will build and test photon. The final jar can be found in target.

Experimental OpenSearch version

The repository also contains a version that runs against the latest version of OpenSearch. This version is still experimental. To build the OpenSearch version run:

./gradlew app:opensearch:build

The final jar can be found in target/photon-opensearch-<VERSION>.jar.

Indexes produced by this version are not compatible with the ElasticSearch version. There are no prebuilt indexes available. You need to create your own export from a Nominatim database. See 'Customized Search Data' below.

Usage

Start photon with the following command:

java -jar photon-*.jar

Use the -data-dir option to point to the parent directory of photon_data if that directory is not in the default location ./photon_data. Before you can send requests to photon, ElasticSearch needs to load some data into memory so be patient for a few seconds.

Check the URL http://localhost:2322/api?q=berlin to see if photon is running without problems. You may want to use our leaflet plugin to see the results on a map.

To enable CORS (cross-site requests), use -cors-any to allow any origin or -cors-origin with a specific origin as the argument. By default, CORS support is disabled.

Discover more of photon's features with its usage java -jar photon-*.jar -h. The available options are as follows:

-h                    Show help / usage

-cluster              Name of elasticsearch cluster to put the server into (default is 'photon')

-transport-addresses  The comma separated addresses of external elasticsearch nodes where the
                      client can connect to (default is an empty string which forces an internal node to start)

-nominatim-import     Import nominatim database into photon (this will delete previous index)

-nominatim-update     Fetch updates from nominatim database into photon and exit (this updates the index only
                      without offering an API)

-languages            Languages nominatim importer should import and use at run-time, comma separated (default is 'en,fr,de,it')

-default-language     Language to return results in when no explicit language is choosen by the user

-country-codes        Country codes filter that nominatim importer should import, comma separated. If empty full planet is done

-extra-tags           Comma-separated list of additional tags to save for each place

-synonym-file         File with synonym and classification terms

-json                 Import nominatim database and dump it to a json like files in (useful for developing)

-host                 Postgres host (default 127.0.0.1)

-port                 Postgres port (default 5432)

-database             Postgres host (default nominatim)

-user                 Postgres user (default nominatim)

-password             Postgres password (default '')

-data-dir             Data directory (default '.')

-listen-port          Listen to port (default 2322)

-listen-ip            Listen to address (default '0.0.0.0')

-cors-any             Enable cross-site resource sharing for any origin (default CORS not supported)

-cors-origin          Enable cross-site resource sharing for the specified origins, comma separated (default CORS not supported)

-enable-update-api    Enable the additional endpoint /nominatim-update, which allows to trigger updates
                      from a nominatim database

Customized Search Data

If you need search data in other languages or restricted to a country you will need to create your search data by your own. Once you have your Nominatim database ready, you can import the data to photon.

If you haven't already set a password for your Nominatim database user, do it now (change user name and password as you like, below):

su postgres
psql
ALTER USER nominatim WITH ENCRYPTED PASSWORD 'mysecretpassword';

Import the data to photon:

java -jar photon-*.jar -nominatim-import -host localhost -port 5432 -database nominatim -user nominatim -password mysecretpassword -languages es,fr

The import of worldwide data set will take some hours/days, SSD/NVME disks are recommended to accelerate Nominatim queries.

Updating from OSM via Nominatim

To update an existing Photon database from Nominatim, first prepare the Nominatim database with the appropriate triggers:

java -jar photon-*.jar -database nominatim -user nominatim -password ... -nominatim-update-init-for update_user

This script must be run with a user that has the right to create tables, functions and triggers.

'update-user' is the PostgreSQL user that will be used when updating the Photon database. The user needs read rights on the database. The necessary update rights will be granted during initialisation.

Now you can run updates on Nominatim using the usual methods as described in the documentation. To bring the Photon database up-to-date, stop the Nominatim updates and then run the Photon update process:

java -jar photon-*.jar -database nominatim -user nominatim -password ... -nominatim-update

You can also run the photon process with the update API enabled:

java -jar photon-*.jar -enable-update-api -database nominatim -user nominatim -password ...

Then you can trigger updates like this:

curl http://localhost:2322/nominatim-update

This will only start the updates. To check if the updates have finished, use the status API:

curl http://localhost:2322/nominatim-update/status

It returns a single JSON string "BUSY" when updates are in progress or "OK" when another update round can be started.

For your convenience, this repository contains a script to continuously update both Nominatim and Photon using Photon's update API. Make sure you have Photon started with -enable-update-api and then run:

export NOMINATIM_DIR=/srv/nominatim/...
./continuously_update_from_nominatim.sh

where NOMINATIM_DIR is the project directory of your Nominatim installation.

Search API

Search

http://localhost:2322/api?q=berlin

Search with Location Bias

http://localhost:2322/api?q=berlin&lon=10&lat=52

There are two optional parameters to influence the location bias. 'zoom' describes the radius around the center to focus on. This is a number that should correspond roughly to the map zoom parameter of a corresponding map. The default is zoom=16.

The location_bias_scale describes how much the prominence of a result should still be taken into account. Sensible values go from 0.0 (ignore prominence almost completely) to 1.0 (prominence has approximately the same influence). The default is 0.2.

http://localhost:2322/api?q=berlin&lon=10&lat=52&zoom=12&location_bias_scale=0.1

Reverse geocode a coordinate

http://localhost:2322/reverse?lon=10&lat=52

An optional radius parameter can be used to specify a value in kilometers to reverse geocode within. The value has to be greater than 0 and lower than 5000.

http://localhost:2322/reverse?lon=10&lat=52&radius=10

Adapt Number of Results

http://localhost:2322/api?q=berlin&limit=2

Adjust Language

http://localhost:2322/api?q=berlin&lang=it

If omitted the 'accept-language' HTTP header will be used (browsers set this by default). If neither is set the local name of the place is returned. In OpenStreetMap data that's usually the value of the name tag, for example the local name for Tokyo is 東京都.

Filter results by bounding box

Expected format is minLon,minLat,maxLon,maxLat.

http://localhost:2322/api?q=berlin&bbox=9.5,51.5,11.5,53.5

Filter results by tags and values

Note: the filter only works on principal OSM tags and not all OSM tag/value combinations can be searched. The actual list depends on the import style used for the Nominatim database (e.g. settings/import-full.style). All tag/value combinations with a property 'main' are included in the photon database. If one or many query parameters named osm_tag are present, photon will attempt to filter results by those tags. In general, here is the expected format (syntax) for the value of osm_tag request parameters.

  1. Include places with tag: osm_tag=key:value
  2. Exclude places with tag: osm_tag=!key:value
  3. Include places with tag key: osm_tag=key
  4. Include places with tag value: osm_tag=:value
  5. Exclude places with tag key: osm_tag=!key
  6. Exclude places with tag value: osm_tag=:!value

For example, to search for all places named berlin with tag of tourism=museum, one should construct url as follows:

http://localhost:2322/api?q=berlin&osm_tag=tourism:museum

Or, just by they key

http://localhost:2322/api?q=berlin&osm_tag=tourism

You can also use this feature for reverse geocoding. Want to see the 5 pharmacies closest to a location ?

http://localhost:2322/reverse?lon=10&lat=52&osm_tag=amenity:pharmacy&limit=5

Filter results by layer

List of available layers:

  • house
  • street
  • locality
  • district
  • city
  • county
  • state
  • country
  • other (e.g. natural features)
http://localhost:2322/api?q=berlin&layer=city&layer=locality

Example above will return both cities and localities.

Results as GeoJSON

{
  "features": [
    {
      "properties": {
        "name": "Berlin",
        "state": "Berlin",
        "country": "Germany",
        "countrycode": "DE",
        "osm_key": "place",
        "osm_value": "city",
        "osm_type": "N",
        "osm_id": 240109189
      },
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [13.3888599, 52.5170365]
      }
    },
    {
      "properties": {
        "name": "Berlin Olympic Stadium",
        "street": "Olympischer Platz",
        "housenumber": "3",
        "postcode": "14053",
        "state": "Berlin",
        "country": "Germany",
        "countrycode": "DE",
        "osm_key": "leisure",
        "osm_value": "stadium",
        "osm_type": "W",
        "osm_id": 38862723,
        "extent": [13.23727, 52.5157151, 13.241757, 52.5135972]
      },
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [13.239514674078611, 52.51467945]
      }
    }
  ]
}

Structured queries

The OpenSeach based version of photon has opt-in support for structured queries. See docs/structured.md for details. Please note that structured queries are disabled for photon.komoot.io.

Related Projects

photon's People

Contributors

alfonsoguadaltel avatar alpoi-x avatar azbesciak avatar christophlingg avatar cinemascop89 avatar codepainters avatar dependabot[bot] avatar gerungofulus avatar hbruch avatar hendrikmoree avatar jonaskmt avatar karussell avatar krahulreddy avatar leonardehrenfried avatar lilithwittmann avatar lonvia avatar lukaswelte avatar masda70 avatar meetic-mrobin avatar mtmail avatar nakaner avatar otbutz avatar reuschling avatar richterb avatar simonpoole avatar therealpanpan avatar tobiass-sdl avatar trafficant avatar ybert avatar yohanboniface avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

photon's Issues

Returning weak partial matches

Sometimes one needs to do a geocoding lookup on an address that contains irrelevant data, such as an unknown condominium name. For example when searching for "Grand Parkview Asoke Unit 233/11 Sukhumvit 21" then none of the information is in OSM's database other than "Sukhumvit 21".

I would consider this an example of a weak partial match. For my use case, such irrelevant data needs to be ignored and therefore partial matches need to be returned too. E.g. http://photon.komoot.de/api/?q=grand%20parkview%20asoke%20sukhumvit%2021 should return the same as http://photon.komoot.de/api/?q=20sukhumvit%2021

Is it possible to perhaps add a score index to each returned feature, such that one can programatically detemrine if the result's score is high enough to be displayed? This varies by application.

create a load test

check the performance and find bottlenecks of photon to evaluate usgage on osm.org

maybe sarah can provide some logs of search term from nominatim. on komoot we also logs of this kind

Feature request: bounding box in results

To show the results for areas at the correct scale it would be helpful if the bounding box could be provided directly as part of the result. With the current osm webservices one would either request the full osm data via overpass/xapi or try to get the matching result from nominatim (which unfortunately does not support query by osm_id)

Wrong info in README

In README it is indicated that demo UI is in src/main/python/demo, it seems to be in website/ or it's not the demo UI in that dir ?

[Question] Location bias

How does location bias work?

On http://photon.komoot.de/api results seem to be only loosely influenced by different lat|lon parameters, and some results that are much farther from the given point, come before others that are much nearer.

Is it possible to order results based on distance from the point given as lat|lon?

Would it be possible to provide a bounding box to search, so that search wouldn't return results outside that box?

mittelberg

searching for the village mittelberg reveals a lot of places in mittelberg but the actual village cannot be found.

cannot find small housenumbers in type-ahead mode

when searching for street + housenumber, e.g. bödmerstraße 7 the right address can be found because both tokens can be found in collector's raw field.

searching for bödmerstr 7 won't be successful. bödmerstr is found in the edgengram field but 7 got stripped because edgengram starts with 2 characters:

"photonngram": {
    "type": "edgeNGram",
    "min_gram": 2,
    "max_gram": 15
}

Allow searching on osm_key and osm_value

There should be another filter parameter in the API to search in a specific area (bounding box) for a specific tag combination or even a set of combinations ('I want food').

For that osm_key and osm_value have to be indexed in the mapping. Or maybe we make this configurable if this increases size too much.

spatial extent of search item

when searching for extended items like germany you will see a marker a wood in a very high zoom level.

currently one the centroid of the geometries is stored. if we would store the extent of items too, we could adjust the zoom level to see the entire item on the map.

Use alias 'photon' instead of index

Instead of creating an index called 'photon' it should be thought about creating an index called e.g. 'photon_1' or 'photon_' and add an alias then called 'photon'. This could make it easy in production to feed a new index without touching the old and fast switching. Not sure if this should be part of this project but this is an easy change and increases flexibility.

Improve performance of Nominatim importer

Possible changes to the importer to speed up the process and reduce IO load:

  • parallelize nominatim export and elastic search import
  • query place_addressline directly instead of using get_addressdata
  • cache country names for context and use calculated_country_code for lookups
  • reuse address for places with rank_search > 27 and same parent

Data import log

I wanted to try photon with Italy data I started yesterday the import and it is not over yet: how do I see where they are? Is there available an import log?
Thank you
Alessandra

[Question] Getting started

How does one import data without installing Nominatim? The readme says

curl http://localhost:4567/dump/import # not working yet!

so what should one do...?

osm2geojson

You mention in the readme that it can take up to 12 days to create the nominatim database dump from scratch. I spend a bit of time recently coming up with a way to convert the osm planet xml dump to a joined json data set that you can build in around 15 hours. All it does is join nodes, ways, and relations but from there it should be quite easy to extract whatever information you need without having to setup nominatim.

I've actually been considering doing a little geocoder on top of elastic search based on this so I'm quite interested in your experiences with photon.

If you are interested, the project for my conversion tool is here: https://github.com/jillesvangurp/osm2geojson

Missing fields in schema.xml

When I am trying to upload iceland data I got some errors of missing fields. I had to add thes fields to the schema:

Probably I made some errors during the installation.

Provide filters

It always depends on how the search mechanism should work, but from my point of view filters could be generally very usable (and would be for our case).

Filters:
Provide a way to filter results, e.g.

q=Hauptstraße&city=Salzburg

and the result will only show results from the city of Salzburg.

and as an idea:

q=Hauptstraße&city=Salzburg&city:mode=fuzzy

this result will also include fuzzy matches, like
Salzberg and Salzbrug (typo).

Filters could be:

  • city
  • country
  • postcode (e.g. 5020 with high fuzzyness would match 5* and low fuzzyness 502?)
  • name
  • osm_key (or generalized version)
  • osm_value (or generalized version)

Why:
Often, when searching for addresses, the user is confronted with impartial and incorrect addresses. The only thing he knows is a part of the name and in which city or part of the city the address is in. This would help greatly for people who have to research addresses.

For frontend-development (i.e. the komoot main site), the query could be totally adjusted to the needs or even user preferences just by the passed parameters.

Additionally as crazy idea (don't know if it is possible with elasticsearch):
Provide a "near" parameter, so you have:

q=Haubdstr&near=(Salzburg  OR lat/lon)

and the search parameter can have lesser matching the nearer it gets to the point provided by "near".

Some entries are 'null', should be none existing or contain a value

Is it expected that some entries don't have a 'default' or 'en' entry? Even worse they contain 'null' (not printed here as gson is too clever)

{
  "_index": "photon",
  "_type": "place",
  "_id": "1406467",
  "_score": 10.824916,
  "_source": {
    "osm_key": "place",
    "city": {
      "default": "Frankfort"
    },
    "osm_value": "village",
    "osm_id": "306444989",
    "ranking": 11,
    "context": {
      "default": "Saint Mary, Middlesex County"
    },
    "id": "1406467",
    "country": {
      "de": "Jamaika",
      "it": "Giamaica",
      "fr": "Jamaïque"
    },
    "coordinate": "18.418850,-77.053417",
    "name": {
      "default": "Frankfort"
    }
  }
}

provide data dumps

installing nominatim and importing osm files is time consuming. people who don't care about continious updates will be very happy if we provide data dumps (world / country extracts in most common languages). setup photon will then be dead easy and fast.

Wordending

Using wordending is kind of a workaround for nedgegram searches like

berlin erlange

which would match berlinerstraße erlangen but better should only match stuff like 'berlin erlange*'.

When this workaround is used - why not avoid edge ngram at all and tokenize the query, plus do a prefix query for the last term? This would save space and memory with same quality. The only problem could be performance but my simple tests for small data don't tell me problems there.

Many address features are missing?

Address geocoding on a worldwide level is complicated for many reasons, of which one is that every single country lists addresses in different ways. For example, in some countries provinces and states are important (Netherlands), whereas in others districts and sub-districts are more important (Thailand).

When I query your API for a road in Bangkok, Thailand, like so:

http://photon.komoot.de/api/?q=sukhumvit

I get the following:

    {
      "features": [
        {
          "properties": {
            "name": "Sukhumvit",
            "osm_value": "neighbourhood",
            "country": "Thailand",
            "osm_id": 2203974487,
            "osm_key": "place",
            "postcode": "10110"
          },
          "geometry": {
            "coordinates": [
              100.565073,
              13.73384
            ],
            "type": "Point"
          },
          "type": "Feature"
        },
        {
          "properties": {
            "country": "Thailand",
            "name": "Sukhumvit Road",
            "osm_value": "tertiary",
            "street": "Sukhumvit Road",
            "osm_id": 232865089,
            "osm_key": "highway"
          },
          "geometry": {
            "coordinates": [
              102.45796,
              12.18993
            ],
            "type": "Point"
          },
          "type": "Feature"
        },
        ... ETC ...
      ],
      "type": "FeatureCollection"
    }

The first entry is the one I need, but the problem is that there is a lot more data in the OSM database about this entry than your API shows. For example, it should return the following entries when available, just like Nominatim does:

- (Object) administrative
- (Object) attraction
- (Object) city
- (Object) city_district
- (Object) clothes
- (Object) commercial
- (Object) country
- (Object) country_code
- (Object) county
- (Object) house_number
- (Object) pedestrian
- (Object) place
- (Object) postcode
- (Object) road
- (Object) state
- (Object) state_district
- (Object) suburb
- (Object) town
- (Object) village

In the case of my query we already know this data: postcode = 10110 (given), country = thailand (given), district = Watthana District (not given, but should be), sub-district = Sukhumvit (not given, but should be), city = Bangkok, province = Bangkok Metropolitan Area (not given, but should be), etc.

Can I make this change myself to let photon return all known data that is relevant to the query? If I would have queried a US street I want to know all data that is available too, so in that case it should return the state, among other fields.

I hope I expressed my request clearly, if not, please tell me. ;)

Proposal: Don't start server after import/dump actions

Right now, the server keeps running after import and dump actions.

This might not be the desired behavior all the time. How about one of these solutions:

1.) Introduce a "-shutdown" parameter, that shuts down photon after executing imports/export/indexing.

2.) Shut down automatically after one of the i/o actions.

I would refactor the code in App.java and send it as a pull request if so desired. Also I can add more description to the docs about the behavior.

500 Internal Error

Hi there,

I have imported a sub-region from geofabrik.de, imported it into nominatim database and finally started photon. Everything works quite charming, unfortunately the query localhost:2322/api?q=Gutenberg results in a "500 Internal Error" the console output is:
"Error spark.webserver.MatchFilter - java.lang.NullPointerException"

I have no idea whats wrong, do you have any suggestions?

Cheers
Ivan

US postcode are imported even for non US instances

This is only reproduceable on local installations:

Some queries have after their results several fields with completely wrong results. E.g. using an extracted part of austria, the query

http://localhost:2322/api?lang=de&q=Kaisersch%C3%BCtzenstra%C3%9Fe%2017

returns normal results and afterwards there are several fields:

{
            "properties": {
                "osm_id": 31344,
                "postcode": "17503",
                "osm_value": "postcode",
                "country": "Vereinigte Staaten von Amerika"
            },
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [
                    -76.08483701582313,
                    39.93753065153189
                ]
            }
        },

Which shouldn't even be in the extract. Is there an error in the extraction, or does photon "invent" these results?

how does search as you type work?

Could you provide an elaboration on the search as you type feature? Does it create a new query for every word that is inserted?

Moreover, is there an online demo available that I could check out?

Any feedback is highly appreciated.

Tom

Integration external test cases

there are a bunch of existing test cases (from nominatim, komoot, ...) that can be integrated into our own test framework to help us finding bugs an improvments

Decompound words

One more normalization has to be done to improve search. E.g. Erlangerstraße will be split into "Erlanger straße". This has to be done while indexing and searching.

There is a plugin but it is GPL due to one used library, it could be less restrict but phyton code has to be ported to Java: jprante/elasticsearch-analysis-decompound#5

For POIs there is also often the Bahnhof vs. Hauptbahnhof problem. But probably we should get a main railway station that is not named like this also important in a different country. But probably this should be handled via a different fix: #318

Build failure / curly single quote problem

While trying to build photon I get the following build failure

/photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[131,17] error: unmappable character for encoding ASCII

The same error appears at other places in the same file; it would seem the character used in the source for the single quote in error messages ("Can´t...") is not mappable in ASCII: probably a curly quote instead of a straight single quote?

Would it be possible to search/replace that character in the source?

I think "Can't" would work, but "Can´t" can't.

Full stack trace below:

[DEBUG] Source roots:
[DEBUG]  /photon/src/main/java
[INFO] Compiling 16 source files to /photon/target/classes
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.332s
[INFO] Finished at: Wed May 28 09:05:31 UTC 2014
[INFO] Final Memory: 15M/481M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project photon: Compilation failure: Compilation failure:
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[131,17] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[131,18] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[205,17] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[205,18] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[230,17] error: unmappable character for encoding ASCII
[ERROR]
[ERROR] /photon/src/main/java/de/komoot/photon/importer/elasticsearch/Server.java:[230,18] error: unmappable character for encoding ASCII
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project photon: Compilation failure
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
        at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
        at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
        at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
        at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
        at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
        at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
        at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
        at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
        at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)
Caused by: org.apache.maven.plugin.CompilationFailureException: Compilation failure
        at org.apache.maven.plugin.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:516)
        at org.apache.maven.plugin.CompilerMojo.execute(CompilerMojo.java:114)
        at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101)
        at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
        ... 19 more
[ERROR]
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

Precisions in readme : Import Hardware

Hi,

You mention in readme "It takes up to 10 days and sufficient RAM to import the entire world,".

Could you define "sufficient RAM" ?
I may be about to rent a new server for this purpose (unless "sufficient" is 300GB...)

Phonetic analyzer

One could try this analyzer. Not sure if this is necessary if fuzzy is enabled. E.g. fuzzy is already able to find strasse vs. straße

Support arbitrary languages

Photon should be able to do searches over all available language variants mapped in OSM (i.e. all name:* tags). Bonus points if it can improve search results by guessing the language of the query right and re-weight the results accordingly.

allow typos

currently small typos like jamiaca instead of jamaica return no result. typos are very common in the world of search engines, for sure there are already best practices to solve it with elastic search.

test fails

QueryParsingException[[photon] script_score the script could not be loaded

Where is this script?

Support multiple names

Sometimes OSM items have more than one name which are stored in tags like int_name, alt_name, loc_name, official_name ...

photon would benefit to consider these names too

unknown -l option

Hi,

I tried to use NominatimImporter with the options described in documentation sample, but it seems that -l option doesn't exist (reading the source code I don't find it).

greetings
Mirko

Importer > Import "importance" Nominatim field

The Nominatim importance field is not imported from Nominatim to Solr.

I think it should be available to be used in boost functions on some cases :

I plan to use photon with a map and sort search results according top map bounds/center.
But...
I would like queries like "Eiffel tower" or "Statue of Liberty" to push the original places first or at least fairly up in the search results (I can accept that if a replica is in or near map bounds, it shows up first).

I believe this could be accomplished by introducing importance field from nominatim and a custom query.

  • Am I wrong ?
  • Would the drawback on index size be worth it for you ?
  • The importance field is mapped in NominatimEntry model but does not make it to Solr index.
    Any historical reason ?

I think I can deal with the solr query, but the import part is tougher to me, since I am not exactly an experimented Java developer...
I'll give it a try (unless you think I am totally wrong), but if anyone feels like they can acheive this "easily" they're welcome.

Importing other sources

I would like at some time to import other sources than OSM into Photon, like the datasets from openaddresses.io or the BANO project, as soon as the licence is compliant.
This will need some changes in the mapping, like having source and source_id keys (instead of just osm_id)

And in the importer, we will need to find a way not to import the data from OSM where it can duplicate with those other sources (that we think are "better" than OSM if we chose to import them), like skipping all place=house if country=France, etc.

I will work on importing some BANO data, to have a first use case that we can discuss.

core.properties

I am trying to install photon on Ubuntu. I have manually installed SOLR and tried to load the photon config. SOLR is not able to load collection1 without the cor.properties file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.