Code Monkey home page Code Monkey logo

mapquest-osm-server's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mapquest-osm-server's Issues

Implement cache invalidation on the front-end

Once support for adding incremental changes to the data store is added (see #14, osmChange support to the dbmgr tool), we would need a way to invalidate 'stale' entries from the caches maintained inside the front-end servers.

A (too simple) solution could be to have the front-ends periodically reset their caches, for example by reading a 'generation count' of some kind from the data store at regular intervals and resetting their caches if this count has changed. Such an approach would however be wasteful because updates to the data store are likely to be frequent (once a minute with the Minutely Mapnik process), but with each update only touching a very small number of elements.

Investigate alternate geo-grouping and storage schemes for OSM data

Investigate more cache efficient ways to group OSM elements.

"Real-life" usage (e.g., map queries from JOSM) appear to have a significant degree of geographical locality to them.

It may help to group OSM elements in a different manner, so that repeated queries in the same geographical area are served with fewer compute resources (and faster).

This ticket tracks the following tasks:

  1. Collect data on actual data access patterns.
  2. Implement a cache efficient grouping scheme that works well with these access patterns.

<-- References -->

Reduce data transfer in the implementation of the /map API

In the implementation of the /map API, we can reduce the data transferred between the data store and the front end by modifying the implementation in the following way:

  • Store node coordinates along with node IDs in the geo-indexes created by the dbmgr tool.
  • After retrieving geo-index information covering the desired bounding box to the /map API, use these node coordinates to immediately filter out nodes falling outside the desired bounding box.

Investigate better ways to access a Membase backend

The code currently accesses membase in memcache compatibility mode. This is sub-optimal in a couple of ways:

  • Traffic to the membase is routed through a proxy (moxi), which implies an additional hop for data to traverse.
  • The python client used by the current server does not deal with all the error responses that a membase server could issue.
  • The memcache protocol itself has limits that the membase does not have.

Choose and use a suitable interface library so that a Membase backend can be accessed efficiently.

Move project documentation to the main source tree.

Move project documentation to the main source tree from the wiki.

This would make it easier to keep the project's documentation and code in sync.

  • Cloning the project would clone documentation automatically, without an extra step to clone the wiki's Git repository.
  • Collaboration on documentation can use GitHub's facilities.
  • Only one variant of Markdown syntax to worry about.

Create an installation guide

Create user-centric documentation covering the care and feeding of the API Server.

A rough outline of this article would be:

  • Introduction
    • A brief overview of the OSM project.
    • A description of what this API service does.
    • Limitations of this API service.
  • Pre-requisites for installation.
    • Supported operating systems.
    • Pre-requisite packages.
  • The installation process.
    • Describe this for each supported OS.
  • Monitoring the operation of the server.
    • Facilities for logging.
    • Monitoring performance metrics.
    • Looking out for errors.
  • Troubleshooting help.
  • Appendices:
    • How to report bugs.
    • Building the server from source.
    • Contributing source code patches.

The guide would need be written in using open-source documentation tools that are in common use in the open-source eco-system, say one of LaTeX, ConTeXt or DocBook.

The dbmgr tool needs to be faster/more frugal with memory

A complete planet.osm dump currently holds of the order of a billion nodes, ninety million ways and just a bit less than a million relations, per the current statistics for the OSM database.

In order to be able to deal with a data set of this size in a reasonable amount of time, the db-mgr ingestion tool needs to be sped up considerably, and also made frugal in its memory consumption.

Retrieve slab configuration information from the data store

Slab configuration information is used by the dbmgr tool to group OSM elements into slabs.

The dbmgr tool and the front-end currently retrieve slab related information from the system configuration file (config/osm-api-server.cfg). It would be more robust for the front-end to retrieve slab information from the data store directly.

Configuration information in the data store should also be versioned, so that incompatible configuration schemas can be detected.

Provide APIs to query meta-data about the contents of the data store

It would be useful if meta-data about the contents of the data-store could be queried externally. For example:

  • A human readable comment describing the contents of the data store.
  • Upload history.
    • The time of the each (full or incremental) update to the data store.
    • Information about the data used (file size, last modified time, source URL, if any).
  • Information about the OSM elements available in the data store, perhaps in the form of a min--max range for each type of element.

Add support for ingesting osmChange documents

Incremental updates to the map are distributed by the OpenStreetMap project in the form of OsmChange files. These need to be supported by the ingestor tool.

Open Issue: OsmChange files do not contain <changeset> elements; for now, the ingestor would need to fetch these elements from an upstream server.

See also: issue #22 which tracks related changes to the front-end.

Investigate support for queries using XAPI-style predicates

XAPI offers the ability to query the map using complex predicates.

For example:

  • /api/0.6/node[amenity=hospital]
    Queries the map for all nodes containing <tag k="amenity" v="hospital" />
  • /api/0.6/*[key1=value1][key2=value2]...
    Queries the map for all elements matching the predicates (key1 == value1 && key2 == value2).
  • /api/0.6/relation[not(way)]
    Retrieve relations that do not have way members.
  • /api/0.6/node[@user=name]
    Retrieve nodes last change by a specific user name.

Investigate how best these queries could be implemented in the current architecture.

Support "full" planet dumps

"Full" planet dumps contain the entire history of the map. In order to support full dumps, we would need to:

  • Represent edits to the map in a space efficient manner.
  • Implement support for ingesting change history.
  • Implement support for serving change history via the front-end server.

Redirect write requests to an upstream server

It would be useful for this API server to redirect requests that it cannot handle (using an appropriate HTTP response code) to an upstream server.

Examples of such requests include:

  • Requests for data that is not present in the datastore, e.g., GPS traces, or user data.
  • Requests that would modify the data, e.g., requests to the /create URIs or POST requests to URIs.

This feature would allow this API server to act as a fast and scalable "front end" for the current OSM API server.

See also: issue #12.

Provide a way for restarting interrupted ingestion runs

Since ingestion of a full OSM dump can take a long while, having an option to 'resume' from some point in the middle of a dump could be useful.

For example, a --resume=type:id option, that would instruct the ingestor tool to resume ingestion from an element of type type and id id. The type specifier would be one of changeset, node, way or relation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.