Code Monkey home page Code Monkey logo

koop-socrata's Introduction

koop-socrata (Koop 2.x)

Note this is the Koop 2.x version of the Socrata provider; please see https://github.com/koopjs/koop-provider-socrata for Koop 3.x version.

Socrata Provider for Koop

npm version build status Greenkeeper badge

This provider makes it possible to access Socrata's JSON API as either GeoJSON or an Esri FeatureService. This is particularly useful for making maps and doing analysis on the web.

Install

To use this provider you first need a working installation of Koop. Then from within the koop directory you'll need to run the following:

npm install koop-socrata --save

Usage

koop-socrata needs to be registered as a provider in your Koop app in order to work.

var socrata = require('koop-socrata')
koop.register(socrata)

If you are using Postgres you will need to then create a database and enable Postgis

createdb koop
psql -d "koop" -c "create extension postgis;"

Once that's done you can restart your server and the Socrata routes will be available.

If you're using the koop-sample-app template, you can start the server like this:

node server.js

Registering Socrata Hosts

Once this provider's been installed you need to register a particular instance of Socrata with your Koop instance. To do this you make POST request to the /socrata endpoint like so:

curl --data "host=https://data.nola.gov&id=nola" localhost:1337/socrata

for Windows users, download cURL from http://curl.haxx.se/download.html or use a tool of your choice to generate the POST request

What you'll need for that request to work is an ID and the URL of the Socrata instance. The ID is what you'll use to reference datasets that come from Socrata in Koop.

To make sure this works you can visit: http://localhost:1337/socrata and you should see all of the registered hosts.

Add your app key

Socrata allows 1,000 requests per rolling hour period if you have an app key. If not, there is no guarantee of the number of queries you can make. It is strongly recommended to include an app token if you plan to run Koop-Socrata in production. See:

http://dev.socrata.com/docs/app-tokens.html

  1. Go to dev.socrata.com/register to create an app key
  2. Edit the default.json in your koop-app config to add
{
  "socrata": {
    "token": "your-app-token"
  }
}

Accessing Socrata Data

To access a dataset hosted in Socrata you'll need a "Resource ID" from Socrata. Datasets in Socrata can be accessed as raw JSON like this:

And then the ID fwm6-d78i can be referenced in Koop like so:

http://koop.dc.esri.com/socrata/nola/fwm6-d78i

If your Socrata data has more than one location column, you can specify the desired location column in the http request like this:

https://path_to_koop/socrata/socrataProvider/dataSetID!spatialColumn

Handling Large Datasets

Koop-Socrata will page through large datasets to gather all the rows. The default is set to 10,000 rows per request, but the Socrata API handles up to 50,000 requests very well. For production deployments it is recommended to set the Koop Configuration for the Socrata page limit to 50,000.

{
	"socrata": {
		"pageLimit": 50000
	}
}

Clearing Koop's cache for individual resources

If you find yourself in a situation where Koop isn't returning data for a particular resource and you'd like to make sure it makes a fresh request, you can blow out the cobwebs by making the following request in the browser.

http://[koop]/socrata/[provider]/[resourceID]/drop
>>> true

Examples

Here are a few examples of data hosted in Socrata and accessed via Koop.

Contributing

Esri welcomes contributions from anyone and everyone. Please see our guidelines for contributing.

License

Apache 2.0

koop-socrata's People

Contributors

chelm avatar dmfenton avatar greenkeeper[bot] avatar jgravois avatar rgwozdz avatar sirws avatar slibby avatar ungoldman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

koop-socrata's Issues

Adding Socrata token

In the documentation it says,

"2. Edit the default.json in your koop-app config to add

{
  "socrata": {
    "token": "your-app-token"
  }
}
```"

is this just appended to the default.json file? what exactly is the formatting? I get errors if I try to add this anywhere in default.js when I do npm start, but jslint says the formatting is valid. 

current code is as such with me trying to add in token: 

`{
  "server": {
    "port": 1337
  },
  "socrata": {
    "token": "my_token"
  },
  "data_dir": "/usr/local/koop/",
  "db": {
    "conn": "connection"
  }
}`

Version 10 of node.js has been released

Version 10 of Node.js (code name Dubnium) has been released! 🎊

To see what happens to your code in Node.js 10, Greenkeeper has created a branch with the following changes:

  • Added the new Node.js version to your .travis.yml

If you’re interested in upgrading this repo to Node.js 10, you can open a PR with these changes. Please note that this issue is just intended as a friendly reminder and the PR as a possible starting point for getting your code running on Node.js 10.

More information on this issue

Greenkeeper has checked the engines key in any package.json file, the .nvmrc file, and the .travis.yml file, if present.

  • engines was only updated if it defined a single version, not a range.
  • .nvmrc was updated to Node.js 10
  • .travis.yml was only changed if there was a root-level node_js that didn’t already include Node.js 10, such as node or lts/*. In this case, the new version was appended to the list. We didn’t touch job or matrix configurations because these tend to be quite specific and complex, and it’s difficult to infer what the intentions were.

For many simpler .travis.yml configurations, this PR should suffice as-is, but depending on what you’re doing it may require additional work or may not be applicable at all. We’re also aware that you may have good reasons to not update to Node.js 10, which is why this was sent as an issue and not a pull request. Feel free to delete it without comment, I’m a humble robot and won’t feel rejected 🤖


FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

An in-range update of request is breaking the build 🚨

☝️ Greenkeeper’s updated Terms of Service will come into effect on April 6th, 2018.

Version 2.84.0 of request was just published.

Branch Build failing 🚨
Dependency request
Current Version 2.83.0
Type dependency

This version is covered by your current version range and after updating it in your project the build failed.

request is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details
  • continuous-integration/travis-ci/push The Travis CI build failed Details

Commits

The new version differs by 6 commits.

  • d77c839 Update changelog
  • 4b46a13 2.84.0
  • 0b807c6 Merge pull request #2793 from dvishniakov/2792-oauth_body_hash
  • cfd2307 Update hawk to 7.0.7 (#2880)
  • efeaf00 Fixed calculation of oauth_body_hash, issue #2792
  • 253c5e5 2.83.1

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

Error in the log file

Occasionally, I am getting the following message on the log:

error: Error querying {"name":"error","length":210,"severity":"ERROR","code":"42883","hint":"No function matches the given name and argument types. You might need to add explicit type casts.","position":"72","file":"parse_func.c","line":"523","routine":"ParseFuncOrColumn","msg":"function st_geomfromgeojson(text) does not exist"}

This happens when making the following call (or similar):

http://localhost:1337/socrata/nola/hpm5-48nj

which gives me null in the response. What does it mean and how to deal with it.

optimize socrata paging

We are seeing a massive slow down in socrata services that require paging of lots of data. One reason is that the current paging routing does not seem to be throttling its requests. It currently creates a list of pages to GET and immediately requests them. This is probably causing trouble on the server when we make 100 - 200 requests all at once for data.

The solution might be a scale it back a little bit and request pages a bit more slowly.

displayField should not be hard-coded to 'name'

See this service that doesn't have a field called name. If there is no name, default to the first string?

{
"currentVersion": 10.21,
"id": 0,
"name": "psp3-bvzw",
"type": "Feature Layer",
"displayField": "name",
"description": "",
"copyrightText": "",
"defaultVisibility": true,
"relationships": [
...

Certain datasets from Socrata have incorrect spatial reference

Here is an example:
http://koop.dc.esri.com/socrata/wastate/f9h8-rtz6/FeatureServer

Notice:
spatialReference: {
wkid: 4326,
latestWkid: 4326
},
initialExtent: {
xmin: -13610908.050271627,
ymin: 6061419.6999203265,
xmax: -13605556.778112192,
ymax: 6066753.658368595,
spatialReference: {
wkid: 4326,
latestWkid: 4326
}
},

It appears these coordinates xmin, xmax etc. are web mercator coordinates (wkid:3857) but Koop is saying they are lat/long by setting the wkid to 4326. We need something that will test for the proper coordinate system.

Unable to install on Windows

Perhaps all files under test/fixtures have invalid names under Windows fs rules (colon ':' is not permitted).

tests leave leftover logging/error messages

a successful run of tests results in two leftover files in the root of the repository.

..year-month-day and .error.year-month-day.

2015-08-24T19:49:10.641Z error Could not get rowCount. count::https://data.seattle.gov/resource/missing.json?$select=count(*)::404

2015-08-24T19:49:10.642Z error Could not get metadata. meta::https://data.seattle.gov/views/missing.json::404

2015-08-24T19:49:11.229Z error Could not get rowCount. Could not parse count JSON

2015-08-24T19:49:11.737Z error Could not get rowCount. count::https://data.seattle.gov/resource/filtered.json?$select=count(*)::500

2015-08-24T19:49:12.238Z error Could not get first row. first::https://data.seattle.gov/resource/countFail.json?$order=:id&$limit=1::500

a quick review seems to indicate we are logging behavior that the tests are provoking intentionally, so maybe it'd be best to just blow them away afterward?

Koop-Socrata handles 1mm row datasets

Concept

As a user I can access a performant feature service that is proxying: http://data.seattle.gov/resource/3k2p-39jp.json

Details

  • Requests required to move data from Socrata should not overwhelm hourly limit of 1,000 queries
  • Requests for 10,000 or more rows at once from Socrata should not clog the node process
  • Query performance should be < 2 seconds when the data is already cached
    • Count Only
    • Single page
    • OutStatistics

cc @astauffer

Koop-Socrata Cache Timer Not Expiring

The Koop Cache timer is not expiring.

Environment:

  • Ubuntu 14.04 EC2
  • PostgreSQL 9.3.9 (w/PostGIS)
  • Most recent build of loop-sample-app
  • Most recent install of koop-socrata

STR:

@dmfenton

Multiple Location Column Designator not Working

When working with a dataset with multiple location columns (https://data.detroitmi.gov/Property-Parcels/BSEED-Permits-Issued/n7kb-xdcs) the syntax /socrata/provider/socrataId!columnName no longer appears to work.

Environment:

  • Most recent build of koop-sample-app and koop-socrata in Ubuntu 14.04 EC2
  • PostGIS Cache

Calling Path: /socrata/detroitmi/n7kb-xdcs!site_location creates the following log in Node:

info: CREATE TABLE "Socrata:n7kb-xdcs:0" (id SERIAL PRIMARY KEY,feature JSON,geom Geometry(POINT, 4326),geohash varchar(10))
debug: Updating info Socrata:n7kb-xdcs:0 processing
info: Processing: https://data.detroitmi.gov/resource/n7kb-xdcs.json?$order=:id&$limit=50000&$offset=1
error: insert partial ERROR error: relation "Socrata:n7kb-xdcs!site_location:0" does not exist, Socrata:n7kb-xdcs!site_location
error: Failed while inserting a page of Socrata:n7kb-xdcs!site_location:0. error: relation "Socrata:n7kb-xdcs!site_location:0" does not exist
error: Could not get info of Socrata:n7kb-xdcs!site_location:0 Key Not Found Socrata:n7kb-xdcs!site_location:0
debug: Updating info Socrata:n7kb-xdcs!site_location:0 undefined
info: Finished paging Socrata:n7kb-xdcs!site_location:0
express deprecated res.send(body, status): Use res.status(status).send(body) instead node_modules/koop-socrata/controller/index.js:28:13

Table written to PostGIS is Socrata:n7kb-xdcs:0

Failing to register hosts

Summary:

I've cloned the sample app and I'm trying to pull down some Socrata data, but I'm unable to register a host. It seems like this is because Koop is trying to query a table that doesn't exist. When I use the local cache, I can add a host, but I can't get a dataset from that host.

Details:

When I perform a github/ query like http://localhost:1337/github/chelm/grunt-geo/forks it seems to work fine. A new table is created in my local postgres database the expected GeoJSON is returned.

When I query /socrata, I get the helpful suggestion to POST a host. But when I use the one-liner example in the README (curl --data "host=https://data.nola.gov&id=nola" localhost:1337/socrata) the following error is printed:

{"name":"error","length":107,"severity":"ERROR","code":"42P01","position":"31","file":"parse_relation.c","line":"986","routine":"parserOpenTable"}

To make sure there wasn't something weird going on with my shell, I made a little Python script to make the request. Same outcome.

Some Googling shows that 42P01 is a postgres error meaning "Table does not exist".

Then I tried disabling PGCache in the sample app's server.js so I could default to the in-memory cache. Now I can successfully add a host! (I added New Orleans as shown in the README.) Then I tried using the sample query on the README:

http://localhost:1337/socrata/nola/fwm6-d78i

This stack trace got printed to the console:

TypeError: Cannot read property 'info' of undefined at Object.module.exports.getInfo (/Users/willengler/Sandbox/koop-sample-app/node_modules/koop/lib/Local.js:62:35) at Cache.getInfo (/Users/willengler/Sandbox/koop-sample-app/node_modules/koop/lib/Cache.js:178:13) at /Users/willengler/Sandbox/koop-sample-app/node_modules/koop-socrata/models/Socrata.js:167:20 at /Users/willengler/Sandbox/koop-sample-app/node_modules/koop/lib/Cache.js:133:23 at Object.module.exports.select (/Users/willengler/Sandbox/koop-sample-app/node_modules/koop/lib/Local.js:34:7) at Cache.get (/Users/willengler/Sandbox/koop-sample-app/node_modules/koop/lib/Cache.js:132:13) at Object.socrata.getResource (/Users/willengler/Sandbox/koop-sample-app/node_modules/koop-socrata/models/Socrata.js:165:16) at /Users/willengler/Sandbox/koop-sample-app/node_modules/koop-socrata/controller/index.js:69:17 at /Users/willengler/Sandbox/koop-sample-app/node_modules/koop-socrata/models/Socrata.js:49:9 at Object.module.exports.serviceGet (/Users/willengler/Sandbox/koop-sample-app/node_modules/koop/lib/Local.js:159:7)

But then there was this hopeful message:

info: Processing: https://data.nola.gov/resource/fwm6-d78i.json?$order=:id&$limit=10000&$offset=1 info: Processing: https://data.nola.gov/resource/fwm6-d78i.json?$order=:id&$limit=10000&$offset=1 info: Beginning to page through https://data.nola.gov/resourcefwm6-d78i 1 Pages. info: Beginning to page through https://data.nola.gov/resourcefwm6-d78i 1 Pages. info: Finished paging Socrata:fwm6-d78i:0

Subsequent GETs to http://localhost:1337/socrata/nola/fwm6-d78igive me something like {"checked_at":"2015-09-28T17:38:39.725Z"} instead of the data I'm expecting.

Am I doing something wrong?

Config:

For reference, here's the contents of my default.json file:

{ "server": { "port": 1337 }, "data_dir": "/usr/local/koop/", "db": { "conn": "koop://localhost/koop_test" } }

I've deleted the https config file.

Koop Not Handling Bad Values

FeatueServices in Koop are not responding when bad/out of range values are present in the Socrata table.

Example: http://detroitkoop-268609380.us-east-1.elb.amazonaws.com/socrata/detroitmi/uhnf-v2zs

  • Several rows in the location column have values like:
{"coordinates":[999998.9998,999999],"type":"Point",

The resulting featureService refuses to draw in a map: http://www.arcgis.com/home/webmap/viewer.html?url=http://detroitkoop-268609380.us-east-1.elb.amazonaws.com/socrata/detroitmi/uhnf-v2zs/featureserver/0

Socrata Cache not Expiring

Source data: http://data.detroitmi.gov/resource/encd-2smf.json

  • Last-Modified header: "Mon, 13 Apr 2015 21:11:05 PDT"
  • Date header: "Tue, 14 Apr 2015 14:12:31 GMT"

Data already exists in Koop cache (postgre db)

  • Table: Socrata:encd-2smf:0
  • Entry in kooptimers table: id: Socrata:encd-2smf:timer; expires: '1428951734558' (Mon, 13 Apr 2015, 19:02:14 GMT)

New request initiated for: http://detroitkoop-268609380.us-east-1.elb.amazonaws.com/socrata/detroitmi/encd-2smf/featureserver/0

  • Koop does not drop expired cache, reports "old" data
  • kooptimer remains set as expired 1428951734558 value

Only workaround have found is to drop the corresponding koop table, delete the corresponding row from kooptimers, and make a new request.

needs updating

This module is still including a default config in YAML and the tests are requiring mocha and koop-server, neither of which are listed as dev dependencies or dependencies. Even when mocha, koop-server, and yaml modules are installed, I get a segmentation fault when trying to run the test. Needs to be brought up to speed with the latest version of koop and made usable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.