Code Monkey home page Code Monkey logo

osdi-docs's Introduction

layout title permalink
default
OSDI
index.html

OSDI


The Open Supporter Data Interface (OSDI) effort seeks to define an API and data structures for interoperability among products in the progressive cause-based, campaign and non-profit marketplace. The existence of a common API will reduce customer costs related to moving data between different systems, lower integration costs and enhance the ability of innovators to create products for the marketplace.

OSDI membership is made up of progressive vendors and organizations as well as invited non-partisan and mainstream industry vendors.

More Information about OSDI can be found at: opensupporter.org

The Github source for these documents can be found at: https://github.com/opensupporter/osdi-docs

If you are looking at those sources now and want to see the prettier github pages version, look here: https://opensupporter.github.io/osdi-docs

Experiment with our prototype server: http://api.opensupporter.org

Sections

API Overview and Structure

OSDI used a combination of approaches to provide flexible reading of data, simple operations for simple scenarios, and general purpose CRUD access.

Version

This document represents OSDI version 1.2.0

Working with OSDI in Real Life

HAL Browser

OSDI Servers SHOULD expose the HAL Browser to provide a consistent interface for developers, scripters, digital, tech and data staff to work with.

In the course of writing scripts, reports, applications and other utilities that integrate via OSDI, examining and inspecting the different resources available on a server is a significant component of time spent. By having a consistent interface to work in, customers can further decrease their costs.

Ask your OSDI vendor for the URL to their HAL Browser.

Simple Code Examples

Python - Get People
Ruby - Get People

jQuery Plugin

An OSDI jQuery plugin has been created for use with OSDI's non-authenticated POST feature. The plugin facilitates a javascript implementation of OSDI, allowing for HTML forms to POST data into an OSDI system using AJAX in a user's browser, all without much coding.

Learn more about the plugin and download a copy here.

REST + HAL

Generally, OSDI follows traditional RESTful practices for accessing resources and collections of resources as well as creating, editing, updating, and deleting resources.

OSDI also implements the JSON+HAL spec hypermedia standard, providing links to associated collections and resources. JSON+HAL specifies a simple way to include these links in API output. The combination of linking and a specification allows generic clients to be written and, indeed, many languages have HAL clients. Linking itself makes it easier to both reason about and write clients for an API.

In addition to providing links for associated collections and resources, JSON+HAL specifies a way to embed the actual associated resources (or collections) in the same API response as the links. For example, a request for a collection of Person resources will return a link for each Person resource, and may also return the actual Person resources to which the links point. This allows OSDI providers and users to reduce the number of server round-trips needed to retrieve a set of data, at the expense of working with larger (possibly much larger) response sizes.

Embedding is optional for associated resources or collections. However, any resources or collections that are embedded must also be accessible by link. OSDI clients may check to see if a related resource is embedded in the response, and if not present, should fall back to getting the resource via its link.

By default, server responses should expand first level instances unless otherwise specified. For example, in a response for a collection of resources, those resources should be embedded.

Finally, OSDI implements the OData query language for filtering collections.

Back to top...

Versioning

OSDI uses Semantic Versioning. In practice, this means:

  • Breaking changes will use a new major version number (eg: 2.0)
  • New features will use a minor version number (eg: 1.1)
  • Incremental Bug fixes may use a sub-minor version number (1.1.1)

Current Version

When accessing a server, a client can determine the OSDI version by examining the osdi_version attribute in the API Entry Point (AEP).

Back to top...

Helpers

OSDI also allows a client to perform a number of operations at once that in a traditionally RESTful API would take multiple requests through the use of helpers. For example, helpers can be used to create a new Person resource and register that this new person also signed a petition at the same time, something that with REST would require two operations (first creating the person, then associating them with the petition).

Back to top...

API Entry Point and linking

All access through OSDI starts at the API Entry Point (AEP). The AEP is a resource that acts like a directory of the types of resources available on a server. It also includes capability information like the maximum query pagesize and links to helper endpoints.

Your service provider can tell you what the AEP URL is for your account.

You can explore the AEP with a user-friendly interface by visiting our prototype endpoint.

Back to top...

Curies

You may have noticed that most links are prefaced with a name space "osdi" and that in the _links section there is a key labeled "curies." The link section defines links to relationships between objects and curies define those relationships. You will find documentation on the particular relationship by using the templated curie link. For example, given the following links section:

"_links": {
    "curies": [{ "name": "osdi", "href": "http://api.opensupporter.org/docs/v1/{rel}", "templated": true }],
    "self": {
        "href": "http://api.opensupporter.org/api/v1/answers/46"
    },
    "osdi:question": {
        "href": "http://api.opensupporter.org/api/v1/questions"
    }
}

In order to fetch documentation on the question relationship, you would visit the following url: http://api.opensupporter.org/docs/v1/question

Any links not prefaced with a curie name space are defined here.

Vendors who add their own vendor-specific relationships must defined their own curie and preface their relationships with their own curie namespace. For example,

"_links": {
    "curies": [
        { "name": "osdi", "href": "http://api.opensupporter.org/docs/v1/{rel}", "templated": true },
        { "name": "fb", "href": "http://facebook.com/docs/v1/{rel}", "templated": true }
    ],
    "self": {
        "href": "http://api.opensupporter.org/api/v1/question_answers/46"
    },
    "osdi:question": {
        "href": "http://api.opensupporter.org/api/v1/questions"
    },
    "fb:profile": {
        href: "http://facebook.com/profiles/1234"
    }
}

Back to top...

Collections and Navigation

When retrieving collections, the response representation will include some common attributes.

Name Type Description
total_pages integer The number of pages applicable to this collection.
total_records integer The total number of resources matching this collection.
page integer The page number of this response.

Collection responses may include additional links for navigation to previous and next pages.

Name Description
next The link for the next page of results.
previous The link for the previous page of results.

The parameters per_page and page control pagination.

  • ?per_page specifies how many results to return per page.
  • ?page specifies the starting page to start with.

Back to top...

Resource Expansion

Consult the documentation from your vendor or implementer to determine if expansion is supported.

Including Linked Related Resource Collections

The server may support including related resources in responses. For example, a client might wish to have a response containing people also contain their event attendance resources. In this case, the attendance resources would be contained within an _embedded collection within each person object.

To request that the server included these related resources, the $expand query parameter is used, and the value would be the related OSDI collection, with its prefix 'osdi:attendances'

GET https://osdi-sample-system.org/api/v1/people?$expand=osdi:attendances

The $expand parameter can contain a comma separated list of resources to include.

Including Inline Related Objects

Certain OSDI resources have Related Objects, which are inline objects as opposed to a linked related resource collection link.

Some servers may choose not to automatically return all inline related objects, for example if collecting the needed information for the object is expensive or time consuming. In this case the server will omit those objects, unless the client includes the $expand query parameter containing the comma separated additional objects to return, specified by their attribute name (without an osdi: prefix).

GET https://osdi-sample-system.org/api/v1/people?$expand=divisions

Back to top...

Filtering Collections

When retrieving collections, a client may request that the server filter the results according to a query. OSDI makes use of a subset of the OData query language to accomplish this. The filter string is the value of the 'filter' query parameter.

See OData Filter Query for more information.

General information can be found at odata.org.

Conventions

  • String literals are enclosed in single quotes, eg: 'Jon'
  • Integers are not quoted, eg: 5
  • The whole query string is not enclosed in any quotes
  • Object properties are referenced using /, not ., e.g. birthdate/month

Operators

OSDI supports the following OData operators:

Name Description Example
eq Exact match first_name eq 'John'
ne Not Equal exact match first_name ne 'John'
gt Greater than birthdate/month gt 1980
ge Greater or equal than created gt '2013-11-17T18:27:35-05'
lt Less than birthdate/year lt 1980
le Less or equal than created le '2013-11-17T18:27:35-05'
or Logical OR first_name eq 'John' or first_name eq 'Jon'
and Logical AND first_name eq 'John' and last_name eq 'Doe'

OSDI defines the following OPTIONAL extension operators:

Name Description Example
like Case insensitive match first_name like 'john' # returns John or john
re Matches a regular expression first_name regexp '/[Rr]ob/' # Returns robert, Robert, rob, roberto

Functions

OSDI defines the following OPTIONAL extension functions:

Name Description Example
near Returns entries near a location within a radius gender eq 'Female' and near('10011', '5 miles')

Virtual Field Names

There are some resource fields that should be searchable, but are not directly addressable using the filter syntax. This is the case for array fields and fields on array items. To allow querying of these properties, we expose Virtual Field Names at the filter level.

OSDI implementations should add special case query handlers for these filter options, where the parent resource should be returned if any of the array items match the condition.

Resource Field Virtual Field
Donation recipient.display_name recipient_display_name
Donation recipient.legal_name recipient_legal_name
Message targets.href target_href
Outreach targets.given_name target_given_name
Outreach targets.family_name target_family_name
Outreach targets.ocdid target_ocdid
Petition targets.name target_name
Person email_addresses.address email_address
Person phone_numbers.number phone_number
Person postal_addresses.postal_code postal_code
Person postal_addresses.region region

Examples

Find all males in a given ZIP code: GET /api/v1/people?filter=gender eq 'Male' and address.postal_code eq '10011'

Find new signups on or since a date and time (Eastern Time) GET /api/v1/people?filter=created ge '2013-11-17T18:27:35-05'

Find all people associated with a given email address: GET /api/v1/people?filter=email_address eq '[email protected]'

Back to top...

Encryption

Providers should support secure HTTPS connections using TLS 1.0 and above, and reject non-secure HTTP connections.

If necessary, providers may support non-secure HTTP connections in addition or instead.

Back to top...

Authentication

Clients and providers may use a variety of mechanisms to authenticate and authorize operations. The specification does not currently require supporting a specific method. However, there are many choices which can work with this specification.

  • Cookie-Based Authentication
  • HTTP Basic
  • HTTP Digest
  • Token-Based Authentication
  • OAuth and OAuth 2.0
  • OpenID

Future versions of this specification may officially support one or more of these methods, or provide standard ways of implementing various methods, or may in other ways be more specific about security and authentication.

Not all interactions with the API will require authentication, and some behaviors might differ based on whether your request includes authentication or not. For example, when used in a javascript application, where authentication secrets can't be kept from users, a non-authenticated method may be used, and the server response may be truncated so as to not leak data to unauthenticated users. Some specific non-authenticated behaviors included in this specification are outlined on each resource page.

OSDI Token Based Authentication

While OSDI does not currently mandate implementation of token-based authentication, for those that do implement this method of authentication the following standard should be followed.

For header-based token authentication, the header should be named OSDI-API-Token (case sensitive), as in this example:

OSDI-API-Token: [your token here]

For URL query string-based token authentication, the query parameter should be named osdi-api-token (case insensitive), as in this example:

https://api.opensupporter.org/api/v1/?osdi-api-token=[your token here]

Back to top...

Mime Types

When sending requests or responses, the preferred mime type is application/json.

Servers and clients are strongly encouraged to be liberal in accepting entities with a missing or incorrect mime type.

Back to top...

Error Handling

If the attempt to access, update, or create the a resource or collection fails, the server shall return the appropriate HTTP error code representing the failure.

Within the response body, the server shall include descriptive information on the nature of the failure.

Back to top...

Flexibility and Server Behavior

Not all systems that implement OSDI will implement all aspects of the specification.

There are no required fields in OSDI, and many relations are left up to each individual system and server.

Some servers may support some or all of the different resource collections. For example, a peer to peer donation system might support Donations and People but not events. In order to find out what resources are available and what URIs to use to access them, do a GET on the AEP URL.

Some servers may support certain helpers and not others. The AEP and associated resources also includes links to the helper endpoints available.

Similarly, matching behavior will be determined by each implementing system. For example, some systems may match people based on email address or other information.

Deviations from RESTful Behavior

This section outlines areas where the expectations of the OSDI customer community differ from boilerplate RESTful behavior and specify best practice expected behavior.

Merging Objects (hashes) and Arrays on update

Numerous resources in OSDI contain embedded objects (hashes) and arrays. For example, on Person, Birthday and Custom Fields are Objects (hashes) and Postal Addresses is an array.

When dealing with updates to these elements of resources, servers should merge rather than replace these elements. This is more consistent with expected user behavior. This relates to helpers and POST's to collections.

For example, sending a person_signup_helper, or a POST on the person collection with only 1 address should not delete any existing addresses. Similarly, including only 1 custom field, should not delete all existing custom fields on a resource.

The behavior for PUT is currently unspecified and up to server behavior. Contact your vendor for more details.

Deleting the full contents of an Object (hash) or Array

In order to cause the deletion of the contents of an Object (hash) or Array, a request should include that element, but set its value to null

Back to top...

Common Elements

All OSDI resources share a set of common fields for consistency. These are listed below.

{% include global_fields.md %}

{% include control_headers_readme.md %}

Back to top...

Notational Conventions

In this specification, when defining models, the following notational conventions are used.

Convention Description
Type[] An array of objects of type 'type'
Type[]* A reference to a collection of resources of type 'type'
Type* A reference to a single resource of type 'type'
string A string
datetime A date and time representation. In JSON this is a string. The contents of this attribute shall be ISO 8601
Object A complex attribute represented by a JSON object
decimal A number in decimal notation such as 12.15. Used for currency.
flexenum One of a list of values, or another value. For example, for party_identification on people, if the person is a Democrat they should be marked as "Democratic" with that exact spelling and casing, but if they are not one of the defined types then you can use another value instead, such as "Working Families".

In the description of string types, sometimes the specification will list a set of acceptable values such as

Name Type Description
gender string one of "Male", "Female", "Other"

In these cases, the string value should conform to one of the choices unless specified otherwise.

Back to top...

References

ID Title URL
RFC5646 Tags for Identifying Languages https://tools.ietf.org/html/rfc5646

Authors and Leadership

  • Leo Aguayo, Organizer
  • Tim Anderegg, New Organizing Institute (NOI)
  • Topper Bowers, Independent
  • Beth Becker, Indigo Strategies
  • Gilbert Chan, Organizer
  • Josh Cohen, Washington United For Marriage (Editor)
  • Jeff Crigler, Catalist
  • Gustavo Costa, The Action Network
  • Michael Eskin, Blue State Digital
  • Jascha Franklin-Hodge, Blue State Digital
  • Abraham Godong, FasterCampaigns
  • Tim Gutowski, Trilogy Interactive
  • Harlan Hill, Indigo Strategies
  • Tim Holahan, BroadStripes
  • Ben Krokower, FasterCampaigns
  • Eli Lee, The Quad
  • Dave Leichtman, Microsoft Corporation
  • Marc Love, Independent
  • Walter Ludwig, Indigo Strategies
  • Drew Miller, NGP VAN
  • Joe McLaughlin, Citizen Action of New York
  • Mark Paquette, TheDataBank
  • Charles Parsons, Salsa Labs
  • Rich Ranallo, Revolution Messaging
  • Jason Rosenbaum, The Action Network
  • Shaie Sachs, NGP VAN
  • Ben Stein, Mobile Commons
  • Ben Stroud, Targetsmart Communications
  • Ray Suelzer, UFCW International Union
  • Nate Thames, ActBlue
  • Jim Pugh, ShareProgress
  • Sylvia Rolle, Washington United for Marriage
  • Chris Thomas, Sierra Club
  • Brian Vallelunga, Trilogy Interactive (Editor)
  • Sandra Wechsler, The Quad
  • Nathan Woodhull, ControlShift Labs
  • Ryan Zarkesh
  • Misha Zhurkin, Catalist
  • Kayley Whalen, Trans United
  • Hayden Mora, Trans United
  • Sonya Reynolds, Independent

Additional Acknowledgments

  • Reed Probus, Web, Logo & Graphic Design
  • Nathan Tabak, Whitepaper writing and editing
  • Anthony Whittaker, Evangelism and Booth Duty
  • Scott Wooledge, V1 Logo

Leadership

See our governance committee members and executive officers.

Back to top...

Contributing and Contact

Anyone is welcome to contribute by filing GitHub issues. To join our committees for specification discussion, please contact us at http://opensupporter.org or via email at [email protected].

To build and view these documents locally, use Jekyll, as configured via the Gemfile in this repo:

  • cd /path/to/your/checkout/of/this/repo
  • bundle install
  • bundle exec jekyll serve

Back to top...

osdi-docs's People

Contributors

ben-pr-p avatar brianvallelunga avatar cap10morgan avatar chetangiridhar avatar chuangpopdemoc avatar dleichtman avatar dryan avatar irskep avatar itsdrewmiller avatar j-ro avatar jbizzle avatar jimshare avatar jkriss avatar joemcl avatar joshco avatar marclove avatar mklaber avatar mpaquette1 avatar mroswell avatar paulschreiber avatar raysuelzer avatar richranallo avatar robertsonsamuel avatar schuyler1d avatar slavingia avatar sonyabea avatar tanderegg avatar theintuitionist avatar timholahan avatar tobowers avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

osdi-docs's Issues

Lists - Saved Lists

Create a mechanism to represent saved lists on a server. A user would create the list, or saved query for later access via the API.

Who determines the GUID?

And when there are inevitably duplicate records for the same person, what happens to the discarded GUID when the records are merged?

self links for resources that are composite in server implementation

Some systems don’t store linked or embedded resources such as address separately. What would they put for a self link?
Proposed:
They could have a convention to get a partial resource that is proprietary via query params that would be encoded in the self links. Clients would treat this as opaque

Person collection names

I forgot to bring this up today on the call, but I'm a little concerned about the naming of collections for a person.

The three I'm concerned about are:

"phones" versus "phone_numbers"
"emails" versus "email_addresses"
"addresses" versus "postal_addresses"

I much prefer the ones on the right for clarity. Conversationally we always talk about "phone numbers", not "phones", which things you use to talk on. "Emails" could be a collection of mailings sent to or from a person when we really mean "email addresses." "Addresses" is less ambiguous, but I'd probably differentiate it from "email addresses" just for consistency.

Schema.org uses "Postal Address", "Email", and "Telephone", but don't ever seem to use them in the plural. The plurals are more problematic than the singular forms in my mind though.

Thoughts?

DOB field on Person

Create a field (and set a standard format -- epoch? YYYY-MM-DD? YYYYMMDD?) for a Person's date of birth.

Targets like legislators

Some orgs have a need to maintain a list of targets like legislators. Including them in the people collection causes complexity in simple supporter queries for email blasts etc.

options

Separate targets collection

this would use the same schema as person, but be in a different collection

Lists

Create the concept of a list that people can be associated with. This could be used to associate legislators to a 'legislators' list.
It could also have lists for current supporters, past supporters, donors, as an example.
Some products create individual lists for volunteers to canvass

Tags

Use tags to associate people to a give virtual list

Model schema feedback

A few aesthetic points:

Address

  • state should be region, which is the more commonly used term when handling data from many countries. For example, it's the term used by vCard and Schema.org.
  • locality is a less urban-centric term than city. Used for example by vCard, Schema.org and xAL (an OASIS standard used by KML).
  • Either country_code should be abbreviated to country or state should be expanded to state_code. Right now, the naming is not consistent. I prefer removing _code.
  • Why prefix address_type and address_status with address_? Since these properties are on an Address document, then we already know they apply to the address.
  • Not sure why lat and lng are abbreviated if all other properties are not.

Person

  • For name components (first, last, etc.), if the intention is to handle data from many countries, terms like given_name, family_name, etc. are more appropriate - especially given that in many regions, the family name is written first, not the given name as is usually the case in the US. If you are looking to reuse standard terms, this part of the Popolo spec reuses the same terms as Schema.org, FOAF, vCard, etc. which are:
    • family_name
    • given_name
    • additional_name
    • honorific_prefix
    • honorific_suffix

W3C and DMCI provide some guidance on name representation.

Anyway, as I said, just aesthetic issues.

Latitude/Longitude naming

On addresses I propose renaming "lat" and "lng" to "latitude" and "longitude". We don't use abbreviations anywhere else for our fields and it would be nice to be consistent.

Notes and Attachments

This issue arose with a request to add a 'notes' attribute to interactions to capture CRM like info.
This can also be useful on all objects.
An alternative is suggested to have attachments collections which could include text notes, files, etc as well as metadata like author, date

Authentication?

Does the group plan on publishing any standards on how to authenticate users?

It could certainly be left open as well, but if there are plans on supporting certain authentication types officially would be good to get some guidance on that.

Minimizing datasize in responses

What about having embedded resources contain a subset of attributes vs a whole resource?
How does this impact client conditional complexity?

Approaches

One approach is to have a consistent summary and complete view of a resource
This would provide consistency, but we need to make sure that the different
responses where a person object would be returned (attendance, donation) have the attributes they need

Only include 1:1 resources, not collections
The default embed policy would be to consistently only include resources that have a 1:1 association and exclude embedded collection associations

Collection of links / list of named resources only
Only attributes are name and link. After more discussion: "What about email? that is also almost always needed?"

Note: this reduces to summary view embedded resource
We could document this with a column in schema table for summary or extended

Address Attribute Changes

A few questions and comments regarding address attributes.

  1. Would a "potential address" from TargetSmart or Catalist proximity matching have an address_type = "potential"? In our internal organizing software, we have collections of potential addresses which organizers will visit until they can determine a primary address.

  2. What about bad addresses? It is useful in many cases to know if an address is bad. I'd suggest an additional attribute to reflect when an address is bad, such as if a person has moved. For union organizers, it is helpful because we know not to visit that house again.

  3. Geocodes. It seems that "lat" and "lng" properties should optional attributes? Perhaps with some accuracy attribute such as "rooftop" or "approximate".

Party affiliation clean up

For a person there's a field called "party" that I think needs some minor tweaks. My suggestions:

First, I'd suggest renaming the the field to either "party_affiliation" or "party_identification". If you think about it, this is really the gender issue all over again. What data are we actually recording here? A person might be registered as a Republican, but identify as "Independent." It seems that both pieces of data would be useful. Renaming to either "party_affiliation" or "party_identification" allows us to add "party_registration" down the line.

Second, the options are currently "democrat", "republican, "independent", and "none". If we're naming the party, let's capitalize the options (as we do all other enumerations) and use the correct names: "Democratic", "Republican".

That leads us to "Independent", "None" and by association null. "Independent" should really mean "Unaffiliated" shouldn't it? And if so, shouldn't "Unaffiliated" mean the same as "None"? I suppose some people self-identify as "Independent" and are engaged in politics while apolitical people would say they don't have a party affiliation.

Thoughts?

Compliance Guidelines

Need to describe what compliance means.

Initial Thinking
AEP is required

all collections are optional as different products may only support certain kinds of scenarios. A system that does not do anything with donations would not need to support the donations collection.

If a product does support a collection, it must support the defined schema for that collection.
(modulo vendor extensions)

Event capacity

For an event, I suggest adding a "capacity" attribute. Our event system allows events to have a stated capacity when created and we shut off RSVPs when the capacity is reached (or slightly above).

Related to capacity, I notice that there is a guestsCanInviteOthers attribute. This makes sense, but you require each Attendance object to have a 1:1 ratio with a Person, which isn't necessarily something you'd have if guests can invite others.

For our system we have the concept of an Attendance that has a totalGuests count attribute on it for "anonymous" guests. We also have an array of simple strings to hold guest names.

I would definitely like to see a totalGuests or totalAttendees attribute on the Attendance type. What do others think?

Identifying required fields

One thing our documentation doesn't do well is note required/optional status. I think we should probably add a column to each table with this data.

For example, for the Phone Number type, we should require "number" and "primary" be sent over. The rest can be null, but these two should always be set.

Gender/Sex Confusion

The gender and sex fields can be confusing to some and may be seen as unecessary complexity.
However, with increasing frequency, orgs have a need to represent the emerging reality of transgender identity.

Needs further discussion and community feedback

Reducing compexity by introducing value types

I believe the pain and complexity we've encountered around updating embedded resources are the result of incorrect modeling. I propose that we simplify the people model to reduce the complexity of updating embedded resources. This should be done through the introduction of value types for several objects which are currently resources.

To best explain the issue with our current design, I'm first going to review HTTP verbs and the concept of a resource. Then I'll review entities and value types and propose a simplification to our current person model.

REST and HTTP Verbs

Since OSDI is taking a RESTful approach to building the API, I want to ensure everyone has a firm understanding of the HTTP verbs that are used to manage the data resources exposed by the API.

Generally speaking, the HTTP verbs that operate on a resource have the following definitions (taken from Wikipedia):

Verb Definition
POST Requests that the server accept the entity enclosed in the request as a new subordinate of the web resource identified by the URI.
GET Requests a representation of the specified resource at the supplied URI.
PUT Requests that the enclosed entity be stored under the supplied URI. If the URI refers to an already existing resource, it is modified; if the URI does not point to an existing resource, then the server can create the resource with that URI.
PATCH Applies partial modification to a resource at the supplied URI.
DELETE Deletes the resource at the supplied URI

Given these definitions, there are a couple of points that should be highlighted:

  • PUT requires the full resource be sent to the server and actually allows for the creation of a resource at a client-supplied URI. In many scenarios, and likely ours, there's no way a full resource will be able to be created solely on the client without some server interaction. For example, the API server will likely have to take part in a resource's creation by supplying server-side IDs and corresponding URIs. As a result, we should limit PUTs to replacing existing resources.
  • PATCH doesn't define how partial modification of a resource is achieved. This will be up to the API and depends on the type of resource being updated. Since we are dealing with JSON, RFC 6902 is one method we could adopt. It is quite likely that we won't even need PATCH support to begin with.

Entities and Value Types

HAL and the HTTP verbs help us define the interaction and framework of our API. To better describe the models of our API, I'd like to define a couple additional terms that I've borrowed from Domain Driven Design:

Term Definition
Entity An object with an identity and life cycle
Value Type An object that describes data, but has no identity

Entities

An entity is something with an identity that does not depend on a parent context. An entity may have relationships with other entities. For example, a Donation entity might reference a Donor entity. Both the Donation and Donor exist as independent objects with their own identities and can be updated independently.

In the case of a HAL API, an entity is a resource that can be accessed at a URI. (For the DDD people out there, technically I'm proposing that a Person be treated as an aggregate, not just an entity.)

Value Types

A value type is an object that describes data but has no intrinsic identity. A great example of a value type we already use is date of birth. We describe a person's date of birth with the following value type:

date_of_birth: {
    month: 1,
    day: 15,
    year: 1978 
}

A date of birth is described entirely by its attributes. It has no identity and no meaning outside of the context, in this case a person, to which it is attached.

Similarly, a person could have an array of phone number value types:

phone_numbers: [
    {
        country_code: "1",
        area_code: "202",
        local_number: "555-1212",
        type: "Home"
    },
    {
        country_code: "1",
        area_code: "202",
        local_number: "555-1212",
        type: "Mobile"
    }
]

In the context of the OSDI data set, each phone number is essentially useless outside of a person and can be wholly identified from its values.

From the perspective of a RESTful API, value types are not resources. They cannot live on their own or be accessed independently and lacking identities they will not have URIs.

Proposal

I propose that we treat postal addresses, email addresses, and phone numbers as value types without any inherent identity of their own. These values would always be returned as intrinsic parts of the person entity. The result of this simplification is as follows:

  • Atomic creation and modification of a resource and its values is now straightforward.
  • Retrieving a resource would always include all its associated value types.
  • The value types would not be in the _embedded section of a HAL resource.
  • The value types would not be referenced in the _links section of a HAL resource.
  • Updating, adding, or removing a value type of a person would require either a PUT of the entire modified person resource or a PATCH describing the modification.

This proposal does not directly address updating embedded elements using a HAL API. Rather, by adding value types to our model, we reduce the complexity of the API for both hosts and consumers.

Update semantics with computed properties

Computed properties are generated by the system and read-only. Example: current attendee count on events.
When doing an update what happens if these are included in the representation?
Proposed:

They may be omitted. if they are present the values are ignored

Extensibility Guidelines

Need description of rules for vendor extensions

Where can extensions be added and how should they be done?

Possibilities

new collections
new values for enums
new attributes on an existing resource

Address field naming

For address lines, does it make more sense to simple have an array instead of "address1", "address2", "addressN"? This seems hackish and doesn't map well for statically typed language.

I suggest either removing the addressN fields and leaving address1 and address2 or simply have one string array field:

{
    "address_lines" :  [ "line1", "line2" ]
}

I doubt we'd ever need more than 1 and 2 unless we're doing international addresses (in which case the field names should probably be different anyway).

Name fields

Since many if not most systems include a name-prefix field (Ms., Mr., Rev.) and a name-suffix field (Jr., Ph.D.), it would make sense to include them as properties here. If the systems on both ends of the API support those fields but the API itself doesn't, then prefix and suffix have to be munged into the other name fields, or left out, which is messy and/or data-corrupting.

Minor suggestion: rename "middle_initial" to "middle_name". Oftentimes you see whole middle names and maiden names in data, and if the field is called "middle_initial" it can cause confusion as to whether the data will be truncated in transit.

Partial Updates

Partial updates are a necessity, but should not be done via HTTP PUT as described here (https://github.com/wufm/osdi-docs#updating-a-resource). The semantics of PUT is to replace the entire resource with the body of the request. O'Reilly's "HTTP: The Definitive Guide" elaborates, saying: "POST is used to send data to a server [for processing]. PUT is used to deposit data into a resource on the server (e.g., a file)."

For partial updates, I propose:

  • use POST
  • fields having blank posted values ("") are left unchanged on the server
  • null is explicitly submitted as the value for fields that are to be set to null. (javascript uses null, not nil)

The PATCH command is supposedly meant for partial data updates, but is patchily supported at the moment and is more effort to implement. I think POST is the best candidate verb.

HAL spec details

Josh and I had a lengthy discussion on the HAL mailing group: https://groups.google.com/forum/?fromgroups#!searchin/hal-discuss/documentation/hal-discuss/lt0CnC3eev4/K3nry6KxJ6cJ

So it appears we're supposed to namespace the links/embeds on documents (a curie - http://www.w3.org/TR/curie/ ).

That would make a simplified person model look like thos:

{
  "_links": {
     "self": {
       "href": "http://host/path/to/self"
     },
     "curies": [{
         "name": "osdi"
        ,"href": "http://path/to/rel/docs/{rel}"
        ,"templated": true
        }
     ],
     "osdi:addresses": {
        "href": "http://path/to/addresses"
     }
  },
  "first_name": "name"

Where the osdi:addresses curie would translate into docs. What are everyone's thoughts on this?

Handles

Consider adding a "handles" collection to the person aggregate, to keep track of the many system IDs a person may have. E.g,,

handles = [
{
"service" : "linkedin",
"id" : "markpaquette1"
},
{
"service": "ngp_van",
"id": "123456798"
}, ...
]

This would remove the need for the dedicated properties "twitter_handle" and "guid", and make for a much more flexible system.

"@" in the Twitter handle

The twitter handle should not include the @. It can be safely inferred and there's no reason to include it in every record.

Geolocation accuracy

We agreed to split accuracy into it's own issue from #46.

The API documentation I mistakenly sent us to during the call today was Google's Geocoding API for Business that's intended to be used in mobile clients. That's why the accuracy was in meters and not in these general categories.

Here's a sampling of what some of the major geocoding APIs provide:

Google's Geocoding API v3:

  • ROOFTOP indicates that the returned result is a precise geocode for which we have location information accurate down to street address precision.
  • RANGE_INTERPOLATED indicates that the returned result reflects an approximation (usually on a road) interpolated between two precise points (such as intersections). Interpolated results are generally returned when rooftop geocodes are unavailable for a street address.
  • GEOMETRIC_CENTER indicates that the returned result is the geometric center of a result such as a polyline (for example, a street) or polygon (region).
  • APPROXIMATE indicates that the returned result is approximate.

Bing has a similar, but not identical field in their responses:

  • Rooftop — The geocode point was matched to the rooftop of a building.
  • Interpolation — The geocode point was matched to a point on a road using interpolation.
  • InterpolationOffset — The geocode point was matched to a point on a road using interpolation with an additional offset to shift the point to the side of the street.
  • Parcel — The geocode point was matched to the center of a parcel.

Yahoo tackles it in a completely different manner. They have an integer scale of 1-100, where 99 equals an exact coordinate, 50 equals a neighborhood, 0 is no match, and there are a ton of point values in between.

Proposal

Adopt Google's accuracy levels as an enum with some slightly modified descriptions.

In descending order of accuracy:

  • rooftop — an exact match
  • interpolated — a street match with an interpolated point for the building number
  • geometric_center — the center of a polyline (street) or polygon (city, postal code, etc.)
  • approximate — an accuracy of less than geometic_center

NOTE: Providers who use geocoding services which provide different accuracy categorization should provide the best match from the above enumerated options.

Payment Status

The spec currently defines the payment processor mechanisms as out of scope.
However, some vendors need the ability to track the status of a donation, pending, declined, etc.
This could be accommodated by status fields on donations

HAL Client behavior wrt embedded vs linked

Is a HAL client supposed to transparently handle when a resource is linked vs embedded? If the caller of a client library API specifies a subordinate resource, is the client library supposed to go fetch the embedded resource if only a link is present?

pagination and embedded collections

pagination is set up for primary collections, eg when a client is specifically getting a collection
for subcollections, eg the addresses collection that is embedded in a person, pagination properties do not show up.

Is this right?
I think yes

Phone number type?

Looking at the definition of Person, I see primary_phone with a type of string. Later though, I see a collection of a Phone type. Are we planning on having a Phone type? I think it might be useful at least so we can capture the type of phone (mobile, work, home) in addition to the number.

Filters

One of my main objections to having query string filters is that a client then needs to know how to look at a url and append information... so it breaks the JSON+HAL convention of "I don't really need to know about the API" to use it. At Amicus we've been able to get away from using any "filters" - until now. So we started thinking about the problem again...

http://tools.ietf.org/html/draft-kelly-json-hal-05
The JSON+HAL spec has a concept that could fix this though:

_links: {
  osdi:filter: [
    {  href: "/blah?tags={tagList}",
       name: 'tag',
       templated: true
    } ,
    {
      href: "/blah?state={state}",
      name: 'state',
      templated: true
    }
  ]
}

Now you can list out available filters and have a good way for clients to use them. One issue we haven't figured out is combining them (multiple filters)

geolocation fields and types

From Ray
Geocodes. It seems that "lat" and "lng" properties should optional attributes? Perhaps with some accuracy attribute such as "rooftop" or "approximate".

From Mark
Hello and thank you for having me in the group. Regarding geocodes:

  1. They are so ubiquitous and so useful for engaging supporters (find & map all supporters in an area, assign & query on legislative districts, etc). My preference would be that they be kept among the core fields and not be made optional attributes.

They should be named "latitude" and "longitude", as none of the other address fields (and no person fields) are abbreviated.

For accuracy, latitude and longitude should be served as text fields, not numeric. Numeric latitude and longitude fields are subject to corruption when moved from system to system, as rounding errors and differences in precision between systems cause the 4th, 5th and 6th decimal places to mutate. In most real-life applications, this kind of precision is not necessary, but in latitude and longitude it can mean the difference between assigning the right or wrong congressional district. Also, when viewed as text, it is clear how much precision* there is in the coordinates, as no digits have been artificially added or lopped off by numeric conversions. *precision != accuracy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.