w3c / web-annotation Goto Github PK

View Code? Open in Web Editor NEW

141.0 55.0 30.0 35.26 MB

Web Annotation Working Group repository, see README for links to specs

Home Page: https://w3c.github.io/web-annotation/

License: Other

HTML 98.61% Python 0.85% JavaScript 0.52% Shell 0.01%

web-annotation w3c annotation web json-ld web-annotation-wg

web-annotation's Introduction

Web Annotation Repository

Documents produced by the Web Annotation Working Group of the W3C. The Working Group has published its recommendations, and is now closed; see the separate list of the published documents.

This repository is now under the control of the W3C Open Annotation Community Group. Everyone is welcome adding issues to the repository, clearly marked as errata or proposals for future work (see the errata page for the details). The Community group's mailing list is another avenue for further discussion; please, join the Community Group if you are interested in the topic.

web-annotation's People

Contributors

Stargazers

Watchers

web-annotation's Issues

Intended Audience for Annotation

Some annotations are generated with an intended audience or class of consumer in mind. That could be a role (teacher vs student), or other property of the consumer (age range, human vs machine, vision impaired vs sighted, etc.)

Justification

This is important for accessibility and ensuring the best user experience by filtering or presenting the most appropriate content for the audience.

Proposal

The proposed solution is to use schema.org's set of audiences and roles, applied to the annotation or resource. This is the solution adopted by the IDPF's Open Annotation in EPUB specification.

{
  "@id": "http://example.org/epub/annotation/1.json",
  "@type": "oa:Annotation",
  "audience" : {
    "@type" : "schema:EducationalAudience",
    "schema:educationalRole" : "teacher"
   }
  "hasTarget": { … },
  "hasBody": { … }
}

Background

Audience was discussed at rollout events but not during the CG's work.

Links

Define json-ld profile URI for OA serialization context and structure

Define a profile URI for the context and recommended structure of annotations in JSON-LD.

This is needed to allow systems to content negotiate (at the protocol level) for different flavors of JSON-LD on top of the same content. Any response that follows the serialization should have the profile included to let clients know how to interpret it, and clients need to be able to send it in the request to ask for it.

See in particular:

Unable to unsubscribe from mailing list

G'day. I've tried numerous times to unsub from the public-annotation mailing list (via [email protected]), but the second CONFIRM is never acted upon - instead, I get back a message from [email protected]. Could an admin unsubscribe me manually please?

Received: from m4.mxes.net ([unix socket])
     by m4.mxes.net (Cyrus v2.3.12) with LMTPA;
     Wed, 18 Feb 2015 07:42:40 -0500
X-Sieve: CMU Sieve 2.3
Return-Path: 
Received: from 216.86.168.176
    by m4.mxes.net (bayesd) with LMTP id 1424263360-76556-14
    for ; Wed, 18 Feb 2015 07:42:40 -0500 (EST)
Received: from local_scanner.mxes.net (mxout-01.mxes.net [216.86.168.176])
    by mxout-01.mxes.net (Postfix) with ESMTP id 061DB8A0DA
    for ; Wed, 18 Feb 2015 07:42:40 -0500 (EST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56])
    (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits))
    (No client certificate requested)
    by mxin.mxes.net (Postfix) with ESMTPS id CD4408A057
    for ; Wed, 18 Feb 2015 07:42:39 -0500 (EST)
Received: from lists by frink.w3.org with local (Exim 4.80)
    (envelope-from )
    id 1YO3xf-0006F1-4e
    for [email protected]; Wed, 18 Feb 2015 12:42:39 +0000
To: [email protected]
References: 
In-Reply-To: 
X-Loop: [email protected]
From: [email protected]
Auto-Submitted: auto-replied
Subject: CONFIRM u940223980
Message-Id: 
Date: Wed, 18 Feb 2015 12:42:39 +0000
X-Virus-Scanned: ClamAV
X-Spam-Allowed: Sender domain in global Allow list passes SPF check
X-Originating-IP: 128.30.52.56
X-Envelope-To: 
X-Spam-Check: Enabled,6.0,13.0,1,1,42,1,0,0,1,1,0,0,0,[SPAM],
X-Spam-Status: No, score=-4.0 threshold=6.0,13.0 Allow Listed
X-Spam-BayesResult: Unsure, 0.498447
X-Spam-Score: -4.0
X-Spam-Scoring: 0,0

It has been requested that the following address:

       [email protected]

should be deleted from the public-annotation mailing list.

It has NOT yet been unsubscribed from the list.
To unsubscribe you need to confirm the unsubscription
request by sending an email to the address:

        [email protected]

with the Subject string:

         CONFIRM u940223980

...

Content-Type: text/plain;
    charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\))
Subject: CONFIRM u940223980
X-Universally-Unique-Identifier: 492E18AD-BFA5-4533-85A8-DBECBE0AE4AA
From: Morbus Iff 
In-Reply-To: 
Date: Wed, 18 Feb 2015 07:43:48 -0500
Content-Transfer-Encoding: 7bit
X-Smtp-Server: smtp.mxes.net:morbus_disobey.com
Message-Id: 
References:  
To: [email protected]

Received: from m4.mxes.net ([unix socket])
     by m4.mxes.net (Cyrus v2.3.12) with LMTPA;
     Wed, 18 Feb 2015 07:44:15 -0500
X-Sieve: CMU Sieve 2.3
Return-Path: 
Received: from 216.86.168.176
    by m4.mxes.net (bayesd) with LMTP id 1424263455-76556-21
    for ; Wed, 18 Feb 2015 07:44:15 -0500 (EST)
Received: from local_scanner.mxes.net (mxout-01.mxes.net [216.86.168.176])
    by mxout-01.mxes.net (Postfix) with ESMTP id 42C7C8A05B
    for ; Wed, 18 Feb 2015 07:44:15 -0500 (EST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56])
    (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits))
    (No client certificate requested)
    by mxin.mxes.net (Postfix) with ESMTPS id C8C298A059
    for ; Wed, 18 Feb 2015 07:44:14 -0500 (EST)
Received: from lists by frink.w3.org with local (Exim 4.80)
    (envelope-from )
    id 1YO3zC-0006kD-3T
    for [email protected]; Wed, 18 Feb 2015 12:44:14 +0000
To: [email protected]
Subject: Re: CONFIRM u940223980
References:   
In-Reply-To: 
X-Loop: [email protected]
From: [email protected]
Message-Id: 
Date: Wed, 18 Feb 2015 12:44:14 +0000
X-Virus-Scanned: ClamAV
X-Spam-Allowed: Sender domain in global Allow list passes SPF check
X-Originating-IP: 128.30.52.56
X-Envelope-To: 
X-Spam-Check: Enabled,6.0,13.0,1,1,42,1,0,0,1,1,0,0,0,[SPAM],
X-Spam-Status: No, score=-4.0 threshold=6.0,13.0 Allow Listed
X-Spam-BayesResult: Unsure, 0.502396
X-Spam-Score: -4.0
X-Spam-Scoring: 0,0

******* About the W3C Mailing Lists *******

There are many mailing lists provided by the W3C for discussion

...

Model principles should state that inferencing is not a priority

Decision in the 2015-02-18 telcon was that inferencing or other graph based reasoning was not a priority for making decisions within the annotation model.

The model document needs to be updated to state this.

Allow literals directly as a body?

A request from CSV WG to allow simple literals as a body.

Support for Notification

The protocol should support notification of activity within the annotation ecosystem, such as sending a notification to subscribers when an annotation is created on a resource that is being monitored.

This is a high level tracker issue but involves at least the following steps:

Identify and describe use cases
Identify or design appropriate notification model, with features that fulfill the use cases
Identify or design appropriate transport mechanism for the model
Document
Test

TLDR prevention at beginning? of Protocol

Most of the Protocol spec is totally uninspiringly obvious to anyone who has thought about REST for more than 5 minutes, or anyone who has read the LDP spec. It would be advantageous to have a summary of the additional constraints easily available, rather than requiring adopters to read through the entire doc and pick them out individually. This could be at the beginning or an appendix clearly referenced at the beginning.

Make Turtle support optional?

LDP requires the turtle serialization to be supported for all RDF Sources, which would include both Annotations and Annotation Containers. For JSON based systems, which we expect to be the majority of annotation servers, this is a barrier to adoption. Requiring that implementers do something that none of their anticipated clients will ever use should be avoided if possible.
So the question (post FPWD) is the degree to which this is a real barrier, versus one that is just perceived.

Proposal: To evaluate, we should determine in at least two implementations in different languages, how much code was actually required to support it, given the existence of JSON-LD / RDF libraries. If that code is not extensive and can be easily provided to implementers, we could maintain full compatibility with LDP. If there are a lot of hoops to jump through we should strongly consider breaking compatibility with LDP, potentially at a "level 0" specification.

xsd: prefix not defined in appendix

Appendix A lists the namespaces used in the document, but is missing xsd.

Add RFC7111 to FragmentSelector list

To explicitly allow CSV to be annotated with fragment selectors.
Also look for other recent RFCs defining media type fragments.

Do we need modules for the specification?

A question was whether or not we need to use modules (ala CSS) for the specification.

Proposal

Defer until we have a specification large enough to need splitting up!

Background

The CG originally had two modules, core and extension, and for the second draft collapsed them into one. This was due to confusion of which features belonged in the core and which in a module, and the namespace issues that were generated by separating them.

Links

Tracker

Annotation Lists

Several downstream systems have a need for lists of annotations, including EPUB [1] and IIIF [2]. For search, we need to have a list of annotations for the result set of applying the query to the set of annotations. Other expressed use cases are user constructed "playlists" of Annotations, curated distribution lists of annotations, and general optimization of annotation retrieval to avoid thousands of HTTP calls for each annotation individually.

As an initial proposal, we could use Activity-Streams's OrderedCollection class[3], which seems to fulfill the (implicit, to be expressed) requirements:

{
  "@context": ["http://www.w3.org/ns/activitystreams", "http://www.w3.org/ns/oa"],
  "@type": "OrderedCollection",
  "totalItems": 10,
  "itemsPerPage": 1,
  "next": "http://example.org/foo?page=2",
  "self": "http://example.org/foo?page=1",
  "startIndex": 0,
  "orderedItems": [
    {
      "@type": "Annotation",
      "motivation": "commenting",
      "body": {"value": "I like this!"},
      "target": "http://www.cnn.com/"
    }
  ]
}

This would be consistent with a (to-be-proposed) use of AS2.0 for notifications about annotation activity.

[1] http://www.idpf.org/epub/oa/#h.48f1o3s9o9hf
[2] http://iiif.io/api/presentation/2.0/#other-content-resources
[3] http://www.w3.org/TR/activitystreams-core/#collections

JSON-LD Profile Definition: Where?

The protocol document currently defines a profile for JSON-LD of the structure and context for an Annotation, that should be used. This could equally be in the model document, as it covers serialization, and we simply giving an explicit identity to that particular set of serialization constraints. (See #30)

Proposal: Move the profile definition to Model and reference from Protocol.

Add reference for "LDP Container"

The protocol document should include a link/reference to "LDB Container".

Expanded role for agents' activities

Agents might play more roles than just "annotator" or "serializer" with respect to an Annotation. The model should allow a more complete description of the agents' activities with respect to the annotation, to ensure that an accurate provenance trail is maintained.

This issue is distinct from #8 as this is about activities that have taken place (provenance in the PROV sense) rather than the intended audience of the annotation.

Justification

To be discussed.

Proposal

To be discussed,

Background

Using PROV-O completely was discussed in the CG and there wasn't a use case presented for why the full model was required. The decision was to not require the creation of the full model, but to ensure that it could be derived from the information given.

Links

Tracker

add reference to Annotation Protocol to Model

Add Web Annotation Protocol informative reference to Model

see https://lists.w3.org/Archives/Public/public-annotation/2015Jun/0295.html

Specifically, update introduction 2nd to last paragraph from

"A further specification will be written that standardizes the transport protocol, which may be adopted separately."

"See the Web Annotation Protocol [Web_Annotation_Protocol] for details."

and add reference

[Web-Annotation-Protocol] Sanderson, R. "Web Annotation Protocol", 2 July 2015 http://www.w3.org/TR/annotation-protocol/

I've also shared this proposal as an annotation , see https://lists.w3.org/Archives/Public/public-annotation/2015Jul/0024.html

(apologies I thought by replying the subject would be filled in automatically on the email etc)

Do we need both Composite and List?

The model currently has three options for multiplicity: A Choice (where only one should be selected), a Composite (where all are relevant but without any explicit order), and a List (where all are relevant, and there is an order). These are broadly equivalent to rdf:Alt, rdf:Bag, and rdf:List respectively. The proposal is that Composite is unnecessary complexity, and in JSON-LD the serialization would be identical to that of a List. There doesn't seem to be any harm in asserting an order when there isn't any, as the client will need to present them in /some/ order to the user anyway.

Justification

The model should be as simple as possible, and there's no real distinction or use case for keeping Composite separate from List.

Proposal

Remove oa:Composite from the model.

Background

There was no discussion in the CG about this topic, but it has been raised to the editors within the context of the List discussion.

Links

Tracker
Dependency: #1

Clarify status of specification table in 4.2.1

Should be marked as informational / non-normative -- other specifications can be used than the ones listed. The list is to provide a starting point and to prevent everyone from having to look up the right RFC / specification and using the URI for different representations (RFC in HTML vs text/plain for example)

Serialization of Lists

Serialization of oa:List is more difficult than it needs to be as it is both the head node of the list and has other predicates associated with it. This means that typical serialization routines will either fail or generate inconsistent output, as they expect the list head node to be a blank node with no other properties. This situation could be avoided with a slightly different model.

Justification

Current model exposes RDF plumbing (rdf:first, rdf:rest) unnecessarily
Current model introduces unnecessary implementation difficulty
In JSON-LD, repeated predicates and actual lists are represented in the same way (a json array) and hence there's no overhead in using rdf:Lists.

Proposal

Have an rdf:List as the object of a new property of the oa:List.

{
 "@id": "http://example.org/annos/lists/1",
 "@type": "oa:List",
 "hasList": [ "target1", "target2" ]
}

Background
Lists, in general, are needed to enable the following requirements:

Ordering of multiple bodies. Bodies may need to be presented in a particular order to make sense to the user, when there is a logical flow between them. This might happen if a single logical whole is broken up into a sequence of parts.
Ordering of multiple targets. Similarly, the order of the targets may be important due to some interdependence, and particularly when they are from different resources.
Ordering of multiple selectors. Selectors must be applied in the correct order for the desired region to be selected. For example, selecting the right resource from a zip file/epub and then applying a text range selector.
Ordered resources may be shared between annotations, and hence must have identity.

Allow full SVG in SvgSelector?

Should the SvgSelector allow multiple shapes? (From Doug via annotation)

Discussion:
The rationale for this in the CG was that multiple non-overlapping/grouped shapes should probably be multiple targets, each with their own selector.

Allow Digitally Signed Annotations

Being certain that a given annotation was indeed created by the asserted author is important in some disciplines. For example, in scholarly communication it is important to know who reviewed a paper as a peer's criticism would be regarded more highly than a random individual from the internet. In highly controlled environments this may be taken care of by authentication/authorization, but in a distributed system digital signatures could provide an alternative solution.

A pre Open Annotation implementation of signed annotations: Fab4

Justification

Reputation attacks and other spoofing would be easy in an environment that does not have digitally signed content. For trust networks to be built, the data needs to be both distributed and reliable.

Proposal

Requires further discussion.

Background

Digitally signed annotations were brought up and rejected as overly complex for the first iteration in the community group.

Links

Tracker

Non-Annotation Containers not in scope for Protocol

Decision at 2015-02-11 telcon was that non annotation resources such as arbitrary body content, css for Styles and SVG documents for SvgSelectors were not in scope for management. They may be in scope via LDP or other similar systems, but we do not need to address them.

Protocol ED needs to be updated to reflect this.

"Open" or "Web" Annotation

There's a mix of the two in the specification documents and we should be consistent.

Pro "Open":

There's already brand name recognition
There's already implementations and usage of the ontology, which we have updated rather than re-designed.
The namespace is ns/oa# and it would be good not to change it [c.f. danbri and foaf]

Pro "Web":

The WG is the Web AWG, not the Open AWG.
The protocol and client APIs are not derived from the CG work

Embedded Content

The community draft uses ContentInRDF which has not been updated since 2011, and shows no sign of that changing. As it is only a Working Draft, we can't legitimately use it in a TR, and thus we need to replace it.

Justification

This is an issue due to W3C requirements. It is a requirement for the model because (especially initially) most annotations have embedded textual bodies, and we need a consistent model between external and embedded resources.

This also solves the issue of literals in RDF not being able to have both data type and language at the same time.

Proposal

We use our own minimal representation, modeled after the Content in RDF working draft.

Plain text encoding:

{
  "@type": "oa:EmbeddedContent",
  "rdf:value" : "content here",
  "dc:language" : "en",
  "dc:format" : "media/type"
}

And for base64 encoding:

{
  "@type": "oa:EmbeddedBase64",
  "rdf:value" : "base64-encoded-content-here",
  "dc:language" : "en",
  "dc:format" : "media/type"
}

Rationale for dropping cnt:characterEncoding is that the graph's serialization will have it's own encoding, and any content should be transcoded to that. Somehow carrying windows code page within utf-8 is not something that we should be recommending.

Rationale for dropping cnt:bytes / cnt:chars is that you would never have both of them on the same resource.

Rationale for dropping the ContentAsXml distinction is that it's overkill and no additional information is needed beyond the media type and what is included in the value. Other representations aren't broken up into declarations and so forth if they have them, and it makes for an unnecessary choice for developers between Text and XML.

Background

The OA CG reached out to the editors and EARL WG several times to see if they would update their specification, or even had plans to, without any success.

Links

Tracker

Is fixing the list of fragment identifiers a good idea?

I was re-reading the fragment selector section in the model document; my reading is that the Recommendation would fix the fragment selectors that a conforming implementation can use.

I think this is a very bad idea. Fragment identifiers are defined all the time; by restricting the list to the fragment identifiers we know about at the time of publishing the specification we will incur the danger of being out of date very quickly and that would require updates of the Recommendation. At this moment we are already missing some on the list, like:

fragment identifiers for CSV files, defined by rfc7111 (this is the open #26 issue on our issue list)
fragment identifiers for EPUB files, called CFI
fragment identifiers for PDF, defined in the PDF mime type registration

And these are only a few examples. Within W3C, actually, there is work on, eg., Web packaging that may lead to new fragment identifiers defined for web packaging formats (and the publishing community may come up with alternative for this), and we ourselves may define separate fragment identifiers for the RangeFinder API (as a serialization thereof). On long term we will loose.

I believe it would be a much better approach to leave this open ended. We should accept fragment identifiers that are officially defined either directly as part of a media type specification (as the one for PDF above) or as separate RFC-s (like rfc7111). I am sure there is a list somewhere maintained by IETF to refer to.

Support for search

The protocol should support search and retrieval of annotations according to user/client specified criteria (a query).

This is a tracker issue for progress, which will involve at least the following steps:

Identify and describe Use Cases
Identify or design appropriate query language
Identity or design appropriate transport representations for request (if needed beyond a URI) and response
Write specification
Tests

Yet Another JSON-LD the protocol spec to use?

The protocol spec (4.1.2) says:

The JSON-LD serialization of the Container's description should use the Open Annotation's context, http://www.w3.org/ns/oa. (Additional Constraint)

Are there any strong reasons to recommend using the annotation context URI? Technically speaking, the json in the example 3 would be equivalent to the json below, as far as it follows both JSON-LD and LDP specs.

{
  "@context": {"ldp: "http://www.w3.org/ns/ldp#",
  "rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
  "iana" : "http://something"
  },
  "@id": "http://example.org/annotations/",
  "@type": "ldp:BasicContainer",
  "rdf:label": "A Container for Open Annotations",
  "iana:alternate": ["http://example.org/annotations2/", "http://example.org/moreAnnotations/"],
  "ldp:contains": ["anno1","anno2","anno3","anno4"] 
}

I'm afraid that the line just contribute to introduce unnecessary confusions. Is it possible to remove the line?

Choice should have a priority order

Currently, oa:Choice only allows two priorities for informing clients about the intended preference, the default option and all other alternatives. There is no order of preference within the alternatives. This may be desirable for gracefully degrading across a list of selectors, or representations with decreasing fidelity for body or target.

Justification

The proposal would allow more flexibility, without any increase in overhead. It addresses use cases where there is more priority information available than just a default and a set of equal alternatives, such as q values in HTTP or other prioritization systems.

Proposal

Have a single rdf:List of all the choices in descending priority order.

{
  "@type": "oa:Choice",
  "hasList": ["default", "secondChoice", "thirdChoice"]
}

Dependencies

Depends on #1 for serialization of the list, and for consistency

Background

oa:Choice is needed for the following requirements:

Alternative bodies, where the user agent selects the most appropriate for the user based on preferences. For example translations of the body, and the user has set a language preference.
Alternative targets, where the same target has multiple URIs or identities. Only one target needs to be displayed, rather than all of them.
Alternative selectors to enable fallback scenarios where a very exact selector will work in some cases, but a more broad selector will be less precise but work in other cases, given a dynamic representation.

Should Annotation concept and document be distinguished?

As per Luc's comment in #7, and the editor's note in the Community specification, the current model does not require separation at the vocabulary level of the conceptual annotation and the instantiation of it as an Open Annotation resource. For example, it is clear that someone annotating a book in the 1800s did not create an Open Annotation document, but did create an annotation that could be modeled using the specification. In a more modern use case, the person that conceptualizes the annotation and the agent responsible for creating the annotation could be different, and the agent responsible for serializing it could be different again.

Further, collapsing the concept and the serialized model is convenient for simplicity, but makes it impossible to express further provenance without breaking out of the model. For example, if it was important to use the full PROV-O modeling features, the distinction between serialization and annotation must be distinguished and then annotatedAt / serializedAt don't belong.

Justification

The justification for separating them is expressiveness. The justification for not separating them is simplicity. There's always this trade off.

Proposal

Status quo, unless there's a solid use case provided that can't be accomplished.

Background

Links

Annotation Containers should permit non- ldp:BasicContainer

The protocol specification currently requires that Annotation Containers be implementations of ldp:BasicContainer. This prevents implementations from using other types of Container, such as Direct or Indirect Containers or any defined in the future, even if they may be appropriate for business needs.

Use Case: In managing a list of annotations, an LDP implementer wants to use a direct container to assert the membership of newly created annotations within the list which is separate from the container itself. This would most easily be done with a DirectContainer, but this is not possible as the Annotation Container must be Basic.

Note that BasicContainer and DirectContainer are siblings in the LDP class hierarchy, even though DirectContainer could trivally be a more specific subClassOf BasicContainer. 😿

Proposal: Benjamin and Rob to craft spec text that promotes a default of Basic, but allows for others when appropriate.

Should all responses have the anno json-ld profile?

Should all responses from Annotation Servers where the entity-body is a serialized annotation have a Content-Type of application/json+ld with the profile URI for the annotation profile? As most servers will have to explicitly set the content-type header, rather than allowing it to be set by an upstream system (such as apache's mod_mime_magic or similar) the implementation cost seems low.

It doesn't seem totally necessary, but could be useful for systems that do support multiple profiles. Better to be liberal with what you accept and strict with what you send.

Annotation alsoKnownAs <uri>

Requirement:
In order to deduplicate annotations across multiple systems, it would be useful to know where the annotations were originally harvested from. Also, if a client assigns an internal URI (such as a UUID) to the annotation, recording this in the model would be valuable so the client can later re-discover the annotation.

Discussion:
In the CG, the model included oa:equivalentTo. This seems a much broader issue than just ours -- is there a better relationship that we can make use of? iana:via? prov:derivedFrom?

Should the namespace change?

From #46, there is the question of whether the namespace should change for the model.

Note that this is only a concern from the RDF perspective, not the preferred JSON serialization which won't have even a prefix, let alone the complete namespace (per #12).

From the telco on 2015-07-08, some of the discussion included:

There's a lot of use of the namespace
Changing namespaces is generally a bad idea
Backwards compatibility without changing the namespace is important
The CG spec is explicitly a draft, so we need not feel too constrained by changing the definitions in a non-backwards-compatible way
It demonstrates continuity and inclusion, rather than division and competition, hopefully avoiding splitting the community of practice
It has the oa acronym ... which would be confusing without the history

The decision on the call was to defer the decision until later.

Role of Target/Body w.r.t. the Annnotation?

Should the particular role of the Body or Target be explicitly modeled in the Annotation? This is related to #4 in which some Bodies are Tags and some are Comments.

Luc: I understand for multiple targets, we may want to distinguish roles of targets. If you come from JSON background you might want dictionaries, key / target pairs. Why don't we look at other forms of collections besides lists? minutes

Background:
Choice, Composite, List were the generic constructs discussed in the community group, and further roles were considered as community/domain specific.

Multiple JSON-LD Contexts

It is possible to have multiple JSON-LD contexts applied to exactly the same model to generate different serializations with the same structure. Compare the CG's context with the current WG's context, for example. The protocol should specify how to request the annotation using a particular context. For Create and Update this is not relevant as the context can be part of the JSON-LD payload of the request.

Proposal:
There is a recommended best practice of using profile URIs [1] from RFC 7284 [2]. The WG would need to register a profile for the base context and frame/structure, and further communities would then register their own profiles to specify alternatives. The registered profile URI is then carried in the Link header [3] on the HTTP request/response with a rel of 'profile' [4]

Thus the header might look like:

Link:  <http://www.w3.org/profile/oa/1>;rel="profile"

[1] http://www.iana.org/assignments/profile-uris/profile-uris.xhtml
[2] http://tools.ietf.org/html/rfc7284
[3] http://tools.ietf.org/html/rfc5988
[4] http://tools.ietf.org/html/rfc6906

Distinguishing Semantic Tags from Information Resources

The current model assigns the oa:SemanticTag class to existing Non-Information resources when they are used as a Tag in an Annotation. Given the open world, this asserts that the NIR is always a SemanticTag, not only in the context of the annotation.

Justification

This is a problem for two reasons:

We are polluting the global knowledge about a resource for a local usage scenario.
The same NIR could be used even within the annotation space as both a SemanticTag and the real world object that the URI identifies. For example one might imagine tagging the URI for the (physical) Eiffel Tower with the URI for the (physical) Paris, or Tower, or ...

Proposal

Proposed solution is to always use the same pattern as for using documents as tags in the draft, but with a different predicate along the lines of "hasConcept". Perhaps something from SKOS?

{
 "@type": "oa:Annotation",
 "oa:hasBody" : { 
    "@type": "oa:SemanticTag",
    "xxx:hasConcept": "http://dbpedia.org/resource/Tower"    
  },
  "oa:hasTarget" : "http://dbpedia.org/resource/Eiffel_Tower"
}

Background

This was not seen as crucial to solve in the CG as the likelihood of the pollution actually being relevant is quite low. There's very little danger in asserting that every non information resource is a SemanticTag, because SemanticTag has minimal implications.

The rationale for distinguishing the body resource as a Tag, and having oa:tagging as a motivation comes from multiple bodies on a single annotation. For example, commenting about a particular span of text and tagging it as needing to be updated at the same time.

Links

Tracker

Update skos:related to skos:exactMatch for semantic tags

As the tag is the same concept as the concept expressed by the object of the triple.

Create dereferenceable JSON-LD context document

... And fix the URI in the model appendix from /xxx/yyy (!)

Decision needed as to where it lives (#36) and whether it includes protocol information as well as model.

Add reference to CORS

The protocol spec refers to CORS but there is no reference to that REC in the references (http://www.w3.org/TR/cors/).

JSON-LD contexts for Model/Protocol

The protocol currently refers to the model's default JSON LD context. However it requires additional protocol specific features beyond the annotation model itself, such as ldp:contains, ldp:BasicContainer, iana:alternate and similar.

We would be mixing concerns by including those protocol specific features into the model's context document. Similarly, if we included all of the model's features into the protocol's context. There is the option for multiple context documents, however this is not known to be a common pattern.

Proposal: Create a common context document with both the protocol and model's features and reference from both. This might make for administrative challenges (e.g. a revision of the protocol specification will change something referenced by the model) in the future, but better than requiring multiple contexts.

avoid constraining HTTP

http://www.w3.org/TR/2015/WD-annotation-protocol-20150702/#http-requirements has MUSTs in there that try to constrain HTTP servers. this is not something that HTTP servers reasonably can be required to do. more specifically, clients should have no knowledge of "specific servers" anyway; they simply follow links and interact via HTTP to accomplish application goals. they may interact with one or various "specific servers" along the way, and the web thrives because clients are not tightly coupled to specific servers. clients send self-contained HTTP requests and then have to handle requests individually. no assumptions should be made that go beyond the single request/response scope.
for example, the WD says "All supported methods for interacting with the Annotation Container MUST be advertised in the Allow header of all responses from the container." this constrains HTTP which defines a MAY (http://tools.ietf.org/html/rfc7231#section-7.4.1), and it does not accomplish anything because in the end, clients can try any method and servers can change their minds between request/response interactions. so in the end maybe "Allow" can be a helpful hint, but it's optional, not reliable, and clients still have to deal with servers responding with 405s (which per HTTP spec then MUST have "Allow").

Require dct:Text class for interpretation of literal bodies

To distinguish between tags and comments. This explicitly makes the body a comment and never a tag.

Annotation protocol available or not?

If the annotation protocol is not mandatory to support along with the model, how can a client determine whether then protocol is supported? If it is not supported, is there a way to describe or at least identify the service that /is/ available?

The protocol can advertise itself, via an HTTP header for example, but that doesn't help with other protocols. Should determine the scope of the interaction between the model and arbitrary protocols, and how our protocol can identify itself.

Unable to have a graph as the body of an annotation

Requirement: Have an explicit named graph as the body of an annotation to provide a semantic, machine-readable "comment". This might be used to assign properties or relationships to the target.

Justification: In human language, a comment of "I like this" would be perfectly acceptable as the body of an annotation. In order to effectively express this concept in a semantic way, it could be encoded as a triple: fb:likes . In order to keep this triple separate from the annotation's triples, it needs to be in a separate graph.

Proposal:

Allow a named graph as the object of hasBody.

{
  "@id": "http://example.org/annos/1",
  "@type": "oa:Annotation",
  "body": {
    "@graph": {
    "@context" : "http://example.org/social/context.json",
    "@id" : "http://example.org/users/rob",
    "fb:likes": "http://example.com/logo.jpg"
    }
  },
  "target": "http://example.com/logo.jpg"
}

(JSON-LD playground link: http://tinyurl.com/jvurj2m)

Update protocol WD from 2015-07-22 call

Merge 4.1 and 4 into just Retrieving Annotations
Update references, per @fjh's comments
MAY to SHOULD in 5.2.1 with At Risk note

Create reference set of correct/incorrect annotations for test development

In order to drive testing, we need examples of valid and invalid annotations. Valid to ensure that the test framework correctly discovers all of the patterns that are okay and doesn't reject them, and invalid to ensure that the framework doesn't mistakenly assert that an annotation is okay when it isn't.

Tentative decision was to associate these with the specification, rather than maintaining them in the testing framework, in order to generate broader exposure to the resources for other systems to use. For example, an annotation client might wish to use the same content as unit tests for its own functionality. Also it's easier to keep them up to date when they're with the specification rather than potentially skipped over when they're out of sight in the testing area.

Tentative decision was to use the current working drafts, not FPWDs, as the source of these examples to be as up to date as possible.

4ba6e17 is first step towards this for model.

Correct Examples
Collection of Correct Examples
Error Examples

Client can't determine if user has authorization to modify annotation

Clients typically display whether the current user can modify an annotation when presenting them to the user. The client might wish to present an edit button only when the edit is possible, rather than for every annotation on the chance that the user can edit it.

Currently there's nowhere in the model where this information can be conveyed.

Recommend StillImage instead of Image?

dcterms:Image is the superClass of StillImage and MovingImage. Should we instead replace the recommendation for Image with StillImage to avoid potential inferential confusion?

JSON-LD Context keys

The current JSON-LD context maps the ontology predicates directly into the same name in the JSON. This means that "hasBody" is the key in the JSON, whereas "body" might be more intuitive. By duplicating the keys, we're missing out on one of the key strengths of JSON-LD in that the serialization looks like JSON.

Proposal:

Rename the keys in the context to be more familiar.

Background:

This was discussed in the CG but rejected.

Draft "Client-Side API for Annotations"

Obviously the title of this issue is easier said than done. :)

Currently the farthest progress I've stumbled upon along this front is DOMAnnotations by @nickstenning, which is a great first start!

Continuing along a thought process I started in an ActivityPump issue, I'd like to probe to see if there's a single Web Interface we can standardize that can help with both this deliverable of Web Annotations as well as part of the Social API Deliverable of the Social WG.

Thus, I'd like to throw another contender into the ring, a DOMActivity API, where one use case is an ActivityEvent where .activity is of the form "{actor} create {annotation}". Please annotate my straw man (for now) via GitHub Issues.

w3c / web-annotation Goto Github PK

web-annotation's Introduction

Web Annotation Repository

web-annotation's People

Contributors

Stargazers

Watchers

Forkers

web-annotation's Issues

Recommend Projects

Recommend Topics

Recommend Org