oasis-tcs / cti-taxii2 Goto Github PK

View Code? Open in Web Editor NEW

9.0 9.0 4.0 4 KB

OASIS CTI TC: An official CTI TC repository for TAXII 2 work

Home Page: https://github.com/oasis-tcs/cti-taxii2

License: Other

cti-taxii2's People

Contributors

Stargazers

Watchers

Forkers

4n6strider masterscott 6un9-h0-dan justcherie

cti-taxii2's Issues

Statements About Sorting Appear to Conflict But on Closer Reading Do Not - Needs Clarification

In section 3.3, sorting, we have a conflicting statement.

For the Collection and Manifest Endpoints, objects MUST be sorted in ascending order by the date the object first appeared in the TAXII Collection (i.e., the added date).

Then further down we have

For the Collections Endpoint, Collections MUST be sorted and MAY be sorted in any order.

Investigate pagination support

We need to identify if pagination support is required for TAXII. We had item based pagination support in TAXII 2.0, and took it out for TAXII 2.1. The problems we ran in to were data sets that can change rapidly make item based pagination impossible. Further, item based pagination proved to be very computationally expensive for large data sets.

There is a use case where a system may add millions of records to the TAXII server in a single database transaction. Situations like this, may also record the same "date added" for each record in that database transaction. This means, that date added based filtering / pagination would not be possible.

There needs to be some sort of solution that can allow a client to tell the server, based on some monotonically increasing counter to start there and give you records either before or after that point.

Success criteria for this feature is the ability to handle rapidly changing datasets, datasets that are really large, and provide a performant solution.

The endpoints that need pagination are:
GET /collections/ - see section 5.1.
GET /collections//objects/ - see section 5.3.
GET <api-root/collections//objects// - see section 5.5
GET /collections//manifest/ - see section 5.6.

The Object by ID resource can contain a significant number of object versions, which become unwieldy to manage in a single request/response pair. Without a mechanism to manage highly-versioned objects, effective transport is significantly limited.

How do you handle content that you need to down convert

This one is more of an ethical or philosophical problem. Say my server supports STIX 2.0 and STIX 2.1 content. I get some data that is in STIX 2.1 format. Then someone comes along and asks for content in STIX 2.0 format. What do I do with the fields, properties, objects, relationship types, vocab terms, etc that are not valid in STIX 2.0?

What happens in a STIX Grouping that has content that is in both STIX 2.0 and STIX 2.1. Meaning, what happens if it has indicators and notes and opinions. Do you send the indicator and not the note and options? Do you prune the notes and opinions from the grouping? What do you do with confidence fields?

Update JSON Reference

TAXII 2.0 references RFC 7159 for JSON. It was obsoleted by RFC 8259

TAXII 2.1 should reference RFC 8259. Additionally, RFC 8529 changes course on Character Encoding, moving away from non-UTF-8 payloads.

Add clarifying text around the types of errors you should get and when

We probably need to add a bit more clarifying text to the spec to say what should happen under the following conditions, so that people implement this the same way:

If no data is returned for a filter
If the filter parameters are wrong
If some of the filter parameters are right and some are wrong

RFE: TAXII Channels

TAXII requires a data-bus type of mechanism where data can be posted to subscribers for receipt without the implementor of the TAXII server having to worry about storing that data for all time, as they currently have to with TAXII collections.

It has been suggested that this use case might also be able to be fulfilled by adding some notion of TTL to TAXII collections. I will leave this open to the possibility that this may also be a way to fill the use case.

The ID field in the status resource has the wrong data type

Location: Section 4.3.1

The data type for the ID field in the status resource is wrong. It is set to a string and then in the description it says that it is an "identifier". This is not how we have done this elsewhere. The type needs to be changed from string to identifier. A suggestion has been made in the 2.1 document to fix this.

Object Sort Order is Confusing and Surprising to Some

In the document we need to add a lot more clarity around what content is returned based on what type of query. My proposal would be:

If there are is no added_after or added_before like filter than the results should be the newest ones in the collections sorted in reverse order (not sure on the sort order). Meaning, if you have 100 records, 0-99, and you limit the results to 10 records. You should get records 99-90 or 90-99 depending on which sort order we decide on.
If you have an added_after or added_before then it should just start at these points. I am once again, not 100% sure on the sort order for each of these. But we should be consistent as much as possible so that a client can have a predictable interaction with the server.

Advertising a version of TAXII at a CSD level is not currently possible

Given the way we advertise the TAXII version via the API Root and Media Types, it is not clear how one would go about saying they are using TAXII 2.1 CSD01 for example.

The example for the get_objects endpoint is wrong

Section 5.5

The response uses old syntax for the bundle:

{  
  "type": "bundle",
  ...,
  "indicators": [
    {
      "type": "indicator",
      "id": "indicator--252c7c11-daf2-42bd-843b-be65edca9f61",
      ...,
    }
  ]
}

There is only one property now to store the SDOs and SROs - objects

Discovery Resource Does not Clearly Articulate API Root Allowable Values

Currently the specification is silent on if the URLs inside api_roots can be relative or not.

I would prefer that relative be allowed as it would make creating a modular TAXII service library simpler.... but regardless of the decision, we need to be clear here.

Manifest Resource Cannot Accurately Specify All Media Types and Version Combinations

For each object, the manifest resource specifies a list of versions and a list of media types. However, not all version and media type combinations are necessarily valid. For instance, a Collection may contain an object where v1 is available as STIX 2.0, and v2 is available as both STIX 2.0 and STIX 2.1. The manifest resource cannot accurately specify this condition.

The proposed change is to:
Reduce the scope of each manifest resource from "one manifest for all versions of an object" to "one manifest per object version and media type"
Modify the Manifest resource to permit only a single value for object version and media type.

This change breaks backward compatibility, and the impact is estimated to be moderate. While a large portion of the specification will change in a backward incompatible way (Section 5.6), the STIX2/TAXII2 Preferred program does not consider this area of the specification, and few implementations have implemented this area of the spec.
We discussed this at the F2F and the consensus in the room was that we should make the proposed change to the manifest resource.

This will solve the various pagination problems that exist. This means the object id, date_added, and version would just be a simple string and each "version" of an object will have its own entry in the manifest objects list. Fixing this will also resolve issue #30

API Root TAXII Version does not match Media Type Version

In the API Root resource we advertise the TAXII version as "taxii-2.0" or "taxii-2.1". However, in the media negotiation we do "version=2.0". These should probably be the same or should be done in such a way to make it more clear.

Object by ID Resource Responses Can be Huge and Needs Pagination Support

The Object resource can contain a significant number of object versions, which become unwieldy to manage in a single request/response pair. Without a mechanism to manage highly-versioned objects, effective transport is significantly limited.

The proposed change is to add pagination to this resource in section 5.5. This change is backward compatible.

Add ability to find all objects related to a particular STIX object ID, to prevent an indeterminate number of queries to find them all

One should be able to find objects that relate to a given ID, either embedded or external relationship.

Add Collections to X- Header list

We provide two headers useful for pagination outside of the standard way:

X-TAXII-Date-Added-First
X-TAXII-Date-Added-Last

They are required for:

GET <api-root>/collections/objects/
GET <api-root>/collections/manifest/

They should also be required for GET /<api-root>/collections/, since it also requires pagination support.

Add Delete Capability to Object Endpoints

We need to discuss how a client that has posted content to the taxii server can delete the objects from the server collections.

Currently, there is no easy way to do that and using revoke of an object has problems.

Adding so that we can discuss.

As a User, I want to auto dereference externally related objects, so that I don't have to make multiple web requests to get all the information

Externally related means "not embedded" e.g., sightings, SROs

Version key word 'all' should not be allowed when using multiple version parameters

Section 3.5.1

The spec does not currently say anything about using the "all" keyword in addition to other keywords. The "all" keyword should be restricted from multiple version selectors... Examples:

?match[version]=all = OKAY
?match[version]=last,first = OKAY
?match[version]=last,all,first === THIS SHOULD BE ILLEGAL

GET /status/<id> should be allowed to return HTTP response code 406 (Not Applicable)

TAXII 2 defined this //status// endpoint that allows one to poll for status of adding objects to a collection, to allow for cases when adding to a collection might be an asynchronous task.

However, due to how the spec is ambiguously worded, it is unclear to the implement what they are supposed to respond to this endpoint if they process inserts immediately.

The only thing the spec says is "TAXII Servers SHOULD provide status messages at this Endpoint while the request is in progress until at least 24 hours after it has been marked completed." - that statement is basically meaningless without more information as to what should be replied to a request for status that was immediately fufilled, because the only allowed response code is 404 "Not found".

HTTP2

There is no reason not to allow this in 2.1

We have implemented this in our server.

RFE: TAXII Observed Data Query

RFE to allow a way to query observed_data objects that match a given SCO pattern. Once the consumer retrieves those objects, they can pull other relationed objects if they desire.

Version filter description only talks about object(s)

In section 3.5.1 the version filter only talks about being used on object(s), when in reality it is also supported on manifests.

Need to clarify which data is returned, the oldest data or newest data.

We need to add some clarify around what data is returned if the client is making its first request of the system without any pagination or filtering controls and the server is limiting the data coming back. For example, say there is 100 million records and the client asks for all of them. If the server limits the client to 1000 records, which 1000 records should be returned. The oldest 1000 records or the newest 1000 records.

No way to select a range of versions

In 3.5.1, we can specify a specific version, but there is no way to select a range of versions. For example, say there are 1000 versions of an object, w/ 900 of them before Dec 1st, 2017, but you don't know the newer version yet, but only want the ones that happen after (or before)?

something like [version]=2017-12-01T00:00:00.000Z-

BLOB Store

The Issue

JSON is great for managing structured text. It's terrible for managing binary large objects (BLOBs).

Base64 is inefficient
Most JSON libraries decode the entire structure into memory without a SAX-like/streaming implementation

HTTP solutions have mechanisms in place to make dealing with large downloads easier:

Range Requests for resumable downloads
existing libraries handle POST from and GET to disk
caching

Most databases aren't designed to handle BLOBs and structured text in the same object. STIX 2.0 recognizes this to some extent, providing a URL alternative to Base64-encoded binaries in the url field of the Artifact Object. A custom property could choose to use a similar scheme. However, how to handle this URL is left as an exercise for the implementor:

It is incumbent on object creators to ensure that the URL is accessible for downstream consumers.

The requirements around this URL-based BLOB offloading may include:

Access controls similar to the STIX 2.x objects that reference them
We find and authenticate to the URL the same way can retrieve the object, i.e. TAXII 2.x.
TAXII 2.x implementations MAY verify the existence of a value at the URL before inserting the associated object. It makes no sense to add an Artifact with no value.

If we do not provide implementors a standardized mechanism to handle large BLOBs, we raise the barrier to entry. Proof of concept codebases may blow-up. Without a standardized way to upload BLOBs, it becomes impossible to upload an Artifact that exceeds an API Root's max_content_length. Clients will need to write custom JSON code to stream values larger than available RAM, etc.

STIX Patterning Language is another potential gotcha for large STIX/JSON objects. I don't see an existing way to offload a block of STIX Patterning Language that includes a massive Base64 binary. This could trip up Pattern Language lexers/parsers dealing with large BLOBs that exceed available memory.

The Proposed Solution

TAXII 2.x servers MAY implement the BLOB endpoint.

POST <api-root>/collections/<id>/blobs/
GET <api-root>/collections/<id>/blobs/<blob-id>/

Read/Write access to BLOBs will be governed by the existing can_read and can_write attributes on Collections.

I have deliberately chosen blob over a term like artifact because a custom property could reference a BLOB.

Sponsorship

NineFX will implement a prototype endpoint to support this.

TAXII Discovery URL Collides With Existing Product URLs

We should look at adding NAPTR record support to TAXII to further enable discovery of TAXII services.

Make Error responses optional

Issue

TAXII 2.x Error Messages take an opinionated view of where the TAXII server sits in an enterprise. The standard mandates TAXII-specific JSON responses to common HTTP errors:

TAXII responses with an HTTP error code (400-series and 500-series status codes, defined by sections 6.5 and 6.6 of [RFC7231]) that permit a response body (i.e. are not in response to a HEAD request) MUST contain an error message (see section 3.6.1) in the response body.

This raises a number of issues for implementers. Services often sit behind load balancers and gateways.

Scenarios

Scenario 1 - Authentication Gateway

I have a security gateway that handles authentication via basic auth or client certificates. I place my TAXII server behind this gateway. If authentication/authorization fails in the gateway with an HTTP 401, I have to intercept the auth gateway's response and lift it to JSON.

Scenario 2 - Libraries

A number of enterprises and vendors have software components for auth. Instead of dropping them in a stack, I have to do something similar to Scenario 1 and drop another piece of middleware between the auth layer and the client in my stack to lift the vanilla 401 to JSON.

Scenario 3 - General Purpose Gateways

Load balancers and tools like web application firewalls will process requests before they hit TAXII for any sizable deployment. They are likely to handle the following HTTP codes in sections 6.5 and 6.6 of RFC 7231:

400 Bad Request
404 Not Found
405 Method Not Allowed
408 Request Timeout
413 Payload Too Large
414 URI Too Long
503 Service Unavailable
504 Gateway Timeout
505 HTTP Version Not Supported

They won't respond with TAXII-specific JSON

Limited Value

There is limited value in many of the standard error fields. http_status is already in the response. I don't need a title on a 404, because this is machine-to-machine communication. You can believe that a request to /foo/collections/170f24af-c685-411d-bd2a-f45248adb245/ is "Collection not found". There is no way to know. It could be "not found" on the Get API Root Information endpoint where the API Root is /foo/collections/170f24af-c685-411d-bd2a-f45248adb245/. The same goes for invalid parameters. You may think that /foo/collections/170f24af-c685-411d-bd2a-notarealuuid/ is an invalid UUID parameter to a Collection, but it can also be a valid API Root. If you drop some crafted paths in a deterministic routing engine like Yesod's, you'll see how you can abuse nondeterminism in TAXII's API Root support.

Recommendation

TAXII responses with an HTTP error code that permits a response body (i.e. are not in response to a HEAD request) MAY contain an error message (see section 3.6.1) in the response body.

I have intentionally removed the HTTP 1.1 reference because I'd like to see optional HTTP2 support in 2.1.

The X headers are not clear about which date

In the spec we defined two headers X-TAXII-Date-Added-First and X-TAXII-Date-Added-Last. My question is, should this be the time the object was added to the server or the time it was added to the collection that you are pulling from... When I wrote this originally, I was thinking about the time it was added to the server. But now that I have working code and am trying to implement this feature, there is a difference between when an object is added to a server and when it was put in a collection. Some times these will be the same time, other times, an object may be added to a collection after some sort of analysis, which means it was already in the server.

Add clarifying text around can_read and can_write attributes of Collection

In the table in section 5.2.1:

For can_read, add If true, users are allowed to access the Get Objects, Get an Object, or Get Object Manifests endpoints for this Collection. If false, users are not allowed to access these endpoints.
For can_write, add If true, users are allowed to access the Add Objects endpoint for this Collection. If false, users are not allowed to access this endpoint.

In sections 5.3, 5.5, and 5.6 add If the Collection specifies can_read as false, this Endpoint SHOULD return a HTTP 403 error. (or maybe MUST)

In section 5.4, add If the Collection specifies can_write as false, this Endpoint SHOULD return a HTTP 403 error.

In all of the above, we may want to add "for a particular user", since I believe different users can have different rights to the same Collection.

Need to change the media type for TAXII per OASIS / IETF / IANA

We need to change the media type that we use for TAXII. We are supposed to use the standards tree, not the vendor tree.

Get STIX Object by ID does not describe why a bundle is required

We should add some clarifying text around why a bundle is needed for get an object by ID. The reason is that you might get more than one version of the STIX object.

Returning the number of results is not explicitly clear

In the 2.0 specification we talk about the server returning the total number of results. It is felt that this is not super clear. Does this mean the number of records in the collection or the number of records that match a certain query. All it says is "in a result set". If we keep this in our new pagination style, when that is designed, it should probably be made to be optional and it should probably be the total number of objects / items in the collection, not the number of objects that match the current filtered query.

As a User, I want to traverse the STIX graph over TAXII in an efficient manner, so I don't waste resources

Key components:

Embedded references
SROs
Depth
Graph cycles/loops
referenced_by vs references (e.g., graph directionality)

Add versions entry point on object by ID

We should add a manifest like entry point on the objects URL so that you can then get the list of all of the versions for a specific object ID.

max_content_length example value is not exemplary

Section 4.2.1

The max_content_length on the API Root resource needs some clarifying text about what this field is and how to compute it. Maybe an example or two for "common" possible values. In addition to this the example has a fake value that we picked out of thin air as a place holder and never went and updated. The example of 9765625 would equal 9.313MB

Add a filter of added_before

We should add a filter to allow historically discovery of information from a TAXII collection.

Need ability to request related objects in one request to a distance of 1(?)

TAXII should support the ability for a client to tell it to automatically send external relationship objects and their end points to some depth level. This depth level should be configurable on the server and probably advertised either at the api-root level, server level, or maybe even collection level.

The idea is if you ask for a malware object, you could also say, give me all relationships that point to it and the objects on the other side. We would need to be careful here as this could be very intensive for the server to walk the graph. Thus the need to have a depth parameter. Maybe the default depth should be 1. Meaning, just send the external relationships. Then if people want to get the other side, they can make that call separately. Or they could use the auto dereference feature to get the other side.

Now we may chose to not do the depth, and that is okay, but we do need to provide a way to tell the server to give you the relationships.

Add support to get embedded objects in a single request, dereference embedded objects

TAXII should support the ability to automatically dereference objects. Meaning, if you request a STIX indicator and the client says auto-dereference the object, then the TAXII server should send the identity object that is linked via the created_by_ref at the same time.

Manifests Can Advertise Media Types Beyond Collection Capabilities

In Section 5.6.1 we need to add some clarifying text to the media_type property of the manifest that says they need to be the same or a subset of the media_types listed on the collection.

Add clarifying text about the date of objects in a collection that have versions

This is semi implementation specific, however, it would be good to make sure everyone does this the same way. The problems comes from having multiple versions of an object. Say you have 10 versions of an object and the server limits you to only 8 objects at a time. If you do a filter for added_after or no filter at all, you can run in to a situation where the x-headers coming back might have the same date, depending on how you interpret the specification. This will cause an automated client to get the same content over and over and over.

So while some systems will implement this differently and may not see this problem, others may. It would be good to add clarifying text to make sure clients do not get in to this situation.

Trailing-slash normative requirement for resource identifiers is missing

All of the examples and resource definitions call out that the TAXII resources have a trailing slash. However, we forgot to add a normative statement saying as much. I propose we add a simple normative statement that says "All TAXII resources MUST have a trailing slash, as shown in their definitions." Or something like that.

Add support for Channels

We need to add support for channels (pub/sub) in TAXII.

Item Based Pagination is Unusable for Rapidly Changing Datasets

I have some concerns with how we designed pagination in TAXII 2.0. The whole range aspect which is basically a limit / offset design only works well if your collections are small. Once your collections get to a certain size, regardless of the database you use (MySQL, Postgres, Oracle or even a NoSQL), the performance impact is huge and the wheels come off the bus. Apparently this is very well known issue with REST APIs and large web2.0 companies basically say, never use range (limit/offset) for REST APIs.

Now there are several ways that you can try and get around this problem (like caching data, using pages, using cursors, using result tables and doing in-memory ranges), but non of them are very good and represent a lot of unnecessary hackery. After doing a lot more reading and trying to implement this in various ways with various database backends, I am thinking that there is a better way. This solution would be pretty simple to implement and would greatly improve performance.

I propose that we drop our pagination design and just use added_after with some limit value. This would represent very little change to the overall architecture, other than dropping some sections and rewording a few normative statements. From a code stand point, it would be MUCH easier to implement.

A client could then just say, "Server give me all records after 2016". The server could say "Hey client, you can have 100 records from 2016-01-01 and the last record I am sending you is 2016-03-01". The server could also optionally tell the client that there are 20,000,000 records in the collection.

If the client did not give an added_after filter, the server could just give the client the latest records that were written to the collection, up to the limit size that the client and server both support.

From a performance standpoint, this works a lot better and results in a lot less latency.

More Details

The concepts of items (what we have today) is only valid within a given instance of a single request. There is no guarantee that the server will have the same data associated with the same item numbers for a consecutive query.
The server could add or delete data between queries and the client will either get redundant data or will miss data.
About the best the server could do is hold a results table for a given TTL, and then expire it. So if the client does not make requests within that TTL, all bets are off. This solution would increase complexity and burden for no apparent value.
A client will never be able to know the associated item numbers for data without the server. Meaning, if the client asked for 1,991,124,214 - 1,991,124,800, there is no predictability on what the client will get.
About all the client will know is when was the last time it queried the collection and what did the server send as far as data.

So lets look at the use-cases of a client interacting with the server:

Step 1: Has the client previously talked to the collection:
If Yes: Then the client will know the timestamp of the query it last used and can use that information to gather information about what has changed.
The client will make a request with an added_after url parameter and the server will respond with data up to the limit that the server will support. The client could provide a limit that it will accept. If it is greater than the server limit, then the server limit wins. If the client limit is smaller than the server limit, then the client limit wins.
The server will also send the ending timestamp for the data it is delivering. The client can then use this to get additional data.
No: The client will just pull the collection and it will get the most current data. The server will also be able to tell the client the beginning timestamp and the ending timestamp for the data it sent and the server should be able to tell the client how many total records there are in the collection.
The client can use this information to gather additional data in the future by using the added_after URL parameter.
The client can ask for additional historical data by using a added_before URL parameter.

Manifest Resources Cannot Accurately Specify All Media Type and Version Combinations

The manifest resource currently allows an array of media types to be declared for an object. However, there are situations where an object may have more than one version and each version of the object may be in a different version of STIX. There is no way for the manifest resource to say that this version is in this media type and this version is in that media type.

As a side note, I am not sure putting the media types on the manifest record is a good thing. I am thinking that it would be better to advertise this at the API root or collection level. Or if we need to keep it, it should be changed from a list of media types to just a string value and force all versions to be of a single version.

Section 4.3.1. Example should use total_count instead of total_objects

Instead of

{
  "id": "2d086da7-4bdc-4f91-900e-d77486753710",
...
  "total_objects": 4,
...
}

Should be

{
  "id": "2d086da7-4bdc-4f91-900e-d77486753710",
...
  "total_count": 4,
...
}

How do we handle version based content negotiation

If you only support say STIX 2.1 content and someone requests STIX 2.0 content, the spec says that you should return a 415 error code. When we wrote the spec, that seemed like a good things to do. However, I now feel like this is in error. This provides a terrible user experience

There should be some way of telling the client that you do not support STIX 2.0, but you DO have the content in STIX 2.1

Version filter is unclear if result MUST only be the latest version

Section 3.5.1
The version filter description says you MUST return the latest version if no version parameter is specified. It is not clear if this means MUST ONLY return the last version or if it includes the older versions so long as the latest one is also included.

added_after needs clarifying text

In section 3.5, the text for added_after says a "timestamp", however, it should probably say a "a single timestamp"

Add text about TLS 1.3 0-rtt

We need to add clarifying text to TAXII 2.1 about not using TLS 1.3 0-rtt. I would suggest we use the following text. "Implementations MUST NOT use TLS 1.3 0-rtt for TAXII". The reason for this is the known security implications with 0-rtt with REST based protocols. These are well documented in the IETF TLS 1.3 document.