svt / orm Goto Github PK
View Code? Open in Web Editor NEWORM: Origin Routing Machine
License: MIT License
ORM: Origin Routing Machine
License: MIT License
Add verify required
for the backend servers to verify certificates on health checks.
This will result in a failed healthcheck when the wrong domain is served (for example due to a stale DNS lookup), which will trigger HAProxy to perform a new DNS lookup.
(imported from internal issue)
If a rule is moved from one file to another or changes description the cache entry will not be updated because the cache entries are only keyed by fsm_cache_key = domain + str(match_tree)
. We have two alternatives:
The first solution is a bit simpler, but the second one is more performant (because then we don't need to calculate a new FSM every time the rule changes file / description / other metadata)
To reproduce:
As we need to bump the ORM schema for v2 it's a great opportunity to go through the schema and make changes where needed. For example streamline the key names and fix typos.
Multiple jobs failing in tests due to env not being set up correctly. @splushii something to look at on a rainy day perhaps?
Document redirect type *_allow_method_change
. It's not documented in syntax reference.
AFAIK we have only a minor bugfix that is unreleased, but we figure it is worth making a new release anyway in order to fix logging in haproxy.
Currently, the collision check only parses and verifies that paths do not collide (in the matches
section of ORM rules). We need to be sure that query strings (used with keyword query in the matches
section) do not collide as well. There is also a need to check for collision for fields that may be used in matches
in the future, for example headers.
In this proposal I assume the possibility to match by headers already exist, for a more complete example.
The idea is to generate an FSM for each matches
-element and then merge them all in the same way as done for paths
. But the FSM for paths
should not be merged with the FSM for query
or headers
. There will be a sub-FSM for each matches
type for every rule. So paths
FSM:s are merged with other paths
FSM:s to become a single paths
FSM for the whole rule, query
FSM:s are merged with other query
FSM:s to become a single query
FSM for the whole rule, etc. I think it will be easier to implement and give us greater flexibility and better ability to debug, if we keep them separate.
Take this example rule matching:
matches: # rule A
- all:
- paths: #A1
exact: 'path'
- query: #A2
parameter: 'key'
exact: 'value'
- headers: #A3
field: 'field'
value: 'value'
It would then result in the following regular expressions for each separate matches element under all
(N/A is used when there is no value set):
match element | #A1 sub-FSM |
#A2 sub-FSM |
#A3 sub-FSM |
---|---|---|---|
paths | path |
N/A | N/A |
query | N/A | key=value |
N/A |
headers | N/A | N/A | field=value |
When combining values with nonexisting values the results are:
FIRST_VALUE
OR SECOND_VALUE
= ANOTHER_VALUE
FIRST_VALUE
OR N/A = FIRST_VALUE
N/A OR SECOND_VALUE
= SECOND_VALUE
N/A OR N/A = N/A
FIRST_VALUE
AND SECOND_VALUE
= ANOTHER_VALUE
FIRST_VALUE
AND N/A = FIRST_VALUE
N/A AND SECOND_VALUE
= SECOND_VALUE
N/A AND N/A = N/A
For the rule as a whole it will be (after applying the logical AND imposed by all
):
match element | #A1 sub-FSM |
#A2 sub-FSM |
#A3 sub-FSM |
rule A FSM |
---|---|---|---|---|
paths | path |
N/A | N/A | path |
query | N/A | key=value |
N/A | key=value |
headers | N/A | N/A | field=value |
field=value |
If there is a sub-FSM that is still N/A when all matches in a rule have been evaluated and combined, it must be set to .*
to match all, because it could be anything. For example:
matches: # rule B
- all:
- paths: #B1
exact: 'path'
- query: #B2
parameter: 'key'
exact: 'value'
match element | #B1 sub-FSM |
#B2 sub-FSM |
rule B FSM |
final rule B FSM |
---|---|---|---|---|
paths | path |
N/A | path |
path |
query | N/A | key=value |
key=value |
key=value |
headers | N/A | N/A | N/A | .* |
An ORM rule collides with another ORM rule if every sub-FSM (paths
, query
, headers
) collides with its counterpart in the other rule.
A user has asked about the synthetic_response
action which is mentioned in the syntax reference but without usage documentation. An example or two of the usage would be useful.
One of our users reports nonintuitive behavious when using begins_with
path matching.
The example rule looks like this (user-specific information removed):
- description: Redirect traffic to subpath
domains:
- host.domain.example
matches:
all:
- paths:
begins_with:
- "/mypath"
actions:
req_path:
- prefix:
remove: "/mypath"
Expected behaviour:
Requests to https://host.domain.example/mypath
should be handled by the rule and sent to the configured backend.
Actual behaviour:
Requests to https://host.domain.example/mypath
return error code 503: VCL failed
, while requests to https://host.domain.example/mypath/
(with trailing slash) are handled correctly.
Needs reproducing and further investigation.
I had a little fun today and packaged ORM as a snap package. When I wrote the README I realized that it was not clear what versions of HAProxy and Varnish ORM supports.
You suggests the latest HAProxy (currently 2.1.3) + Varnish 6 in the examples. But the LXD tests uses HAProxy 1.8 and Varnish 5.2. Would it be safe to assume that HAProxy 1.8+ and Varnish 5.2+ works?
When a rule owner does not get the expected result from ORM, it is currently difficult to identify why this is happening. A debug feature that sets one or more headers, identifying the namespace/rule that the request matched, would be very helpful in these cases.
The debug feature should not be publicly available and preferably activated by including an appropriate header in the request. For example, if we set "X-ORM-DEBUG: True" the response should include the debug headers identifying the matching rule.
(imported from internal bug-tracker)
Update the Travis CI to not include pypy3.5 or python3.5, only python 3.6 and 3.7.
Update the documentation to specify officially supported python versions. Only python 3.6 and 3.7, although it will probably work on pypy as well.
In order for users to be able to configure sane timeouts in origins, it must be clear what timeouts are set in ORM.
For example the HAProxy timeouts:
timeout connect 10s
timeout client 15s
timeout server 15s
timeout queue 10s
Currently, we only support matching on path, query (and domain / Host header via domains
). There are usecases which would benefit from method matching.
A user has reported that the current documentation is ambiguous regarding regex matching of mulitple paths. We should provide examples where the value of regex
is a list of multiple items to clarify that multiple path
declarations are not needed:
- paths:
regex:
- '^/foo/[0-9]/bar/.*'
- '^/foo/baz/[0-9]+'
If --cache-path
is specified but it's not a valid path, this will come to the users attention first when the cache is written. This could take a while if there are a lot of files. For example:
Got 612 FSM:s. 0 from cache. 612 freshly generated.
FSM generation took: 399.18s
Path collision check took: 145.09s
Writing FSM cache to cache/external.pkl
Traceback (most recent call last):
File "/home/c/git/orm-rules/env/bin/orm", line 11, in <module>
sys.exit(main())
File "/home/c/git/orm-rules/env/lib/python3.7/site-packages/orm/__main__.py", line 78, in main
cache_path=args.cache_path):
File "/home/c/git/orm-rules/env/lib/python3.7/site-packages/orm/validator.py", line 48, in validate_rule_files
return validate_rule_constraints(yml_files, cache_path=cache_path)
File "/home/c/git/orm-rules/env/lib/python3.7/site-packages/orm/validator.py", line 573, in validate_rule_constraints
cache_path=cache_path):
File "/home/c/git/orm-rules/env/lib/python3.7/site-packages/orm/validator.py", line 558, in validate_constraints_rule_collision
with open(cache_path, 'wb') as fsm_cache_file:
FileNotFoundError: [Errno 2] No such file or directory: 'cache/external.pkl'
make: *** [Makefile:38: ci-output-file] Error 1
It would be better to validate --cache-path
as early as possible, for example in __main__.py
.
(imported from internal issue)
Would be nice to have #47 released...
It seems the syntax reference documentation are missing information for the 'tests' section which can be found here https://github.com/SVT/orm/blob/master/docs/syntax_reference.md#tests
The syntax can be found in file orm/schemas/*.json until documentation is updated
For example, something like this:
backend:
servers:
- server: server1
weight: 30
- server: server2
weight: 70
balance: roundrobin
Sometimes the current health checks (TCP) are not enough. The current generated health checks for HAProxy is:
check
check ssl verify none
One example that won't work with the above simple healthchecks is when a specified origin is using https and resolves to loadbalancers that won't respond to TLS connections without SNI (for example CloudFront).
I propose that we use the existing format used for the internal ORM health check between HAProxy and Varnish (https://github.com/SVT/orm/blob/master/docs/syntax_reference.md#custom_internal_healthcheck), and extend it to support the health check customizations we need.
We could add something similar to the examples below to https://github.com/SVT/orm/blob/master/docs/syntax_reference.md#origin_object :
healthcheck:
tcp:
tls: True
domain: some.servername.example.com
healthcheck:
http:
tls: True
domain: some.servername.example.com
method: GET
path: /some/healthcheck/path
When not setting healthcheck, we need to agree on some sane default for HAProxy (inspired by #7). For example:
check
check ssl check-sni <domain> verify required
Currently, we only support matching on path, query (and domain / Host header via domains
). There are usecases which would benefit from header matching.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.