google / cel-go Goto Github PK
View Code? Open in Web Editor NEWFast, portable, non-Turing complete expression evaluation with gradual typing (Go)
Home Page: https://cel.dev
License: Apache License 2.0
Fast, portable, non-Turing complete expression evaluation with gradual typing (Go)
Home Page: https://cel.dev
License: Apache License 2.0
The interpreter and type-checker have different notions of which types
are supported. The interpreter is quite good at resolving types from a
protobuf file descriptor, but the type-checker only looks at types that have
been provided to the InMemoryTypeProvider
. A unified type provider
would improve support for resolving enums at type-check time while
simplifying the type provision implementation.
// checker setup.
typeProvider := providers.NewTypeProvider(<proto-instances>)
env := checker.NewStandardEnv(common.NewErrors(), typeProvider)
c, errors := checker.Check(<expr>, env, <container>)
// interpreter setup referring to type provider and checked output `c`
i := interpreter.NewStandardInterpreter(typeProvider)
eval := i.NewInterpretable(interpreter.NewCheckedProgram(c, ""))
The type provision could remain separate between the type-checker and
interpreter as they have slightly different use cases. One really cares about
whether a field exists on a type (checker), and another really care about
how to construct a type (interpreter). However, as these concepts are heavily
related, it would be useful to combine them.
I am trying to get familiar with cel lang in the context of opa project. Since I can't open issues against the cel-spec, I am filing it here. I am no language designer. From the readme I see that
CEL is a non-Turing complete language designed to be portable and fast. It is best suited to applications where sandboxing a full-fledged language like JavaScript or Lua would be too resource intensive, but side-effect free dynamic computations are strongly desired.
I am curious to understand why the following 2 are desired properties:
Thanks.
String literals are supposed in CEL are supposed to align with GoogleSQL. At present,
not all escape sequences are supported properly:
\xhh
, \Xhh
conversion of 2 hex chars to unicode code point.\Uhhhhhhhh
conversion of unicode escape with 8 hex chars supported incorrectly as \Uhhhh
r|R
and b|B
prefixes supported, but no support for rb
for a raw byte string\n
in common/types/provider.go: 169
TODO generate an error when the type name is registered more than once.
In checker/decls/decls.go:181
return an error instead of panic
clean up visibility and add licenses for build package.
In parse/macro.go: 167
make consistent with the CEL semantics.
In bin/generate.sh:
Add High-level file comments for bin/generate.sh
For proto3 generated structs, primitive values are set by value. For proto2, primitive values
are set by pointer. ConvertToNative
needs to handle the change from value to pointer for
primitive values.
if typeDesc.Kind() == reflect.Ptr && typeDesc.Elem().Kind() == reflect.String {
return &string(s), nil
}
Add some simple steps in the README to do something simple with this repo, such a very small example of using the type checker?
In interpreter/activation.go: 43
TODO: supply references from checkedpb.proto.
Update server test.
$ go get -u github.com/google/cel-go/...
package github.com/google/cel-spec/proto/checked/v1/checked: cannot find package "github.com/google/cel-spec/proto/checked/v1/checked" in any of:
/usr/local/google/home/cbro/go/src/github.com/google/cel-spec/proto/checked/v1/checked (from $GOROOT)
/usr/local/google/home/cbro/src/github.com/google/cel-spec/proto/checked/v1/checked (from $GOPATH)
package github.com/google/cel-spec/proto/v1/syntax: cannot find package "github.com/google/cel-spec/proto/v1/syntax" in any of:
/usr/local/google/home/cbro/go/src/github.com/google/cel-spec/proto/v1/syntax (from $GOROOT)
/usr/local/google/home/cbro/src/github.com/google/cel-spec/proto/v1/syntax (from $GOPATH)
package github.com/google/cel-go/server: found packages main (main.go) and server (server.go) in /usr/local/google/home/cbro/src/github.com/google/cel-go/server
Related to #51.
I know Bazel has appeal, but if it's getting in the way of Go developers actually using the package, something is wrong.
In interpreter/functions/standard.go:119
TODO: Verify overflow, NaN, underflow cases for numeric
TODO in interpreter/prune.go: 30
Consider having a separate walk of the AST that finds common subexpressions. This can be called before or after constant folding to find common subexpressions.
parse/macro.go:201
use compiler internal method for faster, stateful add.
In parser/macro.go: 24
Consider moving macros to common
The checked.proto
has a type value which is used by the checker, but not elsewhere.
In the interest of making the evaluation results serializable between check and runtime,
the checked.proto should be wrapped to ensure that either the checked or runtime proto
value can be produced from some common internal representation.
When a proto generated struct contains a oneof
field, the reflection code under common/types/pb
does not appropriately account for struct
ural gymnastics of wrapping the oneof in the correct
wrapper.
The following example should create an expression with a literal value of true
:
google.api.expr.v1.Expr{
id: 1,
literal_expr: google.api.expr.v1.Literal{
bool_value: true}}
However, the output from the interpreter is: no such field "BoolValue"
In interpreter/activation.go: 53
TODO: consider passing the current activation to the supplier.
In interpreter/functions/standard.go:194
TODO: verify type conversion safety of numeric values.
in common/overloads/overloads.go:90,96
introduce concept of types.Message
In CEL, type definitions may exist within a specific container
as indicated on the
last argument of checker.Check(expr, env, <container>)
. The container is used to resolve
type and enum definitions. At interpretation time, the container is used to resolve enum
names, but not used in conjunction with resolving type names. This leads the checker
to indicate a program is valid, but the interpreter to fail.
parsed, errors := parser.ParseText("Expr{id: 1}")
provider := types.NewProvider(&expr.Expr{})
env := checker.NewStandardEnv(errors, provider)
checked := checker.Check(parsed, env, "google.api.expr.v1")
i := interpreter.NewStandardInterpreter(provider)
eval := i.NewInterpretable(interpreter.NewCheckedProgram(checked, "google.api.expr.v1"))
result, _ := eval.Eval(interpreter.NewActivation(map[string]interface{}{}))
// yields: unknown type 'Expr'
// using fully-qualified names avoids the issue.
Setup continuous integration testing with Travis to ensure the code remains healthy.
Recent versions of protobuf compiler generate extra fields in go structs. This causes some
tests to break. It is recommended that the new fields which appear to be for tracking internal
protobuf state, be expressly ignored by the code in common/types/pb/
directory as well as
by the object.go
and provider.go
files.
See an example of the new fields in https://godoc.org/github.com/golang/protobuf/proto.
To reproduce a test failure, update WORKSPACE to use https://github.com/bazelbuild/rules_go
version 0.12.0 or later and run all tests with
$ bazel test ...
and you'll see in the logs for common/types/go_default_test
--- FAIL: TestProtoObject_Iterator (0.00s)
object_test.go:51:
Got [id comprehension_expr XXX_NoUnkeyedLiteral XXX_unrecognized XXX_sizecache],
wanted [id comprehension_expr]
FAIL
One option is to rely on the extra fields being prefixed with XXX_
(hooray for in-band signaling!).
This might be simplest, but is fragile, since it seems to be an undocumented convention.
Another option is to eschew Go-level reflection and rely on the reflection in the proto library for
accessing proper fields names in a Message.
In parser/macro.go:220
use compiler internal method for faster, stateful add
Create a type registry for managing type identifiers and traits for
use with type-checking and interpretation.
CEL supports primitives and message based types. There is a need
for some of these types to be treated in an abstract manner as well
as a need to annotate types as having traits to indicate they have
support operator overloads.
The example in the readme...
// Parse the expression and returns the accumulated errors.
p, errors := parser.ParseText("a || b && c.exists(x, x > 2)")
if len(errors.GetErrors()) != 0 {
return nil, fmt.Error(errors.ToDisplayString()))
}
// ...
env := checker.NewStandardEnv(packages.DefaultPackage, typeProvider, errors)
// ...
if len(errors.GetErrors()) != 0 {
return nil, fmt.Error(errors.ToDisplayString()))
}
This is quite confusing - why is the errors
struct being passed around? What's the reason for returning it from the parser and passing it into the environment?
From my reading, the checker env isn't re-usable or goroutine-safe.
As a newcomer to the package, I'd expect a flow like:
The current interface seems to entangle these operations.
Support the evaluation of an expression with partial information.
If the expression a + expensive < 10 || c >= 11
receives values for a=11
and c=10
, but not expensive
the output of the partial evaluation should
yield:
11 + expensive < 10 || false
The computed expression can be simplified to: expensive < -1
. In order to
support such transformations, the interpreter should generate a stable (error
or non-error) or unknown value with tracking for the state observed for an
expression. The tracked state should be detailed enough to generate a new
expression or minimally detect which inputs were crucial to the computation
of the result.
CEL could only execute expressions in serial with complete information. This
would simplify the CEL implementation and align with the expectations of
many developers.
CEL evaluation may be distributed across processes, so the output of one
program may be treated as input to another. It is important for this reason
that interpretation results (valued, error, or unknown) are proto serializable.
CEL supports types based on protocol buffers. These types are treated
as object that support field accesses by name. Protocol buffers also
support JSON-like data via google.protobuf.Value
messages. These
messages are intended to be supported as primitives, maps, and lists
so the message field names should be abstracted away by the
type-checker and interpreter.
Update code to fix lint issues.
When a value has a google.protobuf.Value
type (or derivative), or a well-known type
such as google.protobuf.Any
or google.protobuf.Duration
, the type is treated as a
MESSAGE
by the checker.go
when it should be mapped to the appropriate built-in
or well-known type value.
Add instructions in CONTRIBUTING.md for how to report issues (via GitHub issues in this repo),
in common/types/provider.go: 111
TODO: verify that well-known message types are handled correctly
The docgen tool should generate documentation of the identifiers and functions
supported within an environment (checker/env.go
). The tool should probably
generate markdown.
Refactor proto alias in .go files for cel-go
In common/types/null.go: 56
TODO: determine whether there are other valid conversions for null
Here's a list of important functions and types that don't have documentation, and probably should:
common.Types.NativeToValue
packages
(the whole package, really. what is it for? an example would be good.)checker.NewStandardEnv
checker.StandardDeclarations
types.NativeToValue
Given an AST generate a human-readable string from it.
Translate an AST of the form call('_||_', ident(a), ident(b))
into the human readable string a || b
.
The translation step need not honor line-breaks or parentheses
present in the original expression, but must capture the semantic
meaning of the original.
This functionality does not yet exist. An alternative might be to
record the original string from which the AST was generated, but
this would likely be overkill and serving both the string and AST
form of an expression in production is undesirable.
In interpreter/metadata.go: 22
Replace with common.Source
Error messages are formatted strings which are not uniform. While this was
okay for development, the errors should be consolidated into error codes
with a set of associated details. Further, when multiple errors are encountered
they should be aggregated together rather than simply picking the first one.
Proposed error codes:
Code | Details | Description |
---|---|---|
ATTRIBUTE_NOT_FOUND |
type, attribute | Map or object access using an unknown attribute. |
DIVIDE_BY_ZERO |
n/a | Division by zero, also reported for modulus operations. |
DUPLICATE_ATTRIBUTE |
attribute name | Map or object construction supplies same key value more than once. |
IDENTIFIER_NOT_FOUND |
identifier name | Variable or function name could not be found. |
INVALID_ARGUMENT |
argument value | Invalid argument supplied to a function. |
OVERLOAD_NOT_FOUND |
function signature | Function defined, but no matching overload found. |
TYPE_NOT_FOUND |
type name | Type definition could not be found. |
return types.NewError(errors.ATTRIBUTE_NOT_FOUND, value.Type(), field)
In checker/decls/decls.go: 33
TODO: Replace with an abstract type registry
The termination condition for all comprehensions is <loop-cond> != false
in order
to ensure that when the loop-condition does not terminate on comprehensions where
the loop-step is computed from a non-strict operator (&&
, ||
). This can result in
inefficient computation when the loop-step is strict (literally any other operator).
To account for this difference, the parser macros should expand non-strict
loop-conditions to be a function call to a not_strictly_false(<loop-cond>)
function
which returns true when the loop condition value is error, unknown, or true. The new
function will absorb errors / unknowns just like the non-strict accumulators for exists
and all
macros.
CEL is a dynamically typed language. As such there is no way to specify a type-designator
for an aggregate type literal, e.g. list<int>[1, 2, 3]
. This means that all aggregate types defined
inline within an expression should be treated as dyn
values. Currently, the type-checker asserts
type-agreement between all elements in a list or map unless expressly cast to dyn
. This check
should be removed, preferring instead to identify the most general type element in the list.
Currently fails to type-check: [1u, 1]
but should succeed as the interpreter handles this
expression just fine.
At type-check time within the env.go, verify that declarations are non-overlapping
with special care to make sure that overload signatures for functions cannot collide
after type erasure.
The internal type system used within the parser and type-checker
is convenient for formatting debug strings, but using the protobuf
representations directly will simplify the implementation, provide
valuable feedback about the usability of the abstract CEL representation,
and reduce the likelihood of mismatches between the specification
and the implementation.
The lexing of negative numbers makes the parsing of arithmetic expressions
less than ideal. Moving the negative number detection to the parser will address
this usability issue.
The expression 4--4
is currently ambiguous without additional spaces. This should
read as call('-', '4', '-4')
once the parser is changed.
The CEL language definition indicates that google.protobuf.Any
values are automatically
packed / unpacked to their underlying proto message during expression evaluation. At present,
the interpreter does not handle this.
If msg
is declared as a message containing a field any
of type google.protobuf.Any
, the
value of any
could be any protobuf message type supported by the types.Provider
and
it should be automatically converted to this underlying type
// treat msg.any as a Timestamp when it is a google.protobuf.Timestamp
msg.any < timestamp("2017-01-01T00:00:00Z")
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.