nmondal / cowj Goto Github PK
View Code? Open in Web Editor NEW[C]onfiguration [O]nly [Web] on [J]VM
License: Apache License 2.0
[C]onfiguration [O]nly [Web] on [J]VM
License: Apache License 2.0
Cowj should let preload data.
After a night of thinking I believe we can simply add cron jobs which solves the problem for most - in entirety.
Imagine the case of writing back to Google Storage.
Now, it is obvious that we can do it via:
thread( my_body = req.body ) as {
_storage.g_cloud.dumps("foo", "file.json", my_body)
}
But should we?
We can essentially have a block like routes
and proxies
which immediately returns while adding the job:
post:
/_async_ /webhook : _/dump_g_cloud.zm
while then dump_g_cloud.zm
is :
// dump_g_cloud.zm
_storage.g_cloud.dumps("foo", "file.json", my_body)
of course the thread()
can neither create or produce a request-id
for the async call that would be queried upon later.
With various engines execution ability.
We should keep on hitting an endpoint with https://github.com/wg/wrk
And then we should print the stats.
Cloud Storage implementation only supports json objects and strings. We might want to upload binary files like images and return them.
I am thinking of two functions:
byte_array = loadb("bucket", "path")
dumpb("bucket", "path", byte_array)
This should not be a big problem.
https://www.baeldung.com/java-jwt-token-decode
Need to think about it.
@hemil-ruparel-blox
Also, they should be script specific.
That is, there should be clear way to identify the script which is logging.
This looks promising.
https://github.com/intellisrc/spark
Can not avoid S3 bucket support, so it is better to have it.
JCasbin is a good choice for Auth.
https://github.com/casbin/jcasbin#documentation
With this,
https://graalvm.github.io/native-build-tools/latest/gradle-plugin.html
we can check if the binary size would decrease or not, and we can run better or not.
In some cases it makes sense to load schema and match against data loaded from various sources.
In those cases it makes sense to have programmatic schema verification.
Dump function of google storage not working in case of existing blob
Specifically, there would be the case where we may use a string which needs to have substitution from variables already being used in the script.
x = 42
string = Test.resource("my_resource_id")
While the resource file might be something as follows:
resources.yaml
my_resource_id: "Hello, the variable used is : #{x}"
That would be good.
We can store the resources inside static folders (?)
Given Jython does not have json
ability which is off the shelf, one small demo showing json processing in Jython is required.
Imagine the JDBC db server is booting up - while the cowj is booting up.
Now, cowj knowingly throws an error at boot - because it fails to connect to the db in the load time in the Cowj running thread.
Given the "architecture" is about getting dedicated Connection per thread based on lazy loading,
this at best should be configurable.
Say I have an API Get Request to '/' which forwards the request to some other API and then messages the data and returns it. In this case I want the output validation to trigger after the finally filter because I want to handle both error and success case. Right now what is happening is it is triggering before finally.
Example -
api.yaml
port: 5003
proxies:
get:
/: json_place/users
filters:
finally:
/: _/after.zm
plugins:
cowj.plugins:
curl: CurlWrapper::CURL
data-sources:
json_place:
type: curl
url: https://jsonplaceholder.typicode.com
static/types/schema.yaml
labels: # how system knows which label to invoke
ok: "resp.status == 200" # when response status is 200
err: "resp.status != 200" # when it is not
verify:
out: true
routes:
/:
get:
ok: output.json
static/types/output.json
{
"$id": "https://blox.xyz/cowj_api/Generic.error.schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "Output for send otp service",
"title": "Error",
"type": "object",
"properties": {
"foo": {
"type": "string"
}
},
"required": ["foo"],
"additionalProperties": false
}
after.zm
resp.body("{\"foo\": \"bar\"}")
This setup responds with: {"foo": "bar"}
correctly but on the console, there is a log:
SEVERE: Output Schema Validation failed. Route '/' :
com.worldturner.medeia.api.ValidationFailedException: [Validation Failure
------------------
Rule: type
Message: Type mismatch, data has array and schema has object....
As the schema validation expects {"foo": "bar"}
but it is getting the response from proxy directly before the finally filter.
If we change it to after, it works.
Expectation -
For Proxy, Output validation should trigger after all finally filters are executed because we might want to message the data before returning response using after and finally filters. Finally is important because it allows us to handle error case as well
We would like to use COWJ to send notifications to users using COWJ. However, a multicast using Firebase REST API https://firebase.google.com/docs/cloud-messaging/send-message#send-messages-to-multiple-devices requires using sub requests which is not usually exposed as a part of http libraries. Therefore we want to integrate firebase library into COWJ. Specifically the com.google.firebase.messaging
library
As one can see from here:
https://serverfault.com/questions/684855/disable-authentication-for-http-options-method-preflight-request
For CORS, the preflight would be a problem - it needs to be Authenticated, which will not work.
Apache2 solves the problem by putting
<LimitExcept OPTIONS>
Require valid-user
</LimitExcept>
We want to store all telemetry data to cloud storage.
Essentially all logs should get into proper buckets, in JSON form.
Here is how to do it .
https://www.baeldung.com/jetty-http-2
A clean risky mechanism should be following routes route:
risky:
get:
foo/bar
It also co-insides with the OPTIONS issue.
The cloud storage plugin uses the gcloud default project id. This will not work if project id is not set or if you have multiple projects and want to work with them at the same time. A lot of people have access to the project but might not have set a default project id. And a lot of people might have multiple projects and might not want to change defaults everytime they change projects. Forgetting to change default project might end up with them getting weird error messages.
In the yaml file, add an optional field for project-id.
If project-id is not specified, use gcloud default
plugins:
cowj.plugins:
g_storage: GoogleStorageWrapper::STORAGE
data-sources:
storage:
type: g_storage
With project-id specified
plugins:
cowj.plugins:
g_storage: GoogleStorageWrapper::STORAGE
data-sources:
storage:
type: g_storage
project-id: foo_bar
For input validations, there are lots of cases of:
if (foo) {
return jstr({'error: 'description'})
}
Figure out a better way for input validations
Seems to be a problem in Nashorn. Solution is easy, just extending TestAsserter for the same.
Test.print() is System.out.printf()
Test.printe() is System.err.printf()
If proxy is:
panic(true, 'message', 418)
We are getting a 500 internal server error
Async functionality is currently not handled for proxies end points
Of the form:
{
"error" : "whatever"
}
This helps a lot.
This is a HUGELY debated topic.
While schema are necessary, it is also necessary to describe the current schema,
from the service itself.
When we sat in WSDL consortium - that was key.
https://en.wikipedia.org/wiki/Web_Services_Description_Language
This has the interesting way of make the service self documented.
We do not need WSDL for sure, but we need self documenting API endpoints.
panic(true) throws exception
Should it be?
There is a huge discussion on this topic:
https://stackoverflow.com/questions/978061/http-get-with-request-body
Yes. In other words, any HTTP request message is allowed to contain a message body, and thus must parse messages with that in mind. Server semantics for GET, however, are restricted such that a body, if any, has no semantic meaning to the request. The requirements on parsing are separate from the requirements on method semantics.
So, yes, you can send a body with GET, and no, it is never useful to do so.
This is why we need QUERY.
https://www.ietf.org/archive/id/draft-ietf-httpbis-safe-method-w-body-02.html
A route with the following code:
1 / 0
Returns response as:
java.lang.RuntimeException: zoomba.lang.core.types.ZException$ArithmeticLogicOperation: Invalid Arithmetic Logical Operation [/] : --> /Users/hemil/blox/cowj_apis/api/location.zm:1:5 to 5 --> ( Can not do operation ( DIVISION ) :
( 1 ) with ( 0 ) !
left: java.lang.Integer
right: java.lang.Integer )
Instead of a generic internal server error
The yaml file allows file paths to be specified using _/ notation. Example -
routes:
get:
/hello/g: _/hello.groovy
/hello/j: _/hello.js
/hello/p: _/hello.py
/hello/z: _/hello.zm
_/hello.groovy means hello.groovy in the directory of the configuration file. This allows the file to be relocated as long as the relative paths are preserved. But we are not exposing the base directory to data sources. Many data sources would need files as input which are best specified as relative paths in order to facilitate relocation
https://www.baeldung.com/java-connection-pooling
https://www.baeldung.com/hikaricp
I mean we can work w/o it, but .. it is better to have it.
Right now, in case of input validation fails, the response goes back to the client.
While doing so, the output validation gets triggered.
This is unexpected behavior, and needs to stop.
This should have been done by now, but it is not done.
We can start with Scripts, all scripts. Functionality would be encoded in the Yaml file as follows:
cache :
scripts: false
auth: false
type: false
Default would be set to true.
This way people do not have to reload the server - when they modify things in the scripts and such.
We should start with scripts
and type
.
A much better alternative would be just adding cache : false
in the container yaml
files.
Main config file cache would essentially talk about scripts in question.
Schema.yaml
would only be applicable for JSON files pointed to by the schema.
The function all()
does not do a recursive traversal.
default Stream<Blob> all(String bucketName, String directoryPrefix) {}
This can be fixed easy by using overloads.
Just do this for the script after input schema validation:
payload = req.attribute("_body") // this should already have the parsed data
Error happens:
java.lang.NullPointerException: Cannot invoke "String.length()" because "content" is null
at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1217)
at cowj.TypeSystem.lambda$outputSchemaVerificationFilter$6(TypeSystem.java:344)
at spark.FilterImpl$1.handle(FilterImpl.java:73)
at spark.http.matching.AfterAfterFilters.execute(AfterAfterFilters.java:55)
at spark.http.matching.MatcherFilter.doFilter(MatcherFilter.java:187)
Expose environment variables in scripting languages so zoomba can access environment of the JVM process and so different things depending on the environment it is running on
All of these can be done easily with routes support as follows:
routes:
errors:
"404" : _/not_found.js
"500" : "Internal Server Error!"
"*" : _/custom_error_handler.js
https://sparkjava.com/documentation#error-handling
https://sparkjava.com/documentation#exception-mapping
This seems to make sense.
https://jpy.readthedocs.io/en/latest/install.html#running-python-from-java
6.Ensure ls returns java -jar cow-0.1-SNAPSHOT.jar as well as a deps/ folder. If those two don’t exist, run gradle clean and gradle build -x test
7.java -jar cow-0.1-SNAPSHOT.jar ../../samples/hello/hello.yaml
The corrected comments below for https://github.com/nmondal/cowj#cowj-setup
java -jar cowj-0.1-SNAPSHOT.jar and
java -jar cowj-0.1-SNAPSHOT.jar ../../samples/hello/hello.yaml
There is no function available to list data of a folder in a bucket.
This would be a problem:
local_redis:
type: redis
secrets : gcp
urls: "${redis-urls}"
Would not work even if the gcp
secret manager has a proper key which is redis-urls
We have this.
https://raw.githubusercontent.com/Microsoft/TypeScript/02547fe664a1b5d1f07ea459f054c34e356d3746/lib/tsc.js
we can use this.
What we really have as virtual machine with instructions.
W/o proper logging for each action, each decision, it is next to impossible for people to decide why something was done.
Currently CURLWrapper makes a half hearted effort into being a forward proxy.
The massaging of headers and the request body & query params from client to the destination server.
def send( headers, body, params) {
// producing payload to destination server
}
But the reverse transform functionality is missing.
def receive( headers, body, status ) {
// producing payload to client
}
The proposal for the transform could be:
proxy:
send: _/send.zm
receive: _/rec.zm
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.