nmondal / cowj Goto Github PK

View Code? Open in Web Editor NEW

13.0 13.0 8.0 1.57 MB

[C]onfiguration [O]nly [Web] on [J]VM

License: Apache License 2.0

Java 99.22% JavaScript 0.18% Groovy 0.10% Python 0.23% Dockerfile 0.05% Kotlin 0.21%

cowj's People

Contributors

Stargazers

Watchers

Forkers

saptarshiweb hadric benitojd farazahmadnagrami atanu2531 hemiltherebel argharay00 asisrout

cowj's Issues

Caching - preloading data should be a neat trick

Cowj should let preload data.
After a night of thinking I believe we can simply add cron jobs which solves the problem for most - in entirety.

Async support to trigger a random something should be default

Imagine the case of writing back to Google Storage.
Now, it is obvious that we can do it via:

thread( my_body = req.body  ) as {
    _storage.g_cloud.dumps("foo", "file.json", my_body)
}

But should we?
We can essentially have a block like routes and proxies which immediately returns while adding the job:

   post:
      /_async_ /webhook : _/dump_g_cloud.zm

while then dump_g_cloud.zm is :

// dump_g_cloud.zm 
 _storage.g_cloud.dumps("foo", "file.json", my_body)

of course the thread() can neither create or produce a request-id for the async call that would be queried upon later.

We need to baseline the performance for Cowj instance

With various engines execution ability.
We should keep on hitting an endpoint with https://github.com/wg/wrk
And then we should print the stats.

Byte array support for load and dump

Cloud Storage implementation only supports json objects and strings. We might want to upload binary files like images and return them.

I am thinking of two functions:

byte_array = loadb("bucket", "path")
dumpb("bucket", "path", byte_array)

We should support JWT based auth

This should not be a big problem.
https://www.baeldung.com/java-jwt-token-decode

Is it a good idea to stop using console print and move to SL4j?

Need to think about it.
@hemil-ruparel-blox

logging should be accessible in the scripting environment as injected variable

Also, they should be script specific.
That is, there should be clear way to identify the script which is logging.

We should move out from our own Spark-Java to something better

This looks promising.
https://github.com/intellisrc/spark

S3 bucket support seems necessary

Can not avoid S3 bucket support, so it is better to have it.

Auth should be inbuilt with Casbin

JCasbin is a good choice for Auth.
https://github.com/casbin/jcasbin#documentation

Should we have graal-native?

With this,
https://graalvm.github.io/native-build-tools/latest/gradle-plugin.html
we can check if the binary size would decrease or not, and we can run better or not.

TypeSystem should be accessible programmatically

In some cases it makes sense to load schema and match against data loaded from various sources.
In those cases it makes sense to have programmatic schema verification.

Dump function of google storage not working in case of existing blob

Resource File Support : It is sometime necessary to have resource string support that would depend on the script variables

Specifically, there would be the case where we may use a string which needs to have substitution from variables already being used in the script.

x = 42 
string = Test.resource("my_resource_id")

While the resource file might be something as follows:

resources.yaml

my_resource_id:  "Hello, the variable used is : #{x}"

That would be good.
We can store the resources inside static folders (?)

A Jython JSON demo is required

Given Jython does not have json ability which is off the shelf, one small demo showing json processing in Jython is required.

JDBC Data source timing related load failure

Imagine the JDBC db server is booting up - while the cowj is booting up.
Now, cowj knowingly throws an error at boot - because it fails to connect to the db in the load time in the Cowj running thread.

Given the "architecture" is about getting dedicated Connection per thread based on lazy loading,
this at best should be configurable.

Output filter for proxy does not validate response after finally filter

Say I have an API Get Request to '/' which forwards the request to some other API and then messages the data and returns it. In this case I want the output validation to trigger after the finally filter because I want to handle both error and success case. Right now what is happening is it is triggering before finally.

Example -
api.yaml

port: 5003

proxies:
  get:
    /: json_place/users

filters:
  finally:
    /: _/after.zm

plugins:
  cowj.plugins:
    curl: CurlWrapper::CURL

data-sources:
  json_place:
    type: curl
    url: https://jsonplaceholder.typicode.com

static/types/schema.yaml

labels: # how system knows which label to invoke
  ok:  "resp.status == 200" # when response status is 200
  err:  "resp.status != 200" # when it is not

verify:
  out: true

routes:
  /:
    get:
      ok: output.json

static/types/output.json

{
  "$id": "https://blox.xyz/cowj_api/Generic.error.schema.json",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "description": "Output for send otp service",
  "title": "Error",
  "type": "object",
  "properties": {
    "foo": {
      "type": "string"
    }
  },
  "required": ["foo"],
  "additionalProperties": false
}

after.zm

resp.body("{\"foo\": \"bar\"}")

This setup responds with: {"foo": "bar"} correctly but on the console, there is a log:

SEVERE: Output Schema Validation failed. Route '/' : 
 com.worldturner.medeia.api.ValidationFailedException: [Validation Failure
------------------
Rule:     type
Message:  Type mismatch, data has array and schema has object....

As the schema validation expects {"foo": "bar"} but it is getting the response from proxy directly before the finally filter.

If we change it to after, it works.

Expectation -
For Proxy, Output validation should trigger after all finally filters are executed because we might want to message the data before returning response using after and finally filters. Finally is important because it allows us to handle error case as well

Integrate Firebase Messaging Service into COWJ

We would like to use COWJ to send notifications to users using COWJ. However, a multicast using Firebase REST API https://firebase.google.com/docs/cloud-messaging/send-message#send-messages-to-multiple-devices requires using sub requests which is not usually exposed as a part of http libraries. Therefore we want to integrate firebase library into COWJ. Specifically the com.google.firebase.messaging library

Preflight request Authentication - OPTIONS

As one can see from here:
https://serverfault.com/questions/684855/disable-authentication-for-http-options-method-preflight-request
For CORS, the preflight would be a problem - it needs to be Authenticated, which will not work.
Apache2 solves the problem by putting

<LimitExcept OPTIONS>
  Require valid-user
</LimitExcept>

Should support auto telemetry to cloud storage - Google/AWS

We want to store all telemetry data to cloud storage.
Essentially all logs should get into proper buckets, in JSON form.

Technically we can expose the HTTP 2, and 1.2 stuff

Here is how to do it .
https://www.baeldung.com/jetty-http-2

Risky routes are wrongly implemented

A clean risky mechanism should be following routes route:

risky:
   get: 
     foo/bar

It also co-insides with the OPTIONS issue.

Cloud storage plugin does not provide an option to override project id

Issue Description

The cloud storage plugin uses the gcloud default project id. This will not work if project id is not set or if you have multiple projects and want to work with them at the same time. A lot of people have access to the project but might not have set a default project id. And a lot of people might have multiple projects and might not want to change defaults everytime they change projects. Forgetting to change default project might end up with them getting weird error messages.

Proposed solution

In the yaml file, add an optional field for project-id.

If project-id is not specified, use gcloud default

plugins:
  cowj.plugins:
    g_storage: GoogleStorageWrapper::STORAGE

data-sources:
  storage:
    type: g_storage

With project-id specified

plugins:
  cowj.plugins:
    g_storage: GoogleStorageWrapper::STORAGE

data-sources:
  storage:
    type: g_storage
    project-id: foo_bar

JDBCWrapper does not write back - need update() support

Better way of writing input validations

For input validations, there are lots of cases of:

if (foo) {
    return jstr({'error: 'description'})
}

Figure out a better way for input validations

JavaScript - Rhino - print() does not work

Seems to be a problem in Nashorn. Solution is easy, just extending TestAsserter for the same.

Test.print() is System.out.printf()
Test.printe() is System.err.printf()

If proxy errors out with panic, we do not get response code that we set

If proxy is:

panic(true, 'message', 418)

We are getting a 500 internal server error

Async not working for proxies end points

Async functionality is currently not handled for proxies end points

A very easy feature would be to have errors wrap up in nice JSON body

Of the form:

{
  "error" : "whatever" 
}

This helps a lot.

Error in tranform should not forward the request

Schema validator is somewhat necessary

This is a HUGELY debated topic.
While schema are necessary, it is also necessary to describe the current schema,
from the service itself.
When we sat in WSDL consortium - that was key.
https://en.wikipedia.org/wiki/Web_Services_Description_Language
This has the interesting way of make the service self documented.
We do not need WSDL for sure, but we need self documenting API endpoints.

Panic without message throws exception

panic(true) throws exception

GET input schema verification is by definition disabled

Should it be?
There is a huge discussion on this topic:
https://stackoverflow.com/questions/978061/http-get-with-request-body

Yes. In other words, any HTTP request message is allowed to contain a message body, and thus must parse messages with that in mind. Server semantics for GET, however, are restricted such that a body, if any, has no semantic meaning to the request. The requirements on parsing are separate from the requirements on method semantics.
So, yes, you can send a body with GET, and no, it is never useful to do so.

This is why we need QUERY.
https://www.ietf.org/archive/id/draft-ietf-httpbis-safe-method-w-body-02.html

Unchecked exceptions returned as response instead of generic error message

A route with the following code:

1 / 0

Returns response as:

java.lang.RuntimeException: zoomba.lang.core.types.ZException$ArithmeticLogicOperation: Invalid Arithmetic Logical Operation [/] :  --> /Users/hemil/blox/cowj_apis/api/location.zm:1:5 to 5 --> ( Can not do operation ( DIVISION ) :
 ( 1 ) with ( 0 ) !
 left: java.lang.Integer 
 right: java.lang.Integer )

Instead of a generic internal server error

Allow data sources to get the base directory so the files can use _/ notation

The yaml file allows file paths to be specified using _/ notation. Example -

routes:
  get:
    /hello/g: _/hello.groovy
    /hello/j: _/hello.js
    /hello/p: _/hello.py
    /hello/z: _/hello.zm

_/hello.groovy means hello.groovy in the directory of the configuration file. This allows the file to be relocated as long as the relative paths are preserved. But we are not exposing the base directory to data sources. Many data sources would need files as input which are best specified as relative paths in order to facilitate relocation

JDBC Wrapper should have connection pooling

https://www.baeldung.com/java-connection-pooling
https://www.baeldung.com/hikaricp

I mean we can work w/o it, but .. it is better to have it.

@hemil-ruparel-blox

When input validation failed, do not do output validation!

Right now, in case of input validation fails, the response goes back to the client.
While doing so, the output validation gets triggered.
This is unexpected behavior, and needs to stop.

Cowj should have a docker image to easily load and run cowj apps

This should have been done by now, but it is not done.

For development it would be better to have options to "auto-load"

We can start with Scripts, all scripts. Functionality would be encoded in the Yaml file as follows:

cache : 
     scripts: false
     auth: false 
     type: false

Default would be set to true.
This way people do not have to reload the server - when they modify things in the scripts and such.
We should start with scripts and type.

A much better alternative would be just adding cache : false in the container yaml files.

Main config file cache would essentially talk about scripts in question.
Schema.yaml would only be applicable for JSON files pointed to by the schema.

Google Storage plugin : Recursive Directory Traversal

The function all() does not do a recursive traversal.

default Stream<Blob> all(String bucketName, String directoryPrefix) {}

This can be fixed easy by using overloads.

Responding with non string in Routes return response - crashes output schema validation

Just do this for the script after input schema validation:

payload = req.attribute("_body") // this should already have the parsed data

Error happens:

java.lang.NullPointerException: Cannot invoke "String.length()" because "content" is null
	at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1217)
	at cowj.TypeSystem.lambda$outputSchemaVerificationFilter$6(TypeSystem.java:344)
	at spark.FilterImpl$1.handle(FilterImpl.java:73)
	at spark.http.matching.AfterAfterFilters.execute(AfterAfterFilters.java:55)
	at spark.http.matching.MatcherFilter.doFilter(MatcherFilter.java:187)

Expose environment variables in scripting languages

Expose environment variables in scripting languages so zoomba can access environment of the JVM process and so different things depending on the environment it is running on

Following Spark Features are not yet exposed

Custom "Page not found" handler
Custom Exception class handling ( possibly needed to filter out anything which may disclose random info to hacker )
Exception Mapping

All of these can be done easily with routes support as follows:

routes:
   errors:
        "404" :  _/not_found.js 
        "500" :  "Internal Server Error!"
        "*"  :     _/custom_error_handler.js

https://sparkjava.com/documentation#error-handling
https://sparkjava.com/documentation#exception-mapping

Time to move past Jython and start using JPy?

This seems to make sense.
https://jpy.readthedocs.io/en/latest/install.html#running-python-from-java

cow-0.1-SNAPSHOT.jar should be corrected as cowj-0.1-SNAPSHOT.jar in the doc

6.Ensure ls returns java -jar cow-0.1-SNAPSHOT.jar as well as a deps/ folder. If those two don’t exist, run gradle clean and gradle build -x test
7.java -jar cow-0.1-SNAPSHOT.jar ../../samples/hello/hello.yaml

The corrected comments below for https://github.com/nmondal/cowj#cowj-setup
java -jar cowj-0.1-SNAPSHOT.jar and
java -jar cowj-0.1-SNAPSHOT.jar ../../samples/hello/hello.yaml

No way to list data of a folder in a bucket?

There is no function available to list data of a folder in a bucket.

RedisWrapper does not use SecretManager properly

This would be a problem:

local_redis:
   type: redis
   secrets : gcp
   urls: "${redis-urls}"

Would not work even if the gcp secret manager has a proper key which is redis-urls

def send( headers, body, params) {
  // producing payload to destination server 
}

But the reverse transform functionality is missing.

def receive( headers, body, status ) {
  // producing payload to client  
}

The proposal for the transform could be:

proxy:
  send: _/send.zm
  receive: _/rec.zm