The raymon's discuss from raymon-ai

Notify user if logging fails

If sending data to the backend failed, we need to show a warning (but not let prod code crash!)

The Raymon Logging library should call the backend with the Authorization header set to a Bearer JWT token. It should get the token from our Auth0 endpoint. To get the token for machine 2 machine logging, it should send a request with the following parameters to the Auth0 endpoint:

RAYMON_AUTH0_URL: The endpoint to query the token on
RAYMON_GRANT_TYPE:
RAYMON_AUDIENCE: The API we want to log to
RAYMON_CLIENT_ID: client id
RAYMON_CLIENT_SECRET: client secret

These parameters should be loaded from ~/.raymon/secrets.json, <working_dir>/.raymon.json, from environment variables or from a specific file, in that order of ascending priority.

Important: this use case is for ingesting data. For users querying and inspection data, we need another workflow.

Use session for RaymonAPI

We currently setup a connection every time we log an artefact. This can be optimized by using a session. https://requests.readthedocs.io/en/master/user/advanced/

Lenient type conversion when logging `rt.Native`

When logging nested data, we should gracefully check for types like np.int64 and such and convert them to something json dumpable. Packes will probably exist for this.

Do not send logs to API on every call, but send in batches

Doing an API call on every log / tag statement is fine for demo and MVP purposes, but not for real use cases.

Cache the log / tag / info events in the ray, and send when the ray is destroyed, or when a certain threshold is reached.
Make sure to send all logged data in case of an exception!

Make logging lib use ray object

ray == object
support splitting and merging
str == str of ray_id
log methods in here: need to pass context, ray_id etc only once!

Support tagging rays

Provide a `functionwrapper` Extractor

For simple functions, we should provide a wrapper class that takes care of loading / serialising the function.

Add report checks

We should support the following metrics for all component types (input, output, actual, scores) and make it configurable which ones to check for.

drift
% invalids
mean
samplesize > 0, or big "enough"
difference in samplesize between component and global max

Add aggregations / reducers

Mean
Precision / recall / ...
Others?
Note: We need to implement an related SQL implementation for all the ones we support here.

batched & async API calls

We need to support the batched API data ingestion method and do this async and error-proof.

Add `Profile` class

Now we just have a ModelProfile. We could add a class Profile, that only has inputs.

Consequently use either `Tag` or dict

trace.tag should accept both Tag objects or dicts. We should convert them internally if needed, and only convert them to json when writing to file or API.

Logo on docs.raymon.ai broken

logo broken
alt text should be: Raymon.ai

Add p-values or confidence interval to drift tests

We need a way to indicate how confident we are about the drift, which will be in function of the data we've analysed. We currently use a 2 sample KS test for NumericStats, but do not return a p-value. We could easily do this, but we need to decide what to do with the other types first:

Is the KS actually applicable for discrete distributions? (Like IntComponent)
What with CategroicalComponents?

For all component types / stats types, we need a test that returns a value between 0 and 1, and can return a pvalue, confidence interval, or something else.

Increase testing coverage

Profile contrast thresholds
Profile domain specifications
Extractors
Serialisation / deserialisation

Set trace global on construction

To avoid passing the trace in al lfunctions, we should support a workflow like this:

import raymon
from raymon import Trace
trace = Trace(... ,global=True)  # default
trace2 = raymon.current_trace()

assert trace == trace2

pytorch -> ONNX

The pytorch dependency is huge, and only has limited value.

Let's try to replace it by ONNX, which is hopefully smaller. Alternatively, we should move the extractors that depend on pytorch to a separate package.

Ingest logs and tags with text log files

Allow users to log to text files, and ingest those text files offline in the API, much like prometheus / logstash.

Contrast profiles only for common components

When one profile has input_components=[a, b, c] and output_comonents=[d, e], and the other profile has nput_components=[a, b] and output_comonents=[d], a schema contrast should only take the common components.

Show reducers on frontend

Add `trace.error` method

This should format the error nicely, tag the trace with the error and log the stacktrace as a trace element.

Simplify and rename reducers

Testing

Test Authentication
- config in home dir
- config in current dir
- env variable
- specified config file
Increase test coverage

Provide API querying functions

All REST API endpoints should be queryable through the raymon library.
e.g. api.search_object(...)

Remove bokeh dependency from raymon logging lib

Lib should be as lightweight as possible.

Move datatypes / viz to separate lib

Switch Image plotting from bokeh to pyplot

Integrate with pandas-profiling

https://github.com/pandas-profiling/pandas-profiling

We should be able to parse the output into one of our profiles.

Typo's etc. examples logging library

raymon/examples/0-setup-logging.ipynb

you cna easlity fetch artefacts from the backend and load the data. > you can easily fetch artefacts from the backend and load the data.

https://github.com/raymon-ai/raymon/blob/master/examples/1-profiling-structured.ipynb

Note that some outputs may not be work when viewing on Github since they are shown in an iframe. We recommend to clone this repo and execute the notebooks locally. > might not work(?)
This subset just happens to mostly contain houses on the cheaper end of the price spectrum, which will come in handy later. >> leest wat raar precies, mss kunnen we zeggen for the sake of the example.
RDV offers toolign to inspect the schema's that are built. Let's laod the schema (just because we can) and inspect it. > RDV offers tooling to inspect the schema's that are built. Let's load the schema (just because we can) and inspect it.
RDV mag ook weg
There are a few things of note here. > Things to note?
3 lege blokjes onderaan

https://github.com/raymon-ai/raymon/blob/master/examples/2-profiling-vision.ipynb

Just like in the case of structured data, we need to start by specifying a profile and its components. > Moeten we hier verwijzen naar structured data?

Buggy frontend navigation

Browsing to page 2 on one component_type, and then switching pages results on showing page 2 on the new component_type page too. This may not have 2 pages.\

When switching component_types, set page to 0.

Deal with components without data

When building a schema from the database, its stats can be empty and not contrast to another schema can be made for those components. Currently these will return a drift of -1, which will be shown in green, yikes! We should alert users of "No Data" instead. This should be a new type of alert actually. (invalids, drift, no data)

Allow user to specify domains before building profiles

We already have a paremeter domain on the stats objects from before. We need to re-enable calling those.

profile.build() should have a parameter domains of the form:

{
'input_components': {name: domain, name2: domain},
}

Double plots: cdf + hist

We should enable them both again, like we did before. They can be plotted on the same axis.

Test FileLogger

I got errors in examples.

ingest_retinopathy_1  | Traceback (most recent call last):
ingest_retinopathy_1  |   File "process.py", line 242, in <module>
ingest_retinopathy_1  |     ray_ids = run()
ingest_retinopathy_1  |   File "process.py", line 235, in run
ingest_retinopathy_1  |     oracle.process(ray_id=ray_id, metadata=metadata)
ingest_retinopathy_1  |   File "process.py", line 141, in process
ingest_retinopathy_1  |     ray.info(f"Logging ground truth for {ray}")
ingest_retinopathy_1  |   File "/usr/local/lib/python3.7/site-packages/raymon/ray.py", line 41, in info
ingest_retinopathy_1  |     self.logger.info(ray_id=str(self), text=text)
ingest_retinopathy_1  |   File "/usr/local/lib/python3.7/site-packages/raymon/loggers.py", line 84, in info
ingest_retinopathy_1  |     self.data_logger.info(json.dumps(kafka_msg))
ingest_retinopathy_1  | AttributeError: 'RaymonFileLogger' object has no attribute 'data_logger'

pkg_resources is deprecated

From: https://setuptools.readthedocs.io/en/latest/pkg_resources.html

Use of pkg_resources is discouraged in favor of importlib.resources, importlib.metadata, and their backports (resources, metadata). Please consider using those libraries instead of pkg_resources.

https://docs.python.org/3/library/importlib.html#module-importlib.resources

Frontend must deal with missing components

When we contrast profiles, some component may be missing. We need to deal with this.

Image datatype: store as PIL.Image internally

We are should simply save the image as a PIL image and save / load to lossless PNG format.

Extractor testing

We need basic tests for:

element extractor (structured)
intensity
sharpness
similarity

Parse np.int64 etc as Number

Instead of the validation checks, we should simply try to parse the data as a float.

Check and test all data types

Data types should have the to_jcr and from_jcr functions implemented and should be unit tested.

Replace casting images with retina images in examples

Authentication overhaul

We currently only support the client_credentials flow. We also need to support a user to log in with the CLI using the device flow grant.

When available, the system should use client credentials, if not, it should try to login the user.

ray.system_metrics()?

Using the statement api.system_metrics() we should be able to get and send following tags or global metrics to the backend:

Total memory consumption on the machine
GPU usage (if any)
Disk usage

These should be tags of type global-metric (?) and should not be attached to the ray?

using ray.process_metrics()

Memory usage of process (this is a ray tag)

error on login

I got an error when logging in for the first time.

FileNotFoundError                         Traceback (most recent call last)
~/raymon/examples/setup_project.py in 
     19     login_env = None
     20 # api = RaymonAPI(url=f"https://api{ENV}.raymon.ai/v0", env=login_env)
---> 21 api = RaymonAPI(url=f"http://localhost:8000/v0", env=login_env)
     22 
     23 

~/opt/miniconda3/envs/retinopathy/lib/python3.8/site-packages/raymon/api.py in __init__(self, url, project_id, auth_path, env)
     17         self.token = None
     18 
---> 19         self.login()
     20 
     21     """

~/opt/miniconda3/envs/retinopathy/lib/python3.8/site-packages/raymon/api.py in login(self)
     24 
     25     def login(self):
---> 26         self.token = login(fpath=self.auth_path, project_id=self.project_id, env=self.env)
     27         self.headers["Authorization"] = f"Bearer {self.token}"
     28 

~/opt/miniconda3/envs/retinopathy/lib/python3.8/site-packages/raymon/auth/__init__.py in login(fpath, project_id, env)
     71     # If we did not find m2m credentials, let the user login interactively.
     72     try:
---> 73         token = login_user(credentials=credentials, out=fpath, env=env)
     74     except (SecretException, NetworkException) as exc:
     75         print(f"Could not login with user credentials.")

~/opt/miniconda3/envs/retinopathy/lib/python3.8/site-packages/raymon/auth/__init__.py in login_user(credentials, out, env)
     40     if not token_ok(token):
     41         token = login_device_flow(config)
---> 42     save_user_config(
     43         existing=credentials,
     44         auth_endpoint=config["auth_url"],

~/opt/miniconda3/envs/retinopathy/lib/python3.8/site-packages/raymon/auth/user.py in save_user_config(existing, auth_endpoint, audience, client_id, token, out, env)
     31     user_config[env["auth_url"]] = env_config
     32     known_configs["user"] = user_config
---> 33     with open(out, "w") as f:
     34         json.dump(known_configs, fp=f, indent=4)
     35 

FileNotFoundError: [Errno 2] No such file or directory: '/Users/emreozan/.raymon/secrets.json'
5 cells were canceled due to an error in the previous cell.

Check ref name against ray history

A ray should not be able to log to the same peephole twice. Peepholes must be unique.

Data validation on orchestration

Use Cerberus or Schematics to validate the config values.
Can serve as inspiration for RDV.

Add authentication

The current implementation does not have authentication. Add this ASAP.

Simple: https://pypi.org/project/falcon-auth0/
Future proof: https://pypi.org/project/falcon-auth0/

Set warning thresholds for drift and integrity per component

Must work the same like #33.

Use confidence interval for contrast checks

When comparing 2 stats objects we currently do not use any confidence interval or p-value check, which we should. Since p-value checks can be overly sensitive on big data sets (which we happen to have a lot of in "big data") we want to use confidence intervals. They can also make nice plots.

We can build a confidence intervals with minimal changes required in this library and the backend as follows.

Numeric stats: use the Dvoretzky–Kiefer–Wolfowitz inequality (see here also)
Categoric stats: Estimate the confidence interval based on the poisson distribution as described here.

We can build these confidence intervals based solely on our stats (edf / frequencies and sampel sizes).

To contrast 2 stats objects, we can simply measure the max distance between the confidence intervals instead of the observed functions.

ray timings

We should make it easy for users to measure elapsed time.

with ray.time("your-ref"):
    pass

and ray.time_ref(peephole="your-ref") + easy calculation of time elapsed since previous ref.

This should be added as tag to the ray.

raymon-ai / raymon Goto Github PK

raymon's Issues

Recommend Projects

Recommend Topics

Recommend Org