Code Monkey home page Code Monkey logo

client_python's Introduction



Pypi Slack


Overview

A helper library to interact with Arize AI APIs.

Arize is an end-to-end ML observability and model monitoring platform. The platform is designed to help ML engineers and data science practitioners surface and fix issues with ML models in production faster with:

  • Automated ML monitoring and model monitoring
  • Workflows to troubleshoot model performance
  • Real-time visualizations for model performance monitoring, data quality monitoring, and drift monitoring
  • Model prediction cohort analysis
  • Pre-deployment model validation
  • Integrated model explainability

Quickstart

This guide will help you instrument your code to log observability data for model monitoring and ML observability. The types of data supported include prediction labels, human readable/debuggable model features and tags, actual labels (once the ground truth is learned), and other model-related data. Logging model data allows you to generate powerful visualizations in the Arize platform to better monitor model performance, understand issues that arise, and debug your model's behavior. Additionally, Arize provides data quality monitoring, data drift detection, and performance management of your production models.

Start logging your model data with the following steps:

1. Create your account

Sign up for a free account HERE.



2. Get your service API key

When you create an account, we generate a service API key. You will need this API Key and your Space Key for logging authentication.

3. Instrument your code

Python Client

If you are using the Arize python client, add a few lines to your code to log predictions and actuals. Logs are sent to Arize asynchronously.

Install Library

Install the Arize library in an environment using Python >= 3.6.

$ pip3 install arize

Or clone the repo:

$ git clone https://github.com/Arize-ai/client_python.git
$ python3 -m pip install client_python/

Initialize Python Client

Initialize the arize client at the start of your service using your previously created API and Space Keys.

NOTE: We strongly suggest storing the API key as a secret or an environment variable.

from arize.api import Client
from arize.utils.types import ModelTypes, Environments


API_KEY = os.environ.get('ARIZE_API_KEY') #If passing api_key via env vars

arize_client = Client(space_key='ARIZE_SPACE_KEY', api_key=API_KEY)

Collect your model input features and labels you'd like to track

Real-time single prediction:

For a single real-time prediction, you can track all input features used at prediction time by logging them via a key:value dictionary.

features = {
    'state': 'ca',
    'city': 'berkeley',
    'merchant_name': 'Peets Coffee',
    'pos_approved': True,
    'item_count': 10,
    'merchant_type': 'coffee shop',
    'charge_amount': 20.11,
    }

Bulk predictions:

When dealing with bulk predictions, you can pass in input features, prediction/actual labels, and prediction_ids for more than one prediction via a Pandas Dataframe where df.columns contain feature names.

## e.g. labels from a CSV. Labels must be 2-D data frames where df.columns correspond to the label name
features_df = pd.read_csv('path/to/file.csv')

prediction_labels_df = pd.DataFrame(np.random.randint(1, 100, size=(features.shape[0], 1)))

ids_df = pd.DataFrame([str(uuid.uuid4()) for _ in range(len(prediction_labels.index))])

Log Predictions

Single real-time prediction:

## Returns an array of concurrent.futures.Future
pred = arize.log(
    model_id='sample-model-1',
    model_version='v1.23.64',
    model_type=ModelTypes.BINARY,
    prediction_id='plED4eERDCasd9797ca34',
    prediction_label=True,
    features=features,
    )

#### To confirm that the log request completed successfully, await for it to resolve:
## NB: This is a blocking call
response = pred.get()
res = response.result()
if res.status_code != 200:
  print(f'future failed with response code {res.status_code}, {res.text}')

Bulk upload of predictions:

responses = arize.bulk_log(
    model_id='sample-model-1',
    model_version='v1.23.64',
    model_type=ModelTypes.BINARY,
    prediction_ids=ids_df,
    prediction_labels=prediction_labels_df,
    features=features_df
    )
#### To confirm that the log request completed successfully, await for futures to resolve:
## NB: This is a blocking call
import concurrent.futures as cf
for response in cf.as_completed(responses):
  res = response.result()
  if res.status_code != 200:
    print(f'future failed with response code {res.status_code}, {res.text}')

The client's log_prediction/actual function returns a single concurrent future while log_bulk_predictions/actuals returns a list of concurrent futures for asynchronous behavior. To capture the logging response, you can await the resolved futures. If you desire a fire-and-forget pattern, you can disregard the responses altogether.

We automatically discover new models logged over time based on the model ID sent on each prediction.

Logging Actual Labels

NOTE: Notice the prediction_id passed in matches the original prediction sent on the previous example above.

response = arize.log(
    model_id='sample-model-1',
    model_type=ModelTypes.BINARY,
    prediction_id='plED4eERDCasd9797ca34',
    actual_label=False
    )

Bulk upload of actuals:

responses = arize.bulk_log(
    model_id='sample-model-1',
    model_type=ModelTypes.BINARY,
    prediction_ids=ids_df,
    actual_labels=actual_labels_df,
    )

#### To confirm that the log request completed successfully, await for futures to resolve:
## NB: This is a blocking call
import concurrent.futures as cf
for response in cf.as_completed(responses):
  res = response.result()
  if res.status_code != 200:
    print(f'future failed with response code {res.status_code}, {res.text}')

Once the actual labels (ground truth) for your predictions have been determined, you can send them to Arize and evaluate your metrics over time. The prediction id for one prediction links to its corresponding actual label so it's important to note those must be the same when matching events.

Bulk upload of all your data (features, predictions, actuals, SHAP values) in a pandas.DataFrame

Use arize.pandas.logger to publish a dataframe with the features, predicted label, actual, and/or SHAP to Arize for monitoring, analysis, and explainability.

Initialize Arize Client from arize.pandas.logger

from arize.pandas.logger import Client, Schema
from arize.utils.types import ModelTypes, Environments

API_KEY = os.environ.get('ARIZE_API_KEY') #If passing api_key via env vars
arize_client = Client(space_key='ARIZE_SPACE_KEY', api_key=API_KEY)

Logging features & predictions only, then actuals

response = arize_client.log(
    dataframe=your_sample_df,
    model_id="fraud-model",
    model_version="1.0",
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.PRODUCTION,
    schema = Schema(
        prediction_id_column_name="prediction_id",
        timestamp_column_name="prediction_ts",
        prediction_label_column_name="prediction_label",
        prediction_score_column_name="prediction_score",
        feature_column_names=feature_cols,
    )
)

response = arize_client.log(
    dataframe=your_sample_df,
    model_id=model_id,
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.PRODUCTION,
    schema = Schema(
        prediction_id_column_name="prediction_id",
        actual_label_column_name="actual_label",
    )
)

Logging features, predictions, actuals, and SHAP values together

response = arize_client.log(
    dataframe=your_sample_df,
    model_id="fraud-model",
    model_version="1.0",
    model_type=ModelTypes.NUMERIC,
    environment=Environments.PRODUCTION,
    schema = Schema(
        prediction_id_column_name="prediction_id",
        timestamp_column_name="prediction_ts",
        prediction_label_column_name="prediction_label",
        actual_label_column_name="actual_label",
        feature_column_names=feature_col_name,
        shap_values_column_names=dict(zip(feature_col_name, shap_col_name))
    )
)

4. Log In for Analytics

That's it! Once your service is deployed and predictions are logged you'll be able to log into your Arize account and dive into your data, slicing it by features, tags, models, time, etc.

Analytics Dashboard




Logging SHAP values

Log feature importance in SHAP values to the Arize platform to explain your model's predictions. By logging SHAP values you gain the ability to view the global feature importances of your predictions as well as the ability to perform cohort and prediction based analysis to compare feature importance values under varying conditions. For more information on SHAP and how to use SHAP with Arize, check out our SHAP documentation.


Other languages

If you are using a different language, you'll be able to post an HTTP request to our Arize edge-servers to log your events.

HTTP post request to Arize

curl -X POST -H "Authorization: YOU_API_KEY" "https://log.arize.com/v1/log" -d'{"space_key": "YOUR_SPACE_KEY", "model_id": "test_model_1", "prediction_id":"test100", "prediction":{"model_version": "v1.23.64", "features":{"state":{"string": "CO"}, "item_count":{"int": 10}, "charge_amt":{"float": 12.34}, "physical_card":{"string": true}}, "prediction_label": {"binary": false}}}'

Website

Visit Us At: https://arize.com/model-monitoring/

Official Documentations: https://docs.arize.com/arize/

Additional Resources

Visit the Arize Blog and Resource Center for more resources on ML observability and model monitoring.

client_python's People

Contributors

arizedatngo avatar caroger avatar davidgmonical avatar fjcasti1 avatar gabe0912 avatar gurmeharsomal avatar hannahturk avatar harrisonchu avatar jackyxcs avatar mgordon-arize avatar parker-stafford avatar rogerhyang avatar shiwen1209 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

client_python's Issues

Support for release notes or changelogs?

Hi Arize team,

I'm wondering if your team could please publish release notes, or at least changelogs for version upgrades when arize has a new major/minor/patch upgrade? We have a dependabot that maintains our dependencies and frequently suggests upgrading major or minor versions of Arize, but it is difficult to understand which changes we'd be admitting, and further, if arize has deprecated anything important or otherwise introduced any breaking changes. This makes us reluctant to commit the suggested upgrade. Publishing changelogs or release notes will allow teams using arize for their ML platforms to more quickly adapt to the latest cool features you make available.

Thank you!
Frank

notebook Analyzing_Performance_Degredation needs fix

notebooks/Analyzing_Performance_Degredation needs a fix to update import statements (in Step 0 & Step 1) for the latest version
from arize.types import ModelTypes
to
from arize.utils.types import ModelTypes

ModuleNotFoundError: No module named 'arize.single_log'

Following these docs in colab environment.
Arize verison: 7.11.0
Following code raise exception

from arize.api import Client

API_KEY = 'xx'
SPACE_KEY = 'xx'
arize_client = Client(space_key=SPACE_KEY, api_key=API_KEY)

Trace:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
[<ipython-input-4-afea86b840c6>](https://localhost:8080/#) in <cell line: 1>()
----> 1 from arize.api import Client
      2 
      3 
      4 API_KEY = '6983bed'
      5 SPACE_KEY = 'ff4f4d6be3d5f6fbc6d'

[/usr/local/lib/python3.10/dist-packages/arize/api.py](https://localhost:8080/#) in <module>
     36 from .__init__ import __version__
     37 from .bounded_executor import BoundedExecutor
---> 38 from .single_log.casting import cast_dictionary
     39 from .utils.errors import AuthError, InvalidStringLength, InvalidTypeAuthKey, InvalidValueType
     40 from .utils.logging import get_truncation_warning_message, logger

ModuleNotFoundError: No module named 'arize.single_log'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

Support unsigned int types in features

Hi,
I'm trying to programmatically upload my first training dataset, but even though the docs say you support missing data in features, your validation in fact prevents missing data in integer columns:

allowed_datatypes = (

The list of allowed arrow types only contains non-nullable integer types. Is this an oversight or because you don't really support missing data in features?

Also, and perhaps alternatively, since you support manual upload of parquet and arrow files, do you plan to also support these via the Python SDK? My data is in Arrow to begin with, and so that would save me some manual work of converting to pandas, especially since it'll get converted back to Arrow anyway.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.