Code Monkey home page Code Monkey logo

Comments (2)

Mohamed-Rafik-Bouguelia avatar Mohamed-Rafik-Bouguelia commented on September 2, 2024 1

The function model.predict(t, x) expects t to be a datetime and x to be a numpy array representing a feature-vector (i.e. data-point).

From what I can see, the index of your dataframe is an integer, therefore, at each iteration of the loop for t, x in zip(df.index, df.values): ... the value of your t is an integer (and not a datetime as expected) and your data-point x has a timestamp included in it (while it is expected to be a feature-vector, without time).

A simple way to change your code is to just define the timestamp column as an index. Here is a working code (with comments added where I changed something) :

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime

from grand import IndividualAnomalyInductive, IndividualAnomalyTransductive, GroupAnomaly

df = pd.read_csv("simulated_data.csv", parse_dates=True, header = 0, index_col=0) # Added index_col=0
df.index = pd.to_datetime(df.index) # The timestamps are now our index column
df.columns = ["value"] # the columns (features) excluding the index

df.plot() # there is no column "timestamp" now, it's the index
plt.show()

model = IndividualAnomalyTransductive(ref_group = ["day-of-week"], w_martingale = 100)

# You can also try with "season-of-year" as the periodicity in your data seems seasonal
# model = IndividualAnomalyTransductive(ref_group = ["season-of-year"], w_martingale = 100)

for t, x in zip(df.index, df.values):
    info = model.predict(t, x)
    print("Time: {} ==> strangeness: {}, deviation: {}".format(t, info.strangeness, info.deviation), end="\r")

# Just added this line to see the results
model.plot_deviations(figsize=(12, 8), plots=["data", "strangeness", "deviation", "pvalue", "threshold"])

Regarding your second question, the expected input of IndividualAnomalyTransductive() is as described in the example Notebook:

model = IndividualAnomalyTransductive(
            ref_group = ["day-of-week"] # Criteria to use to construct reference data (check the notebook examples to see other possible criteria to use).
            external_percentage = 0.3   # Percentage of samples to pick from historical data in the case where ref_group is set to "external".

            # The following parameters are the same as in IndividualAnomalyInductive
            non_conformity = "knn",     # Strangeness measure, e.g. "knn" or "median"
            k = 20                      # Used if non_conformity is "knn"
            w_martingale = 15,          # Window size used for computing the deviation level
            dev_threshold = 0.6,        # Threshold on the deviation level (in [0, 1])
            columns=None                # Optional feature names (for interpreting the results)
)

There is no other documentation for the moment besides the explanations given on the example Notebook. However, it will come in near future.

from group-anomaly-detection.

filipwastberg avatar filipwastberg commented on September 2, 2024

That's great. Thanks. I really think that some documentation of the functions would be a great feature.

Furthermore, I think it would be great if we could be able to install the package with pip install git+https://github.com/caisr-hh/group-anomaly-detection, instead of having to clone the whole project and then installing it. Is that something you are considering?

from group-anomaly-detection.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.