Code Monkey home page Code Monkey logo

tods's Introduction

TODS: Automated Time-series Outlier Detection System

Logo

Actions Status codecov

中文文档

TODS is a full-stack automated machine learning system for outlier detection on multivariate time-series data. TODS provides exhaustive modules for building machine learning-based outlier detection systems, including: data processing, time series processing, feature analysis (extraction), detection algorithms, and reinforcement module. The functionalities provided via these modules include data preprocessing for general purposes, time series data smoothing/transformation, extracting features from time/frequency domains, various detection algorithms, and involving human expertise to calibrate the system. Three common outlier detection scenarios on time-series data can be performed: point-wise detection (time points as outliers), pattern-wise detection (subsequences as outliers), and system-wise detection (sets of time series as outliers), and a wide-range of corresponding algorithms are provided in TODS. This package is developed by DATA Lab @ Rice University.

TODS is featured for:

  • Full Stack Machine Learning System which supports exhaustive components from preprocessings, feature extraction, detection algorithms and also human-in-the loop interface.

  • Wide-range of Algorithms, including all of the point-wise detection algorithms supported by PyOD, state-of-the-art pattern-wise (collective) detection algorithms such as DeepLog, Telemanon, and also various ensemble algorithms for performing system-wise detection.

  • Automated Machine Learning aims to provide knowledge-free process that construct optimal pipeline based on the given data by automatically searching the best combination from all of the existing modules.

Examples and Tutorials

Resources

Cite this Work:

If you find this work useful, you may cite this work:

@article{Lai_Zha_Wang_Xu_Zhao_Kumar_Chen_Zumkhawaka_Wan_Martinez_Hu_2021, 
	title={TODS: An Automated Time Series Outlier Detection System}, 
	volume={35}, 
	number={18}, 
	journal={Proceedings of the AAAI Conference on Artificial Intelligence}, 
	author={Lai, Kwei-Herng and Zha, Daochen and Wang, Guanchu and Xu, Junjie and Zhao, Yue and Kumar, Devesh and Chen, Yile and Zumkhawaka, Purav and Wan, Minyang and Martinez, Diego and Hu, Xia}, 
	year={2021}, month={May}, 
	pages={16060-16062} 
}

Installation

This package works with Python 3.7+ and pip 19+. You need to have the following packages installed on the system (for Debian/Ubuntu):

sudo apt-get install libssl-dev libcurl4-openssl-dev libyaml-dev build-essential libopenblas-dev libcap-dev ffmpeg

Clone the repository (if you are in China and Github is slow, you can use the mirror in Gitee):

git clone https://github.com/datamllab/tods.git

Install locally with pip:

cd tods
pip install -e .

Examples

Examples are available in /examples. For basic usage, you can evaluate a pipeline on a given datasets. Here, we provide example to load our default pipeline and evaluate it on a subset of yahoo dataset.

import pandas as pd

from tods import schemas as schemas_utils
from tods import generate_dataset, evaluate_pipeline

table_path = 'datasets/anomaly/raw_data/yahoo_sub_5.csv'
target_index = 6 # what column is the target
metric = 'F1_MACRO' # F1 on both label 0 and 1

# Read data and generate dataset
df = pd.read_csv(table_path)
dataset = generate_dataset(df, target_index)

# Load the default pipeline
pipeline = schemas_utils.load_default_pipeline()

# Run the pipeline
pipeline_result = evaluate_pipeline(dataset, pipeline, metric)
print(pipeline_result)

We also provide AutoML support to help you automatically find a good pipeline for your data.

import pandas as pd

from axolotl.backend.simple import SimpleRunner

from tods import generate_dataset, generate_problem
from tods.searcher import BruteForceSearch

# Some information
table_path = 'datasets/yahoo_sub_5.csv'
target_index = 6 # what column is the target
time_limit = 30 # How many seconds you wanna search
metric = 'F1_MACRO' # F1 on both label 0 and 1

# Read data and generate dataset and problem
df = pd.read_csv(table_path)
dataset = generate_dataset(df, target_index=target_index)
problem_description = generate_problem(dataset, metric)

# Start backend
backend = SimpleRunner(random_seed=0)

# Start search algorithm
search = BruteForceSearch(problem_description=problem_description,
                          backend=backend)

# Find the best pipeline
best_runtime, best_pipeline_result = search.search_fit(input_data=[dataset], time_limit=time_limit)
best_pipeline = best_runtime.pipeline
best_output = best_pipeline_result.output

# Evaluate the best pipeline
best_scores = search.evaluate(best_pipeline).scores

Acknowledgement

We gratefully acknowledge the Data Driven Discovery of Models (D3M) program of the Defense Advanced Research Projects Agency (DARPA)

tods's People

Contributors

daochenzha avatar haifeng-jin avatar hwy893747147 avatar jjjzy avatar junjie-xu avatar lhenry15 avatar lsc2204 avatar mia1996 avatar purav-zumkhawala avatar qazimbhat1 avatar thuzxj avatar yileallenchen1 avatar yzhao062 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tods's Issues

AttributeError: TODS_PRIMITIVE

{
"name": "AttributeError",
"message": "TODS_PRIMITIVE",
"stack": "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mAttributeError\u001b[0m Traceback (most recent call last)\n\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mpandas\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0mpd\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mschemas\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0mschemas_utils\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 4\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mgenerate_dataset\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mevaluate_pipeline\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\n\u001b[1;32mc:\Users\14496\anaconda3\envs\anomaly\lib\site-packages\tods\init.py\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[1;33m.\u001b[0m\u001b[0mutils\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[1;33m*\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata_processing\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[1;33m*\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 3\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtimeseries_processing\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[1;33m*\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfeature_analysis\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[1;33m*\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdetection_algorithm\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[1;33m*\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\n\u001b[1;32mc:\Users\14496\anaconda3\envs\anomaly\lib\site-packages\tods\data_processing\init.py\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata_processing\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mCategoricalToBinary\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mCategoricalToBinaryPrimitive\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 2\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata_processing\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mColumnFilter\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mColumnFilterPrimitive\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata_processing\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mContinuityValidation\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mContinuityValidationPrimitive\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata_processing\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mDatasetToDataframe\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mDatasetToDataFramePrimitive\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mtods\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata_processing\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mDuplicationValidation\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mDuplicationValidationPrimitive\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\n\u001b[1;32mc:\Users\14496\anaconda3\envs\anomaly\lib\site-packages\tods\data_processing\CategoricalToBinary.py\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 117\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mprocessed_df\u001b[0m\u001b[1;33m;\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 118\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 119\u001b[1;33m \u001b[1;32mclass\u001b[0m \u001b[0mCategoricalToBinaryPrimitive\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtransformer\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mTransformerPrimitiveBase\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mInputs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mOutputs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mHyperparams\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 120\u001b[0m """\n\u001b[0;32m 121\u001b[0m \u001b[0mA\u001b[0m \u001b[0mprimitive\u001b[0m \u001b[0mwhich\u001b[0m \u001b[0mwill\u001b[0m \u001b[0mconvert\u001b[0m \u001b[0mall\u001b[0m \u001b[0mthe\u001b[0m \u001b[0mdistinct\u001b[0m \u001b[0mvalues\u001b[0m \u001b[0mpresent\u001b[0m \u001b[1;32min\u001b[0m \u001b[0ma\u001b[0m \u001b[0mcolumn\u001b[0m \u001b[0mto\u001b[0m \u001b[0ma\u001b[0m \u001b[0mbinary\u001b[0m \u001b[0mrepresntation\u001b[0m \u001b[1;32mwith\u001b[0m \u001b[0meach\u001b[0m \u001b[0mdistinct\u001b[0m \u001b[0mvalue\u001b[0m \u001b[0mhaving\u001b[0m \u001b[0ma\u001b[0m \u001b[0mdifferent\u001b[0m \u001b[0mcolumn\u001b[0m\u001b[1;33m.\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\n\u001b[1;32mc:\Users\14496\anaconda3\envs\anomaly\lib\site-packages\tods\data_processing\CategoricalToBinary.py\u001b[0m in \u001b[0;36mCategoricalToBinaryPrimitive\u001b[1;34m()\u001b[0m\n\u001b[0;32m 157\u001b[0m },\n\u001b[0;32m 158\u001b[0m 'algorithm_types': [\n\u001b[1;32m--> 159\u001b[1;33m \u001b[0mmetadata_base\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mPrimitiveAlgorithmType\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mTODS_PRIMITIVE\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 160\u001b[0m ],\n\u001b[0;32m 161\u001b[0m \u001b[1;34m'primitive_family'\u001b[0m\u001b[1;33m:\u001b[0m \u001b[0mmetadata_base\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mPrimitiveFamily\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mDATA_PREPROCESSING\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\n\u001b[1;32mc:\Users\14496\anaconda3\envs\anomaly\lib\enum.py\u001b[0m in \u001b[0;36m__getattr__\u001b[1;34m(cls, name)\u001b[0m\n\u001b[0;32m 324\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mcls\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_member_map_\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mname\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 325\u001b[0m \u001b[1;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 326\u001b[1;33m \u001b[1;32mraise\u001b[0m \u001b[0mAttributeError\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mname\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 327\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 328\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0m__getitem__\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mcls\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mname\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\n\u001b[1;31mAttributeError\u001b[0m: TODS_PRIMITIVE"
}

How to solve this bug?

Import of Certain Modules Not Working

I believe that I properly installed tods into my virtual environment. However for many of the packages in the different modules, I am getting AttributeErrors primarily for some reason. My python environment is 3.8.10.

Ex: from tods.detection_algorithm import PyodHBOS

Provides the following error:

File "c:/Users/shankars/path/Code Files/TODS.py", line 8, in
from tods.detection_algorithm import PyodHBOS
File "C:\Users\shankars\path.py38env\lib\site-packages\tods\detection_algorithm\PyodHBOS.py", line 75, in
class HBOSPrimitive(UnsupervisedOutlierDetectorBase[Inputs, Outputs, Params, Hyperparams]):
File "C:\Users\shankars\path.py38env\lib\site-packages\tods\detection_algorithm\PyodHBOS.py", line 131, in HBOSPrimitive
"algorithm_types": [metadata_base.PrimitiveAlgorithmType.HISTOGRAM_BASED_OUTLIER_DETECTION], File "C:\Users\shankars\AppData\Local\Programs\Python\Python38\lib\enum.py", line 384, in getattr
raise AttributeError(name) from None
AttributeError: HISTOGRAM_BASED_OUTLIER_DETECTION

Apologies if this is a trivial question, but I have not been able to figure out why this would not be working. Any feedback or resolution to this issue would be appreciated.

Error installing TODS on Apple M1 Pro

I got this error installing on Apple M1 Pro. Suggestions greatly appreciated. I am closer to being a novice Python programmer.

        File "numpy/core/setup.py", line 661, in get_mathlib_info
          raise RuntimeError("Broken toolchain: cannot link a simple C program")
      RuntimeError: Broken toolchain: cannot link a simple C program
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

.

Default pipeline & Other - Running into same issue as issue #31

@lhenry15

  • On running run_pipeline.py in examples with the default_pipeline.json (corresponding to load_default_pipeline()) gives below error

same issue as in
#31
#21 (comment)

Not all provided hyper-parameters for the data preparation pipeline 79ce71bd-db96-494b-a455-14f2e2ac5040 were used: ['method', 'number_of_folds', 'randomSeed', 'shuffle', 'stratified']
{'error': "[StepFailedError('Step 6 for pipeline "
"384bbfab-4f6d-4001-9f90-684ea5681f5d failed.',)]",
'method_called': 'evaluate',
'pipeline': '<d3m.metadata.pipeline.Pipeline object at 0x7f30a2d624a8>',
'status': 'ERRORED'}

  • But if I create a new pipeline (for example using build_LODA_pipline.py in examples) and substitute it in line#18 in run_pipeline.py it works fine.

image

  • If run test.sh it fails with same error.

  • with telemanom:

on buildTelemanom.py and use respective pipeline in run_pipelines.py gives below error
image

what might be the issue in all these cases?

Installation errors

I installed the tods package on Python 3.7.17 and have PIP version 20.0.2 but got the following error in my ubuntu terminal for some packages that were not compatible. I followed the exact installation instructions so I am not sure why that is occurring.

Capture

ModuleNotFoundError: No module named 'tods.sk_interface'

I believe that I did a successful install of tods.

Giving the command:

from tods import generate_dataset, generate_problem

yields no error.

but, I get an error with

from tods.sk_interface.detection_algorithm.MatrixProfile_skinterface import MatrixProfileSKI
Traceback (most recent call last):
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/Applications/PyCharm CE.app/Contents/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
ModuleNotFoundError: No module named 'tods.sk_interface'

I am sorry for asking such a rookie question. I suspect that the resolution to this is probably pretty easy.

Pipeline cannot handle special letters or symbols.

While working with a dataset of .csv format, I realized the pipeline cannot run if the data contains special letters or symbols such as "/".

For example, under a column of "Dates", "1/9/12" will stop the pipeline from running, no matter if using the searcher or manually building a pipeline.

git.exc.InvalidGitRepositoryError: E:\Anaconda3\envs\newtods\lib\site-packages\tods-0.0.2-py3.6.egg\tods\detection_algorithm

Thank you for your sharing. I met a problem when I repeated it. Is there a solution to this problem

File "D:/pythonproject/newtods/examples/sk_examples/DeepLog_test.py", line 2, in
from tods.sk_interface.detection_algorithm.DeepLog_skinterface import DeepLogSKI
File "E:\Anaconda3\envs\newtods\lib\site-packages\tods-0.0.2-py3.6.egg\tods_init_.py", line 5, in
from tods.detection_algorithm import *
File "E:\Anaconda3\envs\newtods\lib\site-packages\tods-0.0.2-py3.6.egg\tods\detection_algorithm_init_.py", line 17, in
from tods.detection_algorithm.PyodMoGaal import Mo_GaalPrimitive
File "E:\Anaconda3\envs\newtods\lib\site-packages\tods-0.0.2-py3.6.egg\tods\detection_algorithm\PyodMoGaal.py", line 124, in
class Mo_GaalPrimitive(UnsupervisedOutlierDetectorBase[Inputs, Outputs, Params, Hyperparams]):
File "E:\Anaconda3\envs\newtods\lib\site-packages\tods-0.0.2-py3.6.egg\tods\detection_algorithm\PyodMoGaal.py", line 187, in Mo_GaalPrimitive
git_commit=d3m_utils.current_git_commit(os.path.dirname(file)),
File "E:\Anaconda3\envs\newtods\lib\site-packages\tamu_d3m-2021.2.12-py3.6.egg\d3m\utils.py", line 87, in current_git_commit
repo = git.Repo(path=path, search_parent_directories=search_parent_directories)
File "E:\Anaconda3\envs\newtods\lib\site-packages\git\repo\base.py", line 181, in init
raise InvalidGitRepositoryError(epath)
git.exc.InvalidGitRepositoryError: E:\Anaconda3\envs\newtods\lib\site-packages\tods-0.0.2-py3.6.egg\tods\detection_algorithm

Process finished with exit code 1

UI

It says supports UI. How to launch the UI?

ModuleNotFoundError: No module named 'axolotl'

When running generate_dataset from the given example, the axolotl module is not found.
Running pip install -e . lead to no errors. However, from another issue post, running python setup.py install leads to:

120932875_1444343129103308_1617084652823366945_n

AttributeError: ISOLATION_FOREST

when i run LSTMOD_test.py

Traceback (most recent call last):
File "LSTMOD_test.py", line 2, in
from tods.sk_interface.detection_algorithm.LSTMODetector_skinterface import LSTMODetectorSKI
File "/home/tods/tods/init.py", line 5, in
from tods.detection_algorithm import *
File "/home/tods/tods/detection_algorithm/init.py", line 5, in
from tods.detection_algorithm.LSTMODetect import LSTMODetectorPrimitive
File "/home/tods/tods/detection_algorithm/LSTMODetect.py", line 164, in
class LSTMODetectorPrimitive(UnsupervisedOutlierDetectorBase[Inputs, Outputs, Params, Hyperparams]):
File "/home/tods/tods/detection_algorithm/LSTMODetect.py", line 199, in LSTMODetectorPrimitive
"algorithm_types": [metadata_base.PrimitiveAlgorithmType.ISOLATION_FOREST, ], # up to update
File "/home/anaconda3/envs/python36/lib/python3.6/enum.py", line 326, in getattr
raise AttributeError(name) from None
AttributeError: ISOLATION_FOREST

Issues while creating timeseries preprocessing pipeline

Hi @yzhao062,

Thanks for creating such a wonderful library for experimentation and we're also using it for our use case.

We're unable to create a pipeline / AutoML pipeline for time series-based data. Could you please share some references/examples to perform all the steps in the pipeline?. Based on your demo pipeline notebook we've followed similar steps in creating a pipeline. But when we were not able to add time-series-based processing. Kindly give us inputs for the same.

image

Thanks,
Manoj SH

AssertionError: d3m.primitive_interfaces.base.CallResult[d3m.container.list.List]

when i run example1.py

`import pandas as pd

from tods import schemas as schemas_utils
from tods.utils import generate_dataset, generate_problem, evaluate_pipeline

table_path = 'datasets/yahoo_sub_5.csv'
target_index = 6 # what column is the target
metric = 'F1_MACRO' # F1 on both label 0 and 1

Read data and generate dataset and problem

df = pd.read_csv(table_path)
dataset = generate_dataset(df, target_index)
problem_description = generate_problem(dataset, metric)

Load the default pipeline

pipeline = schemas_utils.load_default_pipeline()

Run the pipeline

pipeline_result = evaluate_pipeline(dataset, pipeline, metric)
`

the result :

ssh:///anaconda3/envs/tods/bin/python -u /home//tods/examples/example1.py
Not all provided hyper-parameters for the data preparation pipeline 79ce71bd-db96-494b-a455-14f2e2ac5040 were used: ['method', 'number_of_folds', 'randomSeed', 'shuffle', 'stratified']
Traceback (most recent call last):
File "/home//tods/examples/example1.py", line 19, in
pipeline_result = evaluate_pipeline(dataset, pipeline, metric)
File "/home/
/tods/tods/utils.py", line 86, in evaluate_pipeline
data_preparation_params=data_preparation_params)
File "/home//tods/src/axolotl/axolotl/backend/base.py", line 263, in evaluate_pipeline
data_preparation_params=data_preparation_params, scoring_params=scoring_params, timeout=timeout
File "/home/
/tods/src/axolotl/axolotl/backend/simple.py", line 166, in evaluate_pipeline_request
volumes_dir=self.volumes_dir, scratch_dir=self.scratch_dir, runtime_environment=self.runtime_environment
File "/home//tods/src/d3m/d3m/runtime.py", line 1530, in evaluate
scratch_dir=scratch_dir, runtime_environment=runtime_environment,
File "/home/
/tods/src/d3m/d3m/runtime.py", line 1488, in prepare_data
scratch_dir=scratch_dir, environment=runtime_environment,
File "/home//tods/src/d3m/d3m/runtime.py", line 236, in init
self.pipeline.check(allow_placeholders=False, standard_pipeline=self.is_standard_pipeline)
File "/home/
/tods/src/d3m/d3m/metadata/pipeline.py", line 1496, in check
self._check(allow_placeholders, standard_pipeline, input_types)
File "/home/****/tods/src/d3m/d3m/metadata/pipeline.py", line 1766, in _check
assert issubclass(produce_type, base.CallResult), produce_type
AssertionError: d3m.primitive_interfaces.base.CallResult[d3m.container.list.List]

Process finished with exit code 1

AttributeError: TODS_PRIMITIVE

when i run
from tods.sk_interface.detection_algorithm.IsolationForest_skinterface import IsolationForestSKI

finallly
d:\software\Anaconda3\envs\tods_env\lib\site-packages\tods\detection_algorithm\AutoRegODetect.py in AutoRegODetectorPrimitive()
139 },
140 "algorithm_types": [
--> 141 metadata_base.PrimitiveAlgorithmType.TODS_PRIMITIVE,
142 ],
143 "primitive_family": metadata_base.PrimitiveFamily.ANOMALY_DETECTION,

d:\software\Anaconda3\envs\tods_env\lib\enum.py in getattr(cls, name)
352 return cls.member_map[name]
353 except KeyError:
--> 354 raise AttributeError(name) from None
355
356 def getitem(cls, name):

AttributeError: TODS_PRIMITIVE

Process images.

in the case of temporal image series, TODS can process those ones? And if yes, How?

regex._regex_core.error: bad escape \d at position 7

When I run the code in the following in the README.md
import pandas as pd
from tods import schemas as schemas_utils
from tods import generate_dataset, evaluate_pipeline
table_path = 'datasets/anomaly/raw_data/yahoo_sub_5.csv'
target_index = 6 # what column is the target
metric = 'F1_MACRO' # F1 on both label 0 and 1

Read data and generate dataset

df = pd.read_csv(table_path)
dataset = generate_dataset(df, target_index)

Load the default pipeline

pipeline = schemas_utils.load_default_pipeline()

Run the pipeline

pipeline_result = evaluate_pipeline(dataset, pipeline, metric)
print(pipeline_result)

I got errors say: regex._regex_core.error: bad escape \d at position 7

Anyone can tell me how does this happen?

ImportError: pycurl: libcurl link-time ssl backend (none/other) is different from compile-time ssl backend (openssl)

I have this error when importing the axolotl:

from axolotl.utils import data_problem
Traceback (most recent call last):
File "", line 1, in
File "/Users/didi/Documents/hegsns/hegsns/TODS/tods_Guanchu/tods/src/axolotl/axolotl/utils/data_problem.py", line 9, in
from axolotl.utils.schemas import PROBLEM_DEFINITION
File "/Users/didi/Documents/hegsns/hegsns/TODS/tods_Guanchu/tods/src/axolotl/axolotl/utils/schemas.py", line 12, in
from d3m.metadata.pipeline import Pipeline
File "/Users/didi/Documents/hegsns/hegsns/TODS/tods_Guanchu/tods/src/d3m/d3m/metadata/pipeline.py", line 18, in
from d3m import container, deprecate, environment_variables, exceptions, index, utils
File "/Users/didi/Documents/hegsns/hegsns/TODS/tods_Guanchu/tods/src/d3m/d3m/index.py", line 21, in
import pycurl # type: ignore
ImportError: pycurl: libcurl link-time ssl backend (none/other) is different from compile-time ssl backend (openssl)

It cannot be solved by reinstalling the pycurl.

Not all provided hyper-parameters for the data preparation pipeline

When I run the default pipeline example of tods, I got the bellow error. Although that I used the correct version of d3m which is 2020.05.18 by checking d3m.__version__. I am using python3.6 and venv as a virtual environment with pip version 19.3.1.

Not all provided hyper-parameters for the data preparation pipeline 79ce71bd-db96-494b-a455-14f2e2ac5040 were used: ['method', 'number_of_folds', 'randomSeed', 'shuffle', 'stratified']
{'error': "[StepFailedError('Step 6 for pipeline "
          "384bbfab-4f6d-4001-9f90-684ea5681f5d failed.',)]",
 'method_called': 'evaluate',
 'pipeline': '<d3m.metadata.pipeline.Pipeline object at 0x7fdde588c2e8>',
 'status': 'ERRORED'}

Any suggestion please

python3.7 AttributeError: module 'typing' has no attribute 'GenericMeta'

could your team help this project support python3.7? In python3.7 typing had remove GenericMeta.

File "/usr/local/lib/python3.7/site-packages/d3m/container/init.py", line 5, in
from .dataset import *
File "/usr/local/lib/python3.7/site-packages/d3m/container/dataset.py", line 32, in
from . import pandas as container_pandas
File "/usr/local/lib/python3.7/site-packages/d3m/container/pandas.py", line 10, in
from . import list as container_list
File "/usr/local/lib/python3.7/site-packages/d3m/container/list.py", line 8, in
from d3m.metadata import base as metadata_base
File "/usr/local/lib/python3.7/site-packages/d3m/metadata/base.py", line 25, in
from . import hyperparams as hyperparams_module, primitive_names
File "/usr/local/lib/python3.7/site-packages/d3m/metadata/hyperparams.py", line 24, in
from d3m import deprecate, exceptions, utils
File "/usr/local/lib/python3.7/site-packages/d3m/utils.py", line 552, in
class GenericMetaclass(typing.GenericMeta, Metaclass):
AttributeError: module 'typing' has no attribute 'GenericMeta'

Direction to install tods in China.

The speed of downloading dependencies from pypi.org will be pretty low in China. You can download the dependencies from pypi.org's mirror site following this Direction:

  1. Git this repo to your local directory.

  2. Download this tods_install.txt, and put it in the directory tods (not tods/tods):
    tods_install.txt

  3. Change its name into tods_install.sh.

  4. Go to the directory tods (not tods/tods), and run sh tods_install.sh

  5. Run python setup.py install

  6. Run pip install -e .

ModuleNotFoundError: No module named 'axolotl.utils'

After installing the tods by using "pip install tods", I tried to run the example. However, it says that "ModuleNotFoundError: No module named 'axolotl.utils' " after I install the "axolotl" by using "pip install axolotl". It seems that the "axolotl" package no longer has "utils" attribute after checking their source code by myself. Thanks!

Cannot find 'axolotl' python package

I successfully installed tods using a simple pip install tods; however, I cannot run the example code because I do not have the axolotl package installed. axolotl is not available through pip nor conda. Where can I find the package and how should I install it?

PyOD for point-wise detection

Hi

I'm really interested in using TODS for detecting outliers in multivariate timeseries data. However, I'm missing something. According to the official TODS's documentation:

"Wide-range of Algorithms, including all of the point-wise detection algorithms supported by PyOD, state-of-the-art pattern-wise (collective) detection algorithms such as DeepLog, Telemanon, and also various ensemble algorithms for performing system-wise detection."

So, TODS currently uses PyOD to perform point-wise detection in time series data. However, as it is indicated here, PyOD doesn't handle time series data. So, my question is: How does TODS adapt PyOD for performing point-wise detection in time series data correctly?

Best regards

AttributeError: TODS_PRIMITIVE

I encountered the following error when importing tods

Traceback (most recent call last):
File "hello.py", line 2, in
import tods
File "/home /miniconda3/envs/py36/lib/python3.6/site-packages/tods-0.0.2-py3.6.egg/tods/init.py", line 2, in
from tods.data_processing import *
File "/home/miniconda3/envs/py36/lib/python3.6/site-packages/tods-0.0.2-py3.6.egg/tods/data_processing/init.py", line 1, in
from tods.data_processing.CategoricalToBinary import CategoricalToBinaryPrimitive
File "/home /miniconda3/envs/py36/lib/python3.6/site-packages/tods-0.0.2-py3.6.egg/tods/data_processing/CategoricalToBinary.py", line 119, in
class CategoricalToBinaryPrimitive(transformer.TransformerPrimitiveBase[Inputs, Outputs, Hyperparams]):
File "/home /miniconda3/envs/py36/lib/python3.6/site-packages/tods-0.0.2-py3.6.egg/tods/data_processing/CategoricalToBinary.py", line 159, in CategoricalToBinaryPrimitive
metadata_base.PrimitiveAlgorithmType.TODS_PRIMITIVE,
File "/home /miniconda3/envs/py36/lib/python3.6/enum.py", line 324, in getattr
raise AttributeError(name) from None
AttributeError: TODS_PRIMITIVE

git clone error: invalid path 'tods/tests/sk_interface/detection_algorithm/\'

Hi there,

I encountered an issue when running git clone.

(tods) PS C:\git> git clone https://github.com/datamllab/tods.git
Cloning into 'tods'...
remote: Enumerating objects: 7680, done.
remote: Counting objects: 100% (4359/4359), done.
remote: Compressing objects: 100% (2772/2772), done.
remote: Total 7680 (delta 1857), reused 3919 (delta 1479), pack-reused 3321 eceiving objects:
Receiving objects: 100% (7680/7680), 14.83 MiB | 2.13 MiB/s, done.
Resolving deltas: 100% (3855/3855), done.
error: invalid path 'tods/tests/sk_interface/detection_algorithm/'
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

construct prediction in dev branch

Construct prediction in dev branch having an unknown issue. It can only works when we install d3m common-primitive (for both tods version python_path and d3m version of python_path), otherwise it will produce a empty prediction

ModuleNotFoundError: No module named 'axolotl.backend'

When I try to run tods-master\examples\run_search.py this error occurred.
Even although I installed axolotl==0.0.1 by pip install, axolotl.backend cannot be found.
Could you please help me solve this problem?
THX!

d3m.primitives.tods.timeseries_processing.transformation

As of a week or so ago, I was able to run everything fine including the AutoML example with my dataset. However, I believe there were some updates made and now I am getting this error when I changed nothing to the code. Any insight into this is appreciated.

Thank you.

Screenshot 2022-04-19 140450

Error1:#print("Prediction Score\n", prediction_score):Error2:#d3m.primitives.tods.detection_algorithm.LSTMODetector: Primitive is not providing a description through its docstring.

Thank you for your team's sharing. I think this is a very valuable project. I can run it successfully now, but there are still some details, mainly as follows
Question a
In the example of running
"tods/examples/sk_examples/DeepLog_test.py"
#print("Prediction Score\n", prediction_score)
my run results are
"Prediction Labels
[[0]
[0]
[0]
...
[0]
[0]
[0]]
Prediction Score"
Refer to the documentation "tods/examples/Demo Notebook/TODS Official Demo Notebook.ipynb" for the correct run result
" [[0. ]
[0.3569443 ]
[0.3569443 ]
...
[0.77054234]
[0.4575615 ]
[0.17499346]]"
Question b
In the example of running
"tods/examples/sk_examples/DeepLog_test.py"
my run results are
"d3m.primitives.tods.detection_algorithm.LSTMODetector: Primitive is not providing a description through its docstring."
Refer to the documentation "tods/examples/Demo Notebook/TODS Official Demo Notebook.ipynb" for the correct run result
"Primitive: d3m.primitives.tods.detection_algorithm.telemanom(hyperparams=Hyperparams({'contamination': 0.1, 'window_size': 1, 'step_size': 1, 'return_subseq_inds': False, 'use_columns': (), 'exclude_columns': (), 'return_result': 'new', 'use_semantic_types': False, 'add_index_columns': False, 'error_on_no_input': True, 'return_semantic_type': 'https://metadata.datadrivendiscovery.org/types/Attribute', 'smoothing_perc': 0.05, 'window_size_': 100, 'error_buffer': 50, 'batch_size': 70, 'dropout': 0.3, 'validation_split': 0.2, 'optimizer': 'Adam', 'lstm_batch_size': 64, 'loss_metric': 'mean_squared_error', 'layers': [10, 10], 'epochs': 1, 'patience': 10, 'min_delta': 0.0003, 'l_s': 2, 'n_predictions': 1, 'p': 0.05}), random_seed=0)"
Thank you very much for your selfless help

Run demo Error: ImportError: cannot import name 'get_config'

my env
python 3.6.9
pip 21.3
use virtualenv and create a pure env

I install tods successfully, but run demo has some trouble, detail as follow:

$ python test.py
/home/uba/ML_env/tods/env/lib/python3.6/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.preprocessing.data module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.preprocessing. Anything that cannot be imported from sklearn.preprocessing is now part of the private API.
  warnings.warn(message, FutureWarning)
/home/uba/ML_env/tods/env/lib/python3.6/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.decomposition.truncated_svd module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.decomposition. Anything that cannot be imported from sklearn.decomposition is now part of the private API.
  warnings.warn(message, FutureWarning)
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    from tods import schemas as schemas_utils
  File "/home/uba/ML_env/tods/tods/__init__.py", line 5, in <module>
    from tods.detection_algorithm import *
  File "/home/uba/ML_env/tods/tods/detection_algorithm/__init__.py", line 2, in <module>
    from tods.detection_algorithm.DeepLog import DeepLogPrimitive
  File "/home/uba/ML_env/tods/tods/detection_algorithm/DeepLog.py", line 10, in <module>
    from keras.models import Sequential
  File "/home/uba/ML_env/tods/env/lib/python3.6/site-packages/keras/__init__.py", line 25, in <module>
    from keras import models
  File "/home/uba/ML_env/tods/env/lib/python3.6/site-packages/keras/models.py", line 19, in <module>
    from keras import backend
  File "/home/uba/ML_env/tods/env/lib/python3.6/site-packages/keras/backend.py", line 36, in <module>
    from tensorflow.python.eager.context import get_config
ImportError: cannot import name 'get_config'

I find a solution but haven't been tested:

Replace all import keras with from tensorflow import keras
which is really a hard work...

Error when Running Pipeline using Autoregression

I created my own pipeline using autoregression in test.py and then ran the pipeline or json file in the pipeline.py. When I run the line print (pipeline_result) where pipeline_result = evaluate_pipeline(dataset,pipeline,args.metric), I get the following error:

Capture

It says that Step 0 for pipeline failed for the method_called: evaluated. I am not sure the reason for this occurring? I can't see why the pipeline would have an issue with step 0.

I am running this in Python 3.6.15 and tods = 0.02.

Cannot print pipeline output

I ran an autoregression pipeline on some data and was able to obtain the pipeline metrics/scores using the statement print (pipeline_result.scores). I wanted to see the actual output or anomaly values from the pipeline, so I wrote print (pipeline_result.output) but that would provide with None as an output. I am not sure why my output will not print in this case. Any advice on how it should print would be great.
Output

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.