Code Monkey home page Code Monkey logo

yzhao062 / suod Goto Github PK

View Code? Open in Web Editor NEW
373.0 17.0 49.0 11.24 MB

(MLSys' 21) An Acceleration System for Large-scare Unsupervised Heterogeneous Outlier Detection (Anomaly Detection)

Home Page: https://www.andrew.cmu.edu/user/yuezhao2/papers/20-preprint-suod.pdf

License: BSD 2-Clause "Simplified" License

Python 100.00%
data-mining machine-learning anomaly-detection outlier-detection distributed-systems knowledge-distillation python machine-learning-algorithms machine-learning-library

suod's People

Contributors

yzhao062 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

suod's Issues

np.int deprecation

suod/models/parallel_processes.py:119: DeprecationWarning:

np.int is a deprecated alias for the builtin int. To silence this warning, use int by itself.
Doing this will not modify any behavior and is safe.
When replacing np.int, you may wish to use e.g. np.int64 or np.int32 to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations dtype=np.int)

many issues while running demo code

Firstly, Many thanks for your working.
When I run your api demo code(as follows), I got an IndexError

`base_estimators = [
LOF(n_neighbors=5, contamination=contamination),
LOF(n_neighbors=15, contamination=contamination),
LOF(n_neighbors=25, contamination=contamination),
HBOS(contamination=contamination),
PCA(contamination=contamination),
OCSVM(contamination=contamination),
KNN(n_neighbors=5, contamination=contamination),
KNN(n_neighbors=15, contamination=contamination),
KNN(n_neighbors=25, contamination=contamination)]

model = SUOD(base_estimators=base_estimators, n_jobs=-1, # number of workers
rp_flag_global=True, # global flag for random projection
bps_flag=True, # global flag for balanced parallel scheduling
approx_flag_global=True, # global flag for model approximation
contamination=contamination)

model.fit(X) # fit all models with X`

image

when i run my demo codes(as follws), I got a TypeError.
`knn = KNN(n_jobs = -1)
pca = PCA()
copod = COPOD()
mcd = MCD()
ocsvm = OCSVM()
lmdd = LMDD()
lof = LOF()
base_estimators = [pca
, copod
, mcd
, ocsvm
, lmdd
, lof
, COF
, CBLOF
]

s_model = SUOD(base_estimators=base_estimators, n_jobs=-1
,rp_flag_global= True # 全局是否开启投影降维
,bps_flag= False # 是否开启并行优化
,approx_flag_global= True # 是否在预测时开启伪监督替换无监督模型
)
s_model.fit(X)`

image
image
image
image

looking for your reply, many thanks.

Incompatability with Sklearn (and PyOD)

SUOD with PyOD does not function. An issue within sklearn/base prevents SUOD().fit() from working.

I have created several conda environments to try to resolve the sklearn compatibility issue with no luck. An environment with this issue can be created easily from a new env that only specifies PyOD and SUOD as dependencies. Here versions of Sklearn and other deps are set by conda, but I have also manually specified the versions listed in in SUOD and PyOD docs, but to no avail. The .yml file I have used in this example is (note my only hard requirement for this project is python 3.11):

name: pyod_suod_env
channels:
  - conda-forge
dependencies:
  - python>=3.11
  - pyod
  - pip
  - pip:
    - suod

To reproduce the error, all you need to do is call the suod fit method. Code to reproduce:

# Import packages
from pyod.models.suod import SUOD
# from suod.models.base import SUOD
from pyod.utils.data import generate_data

# Generate data
contamination = 0.1 
n_train = 200 
n_test = 100 

X_train, X_test, y_train, y_test = generate_data(
    n_train=n_train, n_test=n_test, contamination=contamination)

# Fit SUOD
od = SUOD(
    n_jobs=2,
    combination='average',
    verbose=True,
)
od.fit(X_train)

Note that I have tried this above code with both pyod.models.suod.SUOD and suod.models.base.SUOD with the same result.

The entire resulting error trace is as follows:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[10], [line 1](vscode-notebook-cell:?execution_count=10&line=1)
----> [1](vscode-notebook-cell:?execution_count=10&line=1) od.fit(X_train)
      [2](vscode-notebook-cell:?execution_count=10&line=2) train_pred = od.labels_
      [3](vscode-notebook-cell:?execution_count=10&line=3) train_scores = od.decision_scores_

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/pyod/models/suod.py:210](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/pyod/models/suod.py:210), in SUOD.fit(self, X, y)
    [207](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/pyod/models/suod.py:207) self._set_n_classes(y)
    [209](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/pyod/models/suod.py:209) # fit the model and then approximate it
--> [210](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/pyod/models/suod.py:210) self.model_.fit(X)
    [211](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/pyod/models/suod.py:211) self.model_.approximate(X)
    [213](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/pyod/models/suod.py:213) # get the decision scores from each base estimators

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/suod/models/base.py:308](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/suod/models/base.py:308), in SUOD.fit(self, X)
    [304](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/suod/models/base.py:304) if self.bps_flag:
    [305](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/suod/models/base.py:305) 	# load the pre-trained cost predictor to forecast the train cost
    [306](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/suod/models/base.py:306) 	cost_predictor = load_predictor_train(self.cost_forecast_loc_fit)
--> [308](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/suod/models/base.py:308) 	print(cost_predictor)
    [309](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/suod/models/base.py:309) 	time_cost_pred = cost_forecast_meta(cost_predictor, X,
    [310](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/suod/models/base.py:310) 										self.base_estimator_names)
    [312](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/suod/models/base.py:312) 	# use BPS

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:315](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:315), in BaseEstimator.__repr__(self, N_CHAR_MAX)
    [307](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:307) # use ellipsis for sequences with a lot of elements
    [308](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:308) pp = _EstimatorPrettyPrinter(
    [309](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:309)     compact=True,
    [310](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:310)     indent=1,
    [311](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:311)     indent_at_name=True,
    [312](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:312)     n_max_elements_to_show=N_MAX_ELEMENTS_TO_SHOW,
    [313](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:313) )
--> [315](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:315) repr_ = pp.pformat(self)
    [317](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:317) # Use bruteforce ellipsis when there are a lot of non-blank characters
    [318](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:318) n_nonblank = len("".join(repr_.split()))

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:158](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:158), in PrettyPrinter.pformat(self, object)
    [156](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:156) def pformat(self, object):
    [157](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:157)     sio = _StringIO()
--> [158](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:158)     self._format(object, sio, 0, 0, {}, 0)
    [159](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:159)     return sio.getvalue()

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:175](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:175), in PrettyPrinter._format(self, object, stream, indent, allowance, context, level)
    [173](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:173)     self._readable = False
    [174](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:174)     return
--> [175](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:175) rep = self._repr(object, context, level)
    [176](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:176) max_width = self._width - indent - allowance
    [177](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:177) if len(rep) > max_width:

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:455](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:455), in PrettyPrinter._repr(self, object, context, level)
    [454](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:454) def _repr(self, object, context, level):
--> [455](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:455)     repr, readable, recursive = self.format(object, context.copy(),
    [456](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:456)                                             self._depth, level)
    [457](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:457)     if not readable:
    [458](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/pprint.py:458)         self._readable = False

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:189](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:189), in _EstimatorPrettyPrinter.format(self, object, context, maxlevels, level)
    [188](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:188) def format(self, object, context, maxlevels, level):
--> [189](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:189)     return _safe_repr(
    [190](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:190)         object, context, maxlevels, level, changed_only=self._changed_only
    [191](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:191)     )

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:440](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:440), in _safe_repr(object, context, maxlevels, level, changed_only)
    [438](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:438) recursive = False
    [439](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:439) if changed_only:
--> [440](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:440)     params = _changed_params(object)
    [441](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:441) else:
    [442](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:442)     params = object.get_params(deep=False)

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:93](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:93), in _changed_params(estimator)
     [89](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:89) def _changed_params(estimator):
     [90](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:90)     """Return dict (param_name: value) of parameters that were given to
     [91](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:91)     estimator with non-default values."""
---> [93](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:93)     params = estimator.get_params(deep=False)
     [94](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:94)     init_func = getattr(estimator.__init__, "deprecated_original", estimator.__init__)
     [95](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/utils/_pprint.py:95)     init_params = inspect.signature(init_func).parameters

File [~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:244](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:244), in BaseEstimator.get_params(self, deep)
    [242](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:242) out = dict()
    [243](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:243) for key in self._get_param_names():
--> [244](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:244)     value = getattr(self, key)
    [245](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:245)     if deep and hasattr(value, "get_params") and not isinstance(value, type):
    [246](https://file+.vscode-resource.vscode-cdn.net/home/jonny/code/local/nv_outlier_detection/~/miniforge3/envs/pyod_suod_env/lib/python3.11/site-packages/sklearn/base.py:246)         deep_items = value.get_params().items()

AttributeError: 'RandomForestRegressor' object has no attribute 'monotonic_cst'

clarify on use SUOD for XGBOD

Hi,

I see from your paper:

demon- strate SUOD’s effectiveness as an end-to-end framework on more complex combination models like unsupervised LSCP [35] and XG- BOD [34]

Thus I am wondering this seems like in progress? Any suggestions on how I implement this feature?
My initial guess is to compute TODS scores via SUOD, then build the XGBoost model separately combine the score and the original features. Any suggestion is very welcomed!

Installing error in latest build

Hi,

I'm facing this error when trying to install suod package through both pip and source. The detail log is below:

`
ERROR: Command errored out with exit status 1:
command: /home/alan/anaconda3/envs/cuongpx/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-6j6iaixz/suod/setup.py'"'"'; file='"'"'/tmp/pip-install-6j6iaixz/suod/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-ms_fv72w/install-record.txt --single-version-externally-managed --compile --install-headers /home/alan/anaconda3/envs/cuongpx/include/python3.6m/suod
cwd: /tmp/pip-install-6j6iaixz/suod/
Complete output (63 lines):
running install
running build
running build_py
creating build
creating build/lib
creating build/lib/suod
copying suod/version.py -> build/lib/suod
copying suod/init.py -> build/lib/suod
creating build/lib/suod/utils
copying suod/utils/init.py -> build/lib/suod/utils
copying suod/utils/utility.py -> build/lib/suod/utils
creating build/lib/suod/models
copying suod/models/jl_projection.py -> build/lib/suod/models
copying suod/models/init.py -> build/lib/suod/models
copying suod/models/parallel_processes.py -> build/lib/suod/models
copying suod/models/cost_predictor.py -> build/lib/suod/models
copying suod/models/base.py -> build/lib/suod/models
creating build/lib/suod/models/saved_models
copying suod/models/saved_models/init.py -> build/lib/suod/models/saved_models
running egg_info
writing suod.egg-info/PKG-INFO
writing dependency_links to suod.egg-info/dependency_links.txt
writing requirements to suod.egg-info/requires.txt
writing top-level names to suod.egg-info/top_level.txt
reading manifest file 'suod.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching 'examples'
no previously-included directories found matching 'notebooks'
no previously-included directories found matching 'suod/test'
warning: no files found matching 'suod/models'
writing manifest file 'suod.egg-info/SOURCES.txt'
copying suod/models/saved_models/bps_prediction.joblib -> build/lib/suod/models/saved_models
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-install-6j6iaixz/suod/setup.py", line 52, in
'Programming Language :: Python :: 3.7',
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/site-packages/setuptools/init.py", line 145, in setup
return distutils.core.setup(**attrs)
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/distutils/core.py", line 148, in setup
dist.run_commands()
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/distutils/dist.py", line 955, in run_commands
self.run_command(cmd)
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/site-packages/setuptools/command/install.py", line 61, in run
return orig.install.run(self)
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/distutils/command/install.py", line 545, in run
self.run_command('build')
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/distutils/dist.py", line 974, in run_command
cmd_obj.run()
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/site-packages/setuptools/command/build_py.py", line 53, in run
self.build_package_data()
File "/home/alan/anaconda3/envs/cuongpx/lib/python3.6/site-packages/setuptools/command/build_py.py", line 126, in build_package_data
srcfile in self.distribution.convert_2to3_doctests):
TypeError: argument of type 'NoneType' is not iterable

----------------------------------------

ERROR: Command errored out with exit status 1: /home/alan/anaconda3/envs/cuongpx/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-6j6iaixz/suod/setup.py'"'"'; file='"'"'/tmp/pip-install-6j6iaixz/suod/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-ms_fv72w/install-record.txt --single-version-externally-managed --compile --install-headers /home/alan/anaconda3/envs/cuongpx/include/python3.6m/suod Check the logs for full command output.`

It's installing fine on Windows but failed on Linux 16.04

OSError [Errno 30] Read-only file system

While attempting to execute suod on a cloud-based data ops platform, I am receiving this error: OSError: [Errno 30] Read-only file system: '/home/site/wwwroot/.python_packages/lib/site-packages/suod/models/saved_models/bps_train_curr.joblib' The platform has provided a /tmp directory to which files can be written. Where in the suod/pyod library, could I specify an alternate save path?

parallel_processes.py issue at line 98

Hi,
I'm here to report an issue with the package. When I use it with parallel processing it starts printing blank lines and I found out that the error is a misplaced print() at line 98 in the parallel_processes.py file, under suod/models. By removing the line the issue is fixed.

Thanks in advance

ModuleNotFoundError: No module named 'sklearn.ensemble._forest'

Hello,

I tried running suod with the following code:

import pandas as pd, numpy as np
from pyod.models.ecod import ECOD
from pyod.models.suod import SUOD
from pyod.models.lof import LOF
from pyod.models.iforest import IForest
from pyod.models.copod import COPOD
from pyod.models.kpca import KPCA
from pyod.utils.utility import standardizer

prepared_np = standardizer(prepared_df)

# train SUOD
clf_name = 'ensemble'

# initialized a group of outlier detectors for acceleration
detector_list = [
    LOF(n_neighbors=20),
    ECOD,
    COPOD(), 
    IForest(n_estimators=200),
    KPCA()
]

# decide the number of parallel process, and the combination method
clf = SUOD(
    base_estimators=detector_list, 
    n_jobs=-1, 
    combination='average',
    verbose=True
)
clf.fit(prepared_np)

And I got this error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-f2fb948dc9a9> in <module>
      1 print('Training...')
      2 
----> 3 clf.fit(prepared_np)
      4 
      5 scores = clf.decision_scores_  # raw outlier scores

/app/dataiku/design/code-envs/python/SP3/lib/python3.6/site-packages/pyod/models/suod.py in fit(self, X, y)
    208 
    209         # fit the model and then approximate it
--> 210         self.model_.fit(X)
    211         self.model_.approximate(X)
    212 

/app/dataiku/design/code-envs/python/SP3/lib/python3.6/site-packages/suod/models/base.py in fit(self, X)
    265         if self.bps_flag:
    266             # load the pre-trained cost predictor to forecast the train cost
--> 267             cost_predictor = joblib.load(self.cost_forecast_loc_fit_)
    268 
    269             time_cost_pred = cost_forecast_meta(cost_predictor, X,

/app/dataiku/design/code-envs/python/SP3/lib/python3.6/site-packages/joblib/numpy_pickle.py in load(filename, mmap_mode)
    585                     return load_compatibility(fobj)
    586 
--> 587                 obj = _unpickle(fobj, filename, mmap_mode)
    588     return obj

/app/dataiku/design/code-envs/python/SP3/lib/python3.6/site-packages/joblib/numpy_pickle.py in _unpickle(fobj, filename, mmap_mode)
    504     obj = None
    505     try:
--> 506         obj = unpickler.load()
    507         if unpickler.compat_mode:
    508             warnings.warn("The file '%s' has been generated with a "

/usr/lib64/python3.6/pickle.py in load(self)
   1048                     raise EOFError
   1049                 assert isinstance(key, bytes_types)
-> 1050                 dispatch[key[0]](self)
   1051         except _Stop as stopinst:
   1052             return stopinst.value

/usr/lib64/python3.6/pickle.py in load_global(self)
   1336         module = self.readline()[:-1].decode("utf-8")
   1337         name = self.readline()[:-1].decode("utf-8")
-> 1338         klass = self.find_class(module, name)
   1339         self.append(klass)
   1340     dispatch[GLOBAL[0]] = load_global

/usr/lib64/python3.6/pickle.py in find_class(self, module, name)
   1386             elif module in _compat_pickle.IMPORT_MAPPING:
   1387                 module = _compat_pickle.IMPORT_MAPPING[module]
-> 1388         __import__(module, level=0)
   1389         if self.proto >= 4:
   1390             return _getattribute(sys.modules[module], name)[0]

ModuleNotFoundError: No module named 'sklearn.ensemble._forest'

I am using the following versions:

combo==0.1.3
joblib==1.1.1
multiprocess==0.70.12.2
numba==0.53.1
numpy==1.19.5
pandas==1.0.5
pyod==1.0.6
scikit-learn==0.20.4
scipy==1.5.4
statsmodels==0.10.2
suod==0.0.7

Well, my env has more libraries, but I left only the ones that I think might be relevant for the issue.

I would be grateful for any help.

And thanks a lot for your wonderful work!

Verbose = True

The SUOD library clutters my console.
After a bit of digging, I found that in the Parallel(...) call, the parameter verbose is just set to true instead of self.verbose.

Would be nice if it's controlled through the objects verbose parameter.

Thanks for your work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.