deepwisdom / autodl Goto Github PK

View Code? Open in Web Editor NEW

1.1K 32.0 213.0 4.56 MB

Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.

Home Page: http://fuzhi.ai

License: Apache License 2.0

Python 100.00%

autodl automl nas feature-engineering model-selection full-automl artificial-intelligence lightgbm resnet pytorch

autodl's People

Contributors

Stargazers

Watchers

Forkers

shuangyumo llllaaaa jsonbao yaohong9257 barryzm qingchendeng hongzhonglu intjun zhujohnson xinguangliu wuli2496 bingoko zgq346712481 qiangcaocao lyzmaster gwworld weics lgqfhwy oksbsb beride zch-soft wyzhe yuanjie-ai njmch03 super973 djofouc zhyj3038 smuzyg caibaibai nothk liuchongwei zking774 jingmouren thluo tonylv shipxu myougg ch-zouxw ys610zz jz3707 dljoan samleoqh ai-awesome-repos jiupinger1010 paopao-jkw wangjingbo1219 alfredlu supersadmin rulai-jianfang fiveking lonelygo lehaolin yubingnan jiangge stoneyu3 lyhiving haoday xiaochanglaoshi aningstar fcoolish elfisworking 582217 gofuntr chunhualiu orangeshine 36304099 hi-trust elekezem mukhali yangyu foundations callmewang t0mato kotlings orjuly letsfork eiun benedictking froggatt thomascx vulxan imxz wizardvan dongweicai bruce2021 u20024804 tyroneyveschen steven-xue wujingchao xianyang masterain98 sjbz ck66 jjwangnlp laomagic hongshunyang qing0991 maoxiaosc jangocheng cedricxie

autodl's Issues

Fix Image pretrained model path

Data preparation should support stratified splitting train/test

Question

In the tutorial and examples, it should support stratified splitting train/test processing, more than random split.

Description

If class_num is big, or imbalanced problem is serious, the train/test dataset may easily miss part of labels.

Add specific train file path to model input.

Move NLP files to seperate path

Make run_local_test as example

Make run_local_test as example.

不支持cpu训练吗

我也不是想用cpu训练因为MAC没有对应的显卡驱动，所以我一般都是用CPU版本在MAC上开发完成简单调试，能跑起来再发到服务器用GPU训练的，貌似autodl是另一个库？但是MAC安装autodl-gpu会报错无法安装tf-1.15-gpu版本

Failed to get convolution algorithm.This is probably because cuDNN failed to initialize.

When run the speech model, the problem occurs to me that:

  File "/usr/local/lib/python3.5/dist-packages/autodl-1.0-py3.5.egg/autodl/auto_models/at_speech/data_space/feats_engine.py", line 204, in make_features
    X = self.kapre_melspectrogram_extractor.predict(X)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1169, in predict
    steps=steps)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training_arrays.py", line 294, in predict_loop
    batch_outs = f(ins_batch)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1439, in __call__
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node melgram/convolution_1}}]]
     [[{{node melgram/Maximum_1}}]]
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node melgram/convolution_1}}]]
     [[{{node melgram/Maximum_1}}]]

The possible solution is that you should assign particular gpu in the begining, like

import os
os.environ["CUDA_VISIBLE_DEVICES"] = '0'

The Speech Model will runs well if you have 12G GPU mem.

Fix pretrain paths for image/video/nlp/speech

Fix pretrain paths for image/video/nlp/speech
image/video/speech: pretrain models path
nlp: embedding data path.

Add 5 examples to Image Classification

Considering independent history recorder and scorer to make records without files.

Can it solve a time series classification ?

Hi,
I am a new bie,
I want to predict Beat detection in a music signal. Will it be possible? Do you have a similar code for that?
Regards,
Suti

Evaluation: Add hitrate evaluation metric for single label classification

Problem

For single label classification, hitrate at top k is useful, more than NAUC and acc.

Add 5 examples to Video Classification

Trying to get in touch regarding a security issue

Hey there!

I'd like to report a security issue but cannot find contact instructions on your repository.

If not a hassle, might you kindly add a SECURITY.md file with an email, or another contact method? GitHub recommends this best practice to ensure security issues are responsibly disclosed, and it would serve as a simple instruction for security researchers in the future.

Thank you for your consideration, and I look forward to hearing from you!

(cc @huntr-helper)

Run without Docker(Speech)

How to improve acc?

I tried run_speech_classification_example.py for speech sentiment classification data, but only got acc of 0.16. How to improve the accuracy rate?

无法运行测试程序

系统：win10 1909 (18363.900)
python版本：3.7.0 64位

D:\GitHub__AI\AutoDL>python run_local_test.py
2020-10-10 20:22:22.150496: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-10-10 20:22:25 INFO run_local_test.py: ##################################################
2020-10-10 20:22:25 INFO run_local_test.py: Begin running local test using
2020-10-10 20:22:25 INFO run_local_test.py: code_dir = AutoDL_sample_code_submission
2020-10-10 20:22:25 INFO run_local_test.py: dataset_dir = miniciao
2020-10-10 20:22:25 INFO run_local_test.py: ##################################################
2020-10-10 20:22:25 INFO run_local_test.py: Cleaning existing output directory of last run: D:\GitHub__AI\AutoDL\AutoDL_sample_result_submission
2020-10-10 20:22:25 INFO run_local_test.py: Cleaning existing output directory of last run: D:\GitHub__AI\AutoDL\AutoDL_scoring_output
2020-10-10 20:22:25.547415: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-10-10 20:22:25.561861: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
python D:\GitHub__AI\AutoDL\AutoDL_scoring_program\score.py --solution_dir=D:\GitHub__AI\AutoDL\AutoDL_sample_data\miniciao
python D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\ingestion.py --dataset_dir=D:\GitHub__AI\AutoDL\AutoDL_sample_data\miniciao --code_dir=AutoDL_sample_code_submission --time_budget=1200
2020-10-10 20:22:31.002661: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-10-10 20:22:31,140 INFO score.py: ===== Start scoring program. Version: v20191204 =====
2020-10-10 20:22:33,430 INFO ingestion.py: ************************************************
2020-10-10 20:22:33,430 INFO ingestion.py: ******** Processing dataset D:\github__ai\autodl\autodl_sample_data\miniciao\miniciao ********
2020-10-10 20:22:33,430 INFO ingestion.py: ************************************************
2020-10-10 20:22:33,431 INFO ingestion.py: Reading training set and test set...
WARNING:tensorflow:From D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\dataset.py:47: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

WARNING:tensorflow:From D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\dataset.py:263: The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.

WARNING:tensorflow:From D:\Software\Program\Python37\lib\site-packages\tensorflow_core\python\autograph\converters\directives.py:119: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead.

WARNING:tensorflow:From D:\Software\Program\Python37\lib\site-packages\tensorflow_core\python\autograph\converters\directives.py:119: The name tf.FixedLenSequenceFeature is deprecated. Please use tf.io.FixedLenSequenceFeature instead.

WARNING:tensorflow:From D:\Software\Program\Python37\lib\site-packages\tensorflow_core\python\autograph\converters\directives.py:119: The name tf.parse_single_sequence_example is deprecated. Please use tf.io.parse_single_sequence_example instead.

2020-10-10 20:22:36.204593: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-10-10 20:22:36.240231: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
2020-10-10 20:22:36.248122: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-10-10 20:22:36.256820: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-10-10 20:22:36.265364: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-10-10 20:22:36.273798: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-10-10 20:22:36.283813: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-10-10 20:22:36.293124: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-10-10 20:22:36.309652: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found
2020-10-10 20:22:36.315878: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-10-10 20:22:36,470 ERROR ingestion.py: Failed to initializing model.
2020-10-10 20:22:36,471 ERROR ingestion.py: Encountered exception:
module 'signal' has no attribute 'SIGALRM'
Traceback (most recent call last):
  File "D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\ingestion.py", line 336, in <module>
    with timer.time_limit("Initialization"):
  File "D:\Software\Program\Python37\lib\contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\ingestion.py", line 203, in time_limit
    signal.signal(signal.SIGALRM, signal_handler)
AttributeError: module 'signal' has no attribute 'SIGALRM'
2020-10-10 20:22:36,473 INFO ingestion.py: ===== Start core part of ingestion program. Version: v20191204 =====
2020-10-10 20:22:36,474 INFO ingestion.py: Failed to run ingestion.
2020-10-10 20:22:36,475 ERROR ingestion.py: Encountered exception:
Your model object doesn't have the method `{}`. Please implement it in model.py.
Traceback (most recent call last):
  File "D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\ingestion.py", line 360, in <module>
    raise ModelApiError("Your model object doesn't have the method " +
ModelApiError: Your model object doesn't have the method `{}`. Please implement it in model.py.
2020-10-10 20:22:36,476 INFO ingestion.py: Wrote the file end.txt marking the end of ingestion.
2020-10-10 20:22:36,476 INFO ingestion.py: [-] Done, but encountered some errors during ingestion.
2020-10-10 20:22:36,476 INFO ingestion.py: [-] Overall time spent  0.00 sec
D:\GitHub__AI\AutoDL\AutoDL_sample_result_submission\end.txt
D:\GitHub__AI\AutoDL\AutoDL_sample_result_submission\start.txt
已复制         2 个文件。
2020-10-10 20:22:36,496 INFO ingestion.py: [Ingestion terminated]
2020-10-10 20:22:37,149 INFO score.py: Detected the start of ingestion after 6 seconds. Start scoring.
2020-10-10 20:22:37,150 INFO score.py: Detected ingestion program had stopped running because an 'end.txt' file is written by ingestion. Stop scoring now.
2020-10-10 20:22:37,151 INFO score.py: Final area under learning curve for miniciao: 0.0000
2020-10-10 20:22:37,153 INFO score.py: Computing error bars with 10 scorings...
2020-10-10 20:22:37,154 INFO score.py:
Latest prediction NAUC:
* Mean: -1
* Standard deviation: -1
* Variance: -1
2020-10-10 20:22:37,154 INFO score.py: Computing ALC error bars with 5 curves...
2020-10-10 20:22:37,155 INFO score.py:
Area under Learning Curve:
* Mean: 0.0
* Standard deviation: 0.0
* Variance: 0.0
2020-10-10 20:22:37,157 ERROR score.py: [-] Some error occurred in ingestion program. Please see output/error log of Ingestion Step.
2020-10-10 20:22:37,157 INFO score.py: [Scoring terminated]

D:\GitHub__AI\AutoDL>

经查询获知，windows系统无法使用此信号，不兼容

Add basic tutorial

run_local_test issuing empty output

Hello,

I tried running your tutorial on the AutoDL public datasets by running e.g.
python scripts/run_local_test.py --dataset_dir=AutoDL_public_data/O3 --output_dir=res/O3

I get a few error messages along the way, such as
Could not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied)
Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
failed call to cuInit: UNKNOWN ERROR (303)
kernel driver does not appear to be running on this host (mantis-cargo): /proc/driver/nvidia/version does not exist
run_ingestion: 62: Failed to initializing model.
FileNotFoundError: [Errno 2] No such file or directory: '/app/embedding/cc.zh.300.vec.gz'
UnboundLocalError: local variable 'model' referenced before assignment
ERROR scoring_process.py: run_scoring: 96: [-] Some error occurred in ingestion program. Please see output/error log of Ingestion Step.

as well as deprecation warnings
distutils Version classes are deprecated. Use packaging.version instead
UserWarning: Setuptools is replacing distutils.
The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.
The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.

and ends up with an empty scores.txtfile as well as a blank learning-curve-O3.pngimage

Do you have an idea why it's not working ?

Thanks for the help!

Build python package for pip.

Add tutorials and practical use cases

Add 5 examples to Text Classification

V站过来大佬能出点教程吗?

据说比谷歌的AudoML都厉害，想知道怎么使用，来个教程吧。

Text-Prediction error when class_num and train_num is large

Problem

Text domain, prediction error when class_num (500~1000) and train_num(~400000) is large.

name 'M' is not defined, seems M = Model(D_train.get_metadata()) encounter error

when I run unit test in docker (cpu ver.), it reports an error:

root@85a655cc87d1:/app/codalab# python run_local_test.py
2020-05-07 06:36:42 INFO run_local_test.py: ##################################################
2020-05-07 06:36:42 INFO run_local_test.py: Begin running local test using
2020-05-07 06:36:42 INFO run_local_test.py: code_dir = AutoDL_sample_code_submission
2020-05-07 06:36:42 INFO run_local_test.py: dataset_dir = miniciao
2020-05-07 06:36:42 INFO run_local_test.py: ##################################################
2020-05-07 06:36:42 INFO run_local_test.py: Cleaning existing output directory of last run: /app/codalab/AutoDL_sample_result_submission
2020-05-07 06:36:42 INFO run_local_test.py: Cleaning existing output directory of last run: /app/codalab/AutoDL_scoring_output
python /app/codalab/AutoDL_ingestion_program/ingestion.py --dataset_dir=/app/codalab/AutoDL_sample_data/miniciao --code_dir=/app/codalab/AutoDL_sample_code_submission --time_budget=1200.0
python /app/codalab/AutoDL_scoring_program/score.py --solution_dir=/app/codalab/AutoDL_sample_data/miniciao
2020-05-07 06:36:43,653 INFO score.py: ===== Start scoring program. Version: v20191204 =====
2020-05-07 06:36:44,673 INFO ingestion.py: ************************************************
2020-05-07 06:36:44,673 INFO ingestion.py: ******** Processing dataset Miniciao ********
2020-05-07 06:36:44,673 INFO ingestion.py: ************************************************
2020-05-07 06:36:44,673 INFO ingestion.py: Reading training set and test set...
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/tensor_array_ops.py:162: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2020-05-07 06:36:44,928 INFO ingestion.py: Creating model...this process should not exceed 20min.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 19, in <lambda>
    threading.Thread(target=lambda: torch.cuda.synchronize()),
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 398, in synchronize
    _lazy_init()
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 192, in _lazy_init
    _check_driver()
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 102, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

2020-05-07 06:36:46,014 INFO ingestion.py: Initialization success, time spent so far 1.0854098796844482 sec
2020-05-07 06:36:46,014 ERROR ingestion.py: Failed to initializing model.
2020-05-07 06:36:46,015 ERROR ingestion.py: Encountered exception:
Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Traceback (most recent call last):
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 339, in <module>
    M = Model(D_train.get_metadata()) # The metadata of D_train and D_test only differ in sample_count
  File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 208, in time_limit
    yield
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 339, in <module>
    M = Model(D_train.get_metadata()) # The metadata of D_train and D_test only differ in sample_count
  File "/app/codalab/AutoDL_sample_code_submission/model.py", line 54, in __init__
    self.domain_model = DomainModel(self.metadata)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 42, in __init__
    super(Model, self).__init__(metadata)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/skeleton/projects/logic.py", line 88, in __init__
    self.build()
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 66, in build
    self.model_9.init(model_dir=model_path, gain=1.0)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/architectures/resnet.py", line 244, in init
    model_dir=self.model_dir)
  File "/usr/local/lib/python3.5/dist-packages/torch/hub.py", line 499, in load_state_dict_from_url
    return torch.load(cached_file, map_location=map_location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 426, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 613, in _load
    result = unpickler.load()
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 576, in persistent_load
    deserialized_objects[root_key] = restore_location(obj, location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 155, in default_restore_location
    result = fn(storage, location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 131, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 115, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
2020-05-07 06:36:46,035 INFO ingestion.py: ===== Start core part of ingestion program. Version: v20191204 =====
2020-05-07 06:36:46,039 INFO ingestion.py: Failed to run ingestion.
2020-05-07 06:36:46,039 ERROR ingestion.py: Encountered exception:
name 'M' is not defined
Traceback (most recent call last):
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 358, in <module>
    if not hasattr(M, attr):
NameError: name 'M' is not defined
2020-05-07 06:36:46,044 INFO ingestion.py: Wrote the file end.txt marking the end of ingestion.
2020-05-07 06:36:46,045 INFO ingestion.py: [-] Done, but encountered some errors during ingestion.
2020-05-07 06:36:46,045 INFO ingestion.py: [-] Overall time spent  0.01 sec
2020-05-07 06:36:46,079 INFO ingestion.py: [Ingestion terminated]

first I thought it was an netowrk issue during download training data, but I tried run test with proxy, orI downloaded the the r9-xxx.pth.tar , even after build with another machine (with docker of course) still without luck.

It's weird that log report :

Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False.

which I'm using docker of cpu ver

Text-Language detect error when mixed traditional chinese and uppercase letters.

Question

For text data, ** language detect make errors ** when meeting traditional chinese and uppercase letters.

Description

t=我有一个梦想, lang=zh-cn
t=I have a dream, lang=en
t=我有一個夢想, lang=ko
t=我有一個夢想並且翻譯為I HAVE A DREAM., lang=ko
t=我有一个梦想并且翻译为I HAVE A DREAM., lang=vi
t=我有一个梦想并且翻译为i have a dream., lang=en
t=I HAVE A DREAM, lang=hu
t=IHAVEADREAM, lang=id
t=ihaveadream, lang=en

deepwisdom / autodl Goto Github PK

autodl's People

Contributors

Stargazers

Watchers

Forkers

autodl's Issues

Question

Description

Problem

Problem

Question

Description

Recommend Projects

Recommend Topics

Recommend Org