Code Monkey home page Code Monkey logo

autodl's People

Contributors

daemonyz avatar dependabot[bot] avatar geekan avatar wp19991 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autodl's Issues

不支持cpu训练吗

我也不是想用cpu训练 因为MAC没有对应的显卡驱动,所以我一般都是用CPU版本在MAC上开发完成简单调试,能跑起来再发到服务器用GPU训练的,貌似autodl是另一个库? 但是MAC安装autodl-gpu会报错 无法安装tf-1.15-gpu版本

Failed to get convolution algorithm.This is probably because cuDNN failed to initialize.

When run the speech model, the problem occurs to me that:

  File "/usr/local/lib/python3.5/dist-packages/autodl-1.0-py3.5.egg/autodl/auto_models/at_speech/data_space/feats_engine.py", line 204, in make_features
    X = self.kapre_melspectrogram_extractor.predict(X)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1169, in predict
    steps=steps)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training_arrays.py", line 294, in predict_loop
    batch_outs = f(ins_batch)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1439, in __call__
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node melgram/convolution_1}}]]
     [[{{node melgram/Maximum_1}}]]
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node melgram/convolution_1}}]]
     [[{{node melgram/Maximum_1}}]]

The possible solution is that you should assign particular gpu in the begining, like

import os
os.environ["CUDA_VISIBLE_DEVICES"] = '0'

The Speech Model will runs well if you have 12G GPU mem.

Trying to get in touch regarding a security issue

Hey there!

I'd like to report a security issue but cannot find contact instructions on your repository.

If not a hassle, might you kindly add a SECURITY.md file with an email, or another contact method? GitHub recommends this best practice to ensure security issues are responsibly disclosed, and it would serve as a simple instruction for security researchers in the future.

Thank you for your consideration, and I look forward to hearing from you!

(cc @huntr-helper)

How to improve acc?

I tried run_speech_classification_example.py for speech sentiment classification data, but only got acc of 0.16. How to improve the accuracy rate?
image

无法运行测试程序

系统:win10 1909 (18363.900)
python版本:3.7.0 64位

D:\GitHub__AI\AutoDL>python run_local_test.py
2020-10-10 20:22:22.150496: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-10-10 20:22:25 INFO run_local_test.py: ##################################################
2020-10-10 20:22:25 INFO run_local_test.py: Begin running local test using
2020-10-10 20:22:25 INFO run_local_test.py: code_dir = AutoDL_sample_code_submission
2020-10-10 20:22:25 INFO run_local_test.py: dataset_dir = miniciao
2020-10-10 20:22:25 INFO run_local_test.py: ##################################################
2020-10-10 20:22:25 INFO run_local_test.py: Cleaning existing output directory of last run: D:\GitHub__AI\AutoDL\AutoDL_sample_result_submission
2020-10-10 20:22:25 INFO run_local_test.py: Cleaning existing output directory of last run: D:\GitHub__AI\AutoDL\AutoDL_scoring_output
2020-10-10 20:22:25.547415: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-10-10 20:22:25.561861: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
python D:\GitHub__AI\AutoDL\AutoDL_scoring_program\score.py --solution_dir=D:\GitHub__AI\AutoDL\AutoDL_sample_data\miniciao
python D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\ingestion.py --dataset_dir=D:\GitHub__AI\AutoDL\AutoDL_sample_data\miniciao --code_dir=AutoDL_sample_code_submission --time_budget=1200
2020-10-10 20:22:31.002661: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-10-10 20:22:31,140 INFO score.py: ===== Start scoring program. Version: v20191204 =====
2020-10-10 20:22:33,430 INFO ingestion.py: ************************************************
2020-10-10 20:22:33,430 INFO ingestion.py: ******** Processing dataset D:\github__ai\autodl\autodl_sample_data\miniciao\miniciao ********
2020-10-10 20:22:33,430 INFO ingestion.py: ************************************************
2020-10-10 20:22:33,431 INFO ingestion.py: Reading training set and test set...
WARNING:tensorflow:From D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\dataset.py:47: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

WARNING:tensorflow:From D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\dataset.py:263: The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.

WARNING:tensorflow:From D:\Software\Program\Python37\lib\site-packages\tensorflow_core\python\autograph\converters\directives.py:119: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead.

WARNING:tensorflow:From D:\Software\Program\Python37\lib\site-packages\tensorflow_core\python\autograph\converters\directives.py:119: The name tf.FixedLenSequenceFeature is deprecated. Please use tf.io.FixedLenSequenceFeature instead.

WARNING:tensorflow:From D:\Software\Program\Python37\lib\site-packages\tensorflow_core\python\autograph\converters\directives.py:119: The name tf.parse_single_sequence_example is deprecated. Please use tf.io.parse_single_sequence_example instead.

2020-10-10 20:22:36.204593: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-10-10 20:22:36.240231: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
2020-10-10 20:22:36.248122: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-10-10 20:22:36.256820: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-10-10 20:22:36.265364: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-10-10 20:22:36.273798: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-10-10 20:22:36.283813: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-10-10 20:22:36.293124: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-10-10 20:22:36.309652: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found
2020-10-10 20:22:36.315878: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-10-10 20:22:36,470 ERROR ingestion.py: Failed to initializing model.
2020-10-10 20:22:36,471 ERROR ingestion.py: Encountered exception:
module 'signal' has no attribute 'SIGALRM'
Traceback (most recent call last):
  File "D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\ingestion.py", line 336, in <module>
    with timer.time_limit("Initialization"):
  File "D:\Software\Program\Python37\lib\contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\ingestion.py", line 203, in time_limit
    signal.signal(signal.SIGALRM, signal_handler)
AttributeError: module 'signal' has no attribute 'SIGALRM'
2020-10-10 20:22:36,473 INFO ingestion.py: ===== Start core part of ingestion program. Version: v20191204 =====
2020-10-10 20:22:36,474 INFO ingestion.py: Failed to run ingestion.
2020-10-10 20:22:36,475 ERROR ingestion.py: Encountered exception:
Your model object doesn't have the method `{}`. Please implement it in model.py.
Traceback (most recent call last):
  File "D:\GitHub__AI\AutoDL\AutoDL_ingestion_program\ingestion.py", line 360, in <module>
    raise ModelApiError("Your model object doesn't have the method " +
ModelApiError: Your model object doesn't have the method `{}`. Please implement it in model.py.
2020-10-10 20:22:36,476 INFO ingestion.py: Wrote the file end.txt marking the end of ingestion.
2020-10-10 20:22:36,476 INFO ingestion.py: [-] Done, but encountered some errors during ingestion.
2020-10-10 20:22:36,476 INFO ingestion.py: [-] Overall time spent  0.00 sec
D:\GitHub__AI\AutoDL\AutoDL_sample_result_submission\end.txt
D:\GitHub__AI\AutoDL\AutoDL_sample_result_submission\start.txt
已复制         2 个文件。
2020-10-10 20:22:36,496 INFO ingestion.py: [Ingestion terminated]
2020-10-10 20:22:37,149 INFO score.py: Detected the start of ingestion after 6 seconds. Start scoring.
2020-10-10 20:22:37,150 INFO score.py: Detected ingestion program had stopped running because an 'end.txt' file is written by ingestion. Stop scoring now.
2020-10-10 20:22:37,151 INFO score.py: Final area under learning curve for miniciao: 0.0000
2020-10-10 20:22:37,153 INFO score.py: Computing error bars with 10 scorings...
2020-10-10 20:22:37,154 INFO score.py:
Latest prediction NAUC:
* Mean: -1
* Standard deviation: -1
* Variance: -1
2020-10-10 20:22:37,154 INFO score.py: Computing ALC error bars with 5 curves...
2020-10-10 20:22:37,155 INFO score.py:
Area under Learning Curve:
* Mean: 0.0
* Standard deviation: 0.0
* Variance: 0.0
2020-10-10 20:22:37,157 ERROR score.py: [-] Some error occurred in ingestion program. Please see output/error log of Ingestion Step.
2020-10-10 20:22:37,157 INFO score.py: [Scoring terminated]

D:\GitHub__AI\AutoDL>

经查询获知,windows系统无法使用此信号,不兼容

run_local_test issuing empty output

Hello,

I tried running your tutorial on the AutoDL public datasets by running e.g.
python scripts/run_local_test.py --dataset_dir=AutoDL_public_data/O3 --output_dir=res/O3

I get a few error messages along the way, such as
Could not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied)
Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
failed call to cuInit: UNKNOWN ERROR (303)
kernel driver does not appear to be running on this host (mantis-cargo): /proc/driver/nvidia/version does not exist
run_ingestion: 62: Failed to initializing model.
FileNotFoundError: [Errno 2] No such file or directory: '/app/embedding/cc.zh.300.vec.gz'
UnboundLocalError: local variable 'model' referenced before assignment
ERROR scoring_process.py: run_scoring: 96: [-] Some error occurred in ingestion program. Please see output/error log of Ingestion Step.

as well as deprecation warnings
distutils Version classes are deprecated. Use packaging.version instead
UserWarning: Setuptools is replacing distutils.
The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.
The name tf.gfile.Glob is deprecated. Please use tf.io.gfile.glob instead.

and ends up with an empty scores.txtfile as well as a blank learning-curve-O3.pngimage

Do you have an idea why it's not working ?

Thanks for the help!

name 'M' is not defined, seems M = Model(D_train.get_metadata()) encounter error

when I run unit test in docker (cpu ver.), it reports an error:

root@85a655cc87d1:/app/codalab# python run_local_test.py
2020-05-07 06:36:42 INFO run_local_test.py: ##################################################
2020-05-07 06:36:42 INFO run_local_test.py: Begin running local test using
2020-05-07 06:36:42 INFO run_local_test.py: code_dir = AutoDL_sample_code_submission
2020-05-07 06:36:42 INFO run_local_test.py: dataset_dir = miniciao
2020-05-07 06:36:42 INFO run_local_test.py: ##################################################
2020-05-07 06:36:42 INFO run_local_test.py: Cleaning existing output directory of last run: /app/codalab/AutoDL_sample_result_submission
2020-05-07 06:36:42 INFO run_local_test.py: Cleaning existing output directory of last run: /app/codalab/AutoDL_scoring_output
python /app/codalab/AutoDL_ingestion_program/ingestion.py --dataset_dir=/app/codalab/AutoDL_sample_data/miniciao --code_dir=/app/codalab/AutoDL_sample_code_submission --time_budget=1200.0
python /app/codalab/AutoDL_scoring_program/score.py --solution_dir=/app/codalab/AutoDL_sample_data/miniciao
2020-05-07 06:36:43,653 INFO score.py: ===== Start scoring program. Version: v20191204 =====
2020-05-07 06:36:44,673 INFO ingestion.py: ************************************************
2020-05-07 06:36:44,673 INFO ingestion.py: ******** Processing dataset Miniciao ********
2020-05-07 06:36:44,673 INFO ingestion.py: ************************************************
2020-05-07 06:36:44,673 INFO ingestion.py: Reading training set and test set...
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/tensor_array_ops.py:162: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2020-05-07 06:36:44,928 INFO ingestion.py: Creating model...this process should not exceed 20min.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 19, in <lambda>
    threading.Thread(target=lambda: torch.cuda.synchronize()),
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 398, in synchronize
    _lazy_init()
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 192, in _lazy_init
    _check_driver()
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 102, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

2020-05-07 06:36:46,014 INFO ingestion.py: Initialization success, time spent so far 1.0854098796844482 sec
2020-05-07 06:36:46,014 ERROR ingestion.py: Failed to initializing model.
2020-05-07 06:36:46,015 ERROR ingestion.py: Encountered exception:
Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Traceback (most recent call last):
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 339, in <module>
    M = Model(D_train.get_metadata()) # The metadata of D_train and D_test only differ in sample_count
  File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 208, in time_limit
    yield
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 339, in <module>
    M = Model(D_train.get_metadata()) # The metadata of D_train and D_test only differ in sample_count
  File "/app/codalab/AutoDL_sample_code_submission/model.py", line 54, in __init__
    self.domain_model = DomainModel(self.metadata)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 42, in __init__
    super(Model, self).__init__(metadata)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/skeleton/projects/logic.py", line 88, in __init__
    self.build()
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/model.py", line 66, in build
    self.model_9.init(model_dir=model_path, gain=1.0)
  File "/app/codalab/AutoDL_sample_code_submission/Auto_Image/architectures/resnet.py", line 244, in init
    model_dir=self.model_dir)
  File "/usr/local/lib/python3.5/dist-packages/torch/hub.py", line 499, in load_state_dict_from_url
    return torch.load(cached_file, map_location=map_location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 426, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 613, in _load
    result = unpickler.load()
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 576, in persistent_load
    deserialized_objects[root_key] = restore_location(obj, location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 155, in default_restore_location
    result = fn(storage, location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 131, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 115, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
2020-05-07 06:36:46,035 INFO ingestion.py: ===== Start core part of ingestion program. Version: v20191204 =====
2020-05-07 06:36:46,039 INFO ingestion.py: Failed to run ingestion.
2020-05-07 06:36:46,039 ERROR ingestion.py: Encountered exception:
name 'M' is not defined
Traceback (most recent call last):
  File "/app/codalab/AutoDL_ingestion_program/ingestion.py", line 358, in <module>
    if not hasattr(M, attr):
NameError: name 'M' is not defined
2020-05-07 06:36:46,044 INFO ingestion.py: Wrote the file end.txt marking the end of ingestion.
2020-05-07 06:36:46,045 INFO ingestion.py: [-] Done, but encountered some errors during ingestion.
2020-05-07 06:36:46,045 INFO ingestion.py: [-] Overall time spent  0.01 sec
2020-05-07 06:36:46,079 INFO ingestion.py: [Ingestion terminated]

first I thought it was an netowrk issue during download training data, but I tried run test with proxy, orI downloaded the the r9-xxx.pth.tar , even after build with another machine (with docker of course) still without luck.

It's weird that log report :

Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False.

which I'm using docker of cpu ver

Text-Language detect error when mixed traditional chinese and uppercase letters.

Question

For text data, ** language detect make errors ** when meeting traditional chinese and uppercase letters.

Description

t=我有一个梦想, lang=zh-cn
t=I have a dream, lang=en
t=我有一個夢想, lang=ko
t=我有一個夢想並且翻譯為I HAVE A DREAM., lang=ko
t=我有一个梦想并且翻译为I HAVE A DREAM., lang=vi
t=我有一个梦想并且翻译为i have a dream., lang=en
t=I HAVE A DREAM, lang=hu
t=IHAVEADREAM, lang=id
t=ihaveadream, lang=en

萌新给大佬点个赞 并求入门

一窍不懂的小白(py和c什么的都不会),想要学习使用这个东西用来做分类和判断,不知道大佬们有没有建议?

表格类型数据加载时计算输出纬度代码有误

在autodl/convertor/tabular_to_tfrecords.py 代码55行是先计算纬度再转化标签 但是我直接pip安装下来的autodl-gpu版本是先转化纬度再计算输出纬度 从而会转化为的0-1编码最大纬度为2 从而导致输出纬度计算错误
具体代码 output_dim = int(np.max(label)) + 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.