pistony / residualattentionnetwork Goto Github PK
View Code? Open in Web Editor NEWA Gluon implement of Residual Attention Network. Best acc on cifar10-97.78%.
Home Page: https://pistony.github.io/ResidualAttentionNetwork/
License: MIT License
A Gluon implement of Residual Attention Network. Best acc on cifar10-97.78%.
Home Page: https://pistony.github.io/ResidualAttentionNetwork/
License: MIT License
请问一下,这个错误怎么解决?是我的mxnet没有安装好吗?第一次用mxnet,查资料也没解决,请大神帮忙。谢谢。
Traceback (most recent call last):
File "/home/cy/pycharm-community-2019.1/helpers/pydev/pydevd.py", line 1741, in
main()
File "/home/cy/pycharm-community-2019.1/helpers/pydev/pydevd.py", line 1735, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/cy/pycharm-community-2019.1/helpers/pydev/pydevd.py", line 1135, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/cy/pycharm-community-2019.1/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/cy/PycharmProjects/ResidualAttentionNetwork-master/train_cifar.py", line 160, in
net.initialize(init=mx.init.MSRAPrelu(), ctx=ctx) #ctx=1
File "/home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/gluon/block.py", line 502, in initialize
self.collect_params().initialize(init, ctx, verbose, force_reinit)
File "/home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/gluon/parameter.py", line 813, in initialize
v.initialize(None, ctx, init, force_reinit=force_reinit)
File "/home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/gluon/parameter.py", line 391, in initialize
self._finish_deferred_init()
File "/home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/gluon/parameter.py", line 285, in _finish_deferred_init
self._init_impl(data, ctx)
File "/home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/gluon/parameter.py", line 297, in _init_impl
self._data = [data.copyto(ctx) for ctx in self._ctx_list]
File "/home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/gluon/parameter.py", line 297, in
self._data = [data.copyto(ctx) for ctx in self._ctx_list]
File "/home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/ndarray/ndarray.py", line 2077, in copyto
return _internal._copyto(self, out=hret)
File "", line 25, in _copyto
File "/home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/_ctypes/ndarray.py", line 92, in _imperative_invoke
ctypes.byref(out_stypes)))
File "/home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/base.py", line 252, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [11:06:56] src/ndarray/ndarray.cc:1279: GPU is not enabled
Stack trace returned 10 entries:
[bt] (0) /home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/libmxnet.so(+0x21d8d4) [0x7fd8e83998d4]
[bt] (1) /home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/libmxnet.so(+0x21dcb1) [0x7fd8e8399cb1]
[bt] (2) /home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/libmxnet.so(mxnet::CopyFromTo(mxnet::NDArray const&, mxnet::NDArray const&, int, bool)+0x723) [0x7fd8eaec6f23]
[bt] (3) /home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/libmxnet.so(mxnet::imperative::PushFComputeEx(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray > const&, std::vector<mxnet::OpReqType, std::allocatormxnet::OpReqType > const&, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocatormxnet::engine::Var* > const&, std::vector<mxnet::engine::Var*, std::allocatormxnet::engine::Var* > const&, std::vector<mxnet::Resource, std::allocatormxnet::Resource > const&, std::vector<mxnet::NDArray*, std::allocatormxnet::NDArray* > const&, std::vector<mxnet::NDArray*, std::allocatormxnet::NDArray* > const&, std::vector<mxnet::OpReqType, std::allocatormxnet::OpReqType > const&)::{lambda(mxnet::RunContext)#1}::operator()(mxnet::RunContext) const+0x110) [0x7fd8ead6d8c0]
[bt] (4) /home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/libmxnet.so(mxnet::imperative::PushFComputeEx(std::function<void (nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray > const&, std::vector<mxnet::OpReqType, std::allocatormxnet::OpReqType > const&, std::vector<mxnet::NDArray, std::allocatormxnet::NDArray > const&)> const&, nnvm::Op const*, nnvm::NodeAttrs const&, mxnet::Context const&, std::vector<mxnet::engine::Var*, std::allocatormxnet::engine::Var* > const&, std::vector<mxnet::engine::Var*, std::allocatormxnet::engine::Var* > const&, std::vector<mxnet::Resource, std::allocatormxnet::Resource > const&, std::vector<mxnet::NDArray*, std::allocatormxnet::NDArray* > const&, std::vector<mxnet::NDArray*, std::allocatormxnet::NDArray* > const&, std::vector<mxnet::OpReqType, std::allocatormxnet::OpReqType > const&)+0x3ca) [0x7fd8ead78b7a]
[bt] (5) /home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/libmxnet.so(mxnet::Imperative::InvokeOp(mxnet::Context const&, nnvm::NodeAttrs const&, std::vector<mxnet::NDArray*, std::allocatormxnet::NDArray* > const&, std::vector<mxnet::NDArray*, std::allocatormxnet::NDArray* > const&, std::vector<mxnet::OpReqType, std::allocatormxnet::OpReqType > const&, mxnet::DispatchMode, mxnet::OpStatePtr)+0x839) [0x7fd8ead7e5c9]
[bt] (6) /home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/libmxnet.so(mxnet::Imperative::Invoke(mxnet::Context const&, nnvm::NodeAttrs const&, std::vector<mxnet::NDArray*, std::allocatormxnet::NDArray* > const&, std::vector<mxnet::NDArray*, std::allocatormxnet::NDArray* > const&)+0x38c) [0x7fd8ead7ee4c]
[bt] (7) /home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/libmxnet.so(+0x2b0ec09) [0x7fd8eac8ac09]
[bt] (8) /home/cy/anaconda3/envs/mxnet/lib/python3.5/site-packages/mxnet/libmxnet.so(MXImperativeInvokeEx+0x6f) [0x7fd8eac8b1ff]
[bt] (9) /home/cy/anaconda3/envs/mxnet/lib/python3.5/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7fd94b496ec0]
Process finished with exit code 1
Hi, thank you for your work at first. I am interested in your work and want to fine-tune the model for other task, could you please share the pre-trained imagenet model?
DataBatch: data shapes: [(32L, 3L, 224L, 224L)] label shapes: [(32L,)]
Traceback (most recent call last):
File "train_imagenet.py", line 166, in
lr_decay=lr_decay, train_loader=train_data, test_loader=val_data, cat_interval=cat_interval)
File "train_imagenet.py", line 97, in train
trans = gutils.split_and_load(batch[0], ctx)
TypeError: 'DataBatch' object does not support indexing
how to index DataBatch's data?
I use the train_imagenet.py
to training my customer data, it’s running with the log like this:
Iter 999. Loss: 0.51050, Train top1-acc 0.815984, Train top5-acc 1.000000.Time 00:20:55.lr 0.1
test_Loss: 1.741916, test top1-acc 0.541710, test top5-acc 1.000000.
Iter 1999. Loss: 0.50236, Train top1-acc 0.819633, Train top5-acc 1.000000.Time 00:21:45.lr 0.1
test_Loss: nan, test top1-acc nan, test top5-acc nan.
Iter 2999. Loss: 0.49222, Train top1-acc 0.823680, Train top5-acc 1.000000.Time 00:18:54.lr 0.1
test_Loss: nan, test top1-acc nan, test top5-acc nan.
Iter 3999. Loss: 0.48331, Train top1-acc 0.827703, Train top5-acc 1.000000.Time 00:19:06.lr 0.1
test_Loss: nan, test top1-acc nan, test top5-acc nan.
Iter 4999. Loss: 0.46939, Train top1-acc 0.832125, Train top5-acc 1.000000.Time 00:19:08.lr 0.1
test_Loss: nan, test top1-acc nan, test top5-acc nan.
Iter 5999. Loss: 0.46145, Train top1-acc 0.834523, Train top5-acc 1.000000.Time 00:18:30.lr 0.1
test_Loss: nan, test top1-acc nan, test top5-acc nan.
Iter 6999. Loss: 0.46334, Train top1-acc 0.834242, Train top5-acc 1.000000.Time 00:17:58.lr 0.1
test_Loss: nan, test top1-acc nan, test top5-acc nan.
Iter 7999. Loss: 0.44909, Train top1-acc 0.839078, Train top5-acc 1.000000.Time 00:17:31.lr 0.1
test_Loss: nan, test top1-acc nan, test top5-acc nan.
Iter 8999. Loss: 0.45008, Train top1-acc 0.839844, Train top5-acc 1.000000.Time 00:17:03.lr 0.1
test_Loss: nan, test top1-acc nan, test top5-acc nan.
Iter 9999. Loss: 0.44533, Train top1-acc 0.841063, Train top5-acc 1.000000.Time 00:17:09.lr 0.1
test_Loss: nan, test top1-acc nan, test top5-acc nan.
but, the process is just stack at step 10000, and GPU memory occupation is:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:05:00.0 Off | N/A |
| 29% 37C P8 15W / 250W | 6729MiB / 11170MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... Off | 00000000:06:00.0 Off | N/A |
| 29% 33C P8 14W / 250W | 6717MiB / 11172MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... Off | 00000000:09:00.0 Off | N/A |
| 29% 32C P8 15W / 250W | 6755MiB / 11172MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... Off | 00000000:0A:00.0 Off | N/A |
| 29% 30C P8 14W / 250W | 6733MiB / 11172MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 3355 C python 6715MiB |
| 1 3355 C python 6703MiB |
| 2 3355 C python 6741MiB |
| 3 3355 C python 6719MiB |
+-----------------------------------------------------------------------------+
GPU-Util Compute
is staying at 0%. It seems the training process was stacked at some point, I tried to find some error in code, but verything seems correctly setted just the same as train_imagenet.py
. Did some know what's wrong ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.