maybeshewill-cv / bisenetv2-tensorflow Goto Github PK

Unofficial tensorflow implementation of real-time scene image segmentation model "BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation"

Home Page: https://maybeshewill-cv.github.io/bisenetv2-tensorflow/

License: MIT License

Python 99.52% Shell 0.12% C 0.10% Cython 0.26%

real-time-semantic-segmentation semantic-segmentation tensorflow-segmentation deep-learning cityscapes camvid bisenetv2 bisenet

bisenetv2-tensorflow's Introduction

🏆 Github Trophy

👀 Lovely Snake

👀 Active Repo

👀 Activity Graph

🏆 Github Status

🏆 3D Status Profile

👨‍💻 Languages and Tools

☕ Get In Touch

WeiBo
WeChat

⭐️ From Baidu.Inc

bisenetv2-tensorflow's People

Contributors

Stargazers

Watchers

Forkers

shelfcol shengzhang90 ucmmesa shuxiangguo luxiaohao asa008 pc2005 xrosliang yuhonghong95721 itpreson oym1994 pustar lxyzler nolovenolife templeblock pipecat super-ljg dmsql0816 insad janfschr zjj321 littleserendipity gyup super-daryl hust-wayne bobdeng1974 lukschwalb gaowq2017 cx1628555321 w-okada dudestin suzukidaichi-git protossdragoon qiliang72 ph0809 justinduy arishin zartris zachary-zch forword-1234 medical-projects leedk3 yueyedeai moyanyu pandinosaurus tianjun-world hehongjie fangwudi mrhagchwh sachinkmohan stellariser hbhflw2000 peterkim333 frankzhao1997 jackyuan4334 lu8000 1hu11 ysnreddy ffd000

bisenetv2-tensorflow's Issues

How long the time do all the trianing process comsume?

hello,I know you train the model with 1000 more epoches?but some other model of sematic segmentation need about 150k iterations,i train the pytorch version need nearly 3days.

Params

Could you tell me how many params about your job of the BiseNet v2

Data augmentation and data shuffling problem?

Hi,

I have been trying to train the cityscapes dataset but I was not able to reproduce the results. I am using TF1.12.0
Debugging the training code, I noticed the problem resides in the functionnext_batch(self, batch_size) in the cityscapes_tf_io.py file.

If I skip the data augmentation and the shuffle function, the images in the batch are fine:

            with tf.name_scope('input_tensor'):

                # TFRecordDataset opens a binary file and reads one record at a time.
                # `tfrecords_file_paths` could also be a list of filenames, which will be read in order.
                dataset = tf.data.TFRecordDataset(tfrecords_file_paths)

                # The map transformation takes a function and applies it to every element
                # of the dataset.
                dataset = dataset.map(
                    map_func=aug.decode,
                    num_parallel_calls=CFG.DATASET.CPU_MULTI_PROCESS_NUMS
                )
#                if self._dataset_flags == 'train':
#                    dataset = dataset.map(
#                        map_func=aug.preprocess_image_for_train,
#                        num_parallel_calls=CFG.DATASET.CPU_MULTI_PROCESS_NUMS
#                    )
#                elif self._dataset_flags == 'val':
#                    dataset = dataset.map(
#                        map_func=aug.preprocess_image_for_val,
#                        num_parallel_calls=CFG.DATASET.CPU_MULTI_PROCESS_NUMS
#                    )

                # The shuffle transformation uses a finite-sized buffer to shuffle elements
                # in memory. The parameter is the number of elements in the buffer. For
                # completely uniform shuffling, set the parameter to be the same as the
                # number of elements in the dataset.

#                dataset = dataset.shuffle(buffer_size=512)


                # repeat num epochs
                dataset = dataset.repeat(self._epoch_nums)

                dataset = dataset.batch(batch_size=batch_size, drop_remainder=True)
                dataset = dataset.prefetch(buffer_size=batch_size * 16)

                iterator = dataset.make_one_shot_iterator()

However, if I uncomment the shuffle function, the "pairs" are:

Using the original code (uncommenting also the data augmentation lines), the transformations of the RGB image and labeled image are not the same

This is the code I used to visualize the images:

A = iterator.get_next(name='{:s}_IteratorGetNext'.format(self._dataset_flags))
with tf.Session() as sess: X = A[0].eval(session=sess)
with tf.Session() as sess: Y = A[1].eval(session=sess)
x = np.zeros((256,512,3))
x[:,:] = Y[10]/125
plt.imshow(X[10]*0.5+0.5);plt.show();plt.imshow(x);plt.show()

Maybe I am doing something wrong, but I can not figure it out... Do you have any suggestion?
Regards

您好，您有coco的预训练模型吗？

TypeError: 'NoneType' object is not subscriptable

Hello, I tried the test model on my computer(GTX 1070), but I found the mean fps is only 14.
So I run the time profile tools to test the performance. Then the error went out.
I'm not familiar with tensorflow and segmentation. Could you tell me how to fix this problem?

Here is the error information:

Trt engine file: ./checkpoint/bisenetv2_cityscapes_frozen.trt has been generated
WARNING:tensorflow:From tools/timeprofile_bisenetv2.py:90: calling import_graph_def (from tensorflow.python.framework.importer) with op_dict is deprecated and will be removed in a future version.
Instructions for updating:
Please file an issue at https://github.com/tensorflow/tensorflow/issues if you depend on this feature.
Traceback (most recent call last):
File "tools/timeprofile_bisenetv2.py", line 319, in
pb_file_path=args.pb_file_path
File "tools/timeprofile_bisenetv2.py", line 157, in time_profile_tensorflow_graph
image_feed = image[:, :, (2, 1, 0)]
TypeError: 'NoneType' object is not subscriptable

By the way, I'm searching for a segmentation model as the preprocess of a semantic slam system.
Can I deploy your work with tensorflow framework to the project just like what PaddlePaddle do?
Thank you.

InvalidArgumentError (see above for traceback): slice index 262144 of dimension 0 out of bounds.

train loss: 7.23991, miou: 0.22505: 87%|██████████████████████████████████████████████████████████████████████████▊ | 1294/1487 [12:57<02:17, 1.40it/s]2020-11-02 16:43:36.873786: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at strided_slice_op.cc:106 : Invalid argument: slice index 262144 of dimension 0 out of bounds.
2020-11-02 16:43:36.873945: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at strided_slice_op.cc:106 : Invalid argument: slice index 262144 of dimension 0 out of bounds.
2020-11-02 16:43:36.873967: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at strided_slice_op.cc:106 : Invalid argument: slice index 262144 of dimension 0 out of bounds.
2020-11-02 16:43:36.874000: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at strided_slice_op.cc:106 : Invalid argument: slice index 262144 of dimension 0 out of bounds.
2020-11-02 16:43:36.874034: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at strided_slice_op.cc:106 : Invalid argument: slice index 262144 of dimension 0 out of bounds.
train loss: 7.23991, miou: 0.22505: 87%|██████████████████████████████████████████████████████████████████████████▊ | 1294/1487 [12:57<01:55, 1.66it/s]
Traceback (most recent call last):
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 262144 of dimension 0 out of bounds.
[[{{node tower_0/BiseNetV2/seg_head_segmentation_loss/strided_slice}}]]
[[{{node ConstantFoldingCtrl/miou/mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Switch_0}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tools/cityscapes/train_bisenetv2_cityscapes.py", line 42, in
train_model()
File "tools/cityscapes/train_bisenetv2_cityscapes.py", line 34, in train_model
worker.train()
File "/pxsj/fancp/ZLC/bisenetv2-tensorflow/trainner/cityscapes/cityscapes_bisenetv2_multi_gpu_trainner.py", line 393, in train
self._loss, self._global_step
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 262144 of dimension 0 out of bounds.
[[node tower_0/BiseNetV2/seg_head_segmentation_loss/strided_slice (defined at /pxsj/fancp/ZLC/bisenetv2-tensorflow/bisenet_model/bisenet_v2.py:905) ]]
[[{{node ConstantFoldingCtrl/miou/mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Switch_0}}]]

Caused by op 'tower_0/BiseNetV2/seg_head_segmentation_loss/strided_slice', defined at:
File "tools/cityscapes/train_bisenetv2_cityscapes.py", line 42, in
train_model()
File "tools/cityscapes/train_bisenetv2_cityscapes.py", line 29, in train_model
worker = multi_gpu_trainner.BiseNetV2CityScapesMultiTrainer()
File "/pxsj/fancp/ZLC/bisenetv2-tensorflow/trainner/cityscapes/cityscapes_bisenetv2_multi_gpu_trainner.py", line 153, in init
is_net_first_initialized=is_network_initialized
File "/pxsj/fancp/ZLC/bisenetv2-tensorflow/trainner/cityscapes/cityscapes_bisenetv2_multi_gpu_trainner.py", line 337, in _compute_net_gradients
reuse=is_net_first_initialized
File "/pxsj/fancp/ZLC/bisenetv2-tensorflow/bisenet_model/bisenet_v2.py", line 1112, in compute_loss
n_min=self._ohem_min_sample_nums
File "/pxsj/fancp/ZLC/bisenetv2-tensorflow/bisenet_model/bisenet_v2.py", line 905, in _compute_ohem_cross_entropy_loss
ohem_cond = tf.greater(loss[n_min], ohem_thresh)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 654, in _slice_helper
name=name)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 820, in strided_slice
shrink_axis_mask=shrink_axis_mask)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 9356, in strided_slice
shrink_axis_mask=shrink_axis_mask, name=name)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/pxsj/conda_envs/server/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): slice index 262144 of dimension 0 out of bounds.
[[node tower_0/BiseNetV2/seg_head_segmentation_loss/strided_slice (defined at /pxsj/fancp/ZLC/bisenetv2-tensorflow/bisenet_model/bisenet_v2.py:905) ]]
[[{{node ConstantFoldingCtrl/miou/mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Switch_0}}]]

about coco db training

Hi,
thank you for your great work.
I'd like to train your model on COCO db, any suggestion for dataset preparing and training tips?
thanks~

Poor inference result

Hi @MaybeShewill-CV
Thanks for your amazing work.

I tested your pre-trained model on the images in the repository, it works well. Unfortunately, on other images, it performs poorly.

这项目结构真的牛，佩服！！！

Could you share your released bisenetv2 model in baidu or google cloud?

针对小目标的识别分割应该从哪里下手优化

你好 ,针对小目标的识别分割应该从哪里下手优化

inference time

Thanks for the great work!
What is the inference time of the BiSeNetV2 model with input size 1024*512?
According to CoinCheung'repo, the speed of v2 is no different than v1 (https://github.com/CoinCheung/BiSeNet/tree/master/tensorrt). So, can anyone reproduce the results of 156fps in the paper?

Error while training

change batch size into 8, and training with command
CUDA_VISIBLE_DEVICES="0, 1" python tools/cityscapes/train_bisenetv2_cityscapes.py
I met this errors

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: slice index 262144 of dimension 0 out of bounds.
[[{{node tower_3/BiseNetV2/stage_1_segmentation_loss/strided_slice}}]]
[[ConstantFoldingCtrl/miou/mean_iou/confusion_matrix/assert_less/Assert/AssertGuard/Switch_0/_336]]
(1) Invalid argument: slice index 262144 of dimension 0 out of bounds.
[[{{node tower_3/BiseNetV2/stage_1_segmentation_loss/strided_slice}}]]
0 successful operations.
1 derived errors ignored.

It works until epoch 80 but suddenly stop.
How can I fix it?
And can you tell what MIN_SAMPLE_NUMS in cityscapes_bisenetv2.yaml mean?

是不是对一些细小的识别不太好,或是需要做哪些方面的优化

about training new custom data set

I have trained new data set it worked and successful and all imag is resize 960*540。

but when i try to train smaller dataset: every img is 896 * 544, it is always print error:slice index 262144 of dimension 0 out of bounds。and i changed even smaller image size: 512*320, it print same error。

somewhere should i need to do?

Questions about annotations method and train.py

I have two questions.
Which method did you use to annotate? In my annotation method (VGG Image Annotator ), the "json" file could be created, but the image after annotation "...gtFine_ The "labelTrainIds.png" could not be created.

When I run it, I get the following error.
From the downloaded file "cityscapes_bisenetv2.yaml" Only the path to the dataset has changed. The dataset uses the downloaded files; the rest of the dataset remains unchanged.
The dataset uses the file you downloaded. (\data\example_dataset\cityscapes)

python tools/cityscapes/train_bisenetv2_cityscapes.py
Error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 2097152 values, but the requested shape has 262144
[[{{node Reshape_1}}]]
[[graph_input_node/val_IteratorGetNext]]

Quantized Model?

Hi,

I am trying to use the trained model in my TPU, so I need to quantize the model (using TF1).

I think I must add the line contrib_quantize.create_training_graph(input_graph=self._sess.graph, quant_delay=0) inside the __init__ function of the file BiseNetV2CityScapesTrainer.py. However it is a bit unclear to me exactly where .

Would you have any recommendation?
Thanks

issues about don't have enough ROM

I don't have enough ROM , so i have to adjust the data_provider in training scripts into /data_provider/cityscapes_reader.

but i dont know how to modify the code in cityscapes_bisenetv2_single_gpu_trainner.py

can u help me?

thank you so much

Issue when resuming training

Trying to resume training in Cityscapes of a Bisenetv2 from a checkpoint from previous training of around 100 epochs, the validation and training mIoUs drop to much lower levels than before, instead of continuing improving them.

I've tried both disabling warm up and modifying learning rate manually in /config/cityscapes/cityscapes_bisenetv2.yaml, but the problem still occurs. Do you know why this behaviour occurs, and what could be made to fix it? Should some additional modification be done in the yaml file to resume training? Thanks!

When the tfrecord file contains images of different shapes, an error is reported, how should I modify the code

Traceback (most recent call last):
File "/projects/bisenetv2-tensorflow/train_bisenetv2_cityscapes.py", line 42, in
train_model()
File "/projects/bisenetv2-tensorflow/train_bisenetv2_cityscapes.py", line 33, in train_model
worker.train()
File "/projects/bisenetv2-tensorflow/trainner/cityscapes_bisenetv2_single_gpu_trainner.py", line 258, in train
self._loss, self._global_step
File "/opt/conda/envs/tf/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/opt/conda/envs/tf/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/opt/conda/envs/tf/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/opt/conda/envs/tf/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 1676700 values, but the requested shape has 1920000
[[{{node Reshape}} = Reshape[T=DT_UINT8, Tshape=DT_INT32, _device="/device:CPU:0"](DecodePng, Reshape/shape)]]
[[node graph_input_node/train_IteratorGetNext (defined at /projects/bisenetv2-tensorflow/data_provider/cityscapes_tf_io.py:268) = IteratorGetNextoutput_shapes=[[1,400,400,3], [1,400,400,1]], output_types=[DT_FLOAT, DT_UINT8], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
[[{{node miou/mean_iou/confusion_matrix/assert_less_1/Assert/AssertGuard/Assert/_218}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_4628_...ard/Assert", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

Performance issues in /data_provider (by P3)

Hello! I've found a performance issue in /data_provider: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.

Detailed description is listed below:

in /celebamask_hq/celebamask_hq_tf_io.py: dataset.batch(batch_size=batch_size, drop_remainder=True)(line 263) should be called before dataset.map(...)(line 240), dataset = dataset.map(...)(line 245) and dataset = dataset.map(...)(line 250).
in /cityscapes/cityscapes_tf_io.py: dataset.batch(batch_size=batch_size, drop_remainder=True)(line 263) should be called before dataset.map(...)(line 240), dataset.map(...)(line 245) and dataset.map(...)(line 250).

Besides, you need to check the function called in map()(e.g., aug.preprocess_image_for_val called in dataset = dataset.map(map_func=aug.preprocess_image_for_val, num_parallel_calls=CFG.DATASET.CPU_MULTI_PROCESS_NUMS)) whether to be affected or not to make the changed code work properly. For example, if aug.preprocess_image_for_val needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

请教一下为什么训练时GPU使用率才百分之几

Errors in tools/cityscapes/make_cityscapes_tfrecords.py

When I use this command to test, I got error:
The following error occurred. Please point out what needs to be fixed.

I think the problem is where to place the train.txt. I put it in several directories, but I don't know the right answer, so I'm not sure where to put the file. Please tell me if I should.

python python tools/cityscapes/make_cityscapes_tfrecords.py

Error:
[Errno 2] No such file or directory: '/data/example_dataset/cityscapes/image_file_index/train.txt'
Traceback (most recent call last):
File "tools/cityscapes/make_cityscapes_tfrecords.py", line 32, in
generate_tfrecords()
File "tools/cityscapes/make_cityscapes_tfrecords.py", line 22, in generate_tfrecords
io = cityscapes_tf_io.CityScapesTfIO()
File "C:\Users\students\Desktop\rakuseki\seg\bisenetv2-tensorflow-master\bisenetv2-tensorflow-master\tools\cityscapes\data_provider\cityscapes\cityscapes_tf_io.py", line 279, in init
self._writer = _CityScapesTfWriter()
File "C:\Users\students\Desktop\rakuseki\seg\bisenetv2-tensorflow-master\bisenetv2-tensorflow-master\tools\cityscapes\data_provider\cityscapes\cityscapes_tf_io.py", line 68, in init
self._load_train_val_image_index()
File "C:\Users\students\Desktop\rakuseki\seg\bisenetv2-tensorflow-master\bisenetv2-tensorflow-master\tools\cityscapes\data_provider\cityscapes\cityscapes_tf_io.py", line 90, in _load_train_val_image_index
raise err
File "C:\Users\students\Desktop\rakuseki\seg\bisenetv2-tensorflow-master\bisenetv2-tensorflow-master\tools\cityscapes\data_provider\cityscapes\cityscapes_tf_io.py", line 79, in _load_train_val_image_index
with open(self._train_image_index_file_path, 'r') as file:
FileNotFoundError: [Errno 2] No such file or directory: '/data/example_dataset/cityscapes/image_file_index/train.txt'

suggest to isolate pretrained model with codes

hi，
预训练的模型放在代码库里，git clone的时候非常慢...
建议单独提供下载链接...
现在同步代码太难了

多谢~

Errors in tools/cityscapes/test_bisenetv2_cityscapes.py

When I use this command to test, I got error:
The following error occurred. Please point out what needs to be fixed.

python tools/cityscapes/test_bisenetv2_cityscapes.py --weights_path ./model/cityscapes/bisenetv2/cityscapes.ckpt.data-00000-of-00001 --src_image_path ./data/test_image/test_01.png

Error:
2020-06-26 22:22:14.113271: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-06-26 22:22:14.113348: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "tools/cityscapes/test_bisenetv2_cityscapes.py", line 20, in
from bisenet_model import bisenet_v2
ImportError: No module named 'bisenet_model'

The weight files and input image files are located in the following paths.
--weights_path ./model/cityscapes/bisenetv2/cityscapes.ckpt.data-00000-of-00001
--src_image_path ./data/test_image/test_01.png

Error while training of proprietary data

change dataset to proprietary data (1920*1080 pixel), and training with command

CUDA_VISIBLE_DEVICES="0, 1, 2, 3" python tools/cityscapes/train_bisenetv2_cityscapes.py

The program worked fine when using the cityscapes dataset (20481024 pixel).However, when I used my own data (19201080 pixel), I got the following error
The TRAIN_CROP_SIZE and EVAL_CROP_SIZE in cityscapes_bisenetv2.yaml have also been changed as follows
Could you please tell me where the problem is?

Error;
InvalidArgumentError (see above for traceback): slice index 262144 of dimension 0 out of bounds.

There was a similar question on the #19 issues, but I didn't know what exactly to do with the explanation there.

请教一下cityscapes数据集预处理的问题

对于训练集图片：
1.先随机水平翻转
2.然后对图片进行随机尺度的缩放，{ 0.75, 1, 1.25, 1.5, 1.75, 2.0}
3.随机裁剪为512x1024
4.归一化操作和Normalize
以上操作除了4，标签采用相同的操作。
对于验证集图片：
1.将图片缩小为512x1024，标签不变化
2.归一化操作和Normalize
最后将预测结果邻近插值为标签大小，计算miou。
不知道我的理解有没有问题，希望作者能够指教一下。我用pytorch实现的代码最多也就60%的miou。不知道是哪里出了问题

inference time

In time profile model, you describe
'timeprofile_cityscapes_bisenetv2.py' do

Convert the onnx model into tensorrt engine
Run origin tensorflow frozen model for 500 times to calculate a mean inference time comsuming and fps.
Run accelarated tensorrt engine for 500 times to calculate a mean inference time comsuming and fps.
Calculate the model's gflops statics.

when I try to get inference time, original tensorflow graph(step2 above) and acclerated tensorrt engine(step3 abve) get similar result (inference cost 0.007-0.0080, inference fps 119-130).
But according to tensorrt, shouldn't it have a faster speed than the original?
Did I misunderstand the description of the steps?

为什么训练中会出现 Train loss: 3.29611 Val loss:nan... val loss 怎么运算出了 nan

为什么训练中会出现 Train loss: 3.29611 Val loss:nan...
val loss 怎么运算出了 nan

Hello, when I run your code, the following problems always appear, how can I solve it? Looking forward to your reply, thank you very much

Hello, when I run your code, the following problems always appear, how can I solve it? Is it a problem during packaging?Looking forward to your reply, thank you very much.

ModuleNotFoundError: No module named 'trainner'
ModuleNotFoundError: No module named 'local_utils'

train my custom data

hi, it is a good joob!

when i tainning my own data , the data that are not in same sizes, translate the data to tfrecords successed, and then trained it turns:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 518400 values, but the requested shape has 262144.

question: how to train custom data and the trainning data must in same size as cityscapes in (2048,1024)?

thanks

checkpoint file of the pretrained model

Thank you very much for your excellent work. Could you please provide the checkpoint file of the pretrained model? thank you.

more detailed guideline

Thank you for your invitation. I am new to deeplearning, and I know little about this field. I just want to implement semantic segmentation in my reasearch, not to explore in depth. Could you please show the accuracy about your codes experimented on cityscapes dataset, and give me more specified introduction to use your codes? Looking forward to your reply.

关于cityscapes预训练模型的checkpoint文件

您好，首先非常感谢您的工作，结构非常清晰，平易近人，真的是很优秀的实现。
请问下方便放出cityscapes预训练模型的checkpoint文件吗？

请教模型几个参数的作用

你好,请问以下几个参数的作用是什么
GE_EXPAND_RATIO: 6
SEMANTIC_CHANNEL_LAMBDA: 0.25
SEGHEAD_CHANNEL_EXPAND_RATIO: 2

A question about GFLops

the GFLops of Paper is 21.15 with the input of spatial size is 2048 x 1024, but why your GLops is 34.53 ??? @MaybeShewill-CV

OSError: Config file: ./config/cityscapes/cityscapes_bisenet.yaml, can not be read

OSError: Config file: ./config/cityscapes/cityscapes_bisenet.yaml, can not be read， I don't know why.

Bisenet v2 does not work when using the cityscape dataset, the result has always been NAN, and using my own dataset miou is only 0.49

retrain issue with CityScapes

Hi, Thanks for your good jobs, i have a question, when i try to train with CitySpacesReader but not use tfrecord, i found it will OOM. so i change patch size from 20181024 to 2014512, change OHEM to
OHEM:
ENABLE: True
SCORE_THRESH: 0.65
MIN_SAMPLE_NUMS: 340788
it seems work fine,

train loss: 0.91353, miou: 0.65194: 100%|████████████████████████████████████████████████████████████████████| 185/185 [00:56<00:00, 3.30it/s]
2020-12-30 18:32:58.879 | INFO | trainner.cityscapes.cityscapes_bisenetv2_single_gpu_trainner:train:307 - => Epoch: 876 Time: 2020-12-30 18:32:58 Train loss: 0.91419 Train miou: 0.65191 ...

but when i do evaluation, the output is total different with your pretrained model. do you know how to resolve it?

the loss is always nan

4:36:20 Train loss: nan Train miou: nan Val loss: 15.78061 Val miou: 0.00047...

when train，the loss is always nan ，how to solve this problem

About train miou and val miou

Hi @MaybeShewill-CV , I had been finished model-training, and evaluated my model on the whole cityscapes validation dataset used your train code and val code with no changes. But i have a problem why train miou only achieve 0.68 and val miou only achieve 0.53 in the training process, but when I evaluated my model the miou can achieve 71.3 ? Thanks

the image is train results:

this one is the results of evaluate on whole validation dataset:

Explanation of the data augmentation strategy

Hi,

Could you explain the data augmentation strategy, what the following parameters mean:
FIX_RESIZE_SIZE: [720, 720] # (width, height), for unpadding
INF_RESIZE_VALUE: 500 # for rangescaling
MAX_RESIZE_VALUE: 600 # for rangescaling
MIN_RESIZE_VALUE: 400 # for rangescaling
MAX_SCALE_FACTOR: 2.0 # for stepscaling
MIN_SCALE_FACTOR: 0.75 # for stepscaling
SCALE_STEP_SIZE: 0.25 # for stepscaling

If you fix crop size for training and evaluating (by TRAIN_CROP_SIZE and EVAL_CROP_SIZE parameters), then why do you need above parameters ?

你好我无法解决这个问题

WARNING:tensorflow:From tools/cityscapes/test_bisenetv2_cityscapes.py:120: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /home/hdp123/bisenetv2-tensorflow/bisenet_model/bisenet_v2.py:1187: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /home/hdp123/bisenetv2-tensorflow/bisenet_model/cnn_basenet.py:71: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.

WARNING:tensorflow:From /home/hdp123/bisenetv2-tensorflow/bisenet_model/cnn_basenet.py:411: batch_normalization (from tensorflow.python.layers.normalization) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.BatchNormalization instead. In particular, tf.control_dependencies(tf.GraphKeys.UPDATE_OPS) should not be used (consult the tf.keras.layers.batch_normalization documentation).
WARNING:tensorflow:From /home/hdp123/anaconda3/envs/Unet/lib/python3.7/site-packages/tensorflow_core/python/layers/normalization.py:327: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use layer.__call__ method instead.
WARNING:tensorflow:From /home/hdp123/bisenetv2-tensorflow/bisenet_model/cnn_basenet.py:222: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

WARNING:tensorflow:From /home/hdp123/bisenetv2-tensorflow/bisenet_model/cnn_basenet.py:246: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

WARNING:tensorflow:From /home/hdp123/bisenetv2-tensorflow/bisenet_model/bisenet_v2.py:585: The name tf.image.resize_bilinear is deprecated. Please use tf.compat.v1.image.resize_bilinear instead.

WARNING:tensorflow:From tools/cityscapes/test_bisenetv2_cityscapes.py:133: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From tools/cityscapes/test_bisenetv2_cityscapes.py:137: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2024-03-12 11:51:00.323528: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2024-03-12 11:51:00.346827: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3110400000 Hz
2024-03-12 11:51:00.347344: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5ee58a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2024-03-12 11:51:00.347364: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2024-03-12 11:51:00.348556: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2024-03-12 11:51:00.467029: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-03-12 11:51:00.467490: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5dc8ca0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-03-12 11:51:00.467508: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 3060 Laptop GPU, Compute Capability 8.6
2024-03-12 11:51:00.467623: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-03-12 11:51:00.467908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: NVIDIA GeForce RTX 3060 Laptop GPU major: 8 minor: 6 memoryClockRate(GHz): 1.425
pciBusID: 0000:01:00.0
2024-03-12 11:51:00.467997: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/hdp123/anaconda3/envs/Unet/lib/python3.7/site-packages/cv2/../../lib64:/opt/ros/melodic/lib::/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu
2024-03-12 11:51:00.468042: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/hdp123/anaconda3/envs/Unet/lib/python3.7/site-packages/cv2/../../lib64:/opt/ros/melodic/lib::/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu
2024-03-12 11:51:00.468079: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/hdp123/anaconda3/envs/Unet/lib/python3.7/site-packages/cv2/../../lib64:/opt/ros/melodic/lib::/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu
2024-03-12 11:51:00.468114: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/hdp123/anaconda3/envs/Unet/lib/python3.7/site-packages/cv2/../../lib64:/opt/ros/melodic/lib::/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu
2024-03-12 11:51:00.468147: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/hdp123/anaconda3/envs/Unet/lib/python3.7/site-packages/cv2/../../lib64:/opt/ros/melodic/lib::/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu
2024-03-12 11:51:00.468181: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/hdp123/anaconda3/envs/Unet/lib/python3.7/site-packages/cv2/../../lib64:/opt/ros/melodic/lib::/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu
2024-03-12 11:51:00.468216: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/hdp123/anaconda3/envs/Unet/lib/python3.7/site-packages/cv2/../../lib64:/opt/ros/melodic/lib::/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu
2024-03-12 11:51:00.468220: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2024-03-12 11:51:00.468230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2024-03-12 11:51:00.468234: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2024-03-12 11:51:00.468237: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
WARNING:tensorflow:From tools/cityscapes/test_bisenetv2_cityscapes.py:146: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

[ WARN:[email protected]] global loadsave.cpp:248 findDecoder imread_(''): can't open/read file: check file path/integrity
Traceback (most recent call last):
File "tools/cityscapes/test_bisenetv2_cityscapes.py", line 193, in
weights_path=args.weights_path
File "tools/cityscapes/test_bisenetv2_cityscapes.py", line 150, in test_bisenet_cityspaces
preprocessed_image = preprocess_image(src_image, input_tensor_size)
File "tools/cityscapes/test_bisenetv2_cityscapes.py", line 72, in preprocess_image
output_image = src_image[:, :, (2, 1, 0)]
TypeError: 'NoneType' object is not subscriptable
这是什么原因

ModuleNotFound when testing

When I try to run the test I get an error saying:

File "tools/cityscapes/test_bisenetv2_cityscapes.py", line 21, in
from local_utils.config_utils import parse_config_utils
ModuleNotFoundError: No module named 'local_utils'

It looks like it is unable to link to the directory where local_utils is.

Any idea?