Code Monkey home page Code Monkey logo

adaptive_affinity_fields's Introduction

Adaptive Affinity Fields for Semantic Segmentation

By Tsung-Wei Ke*, Jyh-Jing Hwang*, Ziwei Liu, and Stella X. Yu (* equal contribution)

Semantic segmentation has made much progress with increasingly powerful pixel-wise classifiers and incorporating structural priors via Conditional Random Fields (CRF) or Generative Adversarial Networks (GAN). We propose a simpler alternative that learns to verify the spatial structure of segmentation during training only. Unlike existing approaches that enforce semantic labels on individual pixels and match labels between neighbouring pixels, we propose the concept of Adaptive Affinity Fields (AAF) to capture and match the semantic relations between neighbouring pixels in the label space. We use adversarial learning to select the optimal affinity field size for each semantic category. It is formulated as a minimax problem, optimizing our segmentation neural network in a best worst-case learning scenario. AAF is versatile for representing structures as a collection of pixel-centric relations, easier to train than GAN and more efficient than CRF without run-time inference. Our extensive evaluations on PASCAL VOC 2012, Cityscapes, and GTA5 datasets demonstrate its above-par segmentation performance and robust generalization across domains.

AAF is published in ECCV 2018, see our paper for more details.

  • Multi-GPU SyncBatchNorm has been released!

Prerequisites

  1. Linux
  2. Python2.7 or Python3 (>=3.5)
  3. Cuda 8.0 and Cudnn 6

Required Python Packages

  1. tensorflow 1.4 (for versions >= 1.6 might cause OOM error)
  2. numpy
  3. scipy
  4. tqdm
  5. PIL
  6. opencv

Data Preparation

ImageNet Pre-Trained Models

Download ResNet101.v1 from Tensorflow-Slim.

Training

  • Baseline Models:
python pyscripts/train/train.py
  • Baseline Models (Multi-GPUs):
python pyscripts/train/train_mgpu.py
  • Affinity
python pyscripts/train/train_affinity.py
  • Affinity (Multi-GPUs)
python pyscripts/train/train_affinity_mgpu.py
  • AAF
python pyscripts/train/train_aaf.py
  • AAF (Multi-GPUs)
python pyscripts/train/train_aaf_mgpu.py

Inference

  • Single-Scale Input only
python pyscripts/inference/inference.py
  • Multi-Scale Inputs and Left-Right Flipping (opencv is required)
python pyscripts/inference/inference_msc.py

Benchmarking

  • mIoU
python pyscripts/benchmark/benchmark_by_mIoU.py
  • instance-wise mIoU
python pyscripts/benchmark/benchmark_by_instance.py

See our bash script examples for the corresponding input arguments.

Citation

If you find this code useful for your research, please consider citing our paper Adaptive Affinity Fields for Semantic Segmentation.

@inproceedings{aaf2018,
 author = {Ke, Tsung-Wei and Hwang, Jyh-Jing and Liu, Ziwei and Yu, Stella X.},
 title = {Adaptive Affinity Fields for Semantic Segmentation},
 booktitle = {European Conference on Computer Vision (ECCV)},
 month = {September},
 year = {2018} 
}

License

AAF is released under the MIT License (refer to the LICENSE file for details).

adaptive_affinity_fields's People

Contributors

buttomnutstoast avatar jyhjinghwang avatar twke18 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

adaptive_affinity_fields's Issues

How to calculate boundary recall?

In your paper, the detail for calculating BR is not offered. Also, in your code, I can't find any cue.
You only offer a citation to Contour Detection and Hierarchical Image Segmentation.
But in this paper, the authors refered a way to output the precision-recall curve, instead of a concrete value. So, I just wonder, how to calculate this

The question about dataset

Hi,should I change the folder from VOCdevkit/VOC2012/SegmentationClass to VOCdevkit/VOC2012/segcls? Because there is no segcls folder in the dataset VOC2012.

Time overhead

Hi, I've adapted the code here for Pytorch and it seems to have worked ok. However I'm experience a huge overhead in terms of time. Without AAF a single epoch would take ~18min, now it's taking around 29min. Is an overhead of this magnitude to be expected?
Thanks
Kind regards,

problem about training the train.py

screenshot_2018-09-29 tensorboard_segloss
screenshot_2018-09-29 tensorboard_reducedloss
screenshot_2018-09-29 tensorboard_l2loss
Sorry, I am currently loading the pre-trained model according to the parameter settings in the paper and the sample file named train_pspnet.sh in the bashscripts directory, also using the PASCAL VOC data set, but the same problem still occurs, the loss function can not be reduced, resulting in the prediction result of the verification data set is not correct,which it can't display the content, I would like to ask you how to solve this situation, thank you for your help.

executing Execute the training file, but it has not been able to converge, and the prediction result of the verification data set is wrong.

 Excuse me , I want to ask you a question about executing the training file named train_aaf.py  using PASCAL VOC dataset, which it has not been able to converge, despite following the default parameter settings consistent with the paper. And the prediction result of the validation data set is wrong, sometimes it can display black images with noise points, but in most cases  it shows all black, I suspect that there is a problem with model training. However when I download the trained models which you offer, the results based on the test data set are consistent with the paper. Could you please give me some advice about this situation, thanks so much. If you could offer more detailed parameter setting instructions, I would be grateful.

inference

The inference code could not run!can you give some default input parameter,so that I could just set the input output data directory and list!Thank you.

seg_models.models.pspnet error ?

Hi,I run: python pyscripts/inference/inference.py
The following error message appears:
Traceback (most recent call last):
File "pyscripts/inference/inference.py", line 14, in
from seg_models.models.pspnet import pspnet_resnet101 as model
ImportError: No module named seg_models.models.pspnet

What is the problem, please?
thanks!

AAF for binary labels

Hi, in the original implementation there are multiple labels (background, label 1, labels 2, etc..). In my application I have only one class besides background. Therefore I use sigmoid and have logits in the shape (B, H, W, 1). Should I expand my labels and predictions to (B, H, W, 2) where labels/pred[..., 1] = 1 - pred[..., 0]?

Or can I just compute for a single channel since background and foreground are complementary for the binary case?

Thank you,
Kind regards

Binary segmentation

Hello!
I am bit confused about your approach. You take the color labels with 21 classes, and convert it to a gray image and then use a binary loss to optimize it, am I correct?

it seems the vocseg files are missing for multiple gpu

Thanks for sharing the code. For the multiple gpu version, in the train_mpgu.py file
"from vocseg.models.pspnet_mgpu import pspnet_resnet101 as model", it seems the vocseg files are missing. Could you please update the files? Many thanks.

Question about the implementation of function ignores_from_label in layers.py

I was trying to understand how you implement AAF. However, I found it difficulty to understand how it exactly works. Specifically, I am not sure what do you mean in the comments of ignores_from_label by stating that 'Retrieve eight corner pixels from the center, where the center is ignored. Note that it should be bi-directional'. Why it should be bi-directional?

Thanks in advance for your help!

questions of function "ignores_from_label" in layers.py

As the comments in the function of "ignores_from_label" in layers.py say:

"""Retrieves ignorable pixels from the ground-truth labels. This function returns a binary map in which 1 denotes ignored pixels and 0 means not ignored ones. For those ignored pixels, they are not only the pixels with label value >= num_classes, but also the corresponding neighboring pixels, which are on the the eight cornerls from a (2size+1)x(2size+1) patch.

In my option, it means that it will filter some invalid labels in the input(e.g the 255 in Pascal VOC).
But it seems like that the code is not correct. I made some simple attempts:

N = s * s - 1
tensor = tf.constant(np.random.randint(0, 2, 25, dtype=np.int32), dtype=tf.int32)
tensor = tf.reshape(tensor, (1, 5, 5))
ignore = ignores_from_label(tensor, num_classes=5, size=1)
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(np.reshape(sess.run(tensor), (5, 5)))
    print('------------------')
    print(tensor.get_shape().as_list())
    out = np.reshape(sess.run(ignore), (5, 5, N))
    for n in range(N):
        print(n)
        print(out[:, :, n].astype(np.int32))
            print(n)
            print(out[:, :, n].astype(np.int32))

the output is:

[[1 1 0 1 0]
 [0 1 1 0 0]
 [0 0 1 0 0]
 [1 1 1 1 0]
 [1 0 1 1 0]]

[1, 5, 5]
0
[[1 1 1 1 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 1 1 1 1]]
1
[[1 1 1 1 1]
 [0 0 0 0 0]
 [0 0 0 0 0]
 [0 0 0 0 0]
 [1 1 1 1 1]]
2
[[1 1 1 1 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 1 1 1 1]]
3
[[1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]]
4
[[1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]]
5
[[1 1 1 1 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 1 1 1 1]]
6
[[1 1 1 1 1]
 [0 0 0 0 0]
 [0 0 0 0 0]
 [0 0 0 0 0]
 [1 1 1 1 1]]
7
[[1 1 1 1 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 0 0 0 1]
 [1 1 1 1 1]]

The result is unexpected. Does it means to ignore the boundary pixels of the image?

not-edge loss value is too small, edge loss is nan/inf

Hi, I tried your affinity loss (not adaptive) as my loss function, my network is DeeplabV3+, MobileNet, My own dataset. I set margin=3.0, lambda1=1.0, lambda2=1.0
But there is something wrong with the loss, the not-edge loss is really small and not converge.

Here is a part of nor-edge loss value during training:
Mean Aff Loss is:[6.15826357e-05] Mean Aff Loss is:[7.15486458e-05] Mean Aff Loss is:[4.56848611e-05] Mean Aff Loss is:[5.51421945e-05] Mean Aff Loss is:[7.94407606e-05] Mean Aff Loss is:[0.000143873782] Mean Aff Loss is:[6.04316447e-05] Mean Aff Loss is:[9.94381699e-05] Mean Aff Loss is:[0.000107184518] Mean Aff Loss is:[6.87552383e-05] Mean Aff Loss is:[7.98113e-05] Mean Aff Loss is:[0.000122067388] Mean Aff Loss is:[5.42108719e-05]

As for edge loss value, it will alert Nan or Inf in the beginning. It troubles me so much :(

Could anyone give some advice?

wrong in inference.py

Hi,
Could you provide your training model on voc 2012? I didn't find him.
Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.