The cutmix-semisup-seg from britefury

How to run the jobs on multiple GPUs.

Great job for the semi-supervised semantic segmentation problem!

After checking the implementations, we do not find the related code to support running jobs with multiple GPUs,

We only find that the job function in the job_helper.py contains a parameter "num_jobs" as shown below,

Lines 87 to 116 in 3a8839c

    
           def job(job_name, enumerate_job_names=True): 
        
               """ 
        
               Decorator to turn a function into a job submitter. 
        
               Usage: 
        
               >>> @job('wait_some_time') 
        
               ... def wait_some_time(submit_config: SubmitConfig, iteration_count): 
        
               ...     # Create a run context (hides low level details, exposes simple API to manage the run) 
        
               ...     with dnnlib.RunContext(submit_config) as ctx: 
        
               ... 
        
               ...         fn = os.path.join(submit_config.run_dir, "output.txt") 
        
               ...         with open(fn, 'w') as f: 
        
               ...             f.write("Works!") 
        
               ... 
        
               ...         print('Training...') 
        
               ...         for i in range(iteration_count): 
        
               ...             if ctx.should_stop(): 
        
               ...                 break 
        
               ... 
        
               ...             time.sleep(1.0) 
        
               ...             ctx.update(loss='%2f' % i, cur_epoch=i, max_epoch=iteration_count) 
        
               To submit a job: 
        
               >>> wait_some_time.submit(on='local', job_desc='description_to_identify_specific_job', iteration_count=50) 
        
               :param job_name: The name to be given to the job 
        
               :param module_name: If necessary, name the module in which the job function resides 
        
               :param docker_image: Provide the path to the docker image required for this job 
        
               :param num_gpus: The number of GPUs required

Pretrained models

Hi @Britefury,

Thank you for making the code publicly available.

Can you share the pretrained models for the deeplabv2 architecture (Cutmix, Cutout, VAT and ICT) please?
We are having troubles reproducing the results.

Thanks in advance.

Question about the cons weight.

Hi. In mean teacher, the consistency weight is 100. but in this work, all consistency weight is 1. Isn't this value too small?
Can you tell me the details of setting this parameter, because I have seen other work(such as
CPS, cvpr2021) that uses a consistency weight of around 100 when reproducing the mean-teacher method as well.

Looking forward to your help.

best,

Small concerns on the experiments in Table-4

Really nice work and super impressive results on the low-data regime (shown in Table 4, also pasted as following)!!

We have some small concerns about the red circle marked results:

On the 1/100 subset column, the baseline of DeepLabv3+/PSPNet is only 37.95%/36.69% while your method achieves 59.52%/67.20% separately. We really appreciate you if you could provide some detailed explanation of why your method could achieve so huge gains! One of our small concerns is that applying the CutMix scheme + Mean-Teacher scheme on the supervised baseline method w/o using unlabeled images might be a more reasonable baseline setting. It would be great if could share with us some results of such settings.
On the 1/8 subset column, the baseline performance of DeepLabv3+ is slightly worse than the PSPNet. According to our experience, the DeepLabv3+ should perform much better. Could you share with us some explanation on it?
On the full set column, we observe that the performance of the proposed method is slightly worse than the baseline. Could you share your comments on the possible reasons?
According to your code, all the experiments fix the BN statics and apply crop size 321x321 with a single GPU. Do you have any plans to train or have you ever trained your method on a more strong setting such as with crop size 512x512 + SyncBN + 8x V100 GPUs. MMSegmentation or openseg.pytorch might be a good candidate codebase.

Great thanks for your valuable time and wait for your explanation!

What is the "mask_arr" in the code?

Hi. Thanks for your interesting work!

I have a question about the following code:

cutmix-semisup-seg/datapipe/seg_data.py

Lines 90 to 96 in 44e81b3

    
           if self.mask_flag: 
        
               if self.pipeline_type == 'pil': 
        
                   sample0['mask_pil'] = Image.new('L', size_xy, 255) 
        
               elif self.pipeline_type == 'cv': 
        
                   sample0['mask_arr'] = np.full(size_xy[::-1], 255, dtype=np.uint8) 
        
               else: 
        
                   raise RuntimeError

What is the "mask_arr" here? All elements of this array are set to 255. Why should we define it here?

In the file of the main experiment, "mask_arr" is converted to "mask" and called by:

cutmix-semisup-seg/train_seg_semisup_mask_mt.py

Line 295 in 44e81b3

batch_um0 = unsup_batch0['mask'].to(torch_device)

cutmix-semisup-seg/train_seg_semisup_mask_mt.py

Line 297 in 44e81b3

batch_um1 = unsup_batch1['mask'].to(torch_device)

cutmix-semisup-seg/train_seg_semisup_mask_mt.py

Line 306 in 44e81b3

    
           batch_um_mixed = batch_um0 * (1 - batch_mix_masks) + batch_um1 * batch_mix_masks

cutmix-semisup-seg/train_seg_semisup_mask_mt.py

Line 324 in 44e81b3

loss_mask = batch_um_mixed

In the above code, a "loss_mask" is generated for unlabeled loss, i.e., the consistency loss. Are all elements of "loss_mask" equal to 1?
Can you explain what it does?

Thanks in advance.

Confused on the "loss_mask"

Hi, really nice codebase!

We have tried to understand the details of your implementation and find the usage of the loss mask a little bit confusing (as shown below).

cutmix-semisup-seg/train_seg_semisup_mask_mt.py

Lines 294 to 297 in 3a8839c

    
           batch_ux0 = unsup_batch0['image'].to(torch_device) 
        
           batch_um0 = unsup_batch0['mask'].to(torch_device) 
        
           batch_ux1 = unsup_batch1['image'].to(torch_device) 
        
           batch_um1 = unsup_batch1['mask'].to(torch_device)

First, I have checked the code carefully and find that you define the 'mask' for the unlabeled images at the following code snippet:

cutmix-semisup-seg/datapipe/seg_data.py

Lines 90 to 96 in 3a8839c

    
           if self.mask_flag: 
        
               if self.pipeline_type == 'pil': 
        
                   sample0['mask_pil'] = Image.new('L', size_xy, 255) 
        
               elif self.pipeline_type == 'cv': 
        
                   sample0['mask_arr'] = np.full(size_xy[::-1], 255, dtype=np.uint8) 
        
               else: 
        
                   raise RuntimeError

Therefore, we can see that the initial 'mask' for all images is set as an array of 255 with the same size as the input image. Then you assign the mask value as following:

cutmix-semisup-seg/datapipe/seg_transforms_cv.py

Line 198 in 3a8839c

    
           image, labels, mask, xf = sample0['image_arr'], sample0.get('labels_arr'), sample0.get('mask_arr'), sample0.get('xf_cv')

According to my understanding, only when we apply the cv2.warpAffine transform, some of the 'mask' values become zero (as shown below):

cutmix-semisup-seg/datapipe/seg_transforms_cv.py

Lines 371 to 372 in 3a8839c

    
           if 'mask_arr' in sample0: 
        
               sample0['mask_arr'] = cv2.warpAffine(sample0['mask_arr'], local_xf[0], self.crop_size[::-1], flags=interpolation, borderValue=0, borderMode=cv2.BORDER_CONSTANT)

If my understanding is correct, the batch_um0 and batch_um1 might not be necessary if we do not apply the warpAffine transformation.

cutmix-semisup-seg/train_seg_semisup_mask_mt.py

Lines 294 to 297 in 3a8839c

    
           batch_ux0 = unsup_batch0['image'].to(torch_device) 
        
           batch_um0 = unsup_batch0['mask'].to(torch_device) 
        
           batch_ux1 = unsup_batch1['image'].to(torch_device) 
        
           batch_um1 = unsup_batch1['mask'].to(torch_device)

So my first question is about the influence of affine.cat_nx2x3 (shown as below) on the final performance.

cutmix-semisup-seg/datapipe/seg_transforms_cv.py

Lines 350 to 356 in 3a8839c

    
           # Build affine transformation matrix 
        
           local_xf = affine.cat_nx2x3( 
        
               affine.translation_matrices(self.crop_size_arr[None, ::-1] * 0.5), 
        
               affine.rotation_matrices(rot_theta), 
        
               affine.scale_matrices(scale_factor_yx[None, ::-1]), 
        
               affine.translation_matrices(-centre[None, ::-1]), 
        
           )

So I guess the purpose of the loss mask is mainly to filter out the influence of the extra introduced pixels when applying the warpAffine augmentation. Is my understanding correct?

It would be great if you could point out any of my misunderstandings!

Questions in converting cityscapes dataset

Thanks for your excellent work and kindly code releasing. I have a problem when reproducing the cityscapes experiment.

Following your tutorial, I have reproduced the experimental results, 44.37 vs. 51.73 with 6.79 improvement under the cutmix setting. When delving deep into the codes, I notice that you implement a downsample_label_img function for downsampling ground truth in the convert_cityscapes.py. However, it is more common to use nearest downsampling method in cv2 or PIL. Then, I replace the downsample_label_img function with the following:
y_img = cv2.resize(y_img, (1024, 512), interpolation=cv2.INTER_NEAREST)
and re-conduct the cityscapes experiment. The results degrades greatly, 43.79 vs. 47.02 with only 3.23 improvement.

I wonder the reason why a different downsample method affects the result greatly and the motivation to reimplement a downsample method rather than using a common one.

Improvement due to loss function or data augmentation?

Hi, I went through the paper and had a tough time understanding the jist of it. Is it:

Cut-Mix for segmentation images improving the accuracy metrics or
The consistency loss function is improving your scores

In a more naive way, is the work about cut-mix for segmentation(orginal is for classification) or the work is about consistency loss

Requesting pretrained models

Hi, I found your paper really insightful, been trying to replicate the results. Thanks for the code and the elaborate explanations with training scripts. I was wondering if you could share the pretrained models for Cityscapes dataset (Cutout, Cutmix and baseline models) for comparison and evaluation.

britefury / cutmix-semisup-seg Goto Github PK

cutmix-semisup-seg's People

Contributors

Stargazers

Watchers

Forkers

cutmix-semisup-seg's Issues

How to run the jobs on multiple GPUs.

Pretrained models

Question about the cons weight.

Small concerns on the experiments in Table-4

What is the "mask_arr" in the code?

Confused on the "loss_mask"

Questions in converting cityscapes dataset

Improvement due to loss function or data augmentation?

Requesting pretrained models

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	def job(job_name, enumerate_job_names=True):
	"""
	Decorator to turn a function into a job submitter.

	Usage:

	>>> @job('wait_some_time')
	... def wait_some_time(submit_config: SubmitConfig, iteration_count):
	... # Create a run context (hides low level details, exposes simple API to manage the run)
	... with dnnlib.RunContext(submit_config) as ctx:
	...
	... fn = os.path.join(submit_config.run_dir, "output.txt")
	... with open(fn, 'w') as f:
	... f.write("Works!")
	...
	... print('Training...')
	... for i in range(iteration_count):
	... if ctx.should_stop():
	... break
	...
	... time.sleep(1.0)
	... ctx.update(loss='%2f' % i, cur_epoch=i, max_epoch=iteration_count)

	To submit a job:
	>>> wait_some_time.submit(on='local', job_desc='description_to_identify_specific_job', iteration_count=50)

	:param job_name: The name to be given to the job
	:param module_name: If necessary, name the module in which the job function resides
	:param docker_image: Provide the path to the docker image required for this job
	:param num_gpus: The number of GPUs required

	if self.mask_flag:
	if self.pipeline_type == 'pil':
	sample0['mask_pil'] = Image.new('L', size_xy, 255)
	elif self.pipeline_type == 'cv':
	sample0['mask_arr'] = np.full(size_xy[::-1], 255, dtype=np.uint8)
	else:
	raise RuntimeError

	batch_ux0 = unsup_batch0['image'].to(torch_device)
	batch_um0 = unsup_batch0['mask'].to(torch_device)
	batch_ux1 = unsup_batch1['image'].to(torch_device)
	batch_um1 = unsup_batch1['mask'].to(torch_device)

	if 'mask_arr' in sample0:
	sample0['mask_arr'] = cv2.warpAffine(sample0['mask_arr'], local_xf[0], self.crop_size[::-1], flags=interpolation, borderValue=0, borderMode=cv2.BORDER_CONSTANT)

	# Build affine transformation matrix
	local_xf = affine.cat_nx2x3(
	affine.translation_matrices(self.crop_size_arr[None, ::-1] * 0.5),
	affine.rotation_matrices(rot_theta),
	affine.scale_matrices(scale_factor_yx[None, ::-1]),
	affine.translation_matrices(-centre[None, ::-1]),
	)