justchenhao / bit_cd Goto Github PK

View Code? Open in Web Editor NEW

322.0 322.0 76.0 57.57 MB

Official Pytorch Implementation of "Remote Sensing Image Change Detection with Transformers"

Python 99.13% Shell 0.87%

bit_cd's People

Contributors

Stargazers

Watchers

Forkers

rsdljm shelomoh1 puzhao8 qingwang-7 davecoding ling9911 wr19960001 wgxstar 0919xuelihui mcflyuu hhhh420 ruitan dagopa easm002 liuguangjin98 marszhaoyt swjtulinxi yangsuhui codecalligrapher rsrscoder xiangyunfan myongjun gregory1506 cczls1991 oshholail stepy1aozz xuican youlingzhanshi westamine leqingxu1997 rsip4sh whitenfly tdroseval teslain humanesoft cvhlin txdtplus lhfazry prajachintya 93yh beat992 torment123 laobel fortressrain wxy1337 rahul-anb liupeng0606 charterscruz yangjing524 hebychen crystallo3921 gaoai tomhuang8 edsml-xd522 greviansonula hereiszjy samuhem burcuamirgan kent-xiong armanfahradyan leofengxin anlehrc simonhong111 lairsea jah1994 renhai1 sujith-kumara xsbrsg harry7337 guoqingru0911 lisenbuaa bipin-gouda nikhil-patnaik-2003

bit_cd's Issues

测试指标比论文高

你好，我用了你们提供的best_ckpt.pt文件，在LEVIR数据集上的F1是0.94，比论文高了很多，请问是后续进行了优化么？

Something for 'stride' of the Resnet18 in Backbone

Hello, I have recently read your code and am very interested in your work. However, I also have a few questions that I would like to answer:

In the paper, stride of the last two stages of resnet18 were replaced with 1, but the corresponding settings were not found in the code. In line 137 of models/resnet.py, 'self. structures=[2, 2, 2, 2, 2]'. If we follow the settings in the paper, should we change it as 'self. structures=[2, 2, 1, 1]' ?
May I ask if the setting in line 100 of model/trainer.py indicates that it is possible to read a model and continue training?

Data

你好我想问一下，我在使用DSIFN数据集训练的时候

出现了这个问题，怎么解决呀

ValueError: attempted relative import beyond top-level package

import函数的地方比较混乱，导致出现该问题

关于可视化问题

你好，我想可视化一下heapmap，已找到了可视化项目visulize_features，请问在哪里调用它？

检测结果全黑

你好, 为什么当我训练结束后测试出来的结果是全黑呢?输入标签是255的, 这是什么原因呢?请帮帮忙,谢谢.

关于测试数据输出可视化的问题

作者您好，最后的测试数据集输出结果只有混乱的组合图吗？生成的有每张图片的预测图吗？

The data in the paper is obtained with base_transformer_pos_s4_dd8 or base_transformer_pos_s4_dd8_dedim8 in the run_cd.sh file

hello,your work is amazing! in this paper ,i have a problem.Transformer can get better results only after pre-training in a large data set. I don't know whether you have pre-trained your Transformer frame and fine-tuned it or just loaded resnet model parameters.

Some problems about the image size of training process

In the original paper, it was mentioned that the original image of 1024x1024 resolution was cut into 256x256 patches during training, but I think the source code implementation is to directly resize the 1024x1024 image to 256x256? Why do you do this instead of following the original?

index

May I ask which index in the paper each result of the experiment corresponds to?
Eval Historical_best_acc = 0.9421 (at epoch 199)

Begin evaluation...
Is_training: False. [1,24], running_mf1: 0.77105
acc: 0.89019 miou: 0.66832 mf1: 0.78176 iou_0: 0.87898 iou_1: 0.45765 F1_0: 0.93559 F1_1: 0.62793 precision_0: 0.92755 precision_1: 0.66103 recall_0: 0.94377 recall_1: 0.59798

size mismatch for transformer_decoder.layers.0.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).

我用LEVIR数据集训练的模型，在测试的时候报错：
Traceback (most recent call last):
File "eval_cd.py", line 58, in
main()
File "eval_cd.py", line 54, in main
model.eval_models(checkpoint_name=args.checkpoint_name)
File "/tmp/pycharm_project_668/models/evaluator.py", line 158, in eval_models
self._load_checkpoint(checkpoint_name)
File "/tmp/pycharm_project_668/models/evaluator.py", line 70, in _load_checkpoint
self.net_G.load_state_dict(checkpoint['model_G_state_dict'])
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for BASE_Transformer:
size mismatch for transformer_decoder.layers.0.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.0.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.0.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.0.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.1.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.1.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.1.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.1.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.2.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.2.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.2.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.2.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.3.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.3.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.3.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.3.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.4.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.4.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.4.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.4.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.5.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.5.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.5.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.5.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.6.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.6.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.6.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.6.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.7.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.7.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.7.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.7.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
数据集的size是256*256的，训练和测试都是按照readme里写的方法，有人知道这个问题怎么解决吗？

WHU-CD

https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html,
I can't get to the page. Is there any other way to get it

How you implemented the ablation experiment

data

when I run sh scripts/run_CD.sh,
there is a OSError: path to the root of LEVIR_CD dataset/list/trainval.txt not found

WHU-CD dataset

Hello! I would like to recur your code ,Would you mind giving me the WHU-CD dataset you've already processed? My email address is [email protected] . Thank you very much!

请问对于变化检测到底是基于语义方式例如还是度量方式（STANet）好点呢？

作者你好，请问你作为STANet的作者，到底是像用对比损失如STANet，度量两个时期的距离，还是基于语义如本文中的BIT，最后用激活函数加celoss或者bceloss这种好点呢？因为从我的实验来看，前者真的是不稳定，非常依赖精确的参数设定，后者实验比较稳定，而且效果好于前者

I have a question.

Hello, it's a nice code.
I am a beginner who is learning deep learning.

Don't you make an inference when you say you're going to proceed with the PREDICT?

I don't know why I have to put labels in addition to image 1 and image 2.

Lower index

I conducted my experiments on the NVIDIA-SMI 410.72 GPU server. The parameters are consistent with what you gave. My results are as follows: acc: 0.97853 miou: 0.78005 mf1: 0.86238 iou_0: 0.97787 iou_1: 0.58222 F1_0: 0.98881 F1_1: 0.73596 precision_0: 0.98745 precision_1: 0.76064 recall_0: 0.99017 recall_1: 0.71283.

TypeError: cannot pickle '_io.TextIOWrapper' object

这个报错无论是我在训练和测试时都会出现，大致意思是不能多线程读取txt文本，也就是那些图片的名称。如何解决

ValueError: Input and output must have the same number of spatial dimensions, but got input with with spatial dimensions of [256, 256] and output size of torch.Size([1, 256, 256, 3]). Please provide input tensor in (N, C, d1, d2, ...,dK) format and output size in (o1, o2, ...,oK) format.

ValueError: Input and output must have the same number of spatial dimensions, but got input with with spatial dimensions of [256, 256] and output size of torch.Size([1, 256, 256, 3]). Please provide input tensor in (N, C, d1, d2, ...,dK) format and output size in (o1, o2, ...,oK) format.

出现Windows fatal exception: code 0xc0000139，test setup failed

ERROR at setup of testt ___________________________
file D:\project_code\remote_pytorch\BIT_CD-master\main_cd.py, line 18
def testt(args):
E fixture 'args' not found

  available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
  use 'pytest --fixtures [testpath]' for help on them.

出现这个问题怎么解决？

demo的问题

在demo.py测试时，我替换自己本地的两张有差异的256*256的png图片，在label文件夹复制一张其他的mask二值图作为我要测试的label，但是预测结果predict文件夹下面，出现的结果却是与label的图片一样，理论上不管label如何，都不会影响测试结果的啊。难道predict的结果只是把现有的label拿来替代？而且，理论上测试，应该不需要label的啊（前提我不需要计算指标）

feature_map

Hi! It's very interesting research, I would like to recur your code, but there is a question about how to get the heat_map as the paper showed. Is this part of the code included in the project?

Nan loss. Sometimes weight initialize to -Inf when training from scratch

I've been trying to train the model using 4-band satellite images. Due to not having 3-band RGB I've made changed to ResNet input dims and tried to train from scratch.
In every run, at some point I get Nan loss. And many times I can see a Nan loss on the first epoch; when debugging I found out that weights might init to -Inf (not alway thought - at some point weights probably explode, causing nan values)

I've spend a few hours debugging but haven't sort it out so far.
Any ideas on why I get this behaviour?

backbone

请问backbone网络（Resnet18）网络参数是否不更新，速度好快。

test

How to download WHU-CD dataset?

Hi,

Can someone please help me to download the WHU-CD dataset (https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html)? The provided link is not working.

My email: [email protected]

Thanks.

IndexError: boolean index did not match indexed array along dimension 0

When I run scripts/eval.sh, there is an error.

Traceback (most recent call last):
  File "eval_cd.py", line 59, in <module>
    main()
  File "eval_cd.py", line 55, in main
    model.eval_models(checkpoint_name=args.checkpoint_name)
  File "/mnt/data/home/hanx/BIT_CD/models/evaluator.py", line 171, in eval_models
    self._collect_running_batch_states()
  File "/mnt/data/home/hanx/BIT_CD/models/evaluator.py", line 105, in _collect_running_batch_states
    running_acc = self._update_metric()
  File "/mnt/data/home/hanx/BIT_CD/models/evaluator.py", line 100, in _update_metric
    current_score = self.running_metric.update_cm(pr=G_pred.cpu().numpy(), gt=target.cpu().numpy())
  File "/mnt/data/home/hanx/BIT_CD/misc/metric_tool.py", line 56, in update_cm
    val = get_confuse_matrix(num_classes=self.n_class, label_gts=gt, label_preds=pr)
  File "/mnt/data/home/hanx/BIT_CD/misc/metric_tool.py", line 157, in get_confuse_matrix
    confusion_matrix += __fast_hist(lt.flatten(), lp.flatten())
  File "/mnt/data/home/hanx/BIT_CD/misc/metric_tool.py", line 152, in __fast_hist
    hist = np.bincount(num_classes * label_gt[mask].astype(int) + label_pred[mask],
IndexError: boolean index did not match indexed array along dimension 0; dimension is 65536 but corresponding boolean dimension is 196608

Could you please help me solve this?

Params. and FLOPs

According to your source code, I cannot get params. and FLOPs in the original paper.  In your paper，the BIT(BIT_S4) has 3.55 M

parameters,but I get 12.4 M. There is a big difference between the two. According to my reproduction process, I think the main problem is ResNet network(Base_S4) parameters, and in my results, my Base_S4 gets 11.69 M for Params.
Is there something I didn't notice? I am looking forward to your answer.

The number of parameters and FLOPs!

Dear author, your work is excellent and interesting! I have a question about the BIT. Could you tell me the number of parameters and FLOPs for the BIT? I am looking forward to your reply.

Fine tuning with data of lower resolution

Hi Authors,

First I'd like to express my great gratitude toward you sharing this wonderful work. The paper is well written and code base clear.
I was trying to tune the provided LEVIR-CD model using a lower resolution small dataset OSCD (10m/pixel, 24 pairs).
Using the default training specification, img_size 256, batch 8, lr 0.001, lr_policy linear, the model seemed unable to find the balance between the precision and recall between 0 and 1, which means inference is either full black or full white. I am curious whether you have tried fine tuning the model with other data, and if you have any suggestion on this task?
Thank you very much!

Best,
Ping-Jung Liu

Is the IOU_1 value higher than the paper？

hi,I am using the pretrained model in the checkpoint file folder,then I run: sh scripts/eval.sh,
the result dispaly:
acc: 0.98980 miou: 0.90326 mf1: 0.94702 iou_0: 0.98931 iou_1: 0.81721 F1_0: 0.99463 F1_1: 0.89941 precision_0: 0.99440 precision_1: 0.90329 recall_0: 0.99485 recall_1: 0.89557
It seem that the result is different with the paper,is this result accuracy or something else wrong?Appreciate getting your reply.

Something for ‘update the best model’

Hello, thanks for the great work.
I want to talk about the main evaluation indices of the 'best model', and I noticed that ‘F1-score with regard to the change category’ is used as the main evaluation indices in your article,

Then I found 'mF1' which is the mean F1-score with regard to the change category and unchange category was used for updating for the bset model.

So I want to talk about whe
Therefore, I would like to ask which indicator mF1 or F1_1( F1-score with regard to the change category) is more suitable as the basis for the output of the bset model.
Thanks again and look forward to your reply.

How could I upgrade the accuracy?

I have trained the model by the data you gived(https://pan.baidu.com/s/1fzNiOE7elGRmIo2h6MIhZw code:l7iv), while the accuracy is always lower the the pretrained model. i have upgrade the max_epochs to 500, while i have the same problem(Lastest model updated. Epoch_acc=0.8744, Historical_best_acc=0.8793 (at epoch 435)). how should i upgrade the accuracy?
i have tried to change the model and so on but i don't think i got the point, would you like to give some suggestion? 3q!

train error

Training Error

Hello, We are trying to execute your training model code, and it is failing with different errors, when using NAIP imagery dataset as train data. But it works normal when using LEVID-CD dataset.

We are feeding in the data with 256 chip size with same folder structure as suggested (A, B, label, list). Do we need to consider anything and change the training data?

Tried to fix the errors but fixing one leads to other errors as shown below. Any help on this would be highly appreciated.
Thanks in advance.

cannot reproduce to this paper accuracy

i also train the model base_transformer_pos_s4_dd8 on the v100，but，i cannt do this with yours

How to preprocess the WHU-CD dataset

F1 Metric

Hi ChenHao,

I am a little bit confused about F1 and IOU calculation.

Why seperately calculate f1_0 and f1_1 then average them together?

detect my own image

I want to test my image,I use the demo.py,"FileNotFoundError: [Errno 2] No such file or directory: './samples/label/zhenhua_1.png'means I have to have the label?but you want to detect,how can have the label??I don't understand.

Issue about Unsupervised Learning

Hello！I am very interested in your research. I noticed that your study is developed in the form of supervised learning, which requires numerous labeled samples. I wonder if it can be modified into the style of unsupervised learning? Appreciate it if you reply！

训练报错，提示找不到文件

在对LEVIR-CD数据集进行训练时，报错显示找不到文件，数据集已按要求命名，地址修改了

F1 values of WHU-CD and DSIFN-CD

Hello, I am trying to reproduce your model code, and it looks incorrect when using WHU-CD dataset as well as DSIFN-CD dataset, and the result appears too high than your result. But it looks normal when using LEVID-CD dataset. Do I need to change the pre-training?

python demo.py ModuleNotFoundError: No module named 'torchvision.models.utils'

when i run demo.py file i have the error
ModuleNotFoundError: No module named 'torchvision.models.utils'

About the eval results

hi,

thanks for the great work.

When I test the checkpoint of this repo, I only get:
acc: 0.94999 miou: 0.48876 mf1: 0.51403 iou_0: 0.94992 iou_1: 0.02761 F1_0: 0.97431 F1_1: 0.05374 precision_0: 0.95038 precision_1: 0.74621 recall_0: 0.99949 recall_1: 0.02788

the mean F1 score is two poor compared to you paper claimed, why?

請問如何detect?

你好
我已經成功train結束後，我有一張 before/after的航照圖，想要了解他的改變狀況變顯示出來
但不知道要執行哪個python檔才能讀取change detection?
執行demo.py，卻顯示FileNotFoundError: [Errno 2] No such file or directory: './samples/label\XXXXX.png'
(我將XXXXX.png，放在sample/A； sample/B folder中)

還是說這套model，不能run自己的before/after的圖?

关于训练过程F1的疑问

I used the BIT script in paddlers to train the LEVIR-CD+dataset, which was pre divided into 512x512 patches. During the training process, I found that the F1 index was very low. May I ask if everyone's training process is also so slow? I have trained nearly 100 epochs, and the F1 index is only 0.2..