justchenhao / bit_cd Goto Github PK
View Code? Open in Web Editor NEWOfficial Pytorch Implementation of "Remote Sensing Image Change Detection with Transformers"
Official Pytorch Implementation of "Remote Sensing Image Change Detection with Transformers"
Why is my mask images all black?
你好,我用了你们提供的best_ckpt.pt文件,在LEVIR数据集上的F1是0.94,比论文高了很多,请问是后续进行了优化么?
Hello, I have recently read your code and am very interested in your work. However, I also have a few questions that I would like to answer:
In the paper, stride of the last two stages of resnet18 were replaced with 1, but the corresponding settings were not found in the code. In line 137 of models/resnet.py, 'self. structures=[2, 2, 2, 2, 2]'. If we follow the settings in the paper, should we change it as 'self. structures=[2, 2, 1, 1]' ?
May I ask if the setting in line 100 of model/trainer.py indicates that it is possible to read a model and continue training?
import函数的地方比较混乱,导致出现该问题
你好,我想可视化一下heapmap,已找到了可视化项目visulize_features,请问在哪里调用它?
你好, 为什么当我训练结束后测试出来的结果是全黑呢?输入标签是255的, 这是什么原因呢?请帮帮忙,谢谢.
作者您好,最后的测试数据集输出结果只有混乱的组合图吗?生成的有每张图片的预测图吗?
hello,your work is amazing! in this paper ,i have a problem.Transformer can get better results only after pre-training in a large data set. I don't know whether you have pre-trained your Transformer frame and fine-tuned it or just loaded resnet model parameters.
May I ask which index in the paper each result of the experiment corresponds to?
Eval Historical_best_acc = 0.9421 (at epoch 199)
Begin evaluation...
Is_training: False. [1,24], running_mf1: 0.77105
acc: 0.89019 miou: 0.66832 mf1: 0.78176 iou_0: 0.87898 iou_1: 0.45765 F1_0: 0.93559 F1_1: 0.62793 precision_0: 0.92755 precision_1: 0.66103 recall_0: 0.94377 recall_1: 0.59798
我用LEVIR数据集训练的模型,在测试的时候报错:
Traceback (most recent call last):
File "eval_cd.py", line 58, in
main()
File "eval_cd.py", line 54, in main
model.eval_models(checkpoint_name=args.checkpoint_name)
File "/tmp/pycharm_project_668/models/evaluator.py", line 158, in eval_models
self._load_checkpoint(checkpoint_name)
File "/tmp/pycharm_project_668/models/evaluator.py", line 70, in _load_checkpoint
self.net_G.load_state_dict(checkpoint['model_G_state_dict'])
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for BASE_Transformer:
size mismatch for transformer_decoder.layers.0.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.0.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.0.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.0.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.1.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.1.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.1.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.1.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.2.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.2.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.2.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.2.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.3.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.3.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.3.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.3.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.4.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.4.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.4.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.4.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.5.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.5.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.5.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.5.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.6.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.6.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.6.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.6.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
size mismatch for transformer_decoder.layers.7.0.fn.fn.to_q.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.7.0.fn.fn.to_k.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.7.0.fn.fn.to_v.weight: copying a param with shape torch.Size([512, 32]) from checkpoint, the shape in current model is torch.Size([64, 32]).
size mismatch for transformer_decoder.layers.7.0.fn.fn.to_out.0.weight: copying a param with shape torch.Size([32, 512]) from checkpoint, the shape in current model is torch.Size([32, 64]).
数据集的size是256*256的,训练和测试都是按照readme里写的方法,有人知道这个问题怎么解决吗?
https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html,
I can't get to the page. Is there any other way to get it
How you implemented the ablation experiment
when I run sh scripts/run_CD.sh,
there is a OSError: path to the root of LEVIR_CD dataset/list/trainval.txt not found
Hello! I would like to recur your code ,Would you mind giving me the WHU-CD dataset you've already processed? My email address is [email protected] . Thank you very much!
作者你好,请问你作为STANet的作者,到底是像用对比损失如STANet,度量两个时期的距离,还是基于语义如本文中的BIT,最后用激活函数加celoss或者bceloss这种好点呢?因为从我的实验来看,前者真的是不稳定,非常依赖精确的参数设定,后者实验比较稳定,而且效果好于前者
Hello, it's a nice code.
I am a beginner who is learning deep learning.
Don't you make an inference when you say you're going to proceed with the PREDICT?
I don't know why I have to put labels in addition to image 1 and image 2.
I conducted my experiments on the NVIDIA-SMI 410.72 GPU server. The parameters are consistent with what you gave. My results are as follows: acc: 0.97853 miou: 0.78005 mf1: 0.86238 iou_0: 0.97787 iou_1: 0.58222 F1_0: 0.98881 F1_1: 0.73596 precision_0: 0.98745 precision_1: 0.76064 recall_0: 0.99017 recall_1: 0.71283.
这个报错无论是我在训练和测试时都会出现,大致意思是不能多线程读取txt文本,也就是那些图片的名称。如何解决
ValueError: Input and output must have the same number of spatial dimensions, but got input with with spatial dimensions of [256, 256] and output size of torch.Size([1, 256, 256, 3]). Please provide input tensor in (N, C, d1, d2, ...,dK) format and output size in (o1, o2, ...,oK) format.
ERROR at setup of testt ___________________________
file D:\project_code\remote_pytorch\BIT_CD-master\main_cd.py, line 18
def testt(args):
E fixture 'args' not found
available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory use 'pytest --fixtures [testpath]' for help on them.
出现这个问题怎么解决?
在demo.py测试时,我替换自己本地的两张有差异的256*256的png图片,在label文件夹复制一张其他的mask二值图作为我要测试的label,但是预测结果predict文件夹下面,出现的结果却是与label的图片一样,理论上不管label如何,都不会影响测试结果的啊。难道predict的结果只是把现有的label拿来替代?而且,理论上测试,应该不需要label的啊(前提我不需要计算指标)
Hi! It's very interesting research, I would like to recur your code, but there is a question about how to get the heat_map as the paper showed. Is this part of the code included in the project?
I've been trying to train the model using 4-band satellite images. Due to not having 3-band RGB I've made changed to ResNet input dims and tried to train from scratch.
In every run, at some point I get Nan loss. And many times I can see a Nan loss on the first epoch; when debugging I found out that weights might init to -Inf (not alway thought - at some point weights probably explode, causing nan values)
I've spend a few hours debugging but haven't sort it out so far.
Any ideas on why I get this behaviour?
请问backbone网络(Resnet18)网络参数是否不更新,速度好快。
Hi,
Can someone please help me to download the WHU-CD dataset (https://study.rsgis.whu.edu.cn/pages/download/building_dataset.html)? The provided link is not working.
My email: [email protected]
Thanks.
When I run scripts/eval.sh, there is an error.
Traceback (most recent call last):
File "eval_cd.py", line 59, in <module>
main()
File "eval_cd.py", line 55, in main
model.eval_models(checkpoint_name=args.checkpoint_name)
File "/mnt/data/home/hanx/BIT_CD/models/evaluator.py", line 171, in eval_models
self._collect_running_batch_states()
File "/mnt/data/home/hanx/BIT_CD/models/evaluator.py", line 105, in _collect_running_batch_states
running_acc = self._update_metric()
File "/mnt/data/home/hanx/BIT_CD/models/evaluator.py", line 100, in _update_metric
current_score = self.running_metric.update_cm(pr=G_pred.cpu().numpy(), gt=target.cpu().numpy())
File "/mnt/data/home/hanx/BIT_CD/misc/metric_tool.py", line 56, in update_cm
val = get_confuse_matrix(num_classes=self.n_class, label_gts=gt, label_preds=pr)
File "/mnt/data/home/hanx/BIT_CD/misc/metric_tool.py", line 157, in get_confuse_matrix
confusion_matrix += __fast_hist(lt.flatten(), lp.flatten())
File "/mnt/data/home/hanx/BIT_CD/misc/metric_tool.py", line 152, in __fast_hist
hist = np.bincount(num_classes * label_gt[mask].astype(int) + label_pred[mask],
IndexError: boolean index did not match indexed array along dimension 0; dimension is 65536 but corresponding boolean dimension is 196608
Could you please help me solve this?
According to your source code, I cannot get params. and FLOPs in the original paper. In your paper,the BIT(BIT_S4) has 3.55 M
parameters,but I get 12.4 M. There is a big difference between the two. According to my reproduction process, I think the main problem is ResNet network(Base_S4) parameters, and in my results, my Base_S4 gets 11.69 M for Params.
Is there something I didn't notice? I am looking forward to your answer.
Dear author, your work is excellent and interesting! I have a question about the BIT. Could you tell me the number of parameters and FLOPs for the BIT? I am looking forward to your reply.
Hi Authors,
First I'd like to express my great gratitude toward you sharing this wonderful work. The paper is well written and code base clear.
I was trying to tune the provided LEVIR-CD model using a lower resolution small dataset OSCD (10m/pixel, 24 pairs).
Using the default training specification, img_size 256, batch 8, lr 0.001, lr_policy linear, the model seemed unable to find the balance between the precision and recall between 0 and 1, which means inference is either full black or full white. I am curious whether you have tried fine tuning the model with other data, and if you have any suggestion on this task?
Thank you very much!
Best,
Ping-Jung Liu
hi,I am using the pretrained model in the checkpoint file folder,then I run: sh scripts/eval.sh,
the result dispaly:
acc: 0.98980 miou: 0.90326 mf1: 0.94702 iou_0: 0.98931 iou_1: 0.81721 F1_0: 0.99463 F1_1: 0.89941 precision_0: 0.99440 precision_1: 0.90329 recall_0: 0.99485 recall_1: 0.89557
It seem that the result is different with the paper,is this result accuracy or something else wrong?Appreciate getting your reply.
Hello, thanks for the great work.
I want to talk about the main evaluation indices of the 'best model', and I noticed that ‘F1-score with regard to the change category’ is used as the main evaluation indices in your article,
Then I found 'mF1' which is the mean F1-score with regard to the change category and unchange category was used for updating for the bset model.
So I want to talk about whe
Therefore, I would like to ask which indicator mF1 or F1_1( F1-score with regard to the change category) is more suitable as the basis for the output of the bset model.
Thanks again and look forward to your reply.
I have trained the model by the data you gived(https://pan.baidu.com/s/1fzNiOE7elGRmIo2h6MIhZw code:l7iv), while the accuracy is always lower the the pretrained model. i have upgrade the max_epochs to 500, while i have the same problem(Lastest model updated. Epoch_acc=0.8744, Historical_best_acc=0.8793 (at epoch 435)). how should i upgrade the accuracy?
i have tried to change the model and so on but i don't think i got the point, would you like to give some suggestion? 3q!
Hello, We are trying to execute your training model code, and it is failing with different errors, when using NAIP imagery dataset as train data. But it works normal when using LEVID-CD dataset.
We are feeding in the data with 256 chip size with same folder structure as suggested (A, B, label, list). Do we need to consider anything and change the training data?
Tried to fix the errors but fixing one leads to other errors as shown below. Any help on this would be highly appreciated.
Thanks in advance.
i also train the model base_transformer_pos_s4_dd8 on the v100,but,i cannt do this with yours
Hi ChenHao,
I am a little bit confused about F1 and IOU calculation.
Why seperately calculate f1_0 and f1_1 then average them together?
I want to test my image,I use the demo.py,"FileNotFoundError: [Errno 2] No such file or directory: './samples/label/zhenhua_1.png'means I have to have the label?but you want to detect,how can have the label??I don't understand.
Hello!I am very interested in your research. I noticed that your study is developed in the form of supervised learning, which requires numerous labeled samples. I wonder if it can be modified into the style of unsupervised learning? Appreciate it if you reply!
Hello, I am trying to reproduce your model code, and it looks incorrect when using WHU-CD dataset as well as DSIFN-CD dataset, and the result appears too high than your result. But it looks normal when using LEVID-CD dataset. Do I need to change the pre-training?
when i run demo.py file i have the error
ModuleNotFoundError: No module named 'torchvision.models.utils'
hi,
thanks for the great work.
When I test the checkpoint of this repo, I only get:
acc: 0.94999 miou: 0.48876 mf1: 0.51403 iou_0: 0.94992 iou_1: 0.02761 F1_0: 0.97431 F1_1: 0.05374 precision_0: 0.95038 precision_1: 0.74621 recall_0: 0.99949 recall_1: 0.02788
the mean F1 score is two poor compared to you paper claimed, why?
你好
我已經成功train結束後,我有一張 before/after的航照圖,想要了解他的改變狀況變顯示出來
但不知道要執行哪個python檔才能讀取change detection?
執行demo.py,卻顯示FileNotFoundError: [Errno 2] No such file or directory: './samples/label\XXXXX.png'
(我將XXXXX.png,放在sample/A; sample/B folder中)
還是說這套model,不能run自己的before/after的圖?
I used the BIT script in paddlers to train the LEVIR-CD+dataset, which was pre divided into 512x512 patches. During the training process, I found that the F1 index was very low. May I ask if everyone's training process is also so slow? I have trained nearly 100 epochs, and the F1 index is only 0.2..
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.