sen1floods11's People
Forkers
cuulee ajainf rucnyz akashish990 yhjack samburu hillsonghimire minhokim93 konstantinosf jawaechan dedwing youthle catantics13 woodfanwood eshanjairath chaoecohydrors luckmouse ichit sophiezang sahar-hoseynzadeh adhuan 2794456352 sharadgupta27 donhuvy tian-dandan tiandan-geo prajna1999 noeleos soiqualang suhailrafi thanaporn2 anacarolbs imadfen internetoftim cuki-chang bakunawa30 nvallez gaishixingzai ddrainersen1floods11's Issues
Preprocessing steps for permanent water
Hi, first off, thank you for this resource!
I was testing both flood and permanent water segmentation, and I noticed that the Sentinel-1 images have two slightly different formats: I believe that Flood and Weakly have the same logarithmic scale, while the JRC perm_water subset has been normalized between 0 and 1
flood range: [-60.44136, 36.832024]
water range: [0.0, 1.0]
weak range: [-70.71592, 39.25037]
Plotting them it looks like a very "low contrast" image: I'm obtaining close results by applying np.clip(10 ** (img_flood / 50), 0, 1)
to the flood ones, however I can't find the exact transformation that you applied there.
Where can I find more info about that? Apologies if it's already documented somewhere, I couldn't find anything on this regard.
Thanks!
About name of all files: more details was needed
in HandLabeled ( v1.1 ), there are 5 file:
- JRCWaterHand (it might be the label of s1 ? )
- LabelHand ( (it might be the label of cloud ? ))
- S1Hand (it might be the data of s1 ? )
- S1OtsuLabelHand (***?)
- S2Hand (it might be the data of s2 ? )
but there are no more details of those files name.
if possible,can you add it to README ?
thank you !
sorry, I got it in docs
how to computer the hand labeled training dataset ([0.6851, 0.5235],[0.0820, 0.1102]),these is mean and std
norm = transforms.Normalize([0.6851, 0.5235], [0.0820, 0.1102])
in the code,how to computer the tow data?in article,the author did not comment.How do we calculate the mean and standard deviation of S2
Label TIF optimizations
While attempting to use the labels described within this repo, it became apparent that a couple optimizations are advisable:
- Because the data has 3 possible values (-1, 0, 1), the use of int16 tifs is significant overkill. A byte tif (int8) would save considerable space/transfer time
- At the moment, these tifs have a
NoData
value of −32768. It is likely more appropriate for these tifs to have aNoData
value of -1, given the fact that this tracks the advertised semantics more closely and experience teaches that incorrectly setNoData
values are sometimes problematic for downstream processes
Describe data in each folder on Google Cloud Storage bucket
Thanks for documenting this research here!
I'm reading through the paper, this github repo, and looking at the S3 bucket. When I list the GS bucket, I get:
gs://cnn_chips/CNN_Chips_FTC.geojson
gs://cnn_chips/Sen1Floods11_labeled.tgz
gs://cnn_chips/flood_bolivia_data.csv
gs://cnn_chips/flood_test_data.csv
gs://cnn_chips/flood_train_data.csv
gs://cnn_chips/flood_valid_data.csv
gs://cnn_chips/permanent_water_data.csv
gs://cnn_chips/permanent_water_test_data.csv
gs://cnn_chips/permanent_water_train_data.csv
gs://cnn_chips/permanent_water_validation_data.csv
gs://cnn_chips/NoQC/
gs://cnn_chips/Perm/
gs://cnn_chips/PermJRC/
gs://cnn_chips/QC_v2/
gs://cnn_chips/S1/
gs://cnn_chips/S1Flood/
gs://cnn_chips/S1Flood_NoQC/
gs://cnn_chips/S1Perm/
gs://cnn_chips/S1_NoQC/
gs://cnn_chips/S2/
gs://cnn_chips/S2Flood/
gs://cnn_chips/S2_NoQC/
gs://cnn_chips/cnn_checkpoints/
As part of the documentation in this repo, it would be helpful to have a brief (< 2 sentence) human readable description of what each of the directories and files in the bucket are for. For example, the only reference I can find to S1Flood
is in the getTradFName
function in Test_Models.ipynb
. Does that mean that these are the labels for the Otsu Threshold-VH
dataset or something else?
Download Issue
I downloaded the Sen1Flood11 dataset and managed to get all subdirectories. However, when I go to open some of the flood_events data, the majority of it is black and white. I'm new to this, so is this what I am supposed to be seeing? I thought I would see the original Sentinel Imagery from which the flooding pixels were derived from. Could someone just explain to me what I should be seeing? Thank you!
Necessary Data to Distinguish Permanent Water from Flood Water
Thank you for this fine work.
In trying to reproduce the experiments found in your paper, it seemed to me that the data on Google Storage are not sufficient to reproduce all of the results in the paper.
For example, there are results given for the various models on permanent water, flood water, and all water, but I was not able to find the labels necessary to distinguish between permanent water and flood water for the weakly-labeled Sentinel-1 case nor the weakly-labeled Sentinel-2 case (I think that I could have accomplished this by augmenting the dataset with additional labels from JRC, but I am curious if you already have these labels prepared and/or if I have overlooked something).
Similarly, in the case of the model trained on permanent water labels, the methodology to use to separately evaluate flood water and permanent water was not clear to me.
Thanks again.
Error with the train.ipynb example
Hello everyone,
tried to launch the train.ypinb example with Colab to understand how to use your dataset.
When I launch the training phase, I get this error at the end of Epoch 0:
Current Epoch: 0
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:61: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
model saved at checkpoints/Sen1Floods11_0_0.3114752471446991.cp
Training Loss: tensor(0.5943, device='cuda:0', grad_fn=<DivBackward0>)
Training IOU: tensor(0.2001, device='cuda:0')
Training Accuracy: tensor(0.8267, device='cuda:0')
Validation Loss: tensor(0.4088, device='cuda:0')
Validation IOU: tensor(0.3115, device='cuda:0')
Validation Accuracy: tensor(0.8518, device='cuda:0')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[<ipython-input-17-8365ceef919c>](https://localhost:8080/#) in <module>()
18 epochs.append(i)
19 x = epochs
---> 20 plt.plot(x, training_losses, label='training losses')
21 plt.plot(x, training_accuracies, 'tab:orange', label='training accuracy')
22 plt.plot(x, training_ious, 'tab:purple', label='training iou')
6 frames
<__array_function__ internals> in atleast_1d(*args, **kwargs)
[/usr/local/lib/python3.7/dist-packages/torch/_tensor.py](https://localhost:8080/#) in __array__(self, dtype)
676 return handle_torch_function(Tensor.__array__, (self,), self, dtype=dtype)
677 if dtype is None:
--> 678 return self.numpy()
679 else:
680 return self.numpy().astype(dtype, copy=False)
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
I tried to identify the problem, with no success unfortunately.
Could you give me some hints?
Moreover, could it be possible to have some more explanation in the example?
For istance, the training loop iterates over 1000 epochs, but I don't understand why the number 10
in:
train_validation_loop(net, optimizer, scheduler, train_loader, valid_loader, 10, i)
Thanks in advance for your help.
Why function download_flood_water_data_from_list(l) Variable A is reassigned 2 times. And if np.sum((arr_y != arr_y)) == 0: always return True
arr_x = np.nan_to_num(getArrFlood(os.path.join("files/", im_fname)))
arr_y = getArrFlood(os.path.join("files/", mask_fname))
ignore = (arr_y == -1)
ignore = ((np.uint8(ignore) * -1) * 256) + 1
arr_y *= ignore
arr_y = np.uint8(getArrFlood(os.path.join("files/", mask_fname)))
if np.sum((arr_y != arr_y)) == 0:
Sentinel-2 weak label data
Does the dataset not include Sentinel-2 weak label data?
how to preprocess hlss,hlsl,s2 l1c and s1 image for the model?
I want to use the demo,but i don't know how to process these data ( hlss,hlsl,s2 l1c and s1), is there any standard steps or data requirements?
RuntimeError: "round" "_vml_cpu" not implemented for 'Int'
I tried to duplicate the notebook in the Google Colab, and I am getting some errors in my final step: Train model and assess metrics over epochs
Current Epoch: 0
RuntimeError Traceback (most recent call last)
in ()
17
18 for i in range(start, 1000):
---> 19 train_validation_loop(net, optimizer, scheduler, train_loader, valid_loader, 10, i)
20 epochs.append(i)
21 x = epochs
7 frames
in processTestIm(data)
82 if torch.sum(labels.gt(.003) * labels.lt(.004)):
83 labels *= 255
---> 84 labels = labels.round()
85
86 return ims, labels
RuntimeError: "round" "_vml_cpu" not implemented for 'Int'
Split image stack to seperate bands
In order to improve cloud storage access and ML training we should split the sentinel-2 and sentinel-1 imagery into single band TIFFs. Label can stay the same
-
Sentinel-2 13 band TIFF --> 13 single TIFFs for each band
-
Sentinel-1 2 band TIFF --> 2 single band TIFFs for each band
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.