dl4sits / breizhcrops Goto Github PK

View Code? Open in Web Editor NEW

190.0 8.0 42.0 20.9 MB

A Satellite Time Series Dataset for Crop Type Identification

License: GNU General Public License v3.0

Jupyter Notebook 57.93% Python 41.88% Shell 0.13% Dockerfile 0.06%

breizhcrops's Introduction

BreizhCrops:

A Time Series Dataset for Crop Type Mapping

Check our Breizhcrops Tutorial Colab Notebook for quick hands-on examples.

Installation

Linux and macOS

Install Breizhcrops as python package from PyPI!

pip install breizhcrops

Windows

If you use Windows, execute these lines.

git clone https://github.com/dl4sits/BreizhCrops.git
pip install torch==1.6.0 -f https://download.pytorch.org/whl/torch_stable.html
conda install gdal fiona geopandas
pip install .

Getting Started

This minimal working example

# import package
import breizhcrops as bzh

# initialize and download FRH04 data
dataset = bzh.BreizhCrops("frh04")

# get data sample
x, y, field_id = dataset[0]

# load pretrained model
model = bzh.models.pretrained("Transformer")

# create a batch of batchsize 1
x = x.unsqueeze(0)

# perform inference
y_pred = model(x)

downloads the FRH04 dataset partition (used for evaluation), loads a pretrained model and performs a prediction on the first sample.

Furthermore, for a detailed data analysis you can check the Hands-on Tutorial on Time Series. This is a Jupyter Notebook for timeseries data exploration with BreizhCrops benchmark.

Train a model

Train a model via the example script train.py

python train.py TransformerEncoder --learning-rate 0.001 --weight-decay 5e-08 --preload-ram

This script uses the default model parameters from breizhcrops.models.TransformerModel. When training multiple epochs, the --preload-ram flag speeds up training significantly

Acknowledgements

The model implementations from this repository are based on the following papers and github repositories.

TempCNN (reimplementation from keras source code ) Pelletier et al., 2019
LSTM Recurrent Neural Network adapted from Rußwurm & Körner, 2017
MS-ResNet implementation from Fei Wang
TransformerEncoder implementation was originally adopted from Yu-Hsiang Huang GitHub, but later replaced by own implementation when torch.nn.transformer modules became available
InceptionTime Fawaz et al., 2019
StarRNN Turkoglu et al., 2019
OmniscaleCNN Tang et al., 2020

The raw label data originates from

Registre parcellaire graphique (RPG) of the French National Geographic Institute (IGN)

Reference

This work will be published in the proceedings of ISPRS Archives 2020. Preprint available on ArXiv

@article{breizhcrops2020,
  title={BreizhCrops: A Time Series Dataset for Crop Type Mapping},
  author={Ru{\ss}wurm, Marc and Pelletier, Charlotte and Zollner, Maximilian and Lef{\`e}vre, S{\'e}bastien and K{\"o}rner, Marco},
  journal={International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ISPRS (2020)},
  year={2020}
}

ISPRS virtual congress video can be found here

ICML workshop 2019

A previous version (see workshop website or arxiv version 1) was presented at the presented at the ICML 2019 Time Series workshop, Long Beach, USA ICML workshop contributions do not appear in the ICML proceedings.

breizhcrops's People

Contributors

Stargazers

Watchers

breizhcrops's Issues

some bugs for XX_Train and Evaluate Models

1.import models.transformer.Optim

ModuleNotFoundError Traceback (most recent call last)
in
12 from tqdm import tqdm
13 import numpy as np
---> 14 import models.transformer.Optim

ModuleNotFoundError: No module named 'models.transformer.Optim'

Problems while running the code

Could you elaborate on how to run your codes? Some modules like AttentionModule is missing and we don't know which should be the correct input. The current input is from tslearn dataset named "Trace". I believe it is not the correct input, right?

some bugs for 01_Breizhcrops-Time-Series

plot_timeseries(idx=0,dataset=frh01) has bugs.

ValueError Traceback (most recent call last)
in
----> 1 plot_timeseries(idx=0, dataset=frh01)
2 plot_parcel_location(idx=0, dataset=frh01)

in plot_timeseries(idx, dataset)
5 def plot_timeseries(idx,dataset):
6
----> 7 X,y = dataset[idx]
8 fid = dataset.get_fid(idx)
9

ValueError: too many values to unpack (expected 2)

If I set X,y=dataset[idx],it will cause the problem below

KeyError Traceback (most recent call last)
~\anaconda3\envs\pytorch_gpu\lib\site-packages\pandas-1.1.1-py3.7-win-amd64.egg\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2888 try:
-> 2889 return self._engine.get_loc(casted_key)
2890 except KeyError as err:

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'idx'

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
in
----> 1 plot_timeseries(idx=0, dataset=frh01)
2 plot_parcel_location(idx=0, dataset=frh01)

in plot_timeseries(idx, dataset)
6
7 X,y,fid = dataset[idx]
----> 8 fid = dataset.get_fid(idx)
9
10 X = X.append(X, columns=bands)

~\anaconda3\envs\pytorch_gpu\lib\site-packages\breizhcrops-0.0.2.4-py3.7.egg\breizhcrops\datasets\breizhcrops.py in get_fid(self, idx)
164
165 def get_fid(self, idx):
--> 166 return self.index[self.index["idx"] == idx].index[0]
167
168 def download_h5_database(self):

~\anaconda3\envs\pytorch_gpu\lib\site-packages\pandas-1.1.1-py3.7-win-amd64.egg\pandas\core\frame.py in getitem(self, key)
2900 if self.columns.nlevels > 1:
2901 return self._getitem_multilevel(key)
-> 2902 indexer = self.columns.get_loc(key)
2903 if is_integer(indexer):
2904 indexer = [indexer]

~\anaconda3\envs\pytorch_gpu\lib\site-packages\pandas-1.1.1-py3.7-win-amd64.egg\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2889 return self._engine.get_loc(casted_key)
2890 except KeyError as err:
-> 2891 raise KeyError(key) from err
2892
2893 if tolerance is not None:

KeyError: 'idx'

plot_parcel_location(idx=0,dataset=frh01)

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'idx'

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
in
----> 1 plot_parcel_location(idx=600, dataset=frh01)

in plot_parcel_location(idx, dataset)
54
55 gdf = dataset.geodataframe().to_crs(epsg=3857)
---> 56 parcel_geometry = gdf.loc[frh01.get_fid(idx)]
57
58 fig,ax = plt.subplots(1,1,figsize=(9, 9))

~\anaconda3\envs\pytorch_gpu\lib\site-packages\breizhcrops-0.0.2.4-py3.7.egg\breizhcrops\datasets\breizhcrops.py in get_fid(self, idx)
163
164 def get_fid(self, idx):
--> 165 return self.index[self.index["idx"] == idx].index[0]
166
167 def download_h5_database(self):

KeyError: 'idx'

check dependencies

geopandas is currently required to import breizhcrops. but it is unused if not geodataframe() is used...
this is not really convenient if you only want to use the torch component of the dataset...

TODO outputs confusion matrices for test predictions.

Transformer Bug

Fixed a bug in the new transformer implementation. Need to redo tuning + upload of weights + re-adding transformer to tests

L2A Products

TODO: include bottom-of-atmosphere L2A products

Tests take 10 minutes

Currently the Unittests take 10 minutes.
I think this is mostly due to download of unnecessarily large files.
Would be great to refactor the tests and maybe increase the test coverage as well.

pretrained weights for Inceptiontime seem to underperform

it seems that something is wrong with the pretrained weights of inceptiontime model.
The performance is way below the evaluation.

TODO: investigate and check if we uploaded the right weights

Some bugs for 03_TraninEvaluateModels

from breizhcrops.models.TransformerEncoder import TransformerEncoder
It shows that the ModuleNotFoundError: No module named 'breizhcrops.models.TransformerEncoder'

2.Train the Long Short-Term Memory Network
It shows that
KeyError Traceback (most recent call last)
~\anaconda3\envs\pytorch_gpu\lib\site-packages\pandas-1.1.1-py3.7-win-amd64.egg\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2888 try:
-> 2889 return self._engine.get_loc(casted_key)
2890 except KeyError as err:

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
in
8 betas=(0.9, 0.98), eps=1e-09)
9
---> 10 lstm = train(lstm, optimizer, traindataloader, epochs)

in train(model, optimizer, dataloader, epochs)
18 loss_log = AverageMetric()
19
---> 20 for iteration, data in enumerate(dataloader):
21 optimizer.zero_grad()
22

~\anaconda3\envs\pytorch_gpu\lib\site-packages\torch\utils\data\dataloader.py in next(self)
343
344 def next(self):
--> 345 data = self._next_data()
346 self._num_yielded += 1
347 if self._dataset_kind == _DatasetKind.Iterable and \

~\anaconda3\envs\pytorch_gpu\lib\site-packages\torch\utils\data\dataloader.py in _next_data(self)
383 def _next_data(self):
384 index = self._next_index() # may raise StopIteration
--> 385 data = self._dataset_fetcher.fetch(index) # may raise StopIteration
386 if self._pin_memory:
387 data = _utils.pin_memory.pin_memory(data)

~\anaconda3\envs\pytorch_gpu\lib\site-packages\torch\utils\data_utils\fetch.py in fetch(self, possibly_batched_index)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
---> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]

~\anaconda3\envs\pytorch_gpu\lib\site-packages\torch\utils\data_utils\fetch.py in (.0)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
---> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]

~\anaconda3\envs\pytorch_gpu\lib\site-packages\torch\utils\data\dataset.py in getitem(self, idx)
205 else:
206 sample_idx = idx - self.cumulative_sizes[dataset_idx - 1]
--> 207 return self.datasets[dataset_idx][sample_idx]
208
209 @Property

~\anaconda3\envs\pytorch_gpu\lib\site-packages\breizhcrops-0.0.2.4-py3.7.egg\breizhcrops\datasets\breizhcrops.py in getitem(self, index)
260 X = self.transform(X)
261 if self.target_transform is not None:
--> 262 y = self.target_transform(y)
263
264 return X, y, row.id

in target_transform(y)
25
26 def target_transform(y):
---> 27 y = frh01.mapping.loc[y].id
28 return torch.tensor(y, dtype=torch.long, device=device)
29

~\anaconda3\envs\pytorch_gpu\lib\site-packages\pandas-1.1.1-py3.7-win-amd64.egg\pandas\core\indexing.py in getitem(self, key)
877
878 maybe_callable = com.apply_if_callable(key, self.obj)
--> 879 return self._getitem_axis(maybe_callable, axis=axis)
880
881 def _is_scalar_access(self, key: Tuple):

~\anaconda3\envs\pytorch_gpu\lib\site-packages\pandas-1.1.1-py3.7-win-amd64.egg\pandas\core\indexing.py in _getitem_axis(self, key, axis)
1108 # fall thru to straight lookup
1109 self._validate_key(key, axis)
-> 1110 return self._get_label(key, axis=axis)
1111
1112 def _get_slice_axis(self, slice_obj: slice, axis: int):

~\anaconda3\envs\pytorch_gpu\lib\site-packages\pandas-1.1.1-py3.7-win-amd64.egg\pandas\core\indexing.py in _get_label(self, label, axis)
1057 def _get_label(self, label, axis: int):
1058 # GH#5667 this will fail if the label is not present in the axis.
-> 1059 return self.obj.xs(label, axis=axis)
1060
1061 def _handle_lowerdim_multi_index_axis0(self, tup: Tuple):

~\anaconda3\envs\pytorch_gpu\lib\site-packages\pandas-1.1.1-py3.7-win-amd64.egg\pandas\core\generic.py in xs(self, key, axis, level, drop_level)
3480 loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)
3481 else:
-> 3482 loc = self.index.get_loc(key)
3483
3484 if isinstance(loc, np.ndarray):

KeyError: 0

the Transformer Encoder can not be trained

4.the parameter of tempCNN is wrong
it should be tempcnn = TempCNN(input_dim=13, num_classes=13, sequencelength=45, kernel_size=5,hidden_dims=64,dropout=0.5)

5.the Parameters of MSresnet is also wrong
it should be msresnet = MSResNet(input_dim=13, num_classes=13, hidden_dims=32)

Where can I find codes.csv

I was using BreizhCrops.ipynb. But "/home/marc/projects/ICML19_TSW/images/codes.csv" cannot be found. Could you tell me where to find the csv file?

Update the CLD field in the dataset with new GEE cloud mask product

All S2 images of GEE have now been cloud masked:
The product is here
https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_CLOUD_PROBABILITY

The index dataframe in Breizhcrops dataset has a CLD field that originates form the QA60 field of the satellite data.
That cloud indicator is quite bad and hardly usable (from own visual inspection).

We could gather the classified cloudiness and update the CLD field in the dataframes without changing the code.

@maxzoll @charlotte-pel what do you think?

TODO add random forest

Problematic Dataset for Colab

Hi,
I want to use your served models. Before transforming my datasets for your models, I tried to run your Colab codes in order to understand your dataset. But there is a problem in there: 'ValueError: Shape of passed values is (64, 17), indices imply (64, 19)' and also I do not get why these have 17 columns (13 bands + 4 unknown columns). Could you please help me?

Add a if statement

Add a if statement if csvfolder already exists:
if not os.path.exists(self.csvfolder):

https://github.com/TUM-LMF/BreizhCrops/blob/843aae8e6b2c8cbfa4eec08a1dc1ec87bd8c6fe4/breizhcrops/datasets/breizhcrops.py#L93

HTTP ERROR for model URLS

It gives "urllib.error.HTTPError: HTTP Error 404: Not Found" error even when trying to run minimal working example in readme file. I think it causes from model urls in pretrained script (https://github.com/dl4sits/BreizhCrops/blob/43e7304cd6b446b73cfa46b10f4afb2f53b2c874/breizhcrops/models/pretrained.py)

INCEPTIONTIME_URL = "https://syncandshare.lrz.de/dl/fi3jqF5niKQJTufETbbBPp8N/InceptionTime_input-dim%3D13_num-classes%3D9_hidden-dims%3D64_num-layers%3D4_learning-rate%3D0.0005930998594456241_weight-decay%3D1.8660112778851542e-05.pth"
LSTM_URL = "https://syncandshare.lrz.de/dl/fiGjW6JtFuiUs6kcRHaYbNUr/LSTM_input-dim%3D13_num-classes%3D9_hidden-dims%3D128_num-layers%3D4_bidirectional%3DTrue_use-layernorm%3DTrue_dropout%3D0.5713020228087161_learning-rate%3D0.009880117756170353_weight-decay%3D5.256755602421856e-07.pth"
MSRESNET_URL = "https://syncandshare.lrz.de/dl/fi6FKvymvpyHZ4JVtyWo64wh/MSResNet_input-dim%3D13_num-classes%3D9_hidden-dims%3D32_learning-rate%3D0.0006271686393146093_weight-decay%3D4.750234747127917e-06.pth"
OMNISCALECNN_URL = "https://syncandshare.lrz.de/dl/fi8BZ53crPbExH79xMpNXop3/OmniScaleCNN_learning-rate%3D0.001057192239267413_weight-decay%3D2.2522895556530792e-07.pth"
STARRNN_URL = "https://syncandshare.lrz.de/dl/fiDxFhPxyFxAUVTJKCbncnnS/StarRNN_input-dim%3D13_num-classes%3D9_hidden-dims%3D128_num-layers%3D3_dropout%3D0.5_learning-rate%3D0.008960989762612663_weight-decay%3D2.2171861339535254e-06.pth"
TEMPCNN_URL = "https://syncandshare.lrz.de/dl/fiVpXRMKiEQKfLFnRrKGFhwV/TempCNN_input-dim%3D13_num-classes%3D9_sequencelenght%3D45_kernelsize%3D7_hidden-dims%3D128_dropout%3D0.18203942949809093_learning-rate%3D0.00023892874563871753_weight-decay%3D5.181869707846283e-05.pth"
TRANSFORMER_URL = "https://syncandshare.lrz.de/dl/fiJEVQ1SmvqwNh3EvTGSZnML/new_TransformerEncoder_input-dim%3D13_num-classes%3D9_d-model%3D64_d-inner%3D128_n-layers%3D5_n-head%3D2_dropout%3D0.017998950510888446_learning-rate%3D0.00017369201853408445_weight-decay%3D3.5156458637523697e-06.pth"

Use h5 database instead of large npz arrays

add rocket

https://pyts.readthedocs.io/en/stable/generated/pyts.transformation.ROCKET.html

Fix L2A loading issue

Band sequence of L2A loading is not correct.
fix via if-else case on level

https://github.com/TUM-LMF/BreizhCrops/blob/843aae8e6b2c8cbfa4eec08a1dc1ec87bd8c6fe4/examples/train.py#L73

Pointing out errors in the Breizhcrops Tutorial Colab Notebook

I'd like to point out that the latest changes to the Breizhcrops library affected the code in Breizhcrops Tutorial Colab Notebook
, so I'm going to point out the fixes.

Normalized Difference Vegetation Index

I think the error is probably caused by the addition of label, id to Bands.
I think the changes we made in this commit affected it.

It will work if you make the following changes.

def calculate_ndvi(input_timeseries):
  # https://github.com/dl4sits/BreizhCrops/pull/24/commits/cbe3009c9adee06f51d252b1a734a1ba5636a2b6
  # add ['label', 'id']
  bands = allbands["L1C"].copy()
  bands.remove('label')
  bands.remove('id')
  data = pd.DataFrame(input_timeseries,columns=bands)
  data["doa"] = pd.to_datetime(data["doa"])
  data = data.set_index("doa")
  
  l1c_bands = ['B1', 'B10', 'B11', 'B12', 'B2', 'B3', 'B4', 
               'B5', 'B6', 'B7', 'B8','B8A', 'B9']
  data[l1c_bands] *= 1e-4

  nir = data["B8"]#
  red = data["B4"]

  return (nir-red) / (nir+red+1e-8)

dataset_index = 125 #@param {type:"slider", min:0, max:1048, step:1}

fig,axs = plt.subplots(1,2, figsize=(22,3))

ax = axs[0]

regular_dataset = breizhcrops.BreizhCrops(region="belle-ile", level="L1C", transform=raw_transform)
x,y,field_id = datasets["L1C"][dataset_index]
date = pd.to_datetime(x[:,-1])
x = x[:,:-1]
ax.plot(date,x[:,:-1])
ax.set_title("all spectral bands")

fig.suptitle(regular_dataset.classname[y])
ax = axs[1]

ndvi_dataset = breizhcrops.BreizhCrops(region="belle-ile", level="L1C", transform=calculate_ndvi)
x,y,field_id = ndvi_dataset[dataset_index]
ax.plot(x)
ax.set_title("NDVI")

add link to original IGN data in readme

edit: found link
https://www.data.gouv.fr/fr/datasets/registre-parcellaire-graphique-rpg-contours-des-parcelles-et-ilots-culturaux-et-leur-groupe-de-cultures-majoritaire/

fix missing if statement

Is a if statement on level missing here?

fix folder structure representation

doc string in https://github.com/TUM-LMF/BreizhCrops/blob/843aae8e6b2c8cbfa4eec08a1dc1ec87bd8c6fe4/breizhcrops/datasets/breizhcrops.py#L102
is missing a region folder for the csv files

Visualizations of Google Colab bands at L2A seem to be incorrect

Dear,

I have found an issue in the Google Colab visualizations of the time series. it seems to me that the band order from 'allbands[L2A]' is incorrect, as it includes 'id' and 'code_cultu' before the actual relfectance bands. But in the dataset (using Belle-ile) this seems not to be the case, it seems that there first you have the doa and then directly the reflectance values of the bands. As a consequence, the bands_idxs in the 'plot_sample' function for L2A are incorrect. It states that B2 is in position 3, but in the dataset position in 3 is actually B4. In the plots, I first noticed the strange behaviour for B11 and B12, which seem to be always (near) 0. I think this is because the values for 'CLD' and 'EDG' are plotted as B11 and B12, respectively. So, I think B2 in the plot is actually showing B4, B3 is actually showing B5, B4 is actually showing B6 and so on... Eventually B11 is actually showing CLD and B12 is actually showing EDG.
I don't know if it affects the modelling, I did not yet go through that part in detail yet.

Kind regards,
Stien Heremans

TODO implement starrfm

Query over the forward function of all models

BreizhCrops/breizhcrops/models/PETransformerModel.py

Lines 40 to 52 in 6de796e

    
           def forward(self,x): 
        
               x = self.inlinear(x) 
        
               x = self.pe(x) 
        
               x = x.transpose(0, 1) # N x T x D -> T x N x D 
        
               x = self.transformerencoder(x) 
        
               x = x.transpose(0, 1) # T x N x D -> N x T x D 
        
               x = x.max(1)[0] 
        
               x = self.relu(x) 
        
               logits = self.outlinear(x) 
        
               logprobabilities = F.log_softmax(logits, dim=-1) 
        
               return logprobabilities

I wanted to implement your models on my data and I found at line 50 of script ```PETransformerModel.py" log_softmax function has been implemented.

Secondly, when I saw your examples/train.py , I found criterion used is CrossEntropyLoss

BreizhCrops/examples/train.py

Lines 36 to 40 in 6de796e

    
           criterion = torch.nn.CrossEntropyLoss(reduction="mean") 
        
           log = list() 
        
           for epoch in range(args.epochs): 
        
               train_loss = train_epoch(model, optimizer, criterion, traindataloader, device)

BreizhCrops/examples/train.py

Lines 170 to 182 in 6de796e

    
           def train_epoch(model, optimizer, criterion, dataloader, device): 
        
               model.train() 
        
               losses = list() 
        
               with tqdm(enumerate(dataloader), total=len(dataloader), leave=True) as iterator: 
        
                   for idx, batch in iterator: 
        
                       optimizer.zero_grad() 
        
                       x, y_true, _ = batch 
        
                       loss = criterion(model.forward(x.to(device)), y_true.to(device)) 
        
                       loss.backward() 
        
                       optimizer.step() 
        
                       iterator.set_description(f"train loss={loss:.2f}") 
        
                       losses.append(loss) 
        
               return torch.stack(losses)

So, the loss is calculated on log_softmax which is not recommended by pytorch as nn.CrossEntropy Loss function already does that in its subroutine. (https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html).
I found that it has been done in other models too.

I am not sure whether it was done intentionally or an implementation bug or a mistake in my understanding. Can you explain me before I run my computation.

bands data question

After reviewing the paper and the data, I am not quite clear about the meaning of the 'QA10', 'QA20', 'QA60' bands in L1C, and also the 'CLD', 'EDG', 'SAT' bands in L2A. Could anyone kindly shed some light on this issue?

Pointing out another error in the hands-on tutorial notebook

This issue is very similar to #34 . The addition of the labels "id" and "label" cause the plotting section to fail due to unexpected dimension size. The removal of these two indices fixes the execution of the cell.

This may not be the best solution though; detaching this removal process from the function would be most ideal.

Modifying cell 51 as such will fix this section of the code:

import numpy as np
import pandas as pd
import datetime
import matplotlib.pyplot as plt
import breizhcrops

level = "L1C" #@param ["L1C", "L2A"]

if level == "L1C":
    selected_bands = ['B1', 'B10', 'B11', 'B12', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B8', 'B8A', 'B9']
elif level == "L2A":
    selected_bands = ['B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B8', 'B8A', 'B11', 'B12']

def get_interpolate_transform(interpolation_frequency, interpolation_method):

  def interpolate_transform(input_timeseries):
    #input_timeseries = raw_transform(input_timeseries)

    bands = allbands["L1C"].copy()
    bands.remove('label')
    bands.remove('id')
    data = pd.DataFrame(input_timeseries, columns=bands)
    data["doa"] = pd.to_datetime(data["doa"])
    data = data.set_index("doa")
    data = data.reindex(pd.date_range(start=datetime.datetime(data.index[0].year,1,1),
                                end=datetime.datetime(data.index[0].year,12,31),
                                freq=interpolation_frequency))
    data = data.interpolate(method=interpolation_method)
    data = data.fillna(method="ffill").fillna(method="bfill")
    data = data[selected_bands] * 1e-4
    return data
  return interpolate_transform

frequencies = ["1D","2D","9D"]
methods = ["linear", "nearest", "quadratic", "cubic"]

fig, axs = plt.subplots(len(methods), len(frequencies), figsize=(25,12), sharex=True, sharey=True)
axs = iter(np.array(axs).reshape(-1))

for method in methods:
  for freq in frequencies:
    transform = get_interpolate_transform(freq, method)
    ds = breizhcrops.BreizhCrops(region="belle-ile", level=level, transform=transform)
    x,y,i = ds[10]
    ax = next(axs)
    ax.set_ylim(0,1)
    ax.plot(x)
    ax.set_title(f"method {method}, frequency {freq}")

Bottleneck

Hi,

I think linear should be conv. It might not change something for univariate time series, but I think it will for multivariate time series:

BreizhCrops/breizhcrops/models/InceptionTime.py

Line 50 in 4060d5c

self.bottleneck = nn.Linear(num_filters, out_features=1, bias=use_bias)

Cheers,
Charlotte.

cannot init the dataset, data.zip is gone?

Python 3.8.2 | packaged by conda-forge | (default, Feb 27 2020, 19:33:47)
[Clang 9.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> from breizhcrops import BreizhCrops
>>> BreizhCrops(root="data",region="frh01")


Initializing BreizhCrops region frh01
classmapping.csv: 0.00B [00:01, ?B/s]
Traceback (most recent call last):
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/urllib/request.py", line 1319, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/http/client.py", line 1230, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/http/client.py", line 1276, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/http/client.py", line 1225, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/http/client.py", line 1004, in _send_output
    self.send(msg)
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/http/client.py", line 944, in send
    self.connect()
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/http/client.py", line 1399, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1108)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jun.xiong/Desktop/projects/breizhcrops/breizhcrops/datasets/breizhcrops.py", line 57, in __init__
    self.load_classmapping(classmapping)
  File "/Users/jun.xiong/Desktop/projects/breizhcrops/breizhcrops/datasets/breizhcrops.py", line 128, in load_classmapping
    download_file(CLASSMAPPINGURL, classmapping)
  File "/Users/jun.xiong/Desktop/projects/breizhcrops/breizhcrops/utils.py", line 43, in download_file
    urllib.request.urlretrieve(url, filename=output_path, reporthook=t.update_to)
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/urllib/request.py", line 1362, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Users/jun.xiong/anaconda/envs/breizhcrops/lib/python3.8/urllib/request.py", line 1322, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1108)>

add inceptiontime

reuse pretrained weights if already downloaded

InceptionTime model does not classify correctly when "residual=True"

I tested all 7 model in ./breizhcrops/models with same datasets in 4 classes and processing it with same code, InceptionTime model classify them to only one class, but others six models works fine.

However, after changing InceptionTime.py, line 45

    def __init__(self, kernel_size=32, num_filters=128, residual=True, use_bias=False, device=torch.device("cpu")):

    def __init__(self, kernel_size=32, num_filters=128, residual=False, use_bias=False, device=torch.device("cpu")):

the result seem reasonable, all 4 classes are classified with high precision.

Do I have to uncomment line 36~38 to use residual?

    #if self.use_residual and d % 3 == 2:
    #    x = self._shortcut_layer(input_res, x)
    #    input_res = x

The code I used to processing the datasets, fit the model and classify raster:
https://gist.github.com/GenghisYoung233/834909ced3531e57b8ec6e0353d17c27

Installation issue

OS: Windows 10
Architecture: Intel
Base: Python: 3.7.6 (anaconda distribution)

pip install breizhcrops raises the following issue
ERROR: Could not find a version that satisfies the requirement torch>=1.4.0 (from breizhcrops) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2)
ERROR: No matching distribution found for torch>=1.4.0 (from breizhcrops)

	def forward(self,x):
	x = self.inlinear(x)
	x = self.pe(x)
	x = x.transpose(0, 1) # N x T x D -> T x N x D
	x = self.transformerencoder(x)
	x = x.transpose(0, 1) # T x N x D -> N x T x D
	x = x.max(1)[0]
	x = self.relu(x)
	logits = self.outlinear(x)

	logprobabilities = F.log_softmax(logits, dim=-1)
	return logprobabilities

	criterion = torch.nn.CrossEntropyLoss(reduction="mean")

	log = list()
	for epoch in range(args.epochs):
	train_loss = train_epoch(model, optimizer, criterion, traindataloader, device)

	def train_epoch(model, optimizer, criterion, dataloader, device):
	model.train()
	losses = list()
	with tqdm(enumerate(dataloader), total=len(dataloader), leave=True) as iterator:
	for idx, batch in iterator:
	optimizer.zero_grad()
	x, y_true, _ = batch
	loss = criterion(model.forward(x.to(device)), y_true.to(device))
	loss.backward()
	optimizer.step()
	iterator.set_description(f"train loss={loss:.2f}")
	losses.append(loss)
	return torch.stack(losses)

dl4sits / breizhcrops Goto Github PK

breizhcrops's Introduction

BreizhCrops:

A Time Series Dataset for Crop Type Mapping

Installation

Linux and macOS

Windows

Getting Started

Train a model

Acknowledgements

Reference

ICML workshop 2019

breizhcrops's People

Contributors

Stargazers

Watchers

Forkers

breizhcrops's Issues

1.import models.transformer.Optim

ModuleNotFoundError: No module named 'models.transformer.Optim'

plot_timeseries(idx=0,dataset=frh01) has bugs.

If I set X,y=dataset[idx],it will cause the problem below

KeyError: 'idx'

plot_parcel_location(idx=0,dataset=frh01)

KeyError: 'idx'

KeyError: 0

Normalized Difference Vegetation Index

Recommend Projects

Recommend Topics

Recommend Org