gitbooo / crossvivit Goto Github PK

This repository contains code for the paper "Improving day-ahead Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context"

Home Page: https://arxiv.org/abs/2306.01112

License: MIT License

Python 98.40% Shell 1.60%

crossvivit's People

Contributors

Stargazers

Watchers

Forkers

info-sys-ou zhuolinli-shu

crossvivit's Issues

Why use normalized predicted values and non normalized target values to calculate losses？

How to explain this operation？Won't it lose accuracy？

Fluctuating and Non-decreasing Loss during CrossViVit Training on my own dataset

I really appreciate your outstanding work! The work you have done on the fusion of image and time-series modalities is exactly the direction I have been researching recently. Therefore, I tested your model on my own dataset and compared it with some models I have used before, but I found that CrossViVit did not perform as well as expected. I would appreciate some advice on how to improve the forecasting performance of the model.

Similar to the dataset used in your work, my dataset consists of 15-minute satellite images with only one channel (image size = 96*96) and corresponding photovoltaic power. However, the difference is that the time span of the data I used is only 2.8 years, and I tested it on the last 5 months. The model takes in the past 4 hours of data (step=16) and predicts the future 4 hours of photovoltaic power (step=16).

I trained the model using the default parameters of CrossViVit as in your experiments, with the following differences:

Optical flow was not used.
Loss criterion = nn.MSE (the paper used L1 loss?).
AdamW optimizor with a learning rate of 0.001.
The batch_size was 16.

During the experiments, the train loss and valid loss kept fluctuating and did not decrease. I am not sure what the issue might be. Below are the train loss and valid loss from the wandb logs. I compared them with Perceiver-RNN, which is a model used in the OCF project: https://github.com/openclimatefix/predict_pv_yield (experiment/003*.py)

train_loss: https://api.wandb.ai/links/740402059/uji2orxi

train_step_loss: https://api.wandb.ai/links/740402059/lwz71qqh

valid_loss: https://api.wandb.ai/links/740402059/k7a2abvx

I believe that the cross-attention used in CrossViVit would perform better compared to the concatenation used in Perceiver-RNN, but I am not sure what might be causing the loss to not decrease.

how to get the attention map in README?

i wonder how to get this attention map, could the authors provide the related code?

Share parameters

Hi guys,
Thank you for sharing your amazing work!
For training I am using the dataset posted here https://app.activeloop.ai/crossvivit/SunLake, which can be accessed with deeplake.
Furthermore, I am using some of the parameters which you have shared in your paper, however, I got some errors that mat1 could not be multiplied with mat2, because of different matrix size.
In order to solve this little problem, I have changed ctx_channels from 8 to 23, ts_channels from 16 to 8 in cross_vivit_bis.yaml.
However, I am not feeling that it is the correct solution to the problem.
Could you please share all your parameters from cross_vivit_bis.yaml, txcontext_datamodule.yaml and cross_vivit.yaml?

What services do you suggest to use for training in case I do not have 2 RTX8000 GPUs?

I tried to minimize the batch to 1 and train locally where I have a 3070 8Gb, but without any success

the data in PCCI_20082022_IZA seems to be broken

Thanks for great work!
I got an IndexError after I moved PCCI_20082022_IZA to the training split to be same as the paper.
I was wondering the data in PCCI_20082022_IZA whether it is broken. Is that correct or I need to redownload the section of data in PCCI_20082022_IZA ?
(I'm asure this problem is caused by PCCI_20082022_IZA, since I removed it from training split and everything goes fine)

Here is the detail of my configuration in forecast_datamodule.yaml
stations: train: ["PCCI_20082022_IZA", "PCCI_20082022_CNR", "PCCI_20082022_PAL"] val: ["PCCI_20082022_PAY"] test: ["PCCI_20082022_CAB", "PCCI_20082022_TAM"]

Spatio-temporal prediction

Awesome work and it is really interesting that the model seems to learn physics ? Or do you think the good results are because of using Locations in the Center of the satellite Images ?

Nevertheless, i asked myself whether it is possible to generate predictions for locations without observed GHI?

E.g. providing the latest satellite images and all available observations and Receive a prediction for and other Location in the area ?

Thanks and best regards!

Which is a better approach: forecast satellite image pixels first or direct forecasting (in this paper)?

I have a question about using information from satellite cloud images. Previous studies mostly involved establishing spatiotemporal forecast models to forecast satellite images first, and then using the forecasted results of corresponding pixel points in satellite cloud images as part of the features, along with time series data, for another round of forecast. This staged approach is easy to comprehend and yields decent results in my tests.

In this work, you directly built an end-to-end model for the fusion of the two types of data. I believe this approach could better utilize information from other pixel points in the satellite images . However, I understand that training the model is more challenging compared to the previous method.

Therefore, I would like to inquire if you have compared the method of forecasting satellite cloud images first and whether direct forecasting is superior to staged forecasting. Additionally, how do you perceive these two methods?

Can you please share the weight of the model and a testing script?

Unfortunately, to train the model, it is required expensive hardware RTX8000 (6000$ + per one), but you have trained your model on 2. Can you please share the weights which you have trained?

Also, could you please share a testing script, where the parameters are, for example, 1-2, or 7 days historical time series, 1-2 or 7 days satellite images and the model?

Dataset in the paper

Hi, I enjoyed the paper and wanted to check the dataset that you presented in the paper. Do you think it would be possible to access it ?

gitbooo / crossvivit Goto Github PK

crossvivit's People

Contributors

Stargazers

Watchers

Forkers

crossvivit's Issues

Why use normalized predicted values and non normalized target values to calculate losses？

Fluctuating and Non-decreasing Loss during CrossViVit Training on my own dataset

how to get the attention map in README?

Share parameters

What services do you suggest to use for training in case I do not have 2 RTX8000 GPUs?

the data in PCCI_20082022_IZA seems to be broken

Spatio-temporal prediction

Which is a better approach: forecast satellite image pixels first or direct forecasting (in this paper)?

Can you please share the weight of the model and a testing script?

Dataset in the paper

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent