Cannot run inference.py on VMWare 2016 workstation. This is the outp

The <a href="https://github.com/zezhishao/BasicTS/blob/master/baselines/Crossformer/PE

It seems that there is no GPU. <div class="snippet-clipboard-content notranslate p

Cannot run inference.py on VMWare 2016 workstation. about basicts HOT 20 OPEN

faizanhakim commented on July 22, 2024

Cannot run inference.py on VMWare 2016 workstation.

from basicts.

Comments (20)

zezhishao commented on July 22, 2024 1

ckpt/DLinear/PEMS08/DLinear_best_val_MAE.pt is the checkpoint of the DLinear model, and it saves the weights of the DLinear model.
If you want to visualize your predictions and the ground truth, you should modify the test function in runners.base_tsf_runner.py.
Specifically, you can save the prediction and real_value before this line.
They are both a tensor of shape [B, L, N, C], where B is the number of samples, L is the prediction length, N is the number of time series (i.e., variables), C=1 is the target value.

from basicts.

zezhishao commented on July 22, 2024 1

The configuration in this repo is appropriate. Kindly note that Crossformer is designed for long-term time series modeling, and does not naturally support short term inputs, e.g. 12 time steps.

from basicts.

zezhishao commented on July 22, 2024

It seems that there is no GPU.

 python3.9 experiments/inference.py -m DLinear -d PEMS08 -g ""

from basicts.

faizanhakim commented on July 22, 2024

Is there any method to bypass the use of gpu?

from basicts.

zezhishao commented on July 22, 2024

Does your virtual machine correctly recognize the GPU? For example, nvidia-smi has normal output?

from basicts.

faizanhakim commented on July 22, 2024

faizab@faizab-virtual-machine:~$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

this is the output when after I run "sudo apt install nvidia-utils-450-server" first.

My laptop has Nvidia 820M available but it seem to be recognized on VMware2016

from basicts.

zezhishao commented on July 22, 2024

The nvidia-driver is not installed correctly, and you should install it first.
BTW, NVIDIA 820M is too old, and maybe not enough for running the complex baselines in BasicTS.

from basicts.

faizanhakim commented on July 22, 2024

I am unable to install nvidia driver because it dosen't recognize gpu on the virtual machine. Will the model work on windows 10 OS with python3.9?

from basicts.

zezhishao commented on July 22, 2024

Honestly, considering the weaker performance of NVIDIA 820M, I wouldn't recommend using the GPU. Because it probably won't speed up training too much, and installing nvidia-driver might be an annoying task.
If you want to use some simple baselines like MLP or Linear based models, you can use CPU directly. If you want to try more powerful baselines, such as STGNN or Transformer-based models, 820M is too weak to run them.

Back to your question, Windows 10 OS with python3.9 will work. BasicTS itself has no operating system requirements. However, installing nvidia-driver and pytorch on Windows OS may require additional work compared to installing on Linux (e.g. Ubuntu).

from basicts.

faizanhakim commented on July 22, 2024

Hey I tried running it from another pc with nvidia rtx 3070 and this is the error that shows up:

PS D:\My Work\Fast Semester 7\Applied Machine Learning\ProjectNew\BasicTS-master\BasicTS-master> python experiments/inference.py -m DLinear -d PEMS08 -g "0"
2023-12-05 23:48:56,483 - easytorch-launcher - INFO - Launching EasyTorch runner.
DESCRIPTION: DLinear model configuration
RUNNER: <class 'basicts.runners.runner_zoo.simple_tsf_runner.SimpleTimeSeriesForecastingRunner'>
DATASET_CLS: <class 'basicts.data.dataset.TimeSeriesForecastingDataset'>
DATASET_NAME: PEMS08
DATASET_TYPE: Traffic Flow
DATASET_INPUT_LEN: 12
DATASET_OUTPUT_LEN: 12
GPU_NUM: 1
NULL_VAL: 0.0
ENV:
SEED: 1
CUDNN:
ENABLED: True
MODEL:
NAME: DLinear
ARCH: <class 'baselines.DLinear.arch.dlinear.DLinear'>
PARAM:
seq_len: 12
pred_len: 12
individual: False
enc_in: 170
FORWARD_FEATURES: [0]
TARGET_FEATURES: [0]
TRAIN:
LOSS: masked_mae
OPTIM:
TYPE: Adam
PARAM:
lr: 0.002
weight_decay: 0.0001
LR_SCHEDULER:
TYPE: MultiStepLR
PARAM:
milestones: [1, 25]
gamma: 0.5
NUM_EPOCHS: 100
CKPT_SAVE_DIR: checkpoints\DLinear_100
DATA:
DIR: datasets/PEMS08
BATCH_SIZE: 64
PREFETCH: False
SHUFFLE: True
NUM_WORKERS: 2
PIN_MEMORY: False
VAL:
INTERVAL: 1
DATA:
DIR: datasets/PEMS08
BATCH_SIZE: 64
PREFETCH: False
SHUFFLE: False
NUM_WORKERS: 2
PIN_MEMORY: False
TEST:
INTERVAL: 1
DATA:
DIR: datasets/PEMS08
BATCH_SIZE: 64
PREFETCH: False
SHUFFLE: False
NUM_WORKERS: 2
PIN_MEMORY: False
EVAL:
USE_GPU: False
HORIZONS: [12]

2023-12-05 23:48:56,493 - easytorch-env - INFO - Use devices 0.
2023-12-05 23:48:56,493 - easytorch-env - INFO - Disable TF32 mode
2023-12-05 23:48:56,499 - easytorch - INFO - Set ckpt save dir: 'checkpoints\DLinear_100\e3c9d1ca4d372a70afb22e2326370b81'
2023-12-05 23:48:56,499 - easytorch - INFO - Building model.
test len: 3566
2023-12-05 23:49:00,891 - easytorch-inference - INFO - Loading Checkpoint from 'ckpt/DLinear/PEMS08/DLinear_best_val_MAE.pt'
Traceback (most recent call last):
File "D:\Python 3.9\lib\site-packages\easytorch\core\runner.py", line 289, in load_model
checkpoint_dict = load_ckpt(self.ckpt_save_dir, ckpt_path=ckpt_path, logger=self.logger)
File "D:\Python 3.9\lib\site-packages\easytorch\core\checkpoint.py", line 51, in load_ckpt
return torch.load(ckpt_path, map_location=lambda storage, loc: to_device(storage))
File "D:\Python 3.9\lib\site-packages\torch\serialization.py", line 594, in load
with _open_file_like(f, 'rb') as opened_file:
File "D:\Python 3.9\lib\site-packages\torch\serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "D:\Python 3.9\lib\site-packages\torch\serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'ckpt/DLinear/PEMS08/DLinear_best_val_MAE.pt'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "D:\My Work\Fast Semester 7\Applied Machine Learning\ProjectNew\BasicTS-master\BasicTS-master\experiments\inference.py", line 44, in
launch_runner(cfg_path, inference, (ckpt_path, args.batch_size), devices=args.gpus)
File "D:\My Work\Fast Semester 7\Applied Machine Learning\ProjectNew\BasicTS-master\BasicTS-master\basicts\launcher.py", line 10, in launch_runner
easytorch.launch_runner(cfg=cfg, fn=fn, args=args, device_type=device_type, devices=devices)
File "D:\Python 3.9\lib\site-packages\easytorch\launcher\launcher.py", line 116, in launch_runner
fn(cfg, runner, *args)
File "D:\My Work\Fast Semester 7\Applied Machine Learning\ProjectNew\BasicTS-master\BasicTS-master\experiments\inference.py", line 18, in inference
runner.load_model(ckpt_path=ckpt)
File "D:\Python 3.9\lib\site-packages\easytorch\core\runner.py", line 295, in load_model
raise OSError('Ckpt file does not exist') from e
OSError: Ckpt file does not exist
PS D:\My Work\Fast Semester 7\Applied Machine Learning\ProjectNew\BasicTS-master\BasicTS-master>

Any idea what this is about?

from basicts.

faizanhakim commented on July 22, 2024

I had to make this directory 'ckpt/DLinear/PEMS08/DLinear_best_val_MAE.pt' manually. Is this the type of output that I should expect, and if so is there any script or notebook in the project that I can use to visualize the results.

from basicts.

faizanhakim commented on July 22, 2024

Another thing I would like to ask is that I trained the Cross-former model on the PEMS08 dataset and when I run the inference code all test results are 0.0000. Is there any error in the model or this can be considered as a legitimate result?

from basicts.

zezhishao commented on July 22, 2024

I can't reproduce this error. Can you provide me with more information? For example, what are your modifications and configurations?

from basicts.

faizanhakim commented on July 22, 2024

This is my configuration file for PEMS08

import os
import sys

# TODO: remove it when basicts can be installed by pip
sys.path.append(os.path.abspath(__file__ + "/../../.."))
from easydict import EasyDict
from basicts.losses import masked_mae, masked_mse
from basicts.data import TimeSeriesForecastingDataset
from basicts.runners import SimpleTimeSeriesForecastingRunner

from .arch import Crossformer

CFG = EasyDict()

# ================= general ================= #
CFG.DESCRIPTION = "Crossformer model configuration"
CFG.RUNNER = SimpleTimeSeriesForecastingRunner
CFG.DATASET_CLS = TimeSeriesForecastingDataset
CFG.DATASET_NAME = "PEMS08"
CFG.DATASET_TYPE = "Traffic Flow"
CFG.DATASET_INPUT_LEN = 12
CFG.DATASET_OUTPUT_LEN = 12
CFG.GPU_NUM = 1
CFG.NULL_VAL = 0.0

# ================= environment ================= #
CFG.ENV = EasyDict()
CFG.ENV.SEED = 0
CFG.ENV.CUDNN = EasyDict()
CFG.ENV.CUDNN.ENABLED = True

# ================= model ================= #
CFG.MODEL = EasyDict()
CFG.MODEL.NAME = "Crossformer"
CFG.MODEL.ARCH = Crossformer
NUM_NODES = 170
CFG.MODEL.PARAM = {
    "data_dim": NUM_NODES,
    "in_len": CFG.DATASET_INPUT_LEN,
    "out_len": CFG.DATASET_OUTPUT_LEN,
    "seg_len": 24,
    "win_size": 2,
    # default parameters
    "factor": 10,
    "d_model": 256,
    "d_ff": 512,
    "n_heads": 4,
    "e_layers": 3,
    "dropout": 0.2,
    "baseline": False
}
CFG.MODEL.FORWARD_FEATURES = [0]
CFG.MODEL.TARGET_FEATURES = [0]

# ================= optim ================= #
CFG.TRAIN = EasyDict()
CFG.TRAIN.LOSS = masked_mae
CFG.TRAIN.OPTIM = EasyDict()
CFG.TRAIN.OPTIM.TYPE = "Adam"
CFG.TRAIN.OPTIM.PARAM = {
    "lr": 0.0002,
    "weight_decay": 0.0005,
}
CFG.TRAIN.LR_SCHEDULER = EasyDict()
CFG.TRAIN.LR_SCHEDULER.TYPE = "MultiStepLR"
CFG.TRAIN.LR_SCHEDULER.PARAM = {
    "milestones": [1, 5],
    "gamma": 0.5
}

# ================= train ================= #
CFG.TRAIN.NUM_EPOCHS = 50
CFG.TRAIN.CKPT_SAVE_DIR = os.path.join(
    'checkpoints',
    '_'.join([CFG.MODEL.NAME, str(CFG.TRAIN.NUM_EPOCHS)])
)
# train data
CFG.TRAIN.DATA = EasyDict()
# read data
CFG.TRAIN.DATA.DIR = 'datasets/' + CFG.DATASET_NAME
# dataloader args, optional
CFG.TRAIN.DATA.BATCH_SIZE = 8
CFG.TRAIN.DATA.PREFETCH = False
CFG.TRAIN.DATA.SHUFFLE = True
CFG.TRAIN.DATA.NUM_WORKERS = 2
CFG.TRAIN.DATA.PIN_MEMORY = False

# ================= validate ================= #
CFG.VAL = EasyDict()
CFG.VAL.INTERVAL = 1
# validating data
CFG.VAL.DATA = EasyDict()
# read data
CFG.VAL.DATA.DIR = 'datasets/' + CFG.DATASET_NAME
# dataloader args, optional
CFG.VAL.DATA.BATCH_SIZE = 64
CFG.VAL.DATA.PREFETCH = False
CFG.VAL.DATA.SHUFFLE = False
CFG.VAL.DATA.NUM_WORKERS = 2
CFG.VAL.DATA.PIN_MEMORY = False

# ================= test ================= #
CFG.TEST = EasyDict()
CFG.TEST.INTERVAL = 1
# test data
CFG.TEST.DATA = EasyDict()
# read data
CFG.TEST.DATA.DIR = 'datasets/' + CFG.DATASET_NAME
# dataloader args, optional
CFG.TEST.DATA.BATCH_SIZE = 64
CFG.TEST.DATA.PREFETCH = False
CFG.TEST.DATA.SHUFFLE = False
CFG.TEST.DATA.NUM_WORKERS = 2
CFG.TEST.DATA.PIN_MEMORY = False
1
# ================= evaluate ================= #
CFG.EVAL = EasyDict()
CFG.EVAL.USE_GPU = True
CFG.EVAL.HORIZONS = [12]

from basicts.

zezhishao commented on July 22, 2024

Have you read Crossformer [1]? Crossformer is proposed for long-term time series forecasting, which usually requires long-term historical data, such as 336 time steps. Furthermore, the hyper-parameter of the model, CFG.MODEL.PARAM, must be set appropriately based on your input sequence length, especially the seq_len and win_size parameters.

If you need more information about Crossformer, see [1]. Also, you may can refer to our paper [2] to get some help.

[1] Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting. https://openreview.net/pdf?id=vSVLM2j9eie
[2] Exploring Progress in Multivariate Time Series Forecasting: Comprehensive Benchmarking and Heterogeneity Analysis. https://arxiv.org/pdf/2310.06119.pdf

from basicts.

faizanhakim commented on July 22, 2024

could you share me your configuration so that I can have it as a reference?

from basicts.

faizanhakim commented on July 22, 2024

Are you utilizing full long version of datasets?

from basicts.

zezhishao commented on July 22, 2024

What do you mean by "full" and "long"? There are no "long" and "short" versions of the dataset itself. We generate samples from time series based on a sliding window of $T=P+F$, where $P$ is the length of historical data and the $F$ is the length of future data. The $P$ and $F$ is a human-adjustable hyperparameter, see this script. For example, PEMS08 can be used in both long and short settings, as described in our paper [1].

[1] Exploring advances in multivariate time series forecasting: Comprehensive benchmarking and heterogeneity analysis. https://arxiv.org/pdf/2310.06119.pdf

from basicts.

faizanhakim commented on July 22, 2024

Sometimes with a few epochs of training, I get all errors to be zero with the configuration given. Anything I can do to avoid this?

from basicts.

zezhishao commented on July 22, 2024

There could be many reasons and I need more information to reproduce your error.

from basicts.

Cannot run inference.py on VMWare 2016 workstation. about basicts HOT 20 OPEN

Comments (20)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent