I’ve been using the OCP framework recently to try to perform ethylene adsorption calcu

Nothing holding it up. This should be ready to merge, unless <a class="user-mention no

This should be fixed in <a class="issue-link js-issue-link" data-error-text="Failed to

ASE databases incompatible with current fine-tuning tutorial about ocp HOT 10 CLOSED

gunnarpsu commented on July 30, 2024

ASE databases incompatible with current fine-tuning tutorial

from ocp.

Comments (10)

emsunshine commented on July 30, 2024 1

I was able to get the fine-tuning tutorial working with the changes from these two PRs: Open-Catalyst-Project/tutorial#4 and #630. You can try these branches to see if they solve the problem.

from ocp.

emsunshine commented on July 30, 2024

I think you are correct that this was an oversight when converting to the new trainer/configs. The new location for dataset format makes more sense but is not backwards compatible. You should be able to get around this error by adding "format":"ase_db" to the dataset config.

from ocp.

gunnarpsu commented on July 30, 2024

That did the trick regarding that part - thank you!
However, I now receive the following error in the output after the model loads:

2024-02-26 09:22:56 (INFO): Loading dataset: ase_db
2024-02-26 09:22:56 (INFO): Batch balancing is disabled for single GPU training.
2024-02-26 09:22:56 (INFO): Batch balancing is disabled for single GPU training.
2024-02-26 09:22:56 (INFO): Batch balancing is disabled for single GPU training.
2024-02-26 09:22:56 (INFO): Loading model: gemnet_oc
C:\Users\gls5443\Desktop\ocp-main\ocpmodels\datasets\ase_datasets.py:108: UserWarning: Supplied sid is not numeric (or missing). Using dataset indices instead.
  warnings.warn(
2024-02-26 09:22:59 (INFO): Loaded GemNetOC with 38864438 parameters.
2024-02-26 09:22:59 (WARNING): Model gradient logging to tensorboard not yet supported.
2024-02-26 09:22:59 (WARNING): Using `weight_decay` from `optim` instead of `optim.optimizer_params`.Please update your config to use `optim.optimizer_params.weight_decay`.`optim.weight_decay` will soon be deprecated.
2024-02-26 09:23:00 (INFO): Loading checkpoint from: gnoc_oc22_oc20_all_s2ef.pt
C:\Users\gls5443\Desktop\ocp-main\ocpmodels\datasets\ase_datasets.py:108: UserWarning: Supplied sid is not numeric (or missing). Using dataset indices instead.
  warnings.warn(
C:\Users\gls5443\Desktop\ocp-main\ocpmodels\datasets\ase_datasets.py:108: UserWarning: Supplied sid is not numeric (or missing). Using dataset indices instead.
  warnings.warn(
Traceback (most recent call last):
  File "C:\Users\gls5443\Desktop\ocp-main\main.py", line 92, in <module>
    Runner()(config)
  File "C:\Users\gls5443\Desktop\ocp-main\main.py", line 36, in __call__
    self.task.run()
  File "C:\Users\gls5443\Desktop\ocp-main\ocpmodels\tasks\task.py", line 51, in run
    self.trainer.train(
  File "C:\Users\gls5443\Desktop\ocp-main\ocpmodels\trainers\ocp_trainer.py", line 158, in train
    loss = self._compute_loss(out, batch)
  File "C:\Users\gls5443\Desktop\ocp-main\ocpmodels\trainers\ocp_trainer.py", line 317, in _compute_loss
    target = batch[target_name]
  File "c:\Users\gls5443\AppData\Local\miniconda3\envs\ocp_new1\lib\site-packages\torch_geometric\data\batch.py", line 175, in __getitem__
    return super().__getitem__(idx)
  File "c:\Users\gls5443\AppData\Local\miniconda3\envs\ocp_new1\lib\site-packages\torch_geometric\data\data.py", line 498, in __getitem__
    return self._store[key]
  File "c:\Users\gls5443\AppData\Local\miniconda3\envs\ocp_new1\lib\site-packages\torch_geometric\data\storage.py", line 111, in __getitem__
    return self._mapping[key]
KeyError: 'energy'

Are there additional tags I need to supply to the config for it to parse the databases?

from ocp.

emsunshine commented on July 30, 2024

Thanks for flagging this. The new trainer has renamed the targets from y and force to energy and forces respectively. The ASE datasets were not updated to reflect this. Until the datasets are updated, you should be able to get around this by using the following in the dataset config:

key_mapping:
    y: energy
    force: forces

Referencing these lines from the new example config
https://github.com/Open-Catalyst-Project/ocp/blob/394e9bad7780a05d3371f52550c1f92c47a61ce3/configs/ocp_example.yml#L20

from ocp.

gunnarpsu commented on July 30, 2024

Unfortunately it still throws that error. Just for reference, here is the currently used config.yml:

amp: true
checkpoint: ./gnoc_oc22_oc20_all_s2ef.pt
dataset:
  test:
    a2g_args:
      r_energy: false
      r_forces: false
    format: ase_db
    key_mapping:
      force: forces
      y: energy
    src: test.db
  train:
    a2g_args:
      r_energy: true
      r_forces: true
    format: ase_db
    key_mapping:
      force: forces
      y: energy
    src: train.db
  val:
    a2g_args:
      r_energy: true
      r_forces: true
    format: ase_db
    key_mapping:
      force: forces
      y: energy
    src: val.db
eval_metrics:
  metrics:
    energy:
    - mae
    forces:
    - forcesx_mae
    - forcesy_mae
    - forcesz_mae
    - mae
    - cosine_similarity
    - magnitude_error
    misc:
    - energy_forces_within_threshold
  primary_metric: forces_mae
gpus: 1
loss_fns:
- energy:
    coefficient: 1
    fn: mae
- forces:
    coefficient: 1
    fn: l2mae
model:
  activation: silu
  atom_edge_interaction: true
  atom_interaction: true
  cbf:
    name: spherical_harmonics
  cutoff: 12.0
  cutoff_aeaint: 12.0
  cutoff_aint: 12.0
  cutoff_qint: 12.0
  direct_forces: true
  edge_atom_interaction: true
  emb_size_aint_in: 64
  emb_size_aint_out: 64
  emb_size_atom: 256
  emb_size_cbf: 16
  emb_size_edge: 512
  emb_size_quad_in: 32
  emb_size_quad_out: 32
  emb_size_rbf: 16
  emb_size_sbf: 32
  emb_size_trip_in: 64
  emb_size_trip_out: 64
  enforce_max_neighbors_strictly: false
  envelope:
    exponent: 5
    name: polynomial
  extensive: true
  forces_coupled: false
  max_neighbors: 30
  max_neighbors_aeaint: 20
  max_neighbors_aint: 1000
  max_neighbors_qint: 8
  name: gemnet_oc
  num_after_skip: 2
  num_atom: 3
  num_atom_emb_layers: 2
  num_before_skip: 2
  num_blocks: 4
  num_concat: 1
  num_global_out_layers: 2
  num_output_afteratom: 3
  num_radial: 128
  num_spherical: 7
  otf_graph: true
  output_init: HeOrthogonal
  qint_tags:
  - 1
  - 2
  quad_interaction: true
  rbf:
    name: gaussian
  regress_forces: true
  sbf:
    name: legendre_outer
noddp: false
optim:
  batch_size: 10
  clip_grad_norm: 10
  ema_decay: 0.999
  energy_coefficient: 1
  eval_batch_size: 10
  eval_every: 1
  factor: 0.8
  force_coefficient: 1
  load_balancing: atoms
  loss_energy: mae
  lr_initial: 0.0005
  max_epochs: 10
  mode: min
  num_workers: 2
  optimizer: AdamW
  optimizer_params:
    amsgrad: true
  patience: 3
  scheduler: ReduceLROnPlateau
  weight_decay: 0
outputs:
  energy:
    level: system
  forces:
    eval_on_free_atoms: true
    level: atom
    train_on_free_atoms: true
task:
  dataset: ase_db
trainer: forces

from ocp.

mshuaibii commented on July 30, 2024

The 'key_mapping' functionality has not hit main yet. It currently lives in this PR - #622.

@lbluque is there an update on what's blocking this.

In the mean time you can checkout that branch if you would like to use it before we land it to main.

from ocp.

lbluque commented on July 30, 2024

Nothing holding it up. This should be ready to merge, unless @mshuaibii or @emsunshine have any further suggestions

from ocp.

gunnarpsu commented on July 30, 2024

Hello,
I now have a two part problem, one of which I fixed but which may need to be committed to a future branch, and the other I am unable to solve.

First, I think that line 1018 of ocpmodels/common/utils.py may need to change from
loss_fns=config.get("loss_functions", {}),
to
loss_fns=config.get("loss_fns", {}),
as it is the only way for the configs to be read properly without throwing NotImplementedError.

While the new branch - #622 - which was recommended for using the ASE db's does enable the first inferencing step, it quickly resolves into the second error:

2024-02-26 14:25:48 (INFO): Loading dataset: ase_db
2024-02-26 14:25:49 (INFO): Batch balancing is disabled for single GPU training.
2024-02-26 14:25:49 (INFO): Batch balancing is disabled for single GPU training.
2024-02-26 14:25:49 (INFO): Batch balancing is disabled for single GPU training.
2024-02-26 14:25:49 (INFO): Loading model: gemnet_oc
2024-02-26 14:25:51 (INFO): Loaded GemNetOC with 38864438 parameters.
2024-02-26 14:25:51 (WARNING): Model gradient logging to tensorboard not yet supported.
2024-02-26 14:25:51 (WARNING): Using `weight_decay` from `optim` instead of `optim.optimizer_params`.Please update your config to use `optim.optimizer_params.weight_decay`.`optim.weight_decay` will soon be deprecated.
2024-02-26 14:25:52 (INFO): Loading checkpoint from: gnoc_oc22_oc20_all_s2ef.pt
2024-02-26 14:25:58 (INFO): Evaluating on val.

device 0:   0%|          | 0/3 [00:00<?, ?it/s]
device 0:  33%|███▎      | 1/3 [00:03<00:06,  3.38s/it]
device 0:  67%|██████▋   | 2/3 [00:03<00:01,  1.48s/it]
device 0: 100%|██████████| 3/3 [00:03<00:00,  1.15it/s]
device 0: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]
2024-02-26 14:26:02 (INFO): energy_mae: 2.9834, forcesx_mae: 0.0080, forcesy_mae: 0.0130, forcesz_mae: 0.0073, forces_mae: 0.0094, forces_cosine_similarity: 0.1755, forces_magnitude_error: 0.0144, energy_forces_within_threshold: 0.0000, loss: 3.0039, epoch: 0.0417
2024-02-26 14:26:02 (INFO): Predicting on test.

device 0:   0%|          | 0/3 [00:00<?, ?it/s]
device 0:  33%|███▎      | 1/3 [00:03<00:06,  3.41s/it]
device 0:  67%|██████▋   | 2/3 [00:03<00:01,  1.46s/it]
device 0: 100%|██████████| 3/3 [00:03<00:00,  1.18it/s]
device 0: 100%|██████████| 3/3 [00:04<00:00,  1.35s/it]
Traceback (most recent call last):
  File "C:\Users\gls5443\Desktop\ocp-ase_data_updates\main.py", line 92, in <module>
    Runner()(config)
  File "C:\Users\gls5443\Desktop\ocp-ase_data_updates\main.py", line 36, in __call__
    self.task.run()
  File "C:\Users\gls5443\Desktop\ocp-ase_data_updates\ocpmodels\tasks\task.py", line 51, in run
    self.trainer.train(
  File "C:\Users\gls5443\Desktop\ocp-ase_data_updates\ocpmodels\trainers\ocp_trainer.py", line 215, in train
    self.update_best(
  File "C:\Users\gls5443\Desktop\ocp-ase_data_updates\ocpmodels\trainers\base_trainer.py", line 706, in update_best
    self.predict(
  File "c:\Users\gls5443\AppData\Local\miniconda3\envs\ocp_new1\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\gls5443\Desktop\ocp-ase_data_updates\ocpmodels\trainers\ocp_trainer.py", line 528, in predict
    predictions[key] = np.array(predictions[key])
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (29,) + inhomogeneous part.

from ocp.

github-actions commented on July 30, 2024

This issue has been marked as stale because it has been open for 30 days with no activity.

from ocp.

lbluque commented on July 30, 2024

This should be fixed in #622. closing.

from ocp.

ASE databases incompatible with current fine-tuning tutorial about ocp HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent