Comments (6)
You are right! That's what I used for the other conda environment. I used conda create --user-local
and pip install --no-deps
together. Somehow using pip install --no-deps
in virtualenv didn't work for me. Maybe virtualenv has some integration problem with Perlmutter's module on my side.
I will change it to --no-deps
and develop
branch then. The conditional weight for forces may not be in develop
but I think the difference is small.
from matbench-discovery.
For long-term reproducibility, it would be preferable to install a specific release of the original MACE repo in case your fork is deleted or modified. Perhaps @ilyes319 can comment when the multi-gpu
branch is expected to get merged. A new release after that would be much appreciated!
Regarding module loading pytorch
, not sure I understand the problem. I didn't encounter any issues using the slurm_submit
function in MBD and passing pre_cmd="module load pytorch/2.0.1;",
.
matbench-discovery/models/alignn/train_alignn.py
Lines 49 to 57 in 71d77f4
from matbench-discovery.
There are a couple of differences between my mbd
branch and mace/multi-gpu
. I cloned mace/multi-gpu
as mbd
branch and merged mace/gaussian
into mace/multi-gpu
.
There is a subtlety here where the mace and pytorch will be/have been installed. To reiterate, mace
has dependency on pytorch
and whenever we try to install it in the virtualenv, it will install a new pytorch
in the virtualenv and ignore the existing installation on Perlmutter's module.
If we create a new conda
env instead, we need to use the command conda create --use-local
to correctly use Perlmutter's pytorch, but mace
need to be installed in Perlmutter's conda env too. Installing mace in virtualenv will pass but will raise error when we try to use it.
from matbench-discovery.
I expect the gaussian
and multi-gpu
will both make their way into the next MACE release. The gaussian
branch was already merged into develop
.
I think the pytorch subtlety you mention is easily avoided by running pip install git+https://github.com/ACEsuit/mace@develop --no-deps
which means pip
will skip installing any dependencies and only install MACE itself. That's what I did for testing your MACE checkpoint.
from matbench-discovery.
I think we're good here? Feel free to re-open if not.
from matbench-discovery.
Yes all good. Just to keep a note here, we need to install pytorch-dependent packages all in one virtual environment. That means on Perlmutter MACE needs to be installed via pip provided by Perlmutter's pytorch module in order to use it directly without repeated installation. The other way is to install another pytorch and mace all together in a new virtual environment, in that case we don't need to module load pytorch/2.0.1
on Perlmutter anymore.
from matbench-discovery.
Related Issues (20)
- fetch_process_wbm_dataset.py: AssertionError: mat_id='wbm-1-9': e_form=-0.31117 != e_form_ppd - correction=-0.32358 HOT 2
- test_plots.py and test_preds.py failing HOT 4
- Package uses non-standard site-package paths for resources HOT 4
- df_summary.index contains nan values HOT 1
- fetch_process_wbm_dataset.py: data/wbm/2022-10-19-wbm-init-structs.json.bz2 does not exist
- compute_struct_fingerprints.py: cannot insert material_id, already exists
- fetch_process_wbm_dataset.py: Generating Aflow labels raised exception=KeyError('wyckoff_spglib') HOT 1
- Location of site-stats.json.gz
- Benchmark design questions HOT 15
- Obtain E_above_hull predictions HOT 10
- Reference: Critical examination of robustness and generalizability HOT 2
- Importing CSV with pd.read_json() HOT 3
- Simplified user interface HOT 1
- dead link in contributing HOT 1
- fetch_process_wbm_dataset.py: bad JSON file checksum HOT 1
- Mismatching fingerprint paths HOT 1
- Different training size for benchmarking HOT 2
- MIssing `"direct_url.json"` causes `JSONDecodeError`: Expecting value: line 1 column 1 (char 0)
- df_wbm has wrong index column name type for wandb.Table HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from matbench-discovery.