Code Monkey home page Code Monkey logo

Comments (6)

chiang-yuan avatar chiang-yuan commented on June 1, 2024 1

You are right! That's what I used for the other conda environment. I used conda create --user-local and pip install --no-deps together. Somehow using pip install --no-deps in virtualenv didn't work for me. Maybe virtualenv has some integration problem with Perlmutter's module on my side.

I will change it to --no-deps and develop branch then. The conditional weight for forces may not be in develop but I think the difference is small.

from matbench-discovery.

janosh avatar janosh commented on June 1, 2024

For long-term reproducibility, it would be preferable to install a specific release of the original MACE repo in case your fork is deleted or modified. Perhaps @ilyes319 can comment when the multi-gpu branch is expected to get merged. A new release after that would be much appreciated!

Regarding module loading pytorch, not sure I understand the problem. I didn't encounter any issues using the slurm_submit function in MBD and passing pre_cmd="module load pytorch/2.0.1;", .

slurm_vars = slurm_submit(
job_name=job_name,
# partition="perlmuttter",
account="matgen",
time="4:0:0",
out_dir=out_dir,
slurm_flags="--qos regular --constraint gpu --gpus 1",
pre_cmd="module load pytorch/2.0.1;",
)

from matbench-discovery.

chiang-yuan avatar chiang-yuan commented on June 1, 2024

There are a couple of differences between my mbd branch and mace/multi-gpu. I cloned mace/multi-gpu as mbd branch and merged mace/gaussian into mace/multi-gpu.

There is a subtlety here where the mace and pytorch will be/have been installed. To reiterate, mace has dependency on pytorch and whenever we try to install it in the virtualenv, it will install a new pytorch in the virtualenv and ignore the existing installation on Perlmutter's module.

If we create a new conda env instead, we need to use the command conda create --use-local to correctly use Perlmutter's pytorch, but mace need to be installed in Perlmutter's conda env too. Installing mace in virtualenv will pass but will raise error when we try to use it.

from matbench-discovery.

janosh avatar janosh commented on June 1, 2024

I expect the gaussian and multi-gpu will both make their way into the next MACE release. The gaussian branch was already merged into develop.

I think the pytorch subtlety you mention is easily avoided by running pip install git+https://github.com/ACEsuit/mace@develop --no-deps which means pip will skip installing any dependencies and only install MACE itself. That's what I did for testing your MACE checkpoint.

from matbench-discovery.

janosh avatar janosh commented on June 1, 2024

I think we're good here? Feel free to re-open if not.

from matbench-discovery.

chiang-yuan avatar chiang-yuan commented on June 1, 2024

Yes all good. Just to keep a note here, we need to install pytorch-dependent packages all in one virtual environment. That means on Perlmutter MACE needs to be installed via pip provided by Perlmutter's pytorch module in order to use it directly without repeated installation. The other way is to install another pytorch and mace all together in a new virtual environment, in that case we don't need to module load pytorch/2.0.1 on Perlmutter anymore.

from matbench-discovery.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.