Code Monkey home page Code Monkey logo

Comments (11)

gc00 avatar gc00 commented on July 29, 2024

Hi Yuling,
At this time, we're still working on making MANA really robust. Our primary work environment for now is Cori at NERSC. For CentOS 7, we've partially ported it so that it can launch and checkpoint, but it cannot yet restart. As the project becomes more mature on Cori, we will roll it out to other platforms.
The safest thing is to check back later.

You wrote that you're using intel open mpi.  Is this Intel MPI or Open MPI?  For certain technical reasons, we require the ability to create a statically linked MPI executable.  We will fix this later.  From what I understand, Open MPI does not support statically linked executables (i.e., does not support libmpi.a).

from mana.

yulingao avatar yulingao commented on July 29, 2024

@gc00 Hi gc00,
The kernel of the HPC in my school is CentOS 7, and the MPI version is Intel MPI. For now, I try to checkpoint the MPI job on HPC. By using DMTCP, I was able to checkpoint/restart single-node MPI jobs. However, for multi-node MPI jobs, as you said, DMTCP can only save checkpoints, but cannot restart.
When will you be able to support Linux CentOS 7? Thank you

from mana.

yulingao avatar yulingao commented on July 29, 2024

I haven't tried MANA yet because I had some problems installing it on CentOS 7.

/usr/bin/ld: cannot find -llzma
/usr/bin/ld: cannot find -lxml2

I don't have root permission of our HPC, so I can't install the missing library. I'm working on trying to use MANA on HPC.

from mana.

gc00 avatar gc00 commented on July 29, 2024

Hi Yulin,
The MANA for CentOS 7 is still a work in progress. But it will be possible to download the rpm packages and extract what you need there for the missing libraries.
I estimate another two weeks or so before I have it fully ported.
Best,

  • Gene

from mana.

yulingao avatar yulingao commented on July 29, 2024

Hi Gene.
I will try to download the rpm package for the missing libraries. Thank you very much for your help.
And I am looking forward for the fully porting on CentOS 7.

Yulin

from mana.

l00493405 avatar l00493405 commented on July 29, 2024

Hi Gene.
I will try to download the rpm package for the missing libraries. Thank you very much for your help.
And I am looking forward for the fully porting on CentOS 7.

Yulin

Hi,
I had the same problem, did you solve it?

from mana.

l00493405 avatar l00493405 commented on July 29, 2024

Hi Yulin,
The MANA for CentOS 7 is still a work in progress. But it will be possible to download the rpm packages and extract what you need there for the missing libraries.
I estimate another two weeks or so before I have it fully ported.
Best,

  • Gene

Hi, Gene
I have tried to use openMPI, Intel MPI, and MPICH to compile mana, but have encountered many problems. Which MPI should I use to compile mana, or is it because my environment is CentOS7?

from mana.

yulingao avatar yulingao commented on July 29, 2024

Hi Gene.
I will try to download the rpm package for the missing libraries. Thank you very much for your help.
And I am looking forward for the fully porting on CentOS 7.
Yulin

Hi,
I had the same problem, did you solve it?

I solved this on the direction of a file whose location is mana/contrib/mpi-proxy-split/lower-half/README.mpich-static. After solving this, I have other questions about installing mana on CentOS 7. And i don't fix it yet.

from mana.

yulingao avatar yulingao commented on July 29, 2024

@gc00 Hi Gene,
have you successfully port MANA to Centos 7?
Also, for multi-node MPI jobs, DMTCP cannot restart them. I hope that DMTCP can support multi-node MPI checkpoints.

Best,
Yulin

from mana.

shuqianwang avatar shuqianwang commented on July 29, 2024

@gc00 Hi gc00, The kernel of the HPC in my school is CentOS 7, and the MPI version is Intel MPI. For now, I try to checkpoint the MPI job on HPC. By using DMTCP, I was able to checkpoint/restart single-node MPI jobs. However, for multi-node MPI jobs, as you said, DMTCP can only save checkpoints, but cannot restart. When will you be able to support Linux CentOS 7? Thank you

HI? how about your process? I compile the MANA succeed with MPICH in CentOS7,and save checkpoints with hellpmpi.c in local node,but when restart it fails.

from mana.

kellekai avatar kellekai commented on July 29, 2024

@gc00 How is the status on OpenMPI supprt in MANA? Thanks!

from mana.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.