Code Monkey home page Code Monkey logo

Comments (26)

alazzaro avatar alazzaro commented on June 20, 2024 1

@barracuda156 I'm sorry, it cannot be today, likely next week. MPICH 4.1 is still not supported in our tests, so it is not a surprise you see this error.

from dbcsr.

alazzaro avatar alazzaro commented on June 20, 2024 1

@mkrack thanks for confirmation, I did a similar test on MacOS (GCC 13.1, MPICH 4.1.2) and it works.

@barracuda156 just for confirmation, I don't see -DUSE_MPI_F08=ON in https://github.com/macports/macports-ports/blob/master/math/dbcsr/Portfile....
There is another interesting consideration. I used brew to install MPICH. When I use -DUSE_MPI_F08=ON, cmake reports the following:

-- Found MPI: TRUE (found version "4.0") found components: C CXX Fortran 
CMake Warning at CMakeLists.txt:203 (message):
  The listed MPI implementation does not provide the required mpi_f08.mod
  interface.  The Fortran 90 bindings will be used instead.

So basically the brew version of MPICH is missing the F08 interface and we are back to the original problem of the F77 MPI interface (we will fix it). Can you check your MPI has the F08 module? My understanding is that only new GCC can compile MPICH with F08 support (see for example here).
In my case I have to recompile MPICH via:

brew reinstall --build-from-source mpich

to make sure I have the F08 module.

from dbcsr.

alazzaro avatar alazzaro commented on June 20, 2024 1

OK, then we have a solution.

My summary is the following:

  • The old code was not fully standard compliance with the F77 MPI interface for the mpi_alloc/free_mem. It was a hybrid with the F77 and F08. The reason was avoiding the use of (non-standard) Cray pointers (as suggested in the MPI standard example). Apparently, the "hybrid" solution was supported by all MPI implementations we are using. Now the new MPICH 4.1 enforces the correct standard interface, so the hybrid cannot work anymore.
  • The new F08 requires new compilers. Because of that we can request users to use USE_MPI_F08 for the MPICH 4.1 in combination with a new compiler support. I tried in our CI (GCC 9.4) and I see crashes in several places. For this reason, I would leave USE_MPI_F08 off by default and enable only in the test with the new GCC.

@barracuda156 is this something reasonable for you? In your case it requires MPICH 4.1 with F08 support, add the flag -DUSE_MPI_F08=ON and a recent GCC compiler.

I would like to thank @fstein93 who did the F08 porting of the MPI code and @mkrack for testing it.

from dbcsr.

alazzaro avatar alazzaro commented on June 20, 2024 1

I'm going to close this issue, please open a new for further discussions

from dbcsr.

alazzaro avatar alazzaro commented on June 20, 2024 1

The problem is in the new MPICH, so nothing to do with compilers.
Saying that, only new compilers support the F08 API.

from dbcsr.

alazzaro avatar alazzaro commented on June 20, 2024 1

The latter... this is due to MPICH 4.1 to be strictly complainant with the standard, therefore they enforce the full F77 interface, unless you ask for the F08. You get the first error in DBCSR simply because it is the first to compile...

from dbcsr.

hfp avatar hfp commented on June 20, 2024

@alazzaro: for instance dbcsr_mpiwrap.F:5299, could it be we need to use C_LOC(mp_baseptr)? Currently it seems we rely on passing arguments by pointer, i.e., mp_baseptr is passed by pointer (and hence it should work also).

@barracuda156: I wonder if the basepointer issue is "just a warning" or if the related code actually crashes...

from dbcsr.

barracuda156 avatar barracuda156 commented on June 20, 2024

@barracuda156: I wonder if the basepointer issue is "just a warning" or if the related code actually crashes...

@hfp There is a topic with test results: #645
If there is something more specific to check, I could do that.

from dbcsr.

barracuda156 avatar barracuda156 commented on June 20, 2024

@alazzaro Could you please take a look?

from dbcsr.

fstein93 avatar fstein93 commented on June 20, 2024

I am currently working on the MPI backend and try to upgrade DBCSR to the new mpi_f08 module. The warnings might be related to missing explicit interfaces on the library side which are to be expected here. I wonder about the errors in case of MPICH 4.1. The MPI standard (including MPI 3.0 and MPI 3.1) requires in case of the mpi module (not mpi_f08) to overload MPI_Alloc_mem with a TYPE(C_PTR) version if TYPE(C_PTR) is available compiler-wise. DBCSR's wrapper is in accordance to the example provided in the standard itself and thus should work in my opinion. I am still glancing through Google what it might be related to.

from dbcsr.

hfp avatar hfp commented on June 20, 2024

Maybe unrelated, sometimes it matters if TYPE(C_PTR) is passed as TYPE(C_PTR), VALUE because the opposite side expects the pointer address and not a pointer to the pointer...

from dbcsr.

fstein93 avatar fstein93 commented on June 20, 2024

I have observed it already in a different context but in the given case, even the MPI standard does not use the VALUE attribute indicating that actually a pointer to the pointer is expected.
I have just checked the build-files of MPICH 4.0.3. In case of the mpi_f08 module, TYPE(C_PTR) is used whereas for the mpi module, MPICH uses a placeholder which seems to be any possible type/kind/rank-combination (not standard-compliant).
My impression is that we should switch to the mpi_f08 module. I have code ready based on my currently opened PR. From CP2K, it could be that MPICH is not compatible with certain versions of gcc, but it worked with IntelMPI and OpenMPI (see cp2k/cp2k#2486).

from dbcsr.

alazzaro avatar alazzaro commented on June 20, 2024

@barracuda156 I've just merged #678 so now the problem of this issue should go away if you build DCBSR with -DUSE_MPI_F08=ON. Can you confirm that?

from dbcsr.

barracuda156 avatar barracuda156 commented on June 20, 2024

@barracuda156 I've just merged #678 so now the problem of this issue should go away if you build DCBSR with -DUSE_MPI_F08=ON. Can you confirm that?

Thank you very much! I will test that tonight.

from dbcsr.

barracuda156 avatar barracuda156 commented on June 20, 2024

@alazzaro Unfortunately, still fails:

FAILED: src/CMakeFiles/dbcsr.dir/mpi/dbcsr_mpiwrap.F.o src/dbcsr_mpiwrap.mod 
/opt/local/bin/mpif90-mpich-gcc12 -I/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/build/src/mpi -I/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/src/base -I/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/src -I/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/build/src -ffree-form -std=f2008ts -fimplicit-none -Werror=aliasing -Werror=ampersand -Werror=c-binding-type -Werror=intrinsic-shadow -Werror=intrinsics-std -Werror=line-truncation -Werror=tabs -Werror=target-lifetime -Werror=underflow -Werror=unused-but-set-parameter -Werror=unused-but-set-variable -Werror=unused-variable -Werror=unused-dummy-argument -Werror=conversion -Werror=zerotrip -Werror=uninitialized -Wno-maybe-uninitialized -Werror=unused-parameter -fallow-argument-mismatch -mmacosx-version-min=10.6 -Jsrc -fPIC -fopenmp -Wno-error -fpreprocessed -c src/CMakeFiles/dbcsr.dir/mpi/dbcsr_mpiwrap.F-pp.f -o src/CMakeFiles/dbcsr.dir/mpi/dbcsr_mpiwrap.F.o
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/src/mpi/dbcsr_mpiwrap.F:5543:65:

 5543 |          CALL MPI_ALLOC_MEM(mp_size, mp_info, mp_baseptr, mp_res)
      |                                                                 1
Error: Type mismatch in argument 'baseptr' at (1); passed TYPE(c_ptr) to INTEGER(4)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/src/mpi/dbcsr_mpiwrap.F:5543:65:

 5543 |          CALL MPI_ALLOC_MEM(mp_size, mp_info, mp_baseptr, mp_res)
      |                                                                 1
Error: Type mismatch in argument 'baseptr' at (1); passed TYPE(c_ptr) to INTEGER(4)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/src/mpi/dbcsr_mpiwrap.F:5543:65:

 5543 |          CALL MPI_ALLOC_MEM(mp_size, mp_info, mp_baseptr, mp_res)
      |                                                                 1
Error: Type mismatch in argument 'baseptr' at (1); passed TYPE(c_ptr) to INTEGER(4)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/src/mpi/dbcsr_mpiwrap.F:5543:65:

 5543 |          CALL MPI_ALLOC_MEM(mp_size, mp_info, mp_baseptr, mp_res)
      |                                                                 1
Error: Type mismatch in argument 'baseptr' at (1); passed TYPE(c_ptr) to INTEGER(4)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/src/mpi/dbcsr_mpiwrap.F:5543:65:

 5543 |          CALL MPI_ALLOC_MEM(mp_size, mp_info, mp_baseptr, mp_res)
      |                                                                 1
Error: Type mismatch in argument 'baseptr' at (1); passed TYPE(c_ptr) to INTEGER(4)
/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/src/mpi/dbcsr_mpiwrap.F:5543:65:

 5543 |          CALL MPI_ALLOC_MEM(mp_size, mp_info, mp_baseptr, mp_res)
      |                                                                 1
Error: Type mismatch in argument 'baseptr' at (1); passed TYPE(c_ptr) to INTEGER(4)
[239/346] /opt/local/bin/mpicxx-mpich-gcc12  -I/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/build/src -I/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/src -pipe -Os -DNDEBUG -I/opt/local/include -D_GLIBCXX_USE_CXX11_ABI=0 -std=gnu++11 -arch ppc -mmacosx-version-min=10.6 -MD -MT tests/CMakeFiles/dbcsr_test.dir/dbcsr_test.cpp.o -MF tests/CMakeFiles/dbcsr_test.dir/dbcsr_test.cpp.o.d -o tests/CMakeFiles/dbcsr_test.dir/dbcsr_test.cpp.o -c /opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_math_dbcsr/dbcsr/work/dbcsr-397bf0f80c293a0c6088a1314931a748cff4b5b6/tests/dbcsr_test.cpp

This is the latest commit DBCSR, gcc 12.3.0, mpich-gcc12 @4.1.1_0+fortran

To the config in the portfile https://github.com/macports/macports-ports/blob/master/math/dbcsr/Portfile
I have added -DUSE_MPI_F08=ON.

from dbcsr.

mkrack avatar mkrack commented on June 20, 2024

@alazzaro Building DBCSR with MPICH 4.1.2 and GCC 13.1.0 using the cmake flag -DUSE_MPI_F08=ON works fine for me. With -DUSE_MPI_F08=OFF, compilation errors as shown above and as reported in the CP2K issue #2808 occur: Error: Type mismatch in argument 'baseptr'.
However, make test reports that one test (test 18 out of 19) is failing, but that is also the case with MPICH 4.0.3.

from dbcsr.

barracuda156 avatar barracuda156 commented on June 20, 2024

@alazzaro This is an awesome point. Indeed, it is disabled: https://github.com/macports/macports-ports/blob/100bdfdca9908a07bb07a92663434f401c1f71f9/science/mpich/Portfile#L179-L180
I recall taking part in the related discussion – we disabled it for a reason, it failed to build.

Need to review why it failed, but we do have new GCC (all tested systems use 12.3.0 now, including my PowerPC ones).

from dbcsr.

barracuda156 avatar barracuda156 commented on June 20, 2024

@alazzaro I have built MPICH 4.1.2 with enabled F08 now, and DBCSR built fine. Your solution works.

from dbcsr.

barracuda156 avatar barracuda156 commented on June 20, 2024

@alazzaro I do not control MPICH port in Macports, while I am a maintainer of DBCSR port, so I can only say I hope that will work. I have requested maintainers of MPICH to enable F08 in the next update to the port (that will have to be tested on other systems – I only verified it builds for me locally). If that is done, I will add the fix to DBCSR, so that it can be built again normally.

Requirement for a new GCC will temporarily leave PowerPC builds broken on < 10.6, but I plan to update those to GCC 12 anyway, hopefully soon. (Technically everything ready for that, but changes to toolchain aren’t the easiest to push through.)

from dbcsr.

alazzaro avatar alazzaro commented on June 20, 2024

Well, this is only required for the new MPICH 4.1. In all other cases, the default will work...

from dbcsr.

barracuda156 avatar barracuda156 commented on June 20, 2024

Well, this is only required for the new MPICH 4.1. In all other cases, the default will work...

Well, MPICH 4.x is the current reality. (Introducing back a legacy MPICH just to build one port is too much. I hope we can sort out enabling F08 instead.)

from dbcsr.

haampie avatar haampie commented on June 20, 2024

@alazzaro what's the current status of this?

I'm running into this issue with spack install [email protected] ^[email protected] and would like to add the relevant conflicts / defines.

So:

  1. dbcsr < 2.6.0 doesn't have the USE_MPI_F08 define, meaning it's incompatible with all mpich 4.1 and higher? Or is it also conditional on the underlying gcc?
  2. for dbcsr 2.6.0 and above, it's sufficient to set -DUSE_MPI_F08=ON when using [email protected]: for all compilers? Or only recent gcc?

from dbcsr.

haampie avatar haampie commented on June 20, 2024

Could you take a look at spack/spack#40494?

from dbcsr.

alazzaro avatar alazzaro commented on June 20, 2024

Could you take a look at spack/spack#40494?

Seems reasonable. Do you have any other question?

from dbcsr.

haampie avatar haampie commented on June 20, 2024

Do you have any other question?

I didn't understand whether the CP2K build issue with mpich 4.1 is because CP2K builds vendored DBCSR and hits the issue in this thread, or because it also needs fixes inside CP2K itself?

from dbcsr.

barracuda156 avatar barracuda156 commented on June 20, 2024

For the record, Macports still stuck with non-F08 MPICH, waiting for it to be updated. Locally I have MPICH 4.1.2 with F08 and gcc13, works fine on the old 10.6 PowerPC :)

from dbcsr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.