Code Monkey home page Code Monkey logo

Comments (6)

atorrez avatar atorrez commented on July 18, 2024

I was able to duplicate the problem on smog by increasing the number of hostdirs to 379 (same as cielo). By turning on the VERBOSE DEBUG compiler directive, I was able to see that nodes other than the rank 0 node were not seeing the full bitmap from MPI_Bcast.

What I discovered is that the number of bytes being transimitted was not 128 (full bitmap), but 32. This statement in code was the culprit:

int bitmap_bcast_sz = (BIT_ARRAY_LENGTH/8)/sizeof(MPI_UNSIGNED_CHAR);

this is the size used for broadcast

MPIBCAST(bitmap,bitmap_bcast_sz,MPI_UNSIGNED_CHAR,0,comm);

The problem is the sizeof MPI_UNSIGNED_CHAR is 4, so the resulting size was computing as follows: (1024/8)/4 = 32. I was only seeing 32 bytes being broadcast. So any hostdir > 256 (in the map) was not being broadcast and because of no bounds checking, the code kept running until if found a bit set, which was of course an illegal hostdir. Example MPI_Bcast code I have seen does not take into account the size of an MPI type so I am not sure why this was there in the first place. MPI must account for this internally.

Also, I am not sure why this worked previously on cielo. I browsed earlier versions of cray mpich with regard to the MPI_UNSIGNED_CHAR definition and that does not appear to have changed.

from plfs-core.

johnbent avatar johnbent commented on July 18, 2024

Wow. Nice debugging.

On Feb 25, 2014, at 7:45 AM, Alfred Torrez [email protected] wrote:

I was able to duplicate the problem on smog by increasing the number of hostdirs to 379 (same as cielo). By turning on the VERBOSE DEBUG compiler directive, I was able to see that nodes other than the rank 0 node were not seeing the full bitmap from MPI_Bcast.

What I discovered is that the number of bytes being transimitted was not 128 (full bitmap), but 32. This statement in code was the culprit:

int bitmap_bcast_sz = (BIT_ARRAY_LENGTH/8)/sizeof(MPI_UNSIGNED_CHAR);

this is the size used for broadcast

MPIBCAST(bitmap,bitmap_bcast_sz,MPI_UNSIGNED_CHAR,0,comm);

The problem is the sizeof MPI_UNSIGNED_CHAR is 4, so the resulting size was computing as follows: (1024/8)/4 = 32. I was only seeing 32 bytes being broadcast. So any hostdir > 256 (in the map) was not being broadcast and because of no bounds checking, the code kept running until if found a bit set, which was of course an illegal hostdir. Example MPI_Bcast code I have seen does not take into account the size of an MPI type so I am not sure why this was there in the first place. MPI must account for this internally.

Also, I am not sure why this worked previously on cielo. I browsed earlier versions of cray mpich with regard to the MPI_UNSIGNED_CHAR definition and that does not appear to have changed.


Reply to this email directly or view it on GitHub.

from plfs-core.

thewacokid avatar thewacokid commented on July 18, 2024

The interesting thing here is that the whole point of the MPI datatypes is so we don't have to handle this. MPI_BCast should take into account the size of the type you're sending (since the second argument is the number of items of some MPI datatype).

Did the broadcast code change? It should broadcast 128 bytes if you ask it to broadcast 32 MPI_UNSIGNED_CHARs just like it'll transmit 128 bytes if you ask it to send 10 MPI_LONG_DOUBLEs...

from plfs-core.

brettkettering avatar brettkettering commented on July 18, 2024

Yes, nice job, indeed Alfred!

I see this code in ad_plfs_open.c:

#define BIT_ARRAY_LENGTH MAX_HOSTDIRS

typedef char Bitmap;
Bitmap bitmap[BIT_ARRAY_LENGTH/8];

And, here is MAX_HOSTDIRS:
mallorca:src brettk$ grep MAX_HOSTDIRS *.h
plfs_internal.h:#define MAX_HOSTDIRS 1024

So, it seems to me that since Bitmap is a char, we are assuming a char is 8 bits. Sure, it is now, but that's what we have "sizeof" for. So, we need to do this:

typedef char Bitmap;
Bitmap bitmap[BIT_ARRAY_LENGTH/sizeof( Bitmap )];

Then, we need to do bounds checking so that we don't go off either end of bitmap when using it. And, we need to do the calculation properly to tell MPI how big bitmap is.

Alfred: Did you plan to make the code fixes?

Thanks,
Brett

From: John Bent <[email protected]mailto:[email protected]>
Reply-To: plfs/plfs-core <[email protected]mailto:[email protected]>
Date: Tuesday, February 25, 2014 8:01 AM
To: plfs/plfs-core <[email protected]mailto:[email protected]>
Subject: Re: [plfs-core] MPI Abort during Read Phase of fs_test (#340)

Wow. Nice debugging.

On Feb 25, 2014, at 7:45 AM, Alfred Torrez <[email protected]mailto:[email protected]> wrote:

I was able to duplicate the problem on smog by increasing the number of hostdirs to 379 (same as cielo). By turning on the VERBOSE DEBUG compiler directive, I was able to see that nodes other than the rank 0 node were not seeing the full bitmap from MPI_Bcast.

What I discovered is that the number of bytes being transimitted was not 128 (full bitmap), but 32. This statement in code was the culprit:

int bitmap_bcast_sz = (BIT_ARRAY_LENGTH/8)/sizeof(MPI_UNSIGNED_CHAR);

this is the size used for broadcast

MPIBCAST(bitmap,bitmap_bcast_sz,MPI_UNSIGNED_CHAR,0,comm);

The problem is the sizeof MPI_UNSIGNED_CHAR is 4, so the resulting size was computing as follows: (1024/8)/4 = 32. I was only seeing 32 bytes being broadcast. So any hostdir > 256 (in the map) was not being broadcast and because of no bounds checking, the code kept running until if found a bit set, which was of course an illegal hostdir. Example MPI_Bcast code I have seen does not take into account the size of an MPI type so I am not sure why this was there in the first place. MPI must account for this internally.

Also, I am not sure why this worked previously on cielo. I browsed earlier versions of cray mpich with regard to the MPI_UNSIGNED_CHAR definition and that does not appear to have changed.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//issues/340#issuecomment-36015829.

from plfs-core.

atorrez avatar atorrez commented on July 18, 2024

Brett,

Thanks, yes, I planned on making the changes this morning. I just wanted to get some feedback before I made them. I also wanted to ask Nathan Hjelm about this – not necessarily before I make the changes.

Alfred

From: Brett Kettering <[email protected]mailto:[email protected]>
Reply-To: plfs/plfs-core <[email protected]mailto:[email protected]>
Date: Tue, 25 Feb 2014 07:20:09 -0800
To: plfs/plfs-core <[email protected]mailto:[email protected]>
Cc: Alfred Torrez <[email protected]mailto:[email protected]>
Subject: Re: [plfs-core] MPI Abort during Read Phase of fs_test (#340)

Yes, nice job, indeed Alfred!

I see this code in ad_plfs_open.c:

#define BIT_ARRAY_LENGTH MAX_HOSTDIRS

typedef char Bitmap;
Bitmap bitmap[BIT_ARRAY_LENGTH/8];

And, here is MAX_HOSTDIRS:
mallorca:src brettk$ grep MAX_HOSTDIRS *.h
plfs_internal.h:#define MAX_HOSTDIRS 1024

So, it seems to me that since Bitmap is a char, we are assuming a char is 8 bits. Sure, it is now, but that's what we have "sizeof" for. So, we need to do this:

typedef char Bitmap;
Bitmap bitmap[BIT_ARRAY_LENGTH/sizeof( Bitmap )];

Then, we need to do bounds checking so that we don't go off either end of bitmap when using it. And, we need to do the calculation properly to tell MPI how big bitmap is.

Alfred: Did you plan to make the code fixes?

Thanks,
Brett

From: John Bent <[email protected]mailto:[email protected]mailto:[email protected]>
Reply-To: plfs/plfs-core <[email protected]mailto:[email protected]mailto:[email protected]>
Date: Tuesday, February 25, 2014 8:01 AM
To: plfs/plfs-core <[email protected]mailto:[email protected]mailto:[email protected]>
Subject: Re: [plfs-core] MPI Abort during Read Phase of fs_test (#340)

Wow. Nice debugging.

On Feb 25, 2014, at 7:45 AM, Alfred Torrez <[email protected]mailto:[email protected]mailto:[email protected]> wrote:

I was able to duplicate the problem on smog by increasing the number of hostdirs to 379 (same as cielo). By turning on the VERBOSE DEBUG compiler directive, I was able to see that nodes other than the rank 0 node were not seeing the full bitmap from MPI_Bcast.

What I discovered is that the number of bytes being transimitted was not 128 (full bitmap), but 32. This statement in code was the culprit:

int bitmap_bcast_sz = (BIT_ARRAY_LENGTH/8)/sizeof(MPI_UNSIGNED_CHAR);

this is the size used for broadcast

MPIBCAST(bitmap,bitmap_bcast_sz,MPI_UNSIGNED_CHAR,0,comm);

The problem is the sizeof MPI_UNSIGNED_CHAR is 4, so the resulting size was computing as follows: (1024/8)/4 = 32. I was only seeing 32 bytes being broadcast. So any hostdir > 256 (in the map) was not being broadcast and because of no bounds checking, the code kept running until if found a bit set, which was of course an illegal hostdir. Example MPI_Bcast code I have seen does not take into account the size of an MPI type so I am not sure why this was there in the first place. MPI must account for this internally.

Also, I am not sure why this worked previously on cielo. I browsed earlier versions of cray mpich with regard to the MPI_UNSIGNED_CHAR definition and that does not appear to have changed.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//issues/340#issuecomment-36015829.


Reply to this email directly or view it on GitHubhttps://github.com//issues/340#issuecomment-36018052.

from plfs-core.

atorrez avatar atorrez commented on July 18, 2024

Fixed Broadcast Size by calling MPI_Type_size instead of using sizeof(MPI_Type). Added bounds checking when determining if hostdir bitmap bit is set. Fix merged into Master

from plfs-core.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.