Comments (11)
This is on purpose. It makes sense for all of the cores to have the same
number of emulated samples, so this guarantees that you have a number of
emulated samples such that it is at least as big as what you put in for
num_l_emulate and is a multiple of the number of cores you are running on.
Otherwise, you are not optimally using your computing resources. You can
add a few extra emulated points, with no added computational cost (it is
actually slightly more efficient than doing what you suggest).
I guess, when what the user passes in for num_l_emulate is a multiple of
the number of cores it does add one more emulated point per core.
Since the number of emulated points should be really big compared to the
number of cores adding one point shouldn't really make much of a difference.
If you think that it is important that we use the EXACT number of emulated
points that the user specifies, then I guess we can change this a little.
On Tue, Jun 9, 2015 at 4:38 PM, eecsu [email protected] wrote:
In line 35 of calculateP in the emulate_iid_lebesgue method:
num_l_emulate = int(num_l_emulate/size)+1
Apparently size comes from bet.Comm and is the number of processors used
in the computation, so when this operation is called with only a single
processor it creates 1 more emulated sample than called for by
num_l_emulate parameter.I came across this while creating the verification example to re-create a
uniform distribution on \Lambda and worked around it by passing
num_l_emulate-1 into this method. I think the fix is to computeremainder = num_l_emulate % size
and then for the first "remainder" number of processors add 1 to
num_l_emulate. I am not quite sure how to code this though.—
Reply to this email directly or view it on GitHub
#92.
Postdoctoral Research Fellow
Institute for Computational Engineering and Sciences
The University of Texas at Austin
from bet.
A minor change would be
if num_l_emulate%size != 0:
num_l_emulate = int(num_l_emulate/size)+1
from bet.
Maybe a warning should be printed if the number changes. The issue I see is that someone will set a variable in a script (like I did) to be the number of emulated samples and call that in other places while using the emulated samples resulting in incorrect computations.
from bet.
At worse, the result of my suggestion is that there is 1 more sample in some processors versus others, but these emulated samples never incur model evaluation cost, so this lack of optimal division of resources seems negligible to me.
from bet.
Then the best change would be to do:
num_l_emulate = (num_l_emulate/size) + (rank < num_l_emulate%size)
from bet.
I will put that change in along with some other changes I am making for a new pull request that will include the validation example.
from bet.
If you are doing that, then actually do:
num_l_emulate = (num_l_emulate/comm.size) + (comm.rank < num_l_emulate%comm.size)
from bet.
Excellent. I like the proposed change with the inclusion of a warning in the docString and a printed one if num_l_emulate%comm.size
is nonzero.
from bet.
With what I propose above, this would not be necessary. The global number of emulated points is what the user specifies.
from bet.
Steve is correct, the warning is not needed with this fix.
from bet.
Sorry, got it now. This shouldn't affect any other portions of the code as we don't assume that num_l_emulate is the same on every processor anywhere.
from bet.
Related Issues (20)
- TODO: fix parallel loading of sample sets HOT 10
- revisit PendingDeprecationWarning HOT 1
- Mismatch of docs and definition of postTools.sample_prob HOT 4
- TODO: Feature Scaling HOT 1
- How we do installs HOT 4
- TODO: un-lock scipy HOT 6
- TODO: clean up some tests
- why no pickle? HOT 10
- discussion of decorators
- sampler class options HOT 2
- document when predicted distribution is erased
- nose deprecation HOT 9
- data-consistent methods HOT 5
- Deprecate adaptive sampling? HOT 3
- no mpi4py + mpirun for plotting HOT 2
- May 19 Meeting HOT 9
- Improve Fitting Random Variables
- Parallel Tests HOT 19
- New features for estimating observed and predicted distributions (other than KDE) HOT 4
- Create better (and simpler) plotting routines for density based approach
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bet.