Code Monkey home page Code Monkey logo

lhsmdu's People

Contributors

dilawar avatar sahilm89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

lhsmdu's Issues

lhsmdu to work with scipy distributions (and ranges)

Hi there,

When creating samples with lhsmdu, is there a way to do it not uniformly but according to some/all of the scipy distributions (e.g., normal, alpha, beta, lognormal, ... about 100 different distributions are in scipy).

Ideally would be to have something that sets the distribution and range for each of the variables, then an amount of samples which is obviously equal for all variables and to do latin-hypercube or MC based on that. I'm just thinking out loud by the way as someone who uses it to do simulation analyses (not as a programmer).

Kind regards

numpy.AxisError: axis -1 is out of bounds for array of dimension 0 after package update

Hi S,

I updated the lhsmdu-package today and tested the code from the 'basics'-section you provided, but I immediately got the following error (it never happened with the previous install of your package I had installed):

Traceback (most recent call last):
File "C:\Program Files\Python 3.5\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
runfile('C:/Users/myvhove/Documents/CE_70/Python/LHS_MDU_test.py', wdir='C:/Users/myvhove/Documents/CE_70/Python')
File "C:\Program Files\JetBrains\PyCharm 2019.2.3\helpers\pydev_pydev_bundle\pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm 2019.2.3\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/myvhove/Documents/CE_70/Python/LHS_MDU_test.py", line 6, in
k = lhsmdu.sample(2, 20) # Latin Hypercube Sampling with multi-dimensional uniformity
File "C:\Program Files\Python 3.5\lib\site-packages\lhsmdu_init_.py", line 109, in sample
matrixOfStrata = eliminateRealizationsToStrata(distance_1D, matrixOfRealizations, numSamples)
File "C:\Program Files\Python 3.5\lib\site-packages\lhsmdu_init_.py", line 45, in eliminateRealizationsToStrata
realizations = sort(averageDistance.keys())
File "<array_function internals>", line 6, in sort
File "C:\Program Files\Python 3.5\lib\site-packages\numpy\core\fromnumeric.py", line 970, in sort
a.sort(axis=axis, kind=kind, order=order)
numpy.AxisError: axis -1 is out of bounds for array of dimension 0

The problem is in the sample-function:
k = lhsmdu.sample(2, 20) # Latin Hypercube Sampling with multi-dimensional uniformity


Incremental/Nested latin hypercube sampling

I had another question concerning the sampling. In your sample code, at a certain moment, you define resampling with the same strata of the previous sample to achieve an additional nested sampling.

I'm wondering. This function just generates a new list of sampling points within the same selection of latin hypercube-squares in the example (2 variables).
Is there a way to create an incremental nested sampled set, so that the first time you sample, you get a set of lhs-mdu-sampled points (e.g. 20). And you do this then multiple time but the next 20 points take into account the position of the 20 previously sampled points to generate then 20 more lhs-mdu-sampled points and together you have 40, 60, 80 sampled points then.

In this way a single generated set of 40, 60, 80 lhs-mdu-sampled points should the be the same as an incremental nested set of 2,3,4 times 20 points but the advantage is that you can split up the calculations you have to run in steps of 20 for ex. untill you achieve convergence. So that you don't have to run the whole sampled set at once but can break it down and stop when convergence is achieved. Is there a way to do this with this package?

Attached is an example.
Two sampled sets, generated with the same seed-number. In the first (blue) 120 samples are generated. In the second (red) 240 samples are generated but the second set includes the datapoints from the first set in such a way that if I select the first 120 sample-points from the red dataset, these are the same as the blue dataset. Is this something which is possible to generate with this package?
Nested sample.pdf

Implementation has exponential runtime (unusable for real-world cases)

The LHS implementation in this package has exponential runtime (depending on number of samples). This makes this package unusable if one needs > 20 samples. Just posting this here as a warning if one wants to use this with larger samples sizes which will not work.

From my understanding of the LHS algorithm this should scale with O(n) in number of the samples.

import lhsmdu

def lhsmdu_runtime():
    runtimes = []
    import time
    for k in range(0, 9):
        ts = time.time()
        samples = 2**k
        lhsmdu.sample(2, samples)  # Latin Hypercube Sampling of two variables, and 10 samples each
        te = time.time()
        res = {'samples': samples, 'time': te-ts}
        print(res)
        runtimes.append(
            res
        )
    df = pd.DataFrame(runtimes)
    fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(5, 10))
    for ax in axes:
        ax.plot(df.samples, df.time, '-o', markersize=10)
        ax.set_xlabel("sample size")
        ax.set_ylabel("runtime [s]")
        ax.grid(True)

    axes[1].set_xscale("log")
    axes[1].set_yscale("log")
    fig.savefig("lhsmdu_runtime.png")
    plt.show()

if __name__ == "__main__":
    # example1()
    lhsmdu_runtime()

image

So basically it takes 5 minutes to create 256 samples, around 1 hr for 500 samples. The current implementation is not usable for more than 20 samples.

{'samples': 1, 'time': 0.0008461475372314453}
{'samples': 2, 'time': 0.0018656253814697266}
{'samples': 4, 'time': 0.007106781005859375}
{'samples': 8, 'time': 0.030531644821166992}
{'samples': 16, 'time': 0.1478424072265625}
{'samples': 32, 'time': 0.8095762729644775}
{'samples': 64, 'time': 5.116601467132568}
{'samples': 128, 'time': 36.78028750419617}
{'samples': 256, 'time': 283.7775731086731}

Process finished with exit code 0

Resampling of realization matrix is incorrect

sortedIndicesOfStrata = argsort(ravel(matrixOfStrata[row,:]))

Resampling should be done by determining (per dimension) the index of a realization after sorting all realizations in that dimension. However, this is not what argsort does. For example [0.62, 0.19, 0.92, 0.22, 0.07] should result in [3, 1, 4, 2, 0], but argsort gives [4, 1, 3, 0, 2] instead. Apply argsort again on sortedIndicesOfStrata to get the correct vector of indices. In this example argsort([4, 1, 3, 0, 2]]) gives the correct result [3, 1, 4, 2, 0].

The effect of this incorrect resampling is that the resulting Latin hypercube samples are far from uniformly distributed, as intended by LHSMDU.

Sample code needs to be updated.

The example in the Readme.md is not working with Python3.6. The k and l are 2-D numpy matrices that need to be converted to np.arrays before sending to matplotlib. In addition, the 'col' parameter name needs to be replaced with 'c' or 'color'.

import lhsmdu
import matplotlib.pyplot as plt
import numpy as np

k = lhsmdu.sample(2, 20) # Latin Hypercube Sampling with multi-dimensional uniformity
print(k)
l = lhsmdu.createRandomStandardUniformMatrix(2, 20) # Monte Carlo sampling

k = np.array(k)
l = np.array(l)

fig = plt.figure()
ax = fig.gca()
ax.set_xticks(np.arange(0,1,0.1))
ax.set_yticks(np.arange(0,1,0.1))
plt.scatter(k[0], k[1], color="g", label="LHS-MDU")
plt.scatter(l[0], l[1], color="r", label="MC")
plt.grid()
plt.show()

m = lhsmdu.resample()
n = lhsmdu.resample()
o = lhsmdu.resample()
m = np.array(m)
n = np.array(n)
o = np.array(o)

fig = plt.figure()
ax = fig.gca()
ax.set_xticks(np.arange(0,1,0.1))
ax.set_yticks(np.arange(0,1,0.1))
plt.title("LHS-MDU")
plt.scatter(k[0], k[1], c="g", label="sample 1")
plt.scatter(m[0], m[1], c="r", label="resample 2")
plt.scatter(n[0], n[1], c="b", label="resample 3")
plt.scatter(o[0], o[1], c="y", label="resample 4")
plt.grid()
plt.show()

lhsmdu not work in python 3

hi there. i trying to run the exact sample Code which include in readme.rst (in python3) and taking various Errors. is lhsmdu only support python2 ?

error:
` ValueError Traceback (most recent call last)
in
10 ax.set_xticks(numpy.arange(0,1,0.1))
11 ax.set_yticks(numpy.arange(0,1,0.1))
---> 12 plt.scatter(k[0], k[1], color="b", label="LHS-MDU")
13 plt.scatter(l[0], l[1], color="r", label="MC")
14 plt.grid()

C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\pyplot.py in scatter(x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, data, **kwargs)
2860 vmin=vmin, vmax=vmax, alpha=alpha, linewidths=linewidths,
2861 verts=verts, edgecolors=edgecolors, **({"data": data} if data
-> 2862 is not None else {}), **kwargs)
2863 sci(__ret)
2864 return __ret

C:\ProgramData\Anaconda3\lib\site-packages\matplotlib_init_.py in inner(ax, data, *args, **kwargs)
1808 "the Matplotlib list!)" % (label_namer, func.name),
1809 RuntimeWarning, stacklevel=2)
-> 1810 return func(ax, *args, **kwargs)
1811
1812 inner.doc = _add_data_doc(inner.doc,

C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axes_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, **kwargs)
4260 x, y, s, c, colors, edgecolors, linewidths =
4261 cbook.delete_masked_points(
-> 4262 x, y, s, c, colors, edgecolors, linewidths)
4263
4264 scales = s # Renamed for readability below.

C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\cbook_init_.py in delete_masked_points(*args)
1030 if isinstance(x, np.ma.MaskedArray):
1031 if x.ndim > 1:
-> 1032 raise ValueError("Masked arrays must be 1-D")
1033 else:
1034 x = np.asarray(x)

ValueError: Masked arrays must be 1-D
`

`None` seed is not working

lhsmdu.setRandomSeed(None) leads to

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-70-04ad35276bcd> in <module>
----> 1 lhsmdu.setRandomSeed(None)

AttributeError: module 'lhsmdu' has no attribute 'setRandomSeed'

in python 3.

Repeated sampling

Hi Sahil,

I am using your (extremely useful) script for sampling from parameter space, but this needs to be done identically on repeated simulations. I noticed that supplying randomSeed to sample() has no effect, so have modified my version to have randomSeed = None by default, and include

if randomSeed is not None: random.seed(randomSeed)

within sample()

This solved my issue. Reporting here in case it should be included into the main commit.

Apologies if this is the wrong place to log this issue; I'm new to github.

Thanks,

Sam

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.