Code Monkey home page Code Monkey logo

Comments (3)

MatthewReid854 avatar MatthewReid854 commented on May 29, 2024

This is certainly possible, though I'm not sure if it is really and "issue" or just a matter of convenience and computational efficiency (not fitting things twice and not writing lots of if statements).

Your proposed implementation implies that every distribution object would need a probplot method. There is no reason this can't be done but I don't understand why you can't just refit the best distribution individually to obtain the probability plot. The other problem with giving every distribution a probplot method is that you need to supply the failure and right censored data if you want more than just the straight line of the CDF. The way the functions in Probability_plotting are built, it is essential to provide the failure (and right_censored) data even if you give it the __fitted_dist_params (an internal method that I use to make a probability plot skip the fitting step and just take what it is given).

If I was doing this myself, I would use Fit_Everything to tell me the best fit, then I would do the fit again (either with the right function from Fitters or with the right function from Probability_plotting) to obtain the probability plot. As you say, it is necessary to know the distribution you are fitting in order to achieve this so you've either got to do it manually or write in a lot of if statements. I can understand that it may be problematic for someone trying to automate something so it just spits out the correct probability plot every time without user input.
Can you tell me your use case and why you believe it is necessary to include a probplot method inside each distribution object rather than just obtain the probplot separately?

I haven't seen the use of getattr before, but using what you showed me and my knowledge of the hidden variables inside Fit_Everything, I can provide you with this somewhat hacky solution:

from reliability.Distributions import Weibull_Distribution
from reliability import Probability_plotting
from reliability.Fitters import Fit_Everything
import matplotlib.pyplot as plt

data = Weibull_Distribution(alpha=500, beta=2).random_samples(500)  # make some data

results = Fit_Everything(failures=data, show_histogram_plot=False, show_probability_plot=False, show_PP_plot=False)
plotter = getattr(Probability_plotting, f'{results.best_distribution.name}_probability_plot')  # this will obtain the correct plotter (e.g. Weibull_probability_plot) as a class which we can use
params = getattr(results, '_Fit_Everything__'f'{results.best_distribution.name2}_params')  # this will return the parameters object from within Fit_Everything. This is a hidden variable with a name of the form _Fit_Everything__Weibull_2P_params
plotter(failures=data, __fitted_dist_params=params)  # this uses the probability plot class extracted earlier and gives it the fitted distribution's parameters which prevents fitting from being done a second time. Note that we still need to provide the failures as all probability plots need these to generate the scatter plot.
plt.show()

from reliability.

edwardreed81 avatar edwardreed81 commented on May 29, 2024

Yes, this is completely about convenience and user-facing simplicity, not something that needs to be fixed. The computational efficiency isn't really a big deal - I'm not fitting things millions of times or anything.

My use case is an automated process that uses Fit_Everything and then uses the resulting distribution for other work. Along the way the probability plots of all the distributions fitted are saved for future reference, but it would be nice to save the one for the selected distribution separately.

I agree that using getattr() to call things by concatenating strings feels hacky, but the code you included does accomplish what I need.

from reliability.

MatthewReid854 avatar MatthewReid854 commented on May 29, 2024

I have decided to incorporate your suggestion. From v0.5.6 onward, Fitters.Fit_Everything includes the input show_best_distribution_probability_plot. This will default to True so there will be 4 figures that are returned from Fit_Everything.

I am currently working on rewriting the Accelerated life testing (ALT) section and I will be introducing an ALT_Fit_Everything function that will fit all the ALT models. I think your idea to also return the probability plot of the best fitting distribution is worth including there as well.
Thanks for your suggestion.

from reliability.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.