Code Monkey home page Code Monkey logo

Comments (6)

sweverett avatar sweverett commented on August 11, 2024

I believe it's only in reglib.py, but it includes all the rpy2 imports, lines 9-16, as well as what's done in the run_lrgs() function. So we actually don't need anything in lines 1-75 except for the numpy and linmix imports.

from clustr.

paigemkelly avatar paigemkelly commented on August 11, 2024

okay great, we just put that straight into the clustr.py code. We are having trouble figuring out the run_mcmc part of that function and the .chain part. I haven't seen that before and I'm having trouble relating what I'm finding on the internet to what's actually going on in the code. From what I'm understanding run_mcmc runs linmix through Nmin = 5000 to Nmax = 10000. What do Nmin and Nmax represent and how does that apply to linmix? or should we not worry about it?

from clustr.

sweverett avatar sweverett commented on August 11, 2024

I explained this a bit in #33. In principle we would try out every possible combination of values for (slope, intercept, intrinsic_scatter) and use the combination that gives the 'best fit' to the data. But this is quickly a computational nightmare as the time it takes to check all combination grows as the number of grid points ^ # of parameters. So instead there are smarter ways of sampling the parameter space of (slope, intercept, intrinsic_scatter), one of which is a Monte-Carlo Markov Chain (MCMC). It does a random walk in parameter space in a weighted way that will lead to it sampling better fit values than bad fit values. We can talk about this more in a zoom meeting if you'd like.

from clustr.

sweverett avatar sweverett commented on August 11, 2024

But it doesn't give a very reliable estimate if you only sample a few points. So you want to run until it 'converges', which is actually a bit of a complicated subject. Setting a minimum number of samples makes sure it has found the "correct" part of parameter space to sample from and the max number of samples will force it to quit if it's struggling to do the fit after a long time.

from clustr.

paigemkelly avatar paigemkelly commented on August 11, 2024

I think that makes sense, but just to clarify it will run until at least 5000 samples (Nmin) to make sure its reliable, and it will quit after 10000 (Nmax) if it's struggling? A zoom meeting would be helpful. I'll send an email, thanks!

from clustr.

sweverett avatar sweverett commented on August 11, 2024

Yes, that's correct. Those values aren't necessarily reliable however - it's always best to inspect the resulting chains by eye. We will do that eventually.

from clustr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.