Code Monkey home page Code Monkey logo

lda-explorable's People

Contributors

omarshehata avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

lda-explorable's Issues

Consider explaining LDA without "optimization"

It was suggested on /r/statistics that thinking of LDA as an optimization problem as described in this article isn't how statisticians currently think about LDA. This would explain why it is not actually solved as an optimization problem in practice (which is something I skipped over in the article, see #3):

While I applaud the use of interactivity, I don't actually think this is the best way to go about thinking about LDA.

Firstly, you're talking about Fisher's original formulation of LDA (wiki). Nowadays we usually use the generative model version of LDA, and I think that is actually very intuitive.

Essentially, you assume that your data is generated from normal distributions, with a common covariance structure (if it's not the same, then you get QDA). That is, each class has its own normal distribution. Then, it's a little work to show that if you assume those distributions, then the (Bayes) optimal way to classify new data points correspond to linear separations (intuitively, you're just checking which density is higher, that is your classification).

I think reformulating this explanation would essentially be a different article, but it could still re-use most of the code and visualization here. Happy to support anyone who wants to explore this path.

Inaccurate Example of Metric

Th article says:

This particular metric is what Fisher chose, but it's certainly not the only possible one. A different metric could optimize for different results (perhaps you care a little more about minimizing scatter so you multiply the denominator by a large number to give it more weight).

However, if you just multiply the denominator by a constant, it changes all the scores the same way, so it won't change the relative "best". See:

https://twitter.com/D_M_Gregory/status/1069965514120851457

Maybe come up with a better example?

Justify Fisher's Formula

The article never explains why Fisher's formula has a denominator of (S1 + S2) as opposed to, say (S1 * S2). This is something I haven't been able to figure out, but some more explanation of how to arrive at that as opposed to other forms would be helpful.

Explain how the solution to LDA is found

Something I completely skipped over is how to actually solve this optimization problem. Partially because I thought it was outside the scope, but also partially because I don't know it well enough myself to explain it simply. It would be nice to say something about whether it's an iterative solution, whether it's some sort of gradient descent, or whether there's a closed form solution.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.