Code Monkey home page Code Monkey logo

Comments (11)

fonnesbeck avatar fonnesbeck commented on August 26, 2024

You are correct, PyMC 2.x has never supported Pandas data structures. It will sometimes work, but not reliably, since PyMC does not account for Pandas' indexing rules. We chose to move new functionality like Pandas compatibility to PyMC 3. I will label this as an enhancement request.

from pymc2.

 avatar commented on August 26, 2024

Great. Thanks.

from pymc2.

fonnesbeck avatar fonnesbeck commented on August 26, 2024

Can you check out and test the pandas branch of this repo? It should be compatible with Pandas DataFrame objects as values for stochastics.

from pymc2.

 avatar commented on August 26, 2024

Are you sure the pandas branch is working with pd.Series as in the example I posted above? I tested it but I'm still getting the same behavior:

In [12]: print(mu.value)
0    1
1    2
dtype: float64

In [15]: print(mu.value)
0    1.559405
1   -0.453703
dtype: float64

By the way, I created a conda environment and ran pip install -e . inside the cloned repo and then checked out the pandas branch.

from pymc2.

fonnesbeck avatar fonnesbeck commented on August 26, 2024

I'm looking at your sample code, and am confused by this:

mu = pm.Lambda('mean', lambda b=1: pd.Series(np.array([1.0, 2.0])) )

What is the b argument for? There doesn't appear to be any stochastic nodes in this expression. The use of Pandas in the branch is intended for data values in observed stochastic nodes (see the test case test_pandas_data in pymc/tests/test_instantiation.py for an example).

from pymc2.

 avatar commented on August 26, 2024

Yes, I remember I had a computation involving b that returned a Series object, so I simply simulated the computation to reproduce the bug. Does the new branch work if I use b in the lambda function? It would be nice to have this working even if the node is not stochastic because it can be confusing to see the values changed in deterministic nodes.

from pymc2.

fonnesbeck avatar fonnesbeck commented on August 26, 2024

The deterministic node should involve a stochastic somewhere otherwise there is no point in creating one. Only stochastic variables or the children of stochastic variables require a PyMC object. The pandas branch was only intended to make Pandas data structures work as value arguments in observed stochastics.

from pymc2.

 avatar commented on August 26, 2024

That's okay. I will test using the original dataset in a moment.

from pymc2.

 avatar commented on August 26, 2024

Unfortunately, I think the same issue remains. This is the code:

import pymc as pm, pandas as pd, numpy as np
from scipy.spatial.distance import pdist, squareform
from numpy.linalg import inv

# Loading dataset
df = pd.read_table('http://pastebin.com/raw.php?i=41us4HVj', sep=' ')

# Setting priors
beta = pm.Normal('beta', 0.0, 0.1, size=3)
mu = pm.Lambda('mu', lambda b=beta: 
               b[0]+b[1]*df['LivingArea']/1000.0+b[2]*df['Age'])
tau = pm.Gamma('tau', 0.1, 0.1)
phi = pm.Uniform('phi', 0.1, 10)

# Trying to build a covariate matrix
A = squareform(pdist(np.array(zip(df['Latitude'], df['Longitude']))))

# Using the powered exponential to obtain a precision matrix
precision = pm.Lambda('exp', lambda u=A, tau=tau, phi=phi, kappa=1: 
                                inv((1/tau)*np.exp(-(phi*u)**kappa)))

Then print mu.value:

In [14]: mu.value
Out[14]: 
0     -30.180251
1      -9.762179
2      -5.213797
3     -34.702784
...

Finally execute w = pm.MvNormal('w', mu, precision) and inspect again mu.value. It shouldn't change but I get this:

In [16]: mu.value
Out[16]: 
0     1.407939
1     2.576305
2     1.074116
3    -3.164447

For the record, when using numpy arrays, mu.value is the same before and after pm.MvNormal.

from pymc2.

fonnesbeck avatar fonnesbeck commented on August 26, 2024

I don't see a data likelihood in the code; is this the entire model?

At any rate, I'm not entirely surprised that this is not working as expected, as the pandas branch only makes PyMC compatible with DataFrame objects when they are used as observations in the likelihood. I suppose I should make Deterministic objects cast output to arrays as well.

from pymc2.

 avatar commented on August 26, 2024

No, the model is the following:

for (i in 1:N) {
    Y[i] ~ dbern(p[i])
    logit(p[i]) <- w[i]
    mu[i] <- beta[1]+beta[2]*LivingArea[i]/1000+beta[3]*Age[i]
    useless[i] <- HalfBaths[i]+x[i]+y[i]+Age[i]+OtherArea[i]+Beds[i]+Baths[i]+HalfBaths[i]
}

for (i in 1:3) {beta[i] ~ dnorm(0.0,0.001)}
w[1:N] ~ spatial.exp(mu[], x[], y[], spat.prec, phi, 1)     
#for (i in 1:N) {w[i] ~ dnorm(0.0,spat.prec)}
phi ~ dunif(0.1,10) 

spat.prec ~ dgamma(0.1, 0.1)
sigmasq <- 1/spat.prec

So, basically you have a boolean variable (almost certainly logSellingBool) associated to a Bernoulli distribution with probability logit(p) = w where w is a multivariate distribution that performs kriging.

Thanks. Let me know when you have the new version to test it.

from pymc2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.