MCMC & why 3d matters

This example (although quite artificial) shows that viewing a posterior (ok, I have flat priors) in 3d can be quite useful. While the 2d projection may look quite ‘bad’, the 3d volume rendering shows that much of the volume is empty, and the posterior is much better defined than it seems in 2d.

In [1]:
import pylab
import scipy.optimize as op
import emcee
import numpy as np
%matplotlib inline
In [2]:
# our 'blackbox' 3 parameter model which is highly degenerate
def f_model(x, a, b, c):
    return x * np.sqrt(a**2 +b**2 + c**2) + a*x**2 + b*x**3
In [3]:
N = 100
a_true, b_true, c_true = -1., 2., 1.5

# our input and output
x = np.random.rand(N)*0.5#+0.5
y = f_model(x, a_true, b_true, c_true)

# + some (known) gaussian noise
error = 0.2
y += np.random.normal(0, error, N)

# and plot our data
pylab.scatter(x, y);
pylab.xlabel("$x$")
pylab.ylabel("$y$")
Out[3]:
<matplotlib.text.Text at 0x10d1cdb70>
_images/example_mcmc_3_1.png
In [4]:
# our likelihood
def lnlike(theta, x, y, error):
    a, b, c = theta
    model =  f_model(x, a, b, c)
    chisq = 0.5*(np.sum((y-model)**2/error**2))
    return -chisq
result = op.minimize(lambda *args: -lnlike(*args), [a_true, b_true, c_true], args=(x, y, error))
# find the max likelihood
a_ml, b_ml, c_ml = result["x"]
print("estimates", a_ml, b_ml, c_ml)
print("true values", a_true, b_true, c_true)
result["message"]
estimates 0.140490426416 -1.27278566041 2.56885371889
true values -1.0 2.0 1.5
Out[4]:
'Optimization terminated successfully.'
In [5]:
# do the mcmc walk
ndim, nwalkers = 3, 100
pos = [result["x"] + np.random.randn(ndim)*0.1 for i in range(nwalkers)]
sampler = emcee.EnsembleSampler(nwalkers, ndim, lnlike, args=(x, y, error))
sampler.run_mcmc(pos, 1500);
samples = sampler.chain[:, 50:, :].reshape((-1, ndim))

Posterior in 2d

In [7]:
# plot the 2d pdfs
import corner
fig = corner.corner(samples, labels=["$a$", "$b$", "$c$"],
                      truths=[a_true, b_true, c_true])
_images/example_mcmc_7_0.png

Posterior in 3d

In [8]:
import vaex
import scipy.ndimage
import ipyvolume
In [9]:
ds = vaex.from_arrays(a=samples[...,0], b=samples[...,1], c=samples[...,2])
# get 2d histogram
v = ds.count(binby=["a", "b", "c"], shape=64)
# smooth it for visual pleasure
v = scipy.ndimage.gaussian_filter(v, 2)
In [11]:
ipyvolume.quickvolshow(v, lighting=True)

Note that actually a large part of the volume is empty.

In [ ]: